388 53 8MB
English Pages 355 [356] Year 2011
Current Methods in Historical Semantics
Topics in English Linguistics 73
Editors
Elizabeth Closs Traugott Bernd Kortmann
De Gruyter Mouton
Current Methods in Historical Semantics Edited by
Kathryn Allan Justyna A. Robinson
De Gruyter Mouton
ISBN 978-3-11-025288-0 e-ISBN 978-3-11-025290-3 ISSN 1434-3452 Library of Congress Cataloging-in-Publication Data Current methods in historical semantics / edited by Kathryn Allan, Justyna A. Robinson. p. cm. ⫺ (Topics in English linguistics; 73) Includes bibliographical references and index. ISBN 978-3-11-025288-0 (alk. paper) 1. Semantics, Historical. 2. English language ⫺ Semantics, Historical. I. Allan, Kathryn. II. Robinson, Justyna A. P325.5.H57.C87 2011 4171.7⫺dc23 2011035604
Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de. ” 2012 Walter de Gruyter GmbH & Co. KG, 10785 Berlin/Boston Cover image: Brian Stablyk/Photographer’s Choice RF/Getty Images Typesetting: RoyalStandard, Hong Kong Printing: Hubert & Co. GmbH & Co. KG, Göttingen 앝 Printed on acid-free paper 앪 Printed in Germany www.degruyter.com
Preface This book grew out of a workshop at the 15th International Conference of English Historical Linguistics, held in 2008 in Munich, although it has evolved gradually. We would like to thank Joan Beal, whose initial suggestion prompted our collaboration, and we hugely appreciate the e¤orts of everyone who took part in the session, both as presenters and as participants in the interesting and thought-provoking discussions between and after papers. A number of colleagues have supported us in the preparation of this volume. In particular we would like to acknowledge the generous encouragement and advice of Philip Durkin, Dirk Geeraerts and Carole Hough. The anonymous reviewers who commented on papers gave us thoughtful and constructive comments that have been invaluable, and the volume could not have been completed without their collaboration. Above all, we are grateful to the contributors, whose ideas have shaped the volume, to Elizabeth Traugott, Bernd Kortmann and Birgit Sievert, who have been enthusiastic and patient guides throughout the process, and to Wolfgang Konwitschny and all at De Gruyter Mouton for their help. Finally, we are both fortunate to have understanding family members who have borne with us (and in two cases, who have been born) while we worked together to complete this volume. Thanks for all the time, tolerance and tea.
Table of contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
v
Introduction: Exploring the ‘‘state of the art’’ in historical semantics. . Kathryn Allan & Justyna A. Robinson
1
Section 1: Data and sources Using OED data as evidence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kathryn Allan
17
Developing The Historical Thesaurus of the OED . . . . . . . . . . . . . . Christian Kay
41
The NeoCrawler: identifying and retrieving neologisms from the internet and monitoring on-going change . . . . . . . . . . . . . . . . . . . . Daphne´ Kerremans, Susanne Stegmayr & Hans-Jo¨rg Schmid Commentary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Philip Durkin
59 97
Section 2: Corpus-based methods How anger rose: Hypothesis testing in diachronic semantics . . . . . . Dirk Geeraerts, Caroline Gevaert & Dirk Speelman Diachronic collostructional analysis: How to use it and how to deal with confounding factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Martin Hilpert
109
133
Tracing semantic change with Latent Semantic Analysis . . . . . . . . . Eyal Sagi, Stefan Kaufmann & Brady Clark
161
Commentary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stefan Th. Gries
184
Section 3: Theoretical Approaches A sociolinguistic perspective on semantic change. . . . . . . . . . . . . . . Justyna A. Robinson
199
A pragmatic approach to historical semantics, with special reference to markers of clausal negation in Medieval French . . . . . . . . . . . . . 233 Maj-Britt Mosegaard Hansen
viii
Table of contents
The pervasiveness of contiguity and metonymy in semantic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peter Koch
259
A cognitive approach to the methodology of semantic reconstruction: The case of English chin and knee. . . . . . . . . . . . . . Ga´bor Gyori & Ire´n Hegedus
313
Commentary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Terttu Nevalainen
334
Subject index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index of word forms and concepts . . . . . . . . . . . . . . . . . . . . . . . . .
343 346
Introduction: Exploring the ‘‘state of the art’’ in historical semantics Kathryn Allan and Justyna A. Robinson This book presents a snapshot of current work from within historical semantics, specifically with the aim of exploring new and established methodologies that are indicative of the ‘‘state of the art’’. In pulling together papers by scholars from across the discipline, we aim to show how historical semantics is evolving and maturing. Current work in the field uses innovative, data-driven methods, and is informed by a range of di¤erent perspectives and theoretical models, and it seems timely to review and showcase these alongside more established analytical techniques. As well as this, the collection of papers in this volume considers the extent to which historical semantics can learn from other disciplines within linguistics. The title of the book refers to ‘‘Historical Semantics’’, and we have taken a broad view of what this means: any work that can be called ‘‘diachronic’’ has been taken to be ‘‘historical’’, whether it deals with very recent time periods or is ‘‘historical’’ in a more traditional sense. The collection therefore includes studies of semantic change across long and early time spans alongside papers on shorter and more recent periods. For example, Allan considers semantic change over several centuries, and Geeraerts et al. considers change during the four centuries of the Middle English period; Gyori and Hegedus work backwards from very early evidence to consider reconstructed forms and meanings. By contrast, Hilpert and notably Kerremans et al. consider change during relatively shorter and more recent periods of 18 years and 24 months respectively, while Robinson uses the apparent time construct to consider change over a few decades to the present. The volume is divided into three sections, which focus first on sources of data, then consider methods of analysis, and finally place these within wider theoretical perspectives which look beyond traditional approaches to historical semantics. Each section is followed by a short commentary by a scholar whose core research interest is not historical semantics, but who has a particular interest in the themes of that section and can provide a fresh view informed by his or her own work. In the following pages, after some introductory words about historical background, each of these sections is considered.
2
Kathryn Allan and Justyna A. Robinson
The historical context of historical semantics It is generally recognised that, in some periods of its history, semantics was treated as a ‘‘poor relation’’ of other areas of linguistics; as Lyons comments, ‘‘there have been times in the recent past [. . .] when linguistic semantics was very largely neglected’’ (Lyons 1995: 16). Some scholars even suggested that the study of meaning should not be part of linguistics (Hockett 1954: 152). Views of this kind do not characterise the whole history of linguistic study, though, and interest in semantic change was a key part of the philological tradition that emerged in continental Europe in the nineteenth century, which gradually evolved into modern linguistics. As Geeraerts (2010: 2) points out, historical-philological semantics itself grew out of a very much earlier interest in meaning that can be traced back to classical antiquity. In the past three decades, semantics has become re-established as a valid area for research, partly as a result of the rise of cognitive linguistics, which has revisited some of the earlier questions considered by philologists (see Geeraerts 2010: 277). The recent renaissance of semantics di¤ers from its earlier incarnation in philology, in that the study of semantic change is not at its core, and the vast majority of studies in the past 30 years have been synchronic rather than diachronic. Historical semantics is associated with particular problems, and this perhaps explains why it has been a relatively minority interest amongst both semanticists and historical linguists. For those interested in synchronic meaning, accessing and analysing data that can be used to trace semantic change can present challenges, particularly for less frequent lexemes (see Davies, in press). Data from earlier periods can be scarce, and what does exist is not always representative of all text types or discourse areas. Historical linguists are well-accustomed to these kinds of di‰culties, but many explore word meaning only where it is incidental to the analysis of other levels of language. Semantic change presents particular di‰culties by comparison with other types of change (for example, phonological or morphological change), since it involves a lesser degree of predictability and can often seem less amenable to reconstruction (see Gyori and Hegedus, this volume, for a detailed discussion). In his handbook on etymology, Durkin outlines the challenges semantic change presents to historical linguists: Semantic changes are notoriously di‰cult to classify or systematize, and we have no tool comparable to the historical grammar to help us judge what is or is not likely or plausible. Further, although some semantic changes occur in clusters, with a change in one word triggering a change in another, we do not find anything comparable to a regular sound change, a¤ecting all
Introduction: Exploring the ‘‘state of the art’’ in historical semantics
3
comparable environments within a single historical period. In this respect semantic changes are more similar to sporadic sound changes, but with the major di¤erence that they are much more varied, and show the influence of a much wider set of motivating factors. Additionally, semantic change is much more closely connected with change in the external, non-linguistic world, especially with developments in the spheres of culture and technology. (Durkin 2009: 222–223)
As well as contributing to the particular nature of semantic change, this connection between meaning and extralinguistic change makes the description of lexical meaning di‰cult. Cruse argues that word meaning must be viewed in its social-cultural context, but comments that ‘‘[a] contextual approach to word meaning [. . .] has certain inescapable consequences that some might consider to be disadvantages. One is that any attempt to draw a line between the meaning of a word and ‘encyclopaedic’ facts concerning the extra-linguistic referents of the word can be quite arbitrary’’ (Cruse 1986: 19). Nevertheless, the issue of regularity in semantic change has received more attention in the past two decades, notably with the work of Traugott and Dasher (2002 and subsequent work). Although they argue that patterns of change can be observed, their findings do not contradict Durkin’s view. At the beginning of their study, they note that such regularities are prototypical types of change [. . .] They are possible, indeed probable, tendencies, not changes that are replicated across every possible meaning item at a specific point in time in a specific language, such as the Neogrammarians posulated for sound change. (Traugott and Dasher 2002: 1)1
As this volume shows, a great deal of current work examines these ‘‘tendencies’’, and tries to work towards a greater understanding of changes in meaning as motivated and explicable phenomena. Lexical meaning may be di‰cult and unpredictable, but it is not random. In discussing a range of new and established tools and approaches, the papers in the collection are a contribution to a historical semantics that aspires to build on the existing tradition in a methodologically sophisticated way. 1. Traugott and Dasher go on to propose that the extent to which regularity can be observed di¤ers across word classes, suggesting that ‘‘irregular meaning changes seem to occur primarily in the nominal domain, which is particularly susceptible to extralinguistic factors such as change in the nature of the social construction of the referent’’ (Traugott and Dasher 2005: 3–4).
4
Kathryn Allan and Justyna A. Robinson
Section 1: Data and sources The types of sources that are available for research into historical semantics, and the way in which data can be collected and interrogated, has changed dramatically in the recent past. In the past, finding data was time-consuming and labour-intensive; scholars often had to spend long periods collecting examples of the use of a particular lexical item from literary works or other types of source material before they could begin any kind of analysis, and for many periods and text types even locating examples could be di‰cult. The advent of computers has made a major di¤erence to work in semantics, as in other areas of linguistics, and has made much more thorough examinations of change possible. Large electronic full-text resources and the tools which make these resources searchable o¤er new possibilities for the collection and exploitation of data: for example, the past three decades have seen the creation of digitised collections such as Early English Books Online and Eighteenth Century Collections Online, along with large corpora such as the Helsinki Corpus. As well as o¤ering excellent sources of primary data for research into semantic change, these have fed into secondary sources which have become invaluable, such as the third edition of the Oxford English Dictionary. The papers in the first section of this book focus on three data sources, considering the nature of these sources and how best to exploit them. The volume begins with an examination of the Oxford English Dictionary, perhaps the most invaluable tool for work in historical semantics on English: as Kathryn Allan says at the beginning of her article, ‘Using OED data as evidence for researching semantic change’, ‘‘few scholars within the discipline would not consult the OED at some stage in any piece of research’’ (18). However, like any secondary source the OED cannot be taken at face value, but needs to be used critically, particularly because it is made up of entries from di¤erent periods that belong to di¤erent editions. Allan’s article considers the relationship between the evidence presented in OED entries and the probable semantic development of three metaphorically polysemous lexemes, milksop, pregnant and dull. In each case, dating evidence for the di¤erent senses appears not to be consistent with the semantic development that might be assumed, and which would be suggested by cognitive theories of metaphor. Allan considers the various pieces of evidence that can be used to confirm or challenge a particular semantic history, and argues for the importance
Introduction: Exploring the ‘‘state of the art’’ in historical semantics
5
of a detailed knowledge of any data source in evaluating the evidence it presents. The caveats Allan suggests when using OED data are absolutely relevant to the second paper, which considers an exciting new resource for study of meaning change which is based on the OED. In the paper ‘Developing The Historical Thesaurus of the OED’ Christian Kay gives an account of the creation of the Historical Thesaurus of the Oxford English Dictionary, which was published in 2009 and linked to the OED electronically at the end of 2010. She discusses the nature of the categorisation, and describes the kinds of decisions that had to be taken in handling and classifying the data. HTOED is a reorganisation of the OED into semantic fields, with additional data for Old English added, and as such it is hugely valuable for research that considers the relationships between lexemes and the way in which these might impact upon the semantic development of one another in di¤erent periods. For example, the case study of the category Truth considers the lexemes soþ and true and their derivatives, and suggests various reasons for changes in the meanings of both, including the relationship between the two word groups, subsequent borrowing of semantically related lexemes from French, and changing conceptualisations of loyalty. Both OED and HTOED raise questions about the motivation for semantic change, and HTOED provides an interesting perspective by supplying the semantic context of lexemes. The final paper in the section introduces an innovative new tool for the investigation of semantic change at a much shallower time depth in forensic detail, and this tool exploits the huge amount of data that is o¤ered by the world wide web. ‘The NeoCrawler: Identifying and retrieving neologisms from the internet and monitoring ongoing change’, by Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid, discusses the architecture, theoretical underpinnings and methodological possibilities of the NeoCrawler, a Google-linked web crawler which identifies and retrieves neologisms on the web and then monitors subsequent uses of these neologisms. The great advantage of mining data from the web is its size: even very large corpora contain only a fraction of the data that is found on the web, and it continues to increase in size at a very fast pace. This means that it can be used to retrieve and trace a significant number of tokens for lexemes which are relatively infrequent. The NeoCrawler can only be used to track semantic change from the very recent past onwards, but it gives a unique and interesting view of semantic change in progress that has the potential to inform our ideas about change at a greater time
6
Kathryn Allan and Justyna A. Robinson
depth, and o¤ers great potential to track change over a longer period into the future. Kerremans, Stegmayr and Schmid’s paper provides an interesting contrast to the others in this section, but one which has important connections; as Durkin comments in the commentary to this section, the NeoCrawler is valuable in giving ‘‘a glimpse of the fuller picture that lies behind the level of abstraction and summary that is unavoidable in any dictionary’’ (102), which also underlies HTOED. The commentator for this section is the Principal Etymologist to the Oxford English Dictionary, Philip Durkin, who has had links to all three projects. Durkin makes the point that the questions considered in all three papers in the section ‘‘have been of central concern in the study of lexis for many decades’’ (98). He considers the historical and intellectual context of these questions by revisiting the work of Walther von Wartburg, who asked particularly insightful questions about the interactions between the meanings of individual words and semantically related words within the larger structure of the lexicon long before resources like HTOED and the NeoCrawler existed.
Section 2: Corpus-based methods As the previous section shows, sources of data cannot be separated from the technology that allows interrogation of existing data that was previously di‰cult to gather and process. In recent years, the technological advances which have made it possible to compile large bodies of data have also enabled the exploitation of this data in sophisticated ways: it has become feasible to consider much larger bodies of data, and to identify patterns of usage and changes in these patterns relatively quickly. The analyses of meaning by the means of corpus methods are presented for example in Glynn and Fisher (2010), Glynn and Robinson (forthcoming). The second section of this volume focuses more explicitly on analytical techniques which use computational tools and statistical methods to inform and test hypotheses about semantic change. As Geeraerts et al. point out, ‘‘the advance towards quantitative corpus methods that characterizes current synchronic lexical semantics does not yet pervade the historical study of word meaning’’ (109); this is an area of linguistics that has yet to benefit fully from the kind of methods that these papers describe, and all three papers demonstrate the possibilities that quantitative methods o¤er to future research in historical semantics. At the same time, all three also a‰rm the value of intuition alongside more ‘‘objective’’ methods of
Introduction: Exploring the ‘‘state of the art’’ in historical semantics
7
data processing. Geeraerts et al. sort their data manually by three semantic variables before performing statistical analyses, and comment that ‘‘consistently attributing these features is not an easy matter, and some allowance needs to be made for the e¤ect of subjective interpretations’’ (117); Hilpert comments that at one stage of his analysis ‘‘the processing of mere numbers is complemented by a qualitative approach’’ (143) and says later that ‘‘qualitative interpretation’’ of the statistical analysis he performs is ‘‘necessary’’ (145); and Sagi et al. conclude their paper by saying that ‘‘no such system [of corpus-based computational work] is likely to supplant the researcher’s intuition entirely’’ (180), while a‰rming the value of LSA and comparable tools to improve the rigour of research in historical semantics. The first paper in the section explores the way in which modern statistical techniques can be used to assess earlier claims. ‘How anger rose: Hypothesis testing in diachronic semantics’, by Dirk Geeraerts, Caroline Gevaert and Dirk Speelman, takes a discussion of the changing meaning and use of the lexeme anger in an earlier study by Hans-Ju¨rgen Diller as the starting point for a corpus-based onomasiological study. They monitor and discuss competition between anger, ire and wrath during the Middle English period, and suggest reasons for the increasing dominance of anger by c1500. Their investigation begins with a number of exploratory bivariate analyses and then progresses to multivariate analyses that allow them to measure the relative importance of di¤erent semantic and lectal factors in the use of lexemes in di¤erent periods, and the interaction between these factors. The statistical tests that they perform allow them to take account of a wider range of texts than Diller included in his study, but also to conduct a finer-grained analysis that considers text type alongside other factors in the use of di¤erent terms for ‘anger’; their analysis both supports and qualifies Diller’s ideas (and the ideas of other scholars who have worked on related lexis), and shows elegantly the value of contemporary corpus techniques in tracing and explaining semantic change. Martin Hilpert’s paper, ‘Diachronic collostructional analysis: How to use it and how to deal with confounding factors’, highlights a second corpuslinguistic technique that has not yet become very established in historical semantics, but one which promises to be enormously helpful in its focus on the connection between lexical meaning and grammatical context. Collostructional analysis allows the identification of collocations that occur in statistically significant frequencies in di¤erent periods; Hilpert argues that fluctuations in the relative frequency of particular collocates can be indicative of semantic change, and can provide a starting point
8
Kathryn Allan and Justyna A. Robinson
for an examination of this change. The paper presents a case study of the keep V-ing construction to exemplify this claim, and focuses on the nature of the verbs found in the construction over an 18-year period. The methodology described in this paper has a number of strengths. Firstly, it can identify key periods in the ‘‘life’’ of a construction which can be masked if the data is partitioned into evenly spaced data ranges: rather than basing his analysis on crudely chosen periods of time (e.g. 50-year ‘‘slices’’), Hilpert takes a data-driven ‘‘bottom-up’’ approach, in which periods of particularly significant change in collocational patterns are pinpointed and the data is divided into periods on this basis (see also Gries, this volume, for a longer discussion). He is also able to take account of di¤erences in the use of the construction in di¤erent genres, since the COCA corpus which provides his data is coded for text type and this information can be fed into the analysis. Secondly, it can draw attention to fine semantic di¤erences in collocational patterns that are not obvious, which might be neglected in other kinds of analysis; as in the previous paper by Geeraerts et al., the statistical techniques used here can both complement and supplement other studies. The final paper in this section focuses on a technique which has the potential to make the first stage of the analysis of a large corpus more objective and less reliant on time-consuming manual coding. ‘Tracing semantic change with Latent Semantic Analysis’, by Eyal Sagi, Stefan Kaufmann and Brady Clark, is similar to the preceding paper in that it focuses on context as an indicator of meaning; the paper takes as its starting point the premise that ‘‘changes in the meaning of a given word will be evident when examining the contexts of its occurrences over time’’ (171). Latent Semantic Analysis (LSA) allows the detection of these changes in context, and can thus be used to identify lexemes which have shifted semantically. Because the model that Sagi et al. use can handle large amounts of data, it can be used to track semantic change over long periods. This paper focuses on data from the Middle and Early Modern English sections of the Helsinki corpus, and tests the hypothesis that lexemes which have undergone broadening and narrowing will appear in a greater and smaller variety of contexts respectively. Case studies of some well-known examples of broadening and narrowing, the lexemes do, dog, hound and deer, show that LSA generates analyses consistent with accounts of the semantic histories of these lexemes; this confirms the predictive value of the method for future word into less-studied cases. The commentary for this section is written by Stefan Gries, who has been influential in showing how valuable computational methods can be
Introduction: Exploring the ‘‘state of the art’’ in historical semantics
9
for work in linguistics. He considers future directions of the kind of work discussed in this section, picking up on the methodology used in each of the papers. The first section of the paper deals with the importance and potential of the kind of ‘‘bottom-up’’ approaches to data discussed by Hilpert, particularly focusing on how diachronic data can be clustered di¤erently and how this might a¤ect the results of any analysis. Gries goes on to argue for the value of multifactorial approaches like that of Geeraerts et al., and then finishes by considering the possibilities of greater use of interdisciplinarity and methodological pluralism in historical semantics and other diachronic work.
Section 3: Theoretical Approaches It has been frequently observed that lexical meaning cannot be studied in isolation, and this is demonstrated clearly by some of the papers in the two previous sections of the volume: for example, Hilpert shows the lack of separation between semantics and grammar, while Kerremans et al. talk about the morphological properties of new coinages alongside their semantic content. The third section of the book focuses more clearly on the overlap between semantics and other sub-disciplines within linguistics, such as pragmatics (in Hansen’s paper). This kind of interdisciplinary approach demonstrates the way in which semantics can be informed by approaches and analytical tools that have not always been associated with the study of lexical meaning. The section also shows that, as a discipline, historical semantics is returning to many early principles and ideas, but using modern techniques to explore these ideas more fully. For example, Robinson’s work, described below, shows a return to Meillet’s claim that the existence of social groups within a community is ‘‘le principle essentiel du changement de sens [the essential principle of meaning change]’’ (1906: 245). Koch reviews and refines early ideas about metonymy, for example those of Jakobson, in light of cognitive theories, concluding that processes of metonymy are perhaps more powerful in accounting for semantic change than is often acknowledged. The first paper in the section is Justyna Robinson’s examination of ‘A sociolinguistic approach to semantic change’. Robinson explores semantic change from a variationist perspective, and argues for the use of the apparent-time construct in tracing semantic change within living memory. While the study examines the way in which semantic change happens across generations, itself an under-researched process, Robinson also argues
10
Kathryn Allan and Justyna A. Robinson
that ‘‘a thorough exploration of current semantic changes may shed light onto processes a¤ecting meaning changes at larger time depths’’ (199). While this study, like other papers in the volume, uses dictionary and corpus data to trace the changing meaning of the adjective skinny, it is the only paper in the volume which involves experimentally elicited data as well. The analysis of data is supported by the use of a decision tree analysis which allows Robinson to isolate clearly-defined groups of speakers with the highest or lowest use of each sense (for example, older speakers of a certain socio-economic background). Robinson provides evidence that semantic variation and change are structured socio-demographically and suggests reasons for the rise or decline of senses across groups of speakers. Maj-Britt Mosegaard Hansen’s paper ‘A pragmatic approach to historical semantics, with special reference to markers of clausal negation in Medieval French’ joins a growing body of work that recognises the lack of a clear boundary between semantics and pragmatics; semantic change frequently ‘‘arise[s] out of the pragmatic uses to which speakers or writers and addressees or readers put language’’ (Traugott & Dasher 2002: xi), so it is essential to examine these pragmatic uses to explain semantic change. Hansen’s paper discusses the relationship between di¤erent crosslinguistic tendencies of change (including subjectification, grammaticalization and pragmaticalization), considering examples from a number of European languages. She then focuses especially on the negative markers pas, mie and point in Medieval French; by tracking uses of each across the Medieval French period with particular attention to their contexts of use, she is able to suggest reasons for the functional ‘‘division of labor’’ (242) between them in the 13th and 14th centuries, and for the subsequent rise and eventual dominance of pas in Modern French. The final two papers in the volume are written very much within the cognitive semantic tradition that has become ‘‘arguably the most popular framework for the study of lexical meaning in contemporary linguistics’’ (Geeraerts 2010: 183). In fact, almost all of the papers in sections 1 and 2 are informed by cognitive approaches to some extent, and several make this explicit: in section 1, Allan’s paper revisits cognitive theories of metaphor, and Kay discusses the way in which the cognitive paradigm has ‘‘retrospectively proved sympathetic (44)’’ to the issues that HTOED editors faced; in section 2, Geeraerts et al. make reference to the connection between their work and cognitive accounts of categorization and embodiment. Peter Koch’s account of ‘The pervasiveness of contiguity and metonymy in semantic change’ explores a phenomenon which has been much-studied
Introduction: Exploring the ‘‘state of the art’’ in historical semantics
11
in cognitive linguistics in recent times, but he looks both inside and outside cognitive linguistic accounts to give a broader historical context to his discussion. The paper provides a comprehensive and sophisticated view of the nature of associative relations, and considers the centrality of contiguity to processes of semantic change of di¤erent kinds; like Hansen’s paper, it acknowledges the pragmatic dimension of metonymy and other relationships of contiguity, and it draws from both phenomenological philosophy and frame semantics. Koch di¤erentiates and reconciles the mass of terminology that has been used by di¤erent scholars, and exploits a wide range of examples of lexical and lexico-grammatical change in English to illustrate the ‘‘fundamental nature and [. . .] pragmatic flexibility’’ (301) of contiguity. The final paper in the section is ‘A cognitive approach to the methodology of semantic reconstruction: the case of English chin and knee’, by Ga´bor Gyori and Ire´n Hegedu s. This is informed by both cognitive approaches to meaning and more traditional comparative linguistics, and focuses on the identification and establishment of cognates across Indo-European languages. The paper interrogates the notion of regularity in semantic change, asking more precisely to what extent patterns of change in meaning are comparable to regularity of change in other levels of language; Gyori and Hegedu s suggest that the notion of the image schema is useful in establishing convincing semantic correspondences between cognates that can be postulated on the grounds of word form (e.g. by morphological or phonological criteria). Often forms may show phonological correspondences that would suggest they are cognates, but the lack of any obvious semantic connection leads scholars to suggest that they may instead show developments from separate homonymic roots: the terms for ‘chin’ and ‘knee’ provide a good example. However, evidence from a number of languages builds up a convincing case that these groups can be linked by an image schema, since ‘‘the perceptual pattern of ‘curve/bend’ ’’ (327) provides the cognitive basis for other body part terms via metaphorical and metonymical projections. The paper shows convincingly that while semantic change may not be regular, in many cases it is generalizable, and careful examination of data makes it possible to build up evidence for generalizations. The connections between the papers in Section 3 are teased out in detail in the commentary by Terttu Nevalainen, a historical linguist whose main research interest is in historical sociolinguistics. Following on from the paper by Gyori and Hegedu s, Nevalainen compares and discusses what each paper has to say about the predictability of semantic change; she
12
Kathryn Allan and Justyna A. Robinson
then goes on to consider the way in which they use case studies and the nature of the methodology employed in each study. Nevalainen concludes by saying that the conclusions drawn in each paper suggest fruitful areas for future research: ‘‘The semantician’s work continues’’ (341).
Final comment In their paper in Section 2, Sagi et al. refer to the semantic aspect of the lexicon as ‘‘an ever-changing landscape of meaning’’ (179). This volume represents our attempt to pull together some of the tools and strategies that can help us to navigate, and perhaps even map, this landscape.
References Cruse, D. Alan 1986 Lexical Semantics. Cambridge: Cambridge University Press. Davies, Mark In press Synchronic and Diachronic Uses of Corpora. In: Geo¤ Barnbrook and Vander Viana (eds.), Perspectives on Corpus Linguistics: Connections & Controversies, Philadelphia: John Benjamins. Durkin, Philip 2009 The Oxford Guide to Etymology. Oxford: Oxford University Press. Geeraerts, Dirk 2010 Theories of Lexical Semantics. Oxford: Oxford University Press. Glynn, Dylan and Kerstin Fischer (eds.) 2010 Quantitative Cognitive Semantics. Corpus-Driven Approaches. Berlin: Mouton de Gruyter. Glynn, Dylan and Justyna Robinson (eds.) Forthcoming Polysemy and Synonymy. Corpus Methods and Applications in Cognitive Semantics. Amsterdam: John Benjamins. Hockett, Charles F. 1954 Two Models of Grammatical Description. Indianapolis: BobbsMerrill. Lyons, John 1995 Linguistic Semantics: An Introduction. Cambridge: Cambridge University Press. Meillet, Antoine 1906 Comment les mots changent de sens. Anne´e Sociologique 9: 1–38. The Oxford English Dictionary 1884–1933 10 vols. Ed. Sir James A. H. Murray, Henry Bradley, Sir William A. Craigie and Charles T. Onions. Supplement, 1972–1986,
Introduction: Exploring the ‘‘state of the art’’ in historical semantics
13
4 vols., ed. Robert W. Burchfield; 2nd. edn. 1989, ed. John A. Simpson and Edmund S. C. Weiner; Additions Series, 1993–7, ed. John A. Simpson, Edmund S. C. Weiner, and Michael Pro‰tt; 3rd. edn. in progress: OED Online, March 2000–, ed. John A. Simpson, www.oed.com. Traugott, Elizabeth Closs and Richard B. Dasher 2002 Regularity in Semantic Change (Cambridge Studies in Linguistics 97.) Cambridge: Cambridge University Press.
Section 1: Data and sources
Using OED data as evidence for researching semantic change Kathryn Allan Abstract The OED is integral to research in Historical Linguistics, and has been used for a huge range of purposes in a number of subdisciplines of language study including lexical semantics. Some of the ways in which the OED has been used by scholars in the past century could not have been anticipated by the editors of the first and second editions, and this raises interesting questions about the extent to which OED data alone can underpin and inform current research. Several scholars have pointed out that OED data is not designed to be used uncritically, and needs to be treated with caution because of the nature of the material: for example, Durkin (2002) discusses the di‰culties of basing arguments on the dating of quotations, particularly given the high number of ante- and post-datings in the 3rd edition, and Ho¤mann (2004) examines the theoretical validity of using OED quotations as a corpus for historical research. In this paper, some of the issues around using OED data specifically as the basis for study into lexical-semantic change are explored. As a starting point, three OED entries will be discussed (milksop, pregnant and dull ); each of these appears to show semantic change in a direction that is counter-intuitive and does not follow the concrete > abstract pattern that is generally acknowledged to be most common by cognitive and historical linguists (e.g. see Coulson 2006: 34, Gyori and Hegedus, this volume). It is therefore particularly important to consider how the OED evidence as a whole should be treated, and what stages a researcher should go through in assessing whether dating evidence is likely to reflect the actual diachronic development of a lexeme in individual cases. The paper focuses on the questions raised by using OED entries as the basis for statements about the meanings of individual lexemes and the way in which these change through time.
1. Introduction The Oxford English Dictionary is perhaps the most trusted and widely used resource for the study of any aspect of the history of the English language, and for the semanticist it o¤ers the raw material to reconstruct the sense development of most of the lexis of English. For much research
18
Kathryn Allan
in historical semantics, it provides a starting point at the very least, and few scholars within the discipline would not consult the OED at some stage in any piece of research. Yet as an artefact in its own right and the product of its own history, the OED can present challenges of interpretation, and like any other constructed data source it must be interrogated and treated critically. This paper considers some of the questions that are raised by data presented in the OED, and discusses some of the issues that the semanticist faces in using this data to explore the semantic history of an individual word. The main section of the paper (section 4) focuses on three entries that do not evidence the semantic development that might be expected; these are milksop, pregnant and dull. At face value, the current OED entries for these lexemes suggest that their earliest senses in English are abstract senses, or senses from an abstract domain, that might be considered figurative from a modern perspective. The more concrete senses that relate to physical experiences, which are intuitively primary, are not attested until later, and this raises issues questions about how to treat the available evidence. Sections 2.1 and 2.2 summarise the history of the OED in terms of its content, and go on to consider the suitability of OED as a data source for current work in semantics. These provide some context for section 3, which discusses patterns in the diachronic relationship between ‘‘literal’’ and ‘‘figurative’’ senses and the way in which this relationship is presented in OED, and section 4, which presents the case studies. 2.1. The Oxford English Dictionary: history and content The diverse ways in which the OED is used by modern scholars could not have been anticipated by the original editors. In a lecture in 1900, James Murray set out the aims of the dictionary as follows: [The dictionary] seeks not merely to record every word that has been used in the language for the last 800 years, with its written form and signification, and the pronunciation of the current words, but to furnish a biography of each word, giving as nearly as possible the date of its birth or first known appearance, and, in the case of an obsolete word or sense, of its last appearance, the source from which it was actually derived, the form and sense with which it entered the language or is first found in it, and the successive changes of form and developments of sense which it has since undergone. (Murray 1900: 47)
The core use of the dictionary by modern scholars is still to trace the individual histories of lexemes, considering semantic, syntactic, etymological
Using OED data as evidence for researching semantic change
19
and formal changes, but Murray was prescient when he went on to say that ‘‘It is never possible to forecast the needs and notions of those who shall come after us’’ (Murray 1900: 49). The ways the information contained in OED can now be accessed and the specific foci of interest from which researchers approach the data have changed radically, mainly because of technological developments. The electronic editions (i.e. 2nd edition on CD-Rom and 3rd edition online) allow searches by various di¤erent criteria, including keywords in definitions, dates of attestation, donor languages and cognates. These options foreground pieces of information that were not designed to be used in this way in the first edition (though this is something that editors of the 3rd edition take into account as they prepare material). Even the quotations that are integrated into the entries have become an important source of data in themselves because of the lack of large historical corpora of English, and several scholars have used them as a corpus (see Israel 1996 and, more recently, Mair 2004 for examples of work of this kind; see Ho¤mann 2004 for a more detailed discussion of the issues in this kind of use of the data). The history of the OED is well documented1 and will not be discussed in detail here, but it is worth briefly considering how the lengthy production time has a¤ected the content of the entries. The first edition was published in fascicles between 1884 and 1928, first titled the New English Dictionary and later reprinted as the Oxford English Dictionary, and this was followed by supplements in 1933 and in 1972–1986 (in four volumes). A second edition was completed in 1989, and this was published in both print form and (in 1992) on CD-Rom. This edition integrated the supplements and added a further 5000 words and senses, but most of the existing first edition material was not revised. A full-scale revision for a third edition began in 1990, and revised entries have been published in batches (starting from the letter M, though more recently including ranges across the alphabet) since 2000. With the new revision, editors have been able to integrate new research into entries, for example by adding new entries and new senses in existing entries, completely revising etymologies, including ante- and post-datings for usages and redatings for some quotations, and giving more detailed information on pronunciation in di¤erent accents of English. 1. E.g. see Berg (1993) for the history and content of the first and second editions, and OED Online (http://www.oed.com/public/oedhistory) for a more recently updated account that includes the third edition. Bejoint (2010: 96– 115) also provides a useful summary.
20
Kathryn Allan
The result of this long and complex history is that each edition of the OED is unavoidably a patchwork of entries from di¤erent periods and by di¤erent editorial sta¤, and this is particularly striking in the evolving online edition, since this includes a combination of 2nd and 3rd edition entries. Some entries early in the alphabet, e.g. those for advertise (v.), astonish (v.) and astray (adv.), have not been revised for over 120 years, whereas others were first written for, or supplemented in, the 2nd edition, and have been revised again very recently for the 3rd edition, e.g. rubbish (v.)2. This presents a particular di‰culty for scholars using the online edition, which presently o¤ers a mixture of 2nd and 3rd edition entries that changes every quarter. In some cases, relatively little revision has been made to entries: for example, the 2nd and 3rd edition entries for octagon (n.) are very similar in both the language of the definitions and the structure of the entry (in the 3rd edition an additional, though rare, sense has been added). Although the etymology has been expanded considerably so that more is said about the immediate donor into English and cognates in a wider range of languages and varieties, the essential facts remain the same. This level of similarity seems relatively rare though, and it is much more usual to find more substantial revisions in the content of entries. The 3rd edition entry for marriage provides a good example, showing changes to nearly all sections. The entry begins with a greater range of spelling variants, follow by an expanded etymology section, which gives more precise and detailed information about the immediate donor and cognates. Some definitions have been reworded: for example, group marriage is no longer defined as ‘the system prevailing amongst some primitive peoples’, as in OED2, but as ‘a system understood to exist in some cultures, religious groups, etc.’, and the label ‘‘Anthropol[ogy]’’ has been removed. Perhaps most strikingly, though, the overall structure of the entry has been reorganised. In sense 1, the order of subsenses has changed, and in OED3 a further subsense 1d, ‘A particular instance of matrimony between a husband and wife [. . .]’, has been added; this is listed as sense 3a in OED2. Similarly, OED2 sense 7 has become OED3 subsense 5b, part of a restructured sense 5, which has a general heading ‘‘fig. and in extended use’’; this includes two new subsenses, 5c ‘(An act of ) industrial or commercial union; a merger’, and 5d ‘An antique assembled from components di¤ering in provenance, date, etc.’. Finally, many of the quotations have been redated, and earlier and/or later attestations
2. Revised March 2011.
Using OED data as evidence for researching semantic change
21
have been added at some senses. For example, the date of one quotation from Cursor Mundi has been revised from a1300 to a1400 (1325)3, and the first quotation for the sense ‘a dowry’ (sense 6 in OED2 and sense 3 in OED3) is dated c1330 rather than 1391. 2.2. The OED and current approaches in historical semantics The kind of structural changes that a¤ect revised entries such as marriage seem significant because they alter the picture that is presented of the lexeme’s historical development. The layout of OED entries, which was established in the first edition, sets out a narrative for the semantic history of each individual lexeme, and this implicitly addresses a concern that has been central to past and present work in historical semantics: the question of semantic motivation. Entries are set out in a kind of semantic tree structure, showing the relationship between senses that appear to branch o¤ from one another. This method of presentation aims to make sense of the multiple meanings of lexemes that are highly polysemous, both synchronically and diachronically, and therefore o¤ers some kind of judgement about how and why particular senses have emerged. In a relatively early discussion of types of motivation within what might be considered ‘‘modern’’ linguistics, Ullmann highlights exactly these issues: A third type of motivation is due to semantic factors [. . .] It is based on the most peculiar trait of changes of meaning [. . .]: the coexistence of old sense and new within the same synchronous system. ‘Foot’ continues to denote the human limb while at the same time applying metaphorically to the lowest part of hills and many other objects. As long as the figurative – metaphorical, metonymic, pars pro toto, etc. – character of such transfers is present to the speakers’ mind, motivation exists. (Ullmann 1959: 89)4
Traugott and Dasher point out that this kind of interest has a long heritage, and that questions about semantic motivation were discussed long before the twentieth century. In classical times, Greek and Roman grammarians ‘‘argued at length about the arbitrariness or naturalness of form-meaning pairs, homonymy and polysemy’’; by the nineteenth century, comparative philologists looked for plausible types of meaning change to support their 3. The first date represents the manuscript date, and the date in round brackets is the date of composition. 4. Ullmann goes on to point out that the motivation for figurative senses can become opaque over time as these senses become more conventional.
22
Kathryn Allan
hypotheses about the formal histories of and connections between IndoEuropean languages (Traugott and Dasher 2005: 51). In recent times, and particularly with the rise of cognitive linguistics, there has been a revival of interest in the notion of semantic motivation (see, for example, papers in Cuyckens et al. 2003, and Radden and Panther 2004). In particular, many scholars have been concerned with patterns of regularity across semantic histories. For example, Traugott and Dasher’s own research, and their formulation of the Invited Inferencing Theory of Semantic Change, focuses on ‘‘predictable paths for semantic change across di¤erent conceptual structures and domains of language use’’ (Traugott and Dasher 2005: 1). They point out that, although some paths of change can be identified, regularities in semantic change are not ‘‘absolute’’, and some areas of the lexicon appear to be more problematic than others for those examining change and polysemy: [. . .] irregular meaning changes seem to occur primarily in the nominal domain, which is particularly susceptible to extralinguistic factors such as change in the nature or the social construction of the referent. For example, the referents of towns, armor, rockets, vehicles, pens, communication devices, etc., have changed considerably over time, as have concepts of disease, hence the meanings attached to the words referring to them have changed in ways not subject to linguistic generalization. (Traugott and Dasher 2005: 3–4)
The cases that Traugott and Dasher go on to discuss indicate that the ‘‘nominal domain’’ includes both nouns and adjectives, since it includes both lexemes that name concepts, which tend to be nouns, and lexemes that can be used to describe these concepts. Both types are prone to changes that are highly dependent on extralinguistic factors. The three lexemes that are considered in this paper, milksop, pregnant and dull, can all be taken to belong to this domain. 3.1. Typical patterns of semantic change Within cognitive semantics, a great deal of work has focused on the relationship between the literal and metaphorical or metonymical senses of polysemous lexemes, and on the underlying cognitive mechanisms that result in mappings across or within conceptual domains (see Geeraerts 2010: 203–222 for a summary of work in this area). Although the connection between literal meanings and figurative meanings of either metaphorical or metonymical nature do not seem to exhibit the type of predictability
Using OED data as evidence for researching semantic change
23
that Traugott and Dasher (2005) have identified elsewhere, scholars have observed trends of meaning extension that seem to result in metaphorical or metonymical polysemy. A substantial body of research into metaphorically motivated semantic change suggests that this tends to be unidirectional, and that the most usual and likely direction of change is from a physical, concrete sense to a figurative, more abstract one (see Gyori and Hegedus, this volume). Coulson’s description of the way metaphor works is a very typical statement on this kind of shift: Directionality is thought to reflect the underlying cognitive operations in metaphor, in which an experientially basic source domain is exploited to reason about a more abstract target domain. Indeed, many entrenched metaphors involve the use of a concrete source domain to discuss an abstract target. (Coulson 2006: 34)
As Geeraerts points out, much of the work in this area treats metaphor and metonymy as synchronic phenomena, rather than taking the kind of diachronic perspective that was common in earlier work in historicalphilological semantics (Geeraerts 2010: 203). Nevertheless, it is usually assumed that, historically, directionality in mappings will be reflected in the periods in which senses of polysemous lexemes first emerge, so that ‘‘experientially basic’’ concrete senses will predate the ‘‘more abstract’’ senses that might be assumed to be figurative from a synchronic perspective. This pattern has been observed in a large number of cases. Ullmann lists the pattern ‘‘from concrete to abstract’’ as a subtype of one of his four cardinal types of semantic change, stating that ‘‘one of the basic tendencies in metaphor is to translate abstract experiences into concrete terms’’ (Ullmann 1962: 215), and gives a number of examples including define, eliminate, and desire. More recently, Sweetser observes that ‘‘certain physical-state and motion verbs are likely sources for vocabulary of certain abstract areas of meaning’’ (Sweetser 1990: 20). It seems possible that there may be some cases in which literal and metaphorical senses might emerge simultaneously. Allan (2008: 62–63) suggests that the kind of embodied motivation that has been posited for primary metaphor would support an immediate extension of meaning. On the other hand, it seems logical that metaphorical meanings should not predate ‘‘literal’’ meanings. A word history that appeared to show a development from an earlier metaphorical meaning to a later literal meaning would therefore be treated with caution.
24
Kathryn Allan
3.2. Patterns of change and dating in the OED The OED o¤ers an opportunity to test assumptions about the historical relationship between synchronically literal and figurative senses, since it provides dated attestations for all senses within the entry for each lexeme. These attestations were an important and unusual feature of the first edition, which aimed to give a clear picture of the period of currency of this sense from its earliest use onwards. However, although editors worked to find the earliest attestations, they did not claim that they would or could include the very first and (in the case of obsolete words) last examples in print. Murray was very aware of both the limitations of dictionary editors and readers in finding all existing evidence, and problems with the evidence itself 5, and although the many new and improved resources now available often enable 3rd edition editors to improve the record in many cases, the same limitations essentially apply. Even with a large scale, ongoing reading programme in place, it is impossible for editors to be confident that they have checked every extant text published throughout the written record for instances of any lexeme. Many texts are unedited, and so are di‰cult or impossible to find or access, and even where access to unedited texts is not restricted (for example, because they are privately owned) it might be di‰cult to trace these texts because of lack of accessible cataloguing. Even if the complete written record became available in error-free searchable electronic format, which remains highly improbable, identifying and checking the relevant material would still be a labourintensive task. However, apart from this practical di‰culty, perhaps a more fundamental problem is presented by the incompleteness of evidence for earlier periods. Many texts which might o¤er valuable attestations have not survived, so that there are gaps in the historical record for some lexemes; furthermore, in some periods (such as the early Early Middle 5. In his 1884 address to the Philological Society, Murray commented that [. . .] we cannot exhaust the ground, or attain to absolute certainty, except in very exceptional cases [. . .] Earlier instances will, I doubt not, yet be found of three-fourths of all the words recorded [. . .] It must be remembered also that with the majority of words the earliest attainable written instance is after all not the beginning of the history, but only evidence of an indefinitely earlier beginning [. . .] (Murray 1884: 516–517) See also Mugglestone (2000: 8) for an account of problems with quotation evidence.
Using OED data as evidence for researching semantic change
25
English period) relatively few written texts were ever produced, so that it is di‰cult to compare the evidence from these periods with others which are more fully represented in writing. The combination of these factors means that while attestations indicate currency of any sense of a lexeme at a particular time, lack of attestation cannot be taken to provide any evidence of lack of use. Because dating evidence in limited in these ways, OED1 editors did not structure entries in a purely chronological order according to first available attestations, but also took account of intuition to present the senses of each lexeme in the order they logically appeared to have developed. Murray made the point that Historically . . . a word is often a long series of historically and phonetically connected forms, with a long series of logically and historically related senses. . . The various senses will be arranged in the order in which they seemed to have flowed from the primary sense, now often obsolete. (Murray 1881: 135)
In the third edition, there has been a slight shift away from ‘‘logical’’ ordering of senses, although this is still central to the way entries are structured. Senses that appear to have developed from one another are still presented in branches, but these branches are listed in the order of the earliest date of attestation rather than in the order that they seem most likely to have developed. Equally importantly, the first sense listed in any branch will be the earliest attested sense in that group of meanings. This takes a further step towards a more fully evidence-based presentation, since assumptions about the most likely history of a word are not made unless supported by data, and arguably it places more of a responsibility on the dictionary user to interpret the data. This is the point made by John Simpson in an email discussion with Penny Silva about the restructuring of entries in OED3: With a mass of new data at their disposal, the editors are following the quotation evidence in ordering the senses, applying the historical method more rigorously than is the case in the first edition of the OED – but in tandem with the logical approach. The imposition of higher level branch-numbering often maintains the existing entry structure, solving the problem of disrupted date order; and ‘when this is not the case, the discipline of maintaining a chronological ordering raises significant issues of semantic development which would otherwise remain unaddressed’ (Simpson 27/12/97). (Silva 2000: 93)
26
Kathryn Allan
These ‘‘issues of semantic development’’ can be di‰cult to resolve, particularly because of the problems of dates of attestation discussed above. The remaining part of this article examines three cases of lexemes for which the available dating evidence does not support the semantic history that appears most intuitively likely, and considers how best to assess the evidence in each case. 4.1. From abstract to concrete? The semantic history of milksop The entry for milksop provides a fairly straightforward example of the mismatch between dating evidence and the most intuitively likely semantic development of a lexeme, and it also illustrates the di¤erence in presentation between OED2 and OED3. The 2nd edition entry is structured in the following way (the date of each illustrative quotation is also supplied): †1. †a. †b. †c. 2. †a. b.
A piece of bread soaked in milk. Obs. rare. c1420 fig. in pl. ‘Soft sayings’. Obs. 1577 milksop dishes, dishes made of ‘milkmeats’. 1628 An infant not advanced beyond a milk diet. Obs. rare. c1460 fig. An e¤eminate spiritless man or youth; one wanting in courage or manliness. [1246–56], c1386, 1568, a1619, 1749, 1876
The senses of milksop are thus divided into two groups. Branch 1 starts with the primary ‘‘literal’’ sense, which is a physical object, and it also includes a figurative sense (1b ‘soft sayings’) that seems to have developed directly from this sense, and an established phrase that relates to 1a. Branch 2 covers a second literal sense that appears to have developed metonymically from sense 1, i.e. ‘an infant who is associated with this type of food’, and a further figurative sense that has presumably developed metaphorically from 2a. The fact that all three senses in 1 and sense 2a are supported by only one attestation each indicates that the evidence for these senses is limited, since additional quotations would have been included if they existed6. The final sense, 2b, is therefore attested much more frequently than any of the others7, and the first attestations are the 6. Since the earliest and latest available quotations are given for each sense, a single quotation suggests that no others have been found. 7. The dates in square brackets refer to an attestation of this sense in a surname recorded in a document. This cannot be taken to be the same kind of evidence as a straightforward attestation of the lexeme, but OED2 editors have clearly concluded that the surname is most likely to belong to sense 2b and have recorded it because the document is significantly earlier than the others quoted there. In OED3 the surname evidence is mentioned in the etymology section rather than under any particular sense.
Using OED data as evidence for researching semantic change
27
earliest in the entry, but it has been placed last because this appears most logical semantically. This means that the narrative that has been presented shows a sense development from a concrete first sense, ‘a piece of bread’, to a more abstract sense ‘An e¤eminate spiritless man or youth’. In OED3 the entry is structurally di¤erent; the senses have been reorganised to take more account of the dating evidence, and the earliest attested sense is listed first. The earliest dates of attestation are also slightly di¤erent. For the early figurative sense (2b in OED2), the quotation containing the surname Milcsop has been removed (and replaced with a fuller note on surname uses in the etymology section) and the first quotation (from Chaucer) has been redated to slightly later. Additionally, the first quotation for the literal ‘piece of bread’ sense has been redated to a1474 rather than c1420. This means that the dates of attestation are still counter-intuitive, and a comment in the etymology section (in square brackets) draws attention to this: [ [. . .] The figurative use (sense 1a) is attested earlier than the literal use (sense 2a) [. . .] ] 1.
2.
a.
A feeble, timid, or ine¤ectual person, esp. a man or boy who is indecisive, e¤eminate, or lacking in courage. c1390> b. An infant still on a milk diet. Obs. rare. a1500 a. A piece of bread soaked in milk. rare. a1475>
In this case, evaluating the evidence to decide on the most likely semantic development of the lexeme seems relatively straightforward. Milksop is a native English word that is formed from milk þ sop, and the literal sense appears to be a transparent compound where the first base modifies the second: a milksop is a sop soaked in milk. This literal sense of the compound is not attested frequently until the 18th century onwards (in new evidence added to the OED3 entry8), but the OED2 entry for sop, i.e. the unmodified base, shows that it is much more frequent with a literal sense. Sop n1 meaning ‘soaked piece of bread’ is attested frequently from a1100 onwards, and this is very much earlier than any figurative sense; the first quotation under a figurative sense which may be relevant, sense 2b ‘Used of persons in respect of some pervading quality or property’, is dated c1480, and sense 2c ‘A dull or foolish fellow; a milksop’ is attested even 8. I have also checked for earlier evidence of the literal sense in Early English Books Online, which has added new texts since work on OED3 began, but none exists.
28
Kathryn Allan
later in a1625. In OED Online, an even later first date of 1593 is given for sense 2b9. If sop n1 was in common currency from the 12th century onwards, as the illustrative quotations suggest, a modified form of the lexeme with a related ‘‘literal’’ sense such as milksop would have been easily interpretable even if it were unfamiliar or relatively infrequent. This suggests that the evidence for this sense of milksop is incomplete, either because no attestations survive or because they have not been available to editors, and this is particularly plausible given the early period involved. There is relatively little written material for the 14th and 15th century compared to later centuries, so it is unsurprising that one sense of the lexeme might not occur frequently in surviving records. This is especially likely because milksop belongs largely to one particular discourse area, culinary texts, which are relatively scarce in this period compared to other discourse types, such as, for example, religious texts. In fact, the earliest attestation for the literal sense of milksop is from one of the major collections of recipes of this time. Furthermore, the surname attestation for the figurative sense is earlier than the quotations for any sense, and this suggests that the lexeme was already established by the 13th century, supporting the hypothesis that there are gaps in the historical record for this lexeme. In the light of all of this evidence, it seems highly likely that the literal sense precedes the figurative senses historically, and the dates of attestation are simply the result of a lack of supporting data. One further possibility might be that milk and sop were combined to form a compound with the figurative meaning that is recorded earliest, but this is also problematic, since the earliest date for a related figurative sense of sop (discussed above) is at least a century later than the figurative sense of milksop; again, this assumes that earlier evidence is missing from the record. If the figurative meaning of sop were already established but is not recorded, though, a new coining of milksop would have been interpretable even if no literal use of the compound was already available to
9. In OED Online, the dates of attestations have sometimes changed for 2nd edition entries; when a text is redated during the revision process, the date of that text changes across all entries in both 2nd and 3rd edition entries. This a¤ects the date here: the manuscript of the text from which the attestation has been taken, Henryson’s Testament of Cresseid, has been redated to 1593, with a composition date of 1505. This makes the earliest manuscript date for this figurative sense 1575 (in another redated text).
Using OED data as evidence for researching semantic change
29
speakers10. If this is the case, the literal sense might be regarded as the semantic equivalent of a back-formation11: it would seem logical to speakers familiar with both elements of the compound, who were aware of the polysemous nature of sop, that the compound had primary literal meaning, so this meaning could be ‘‘reconstructed’’ by analogy with sop. In a sense this would support the hypothesis that literal meaning precedes figurative meaning in this case, even if the compound did not have an earlier literal meaning. It seems impossible to go beyond conjecture in considering this possibility, though, and it reinforces the possibility that evidence which would complete the historical picture is not available. 4.2. The semantic history of a borrowing: pregnant A more complex case is presented by pregnant (adj. and n.). This is a highly polysemous lexeme, but again there is a mismatch between the most intuitively and cognitively likely order of development of the senses and the dates of attestation in English. An abridged version of the OED2 entry is given below: 1. That has conceived in the womb; with child or with young; gravid. Const. with, of (the o¤spring), by (the male parent). 1545, 1656, 1665–6, 1667, 1774, a1827, 1844, 1899 b. fig. (or in figurative context). c1630, 1641, 1764, 1873 II. In various mental or non-physical uses. 3. a. Of a person or his mind: Teeming with ideas, fertile, imaginative, inventive, resourceful, ready. Const. of, in, or to with inf. arch. or Obs. 1413, 1432–50, 1513, a1591, 1624, 1632, 1711, 1853 4. Of words, symbolic acts, etc.: Full of meaning, highly significant; containing a hidden sense, implying more than is obvious, suggestive; also, full of, replete with (something significant). c1450, c1480, a1626, 1659, a1661, 1838–9, 1860, 1879 I.
The first sense that is listed is the main physical sense in current use, ‘with child’, and this is clearly the concrete sense from which other senses could be assumed to develop. In the etymology note (in smaller type after the
10. With thanks to an anonymous reviewer for this suggestion. 11. The term ‘‘semantic back(-)formation’’ is used by a number of scholars, e.g. Buck (1917: 175) and more recently Geeraerts (2003: 460) and Queller (2003: 237).
30
Kathryn Allan
etymology) specific attention is drawn to this by the comment that ‘‘It is remarkable that this should appear so much earlier than the literal sense’’. However, sense 3a, which denotes a human characteristic, is attested over a century earlier in 1413, and sense 4, ‘of words [. . .] etc.’ is attested slightly later in c1450. The revised 3rd edition entry therefore restructures the order of the senses: I. Of the mind, language, behaviour, etc. 1. a. Full of meaning, highly significant; suggestive, implying more than is obvious or stated. 1402> 2. a. Of a person or the mind: full of ideas; imaginative, inventive; resourceful; (of wit) quick, sharp. Now rare. ?a1475> II. Of the body or physical phenomena. 3. a. Of a woman or other female mammal: having o¤spring developing in the uterus. Also of the womb (obs.). Freq. with with (the o¤spring), by (the male parent). ?a1425> In this revised entry, all three senses have di¤erent first dates of attestation, and the two branches of meaning have been reordered. A redefined version of OED2 sense 4 (‘Of words [. . .]’ etc.) is now the first sense listed, but because of a new quotation the time di¤erence between the earliest attestations for this sense and for the concrete physical sense (now 3a) has become much smaller and is only around 25 years. This seems a much less significant gap; both quotations are from the 15th century, and it is not unlikely that the written record for the lexeme is incomplete given the early period involved. Nevertheless, the dating evidence still does not support the semantic history that one would expect from this particular lexeme. In this case, information that is given in the rest of the OED3 entry provides some clues about how to treat the evidence. According to this entry, pregnant is first attested in the late Middle English period, and it is ultimately from Classical Latin praegnant-, praegna#ns; its borrowing into English appears to be influenced by both Middle French pregnant and the classical Latin etymon, since both of these would be known to those using English during the period. The earliest sense in both French and Latin appears to be the physical one, and the earliest figurative sense found in English (OED3 1a) is not attested in either language until later than English, although other figurative senses (‘imaginative, inventive’) are recorded in post-classical Latin in the 6th century. There are also cognates in other languages including Italian, although the entry editor notes that the physical sense is recorded significantly earlier than any
Using OED data as evidence for researching semantic change
31
figurative senses in these languages and sense 1a is not attested until later than in English. It seems fairly certain that, etymologically, the lexeme pregnant has undergone the kind of shift from a concrete physical sense to an abstract metaphorical sense that is suggested by studies in cognitive linguistics. The evidence in Latin and cognate languages suggests strongly that the physical sense is the earliest and that figurative senses develop significantly later. However, there are several possibilities about the semantic development of the lexeme after it has entered English. Firstly, it may be that the literal and figurative senses were borrowed at roughly the same time, and it is chance that the attestation for the figurative sense is attested slightly earlier in English. Although sense 1a is not found in Latin (and only later in French than in English), evidence from Latin in this period is patchy, and it is plausible that the historical record for this sense is incomplete and this figurative sense existed and was available alongside the literal sense; the meanings ‘inventive’ and ‘compelling, cogent’ are found much earlier in medieval Latin. Dictionary evidence for early use in French is more secure, but it is still possible that an earlier figurative sense existed but is not recorded. Secondly, if the figurative sense did exist in Latin or French it is also possible that it was borrowed earliest, and the dates of attestation reflect the actual semantic history in English. This is perhaps supported by the fact that there is a lengthy gap between the first two attestations for the physical sense, which might indicate that it was relatively rare in English until later; by contrast, three 15th century attestations are given for sense 1a, indicating that it may have been in more widespread use. The fact that there were other lexemes in English with the same meaning, such as with child12, may have meant that there was no need for pregnant to be borrowed with this sense until much later13. Finally, a third possibility is that incomplete records in English have distorted the historical picture, and the physical sense was borrowed earlier even though attestations have not been found by OED3 editors. This seems less likely because there is no shortage of surviving medical texts from the period, but it is still possible. Related lexemes are not helpful 12. A full list of synonyms and their dates of attestations is available in section 01.02.03.03.16.06 (adj.) Pregnant in the Historical Thesaurus of the Oxford English Dictionary (Kay et al. 2009) 13. If this is the case, the increase in borrowing from classical languages in the 16th century could partly explain why this sense became more established later.
32
Kathryn Allan
here; pregnancy (n.) and pregnate (adj. and v.) are not attested until later than pregnant (adj.), and pregnation (n.) ‘pregnancy’ occurs earliest in the same text as the ‘with child’ sense of pregnant, dated ?a1425. Further investigation might provide better evidence for one of these three hypotheses, but it does not seem possible to prove any one definitively, and it provides a clear case of the limitations of research into semantic change at this kind of time depth. 4.3. Evidence from cognates and related forms: the semantic history of dull The final case that will be examined here is a lexeme that has not yet been revised for OED3, dull. According to Collins Cobuild English Dictionary (CCEB), which lists senses in frequency order, the most frequent PresentDay English sense of dull is ‘boring’. It also has several senses designating qualities in the external, physical world, including those which CCEB defines as ‘not bright’ (sense 3, in full ‘A dull colour or light is not bright’), and as ‘not sharp’ (sense 7, ‘If a knife or blade is dull, it is not sharp; a fairly old-fashioned use’). An informal survey among colleagues suggests that, intuitively, most people assume that one of these senses is the primary sense, and that the senses that can be used to describe people, ‘boring’ and ‘lacking intelligence’, are metaphorical extensions of one of these. Again, this intuition is not supported by OED2 dating evidence. In this case, the structure of the entry is consistent with this evidence so that the more concrete senses are listed after the mental sense: 1.
2.
3.
Not quick in intelligence or mental perception; slow of understanding; not sharp of wit; obtuse, stupid, inapprehensive. In early use, sometimes: Wanting wit, fatuous, foolish. [c940, c975], a1000, a1250> a. Wanting sensibility or keenness of perception in the bodily senses and feelings; insensible, obtuse, senseless, inanimate. In dialect use, esp. Hard of hearing, deaf. c1340> b. Of pain or other sensation: Not keen or intense; slightly or indistinctly felt. 1725> a. Slow in motion or action; not brisk; inert, sluggish, inactive; heavy, drowsy. 1393> b. Of trade: Sluggish, stagnant; the opposite of brisk. Hence transf. of goods or merchandise: Not much in demand, not easily saleable. 1705>
Using OED data as evidence for researching semantic change
4.
5. 6. 7.
33
Of persons, or their mood: Having the natural vivacity or cheerfulness blunted; having the spirits somewhat depressed; listless; in a state approaching gloom, melancholy, or sadness: the opposite of lively or cheerful. c1393> Causing depression or ennui; tedious, uninteresting, uneventful; the reverse of exhilarating or enlivening. 1590> Not sharp or keen; blunt (in lit. sense). [c1400], c1440> a. Of or in reference to physical qualities, as colour or luminosity, sound, taste: Not clear, bright, vivid, or intense; obscure, dim; indistinct, mu¿ed; flat, insipid. b. Of the weather: Not clear or bright; cheerless, gloomy, overcast. (Here there is app. some mixture of sense 5.) c1430>
As this shows, the first attestations for a mental sense of dull (sense 1) are significantly earlier than the first attestations for either current sense designating qualities in the external, physical world (6 and 7a), which are both in the 1400s. There is also a third physical sense listed at 3a, ‘Slow in motion or action; not brisk; inert, sluggish, inactive; heavy, drowsy’, but this is not attested much earlier, with a first quotation date of 1393. The evidence for the first attestations of dull is complex, and any judgement about which sense might be considered earliest requires careful attention to etymology. The first two quotations for the earliest attested sense (1a), ‘Not quick in intelligence. . .’ are in square brackets, as is the first quotation for sense 6, ‘not sharp or keen’. The brackets are used to indicate that these are attestations for a closely related lexeme, dol, rather than for dull itself; in fact, the a1000 quotation shows the same form, and should also be in square brackets. The reason that the OED entry includes citations of dol is because it is so closely related to dull, and it provides further evidence about its most likely semantic development. In Old English, there are several attestations for the lexeme dol, meaning ‘foolish’, but this does not survive into Middle English. Sound change evidence means that dull cannot be a reflex of dol, since the in dol would remain /o/; dull is probably more closely related to Middle English dil, dille, dylle, ultimately reflecting a morphological variant from the same Germanic base as dol. All of these forms are developed from the same Germanic base (with di¤erent su‰xation), as is supported by cognate forms in other Germanic languages. dull is therefore related to dol as a kind of ‘‘cousin’’, though dol is not attested after the OE period. If the quotations attesting dol are disregarded, the earliest attestation for the form dull is therefore the a1250 quotation; this brings the first dates
34
Kathryn Allan
of attestation for the mental senses and the physical senses of the lexeme significantly closer, but still leaves a gap of over a century. Viewed as a separate but closely related lexeme, though, dol provides helpful evidence, since the surviving evidence suggests that it undergoes the same semantic shift from an earlier mental sense to later physical, external senses; if dull had an earlier physical sense that is simply unattested, it seems likely that the same would be true of dol, since the two forms share an etymon. However, citations in the Dictionary of Old English (henceforth DOE; Healey 2007) do not support the possibility of an earlier non-mental sense. The main adjectival senses listed for dol is ‘foolish, stupid, unwise (mainly of people)’; this is followed by three subsenses (A1-3) and then by nominal uses which all denote unwise, foolish or rash people. Senses A1 and A2 both show straightforward narrowing: A1 is defined as ‘of those whom drink has made foolish’, and A2 ‘of those who unwisely oppose God’. Sense A3 does appear to show a physical meaning relating to lack of sharpness, and is simply defined as ‘of a sword’, but the attestation is problematic. It appears in a riddle, so might represent an unusual or intentionally untypical use, and a note supplied by the DOE editors marks the use as ‘‘perh[aps] to be emended’’. A second related form found in Middle English, dill, is mentioned in the etymology section of the entry and suggested as a possible early form of dull, and this is only recorded with the meaning ‘sluggish, slow, stupid, dull’ in quotations dating from c1200 to c1440. The fact that neither dol nor dill evidence early physical senses makes it more plausible that the semantic history proposed for dull is not simply the result of a gap in the records, although it is also possible that in all three cases evidence has been lost. A comparison between the meaning of this set of forms in English and cognates in other languages provides a further clue to the credibility of the existing semantic picture. OED2 notes cognates in three Germanic languages, Old Saxon, Dutch and Old High German (with a reflex in Modern German). An examination of sources relating to each language confirms that no physical meanings are recorded for any language. All three cognates are attested with the meaning, ‘stupid, foolish’, probably reflecting a verbal base meaning ‘to be foolish’14. All of this evidence indicates that the dates of attestation of di¤erent senses of dull do seem consistent with its semantic development; in this case, it seems highly likely that there has not been the typical shift from a 14. I am grateful to Philip Durkin for his advice about the relationship between these cognates.
Using OED data as evidence for researching semantic change
35
concrete (physical) to an abstract (mental) sense that would be predicted by cognitive models of semantic change. This leaves a question about how to account for the change, and finding a definitive answer (if one is possible) would require further investigation. One line of enquiry might be to examine the meanings of other semantically related lexemes in the period in which dull changes, to see if these might influence the development of new senses of dull. As shown above in the OED entry, dull has several senses that could be considered ‘‘literal’’, perhaps most notably ‘not sharp or keen; blunt (in lit. sense)’. In its mental sense, one antonym of dull is sharp, and the antonymous sense of sharp is attested earliest in the Old English period, when sharp ‘physically pointed’ develops the metaphorical meaning ‘intelligent, shrewd’. It is possible that because one sense of sharp is antonymous to dull, dull develops a further sense on the model of sharp which is antonymous to the other physical sense, i.e. by a kind of semantic ‘‘proportional analogy’’. However, a first glance at the evidence shows that the same kind of relationship between the ‘not bright’ sense of dull and the lexeme bright itself is not convincing, since the ‘intelligent’ sense of bright is attested later than the relevant sense of dull. Further research into the full range of lexemes in each semantic field is needed before any conclusion can be drawn, but at least this tentative suggestion shows that some account of the motivation for the semantic history of dull might be found.
5. Conclusion The semantic histories of milksop, pregnant and dull show that, while the principle of semantic shift from more concrete, experientially basic ‘‘external’’ meaning to more abstract figurative meaning is supported by a large number of examples, there are also cases where it is not proven. It is only possible to make a judgement about the plausibility of a shift in the opposite direction by careful attention to the nature of the dating evidence for each lexeme. In the cases of milksop and pregnant, it seems probable that there is a discrepancy between the available evidence presented in the OED and the actual semantic history of each lexeme, although in both cases, this is not certain; even further research might not prove a discrepancy. milksop appears to present a relatively straightforward case, where the ‘‘figurative’’ sense of milksop is not the earliest despite the dating evidence in OED, though the meaning of the composite parts also needs to be taken into account. The evidence for pregnant is more tricky, and
36
Kathryn Allan
highlights the di‰culty of considering the diachronic relationship between polysemous senses of a borrowed lexeme. While a sense may clearly have emerged through metaphor in the source language(s), it is di‰cult to establish which senses have been borrowed into the borrowing language, since the evidence in either or both the source language(s) and the borrowing language may not be complete. The revised OED entry for pregnant reflects these di‰culties, and therefore its structure may not show the diachronic shift of meaning that has occurred. On the other hand, the evidence for dull suggests strongly that the structure presented in the OED is consistent with the actual semantic development of the lexeme; the typical pattern of more concrete, experientially basic meaning > more abstract figurative meaning, suggested as a principle of metaphorical meaning change, has not taken place. Other similar examples that have been observed (e.g. see Shindo 2009: 176) show that this ‘basic > figurative’ pattern of semantic change cannot be assumed in every case, and an alternative account for the motivation of this sense development is needed. Each of these case studies therefore highlights particular issues that arise when examining the semantic histories of individual lexemes and the relationship between theories of semantic change and recorded evidence. In each instance, the OED o¤ers an account of the history of the lexeme based on available attestations, but this is not intended to be an argument for the credibility of a particular sense development. In OED3, the structure of entries consistently represents the order in which senses are attested, but cannot be taken to represent the actual chronological sequence of sense development. OED2 takes a mixture of approaches, sometimes prioritising attestations but sometimes suggesting the most ‘‘logical’’ sense development even where this is not evidenced. This means that, across both editions, the way in which the entry is presented places a responsibility on the reader. It is important to look beyond the historical record for the individual lexeme in various ways15. Firstly, related lexemes of various kinds must
15. The notion of ‘‘ecological motivation’’ that has been discussed within cognitive linguistics is relevant here. It picks up earlier ideas about the systematic nature of language, and the relationship between elements: Radden and Panther explain that ‘‘The ecology of a linguistic unit is to be understood in the sense that it has ‘pointers’ to other units and, to the extent that the unit is related to other units in the language, it is motivated. . . [This] applies to changes a¤ecting the semantic system [as well as other levels of language]’’ (Radden and Panther 2004b: 24–25).
Using OED data as evidence for researching semantic change
37
be taken into account. Derivationally related lexemes might shed light on the most likely sense development of a lexeme, as in the case of compounds like milksop. Etymologically related lexemes must also be considered: where a lexeme is borrowed, like pregnant, the senses and sense development of etymons might shed light on the evidence for the English borrowing; where a lexeme is native or of uncertain origin, like dull, cognates in other languages might o¤er clues to its early meaning, and related lexemes within English like dol might also add pieces to the historical picture. The nature of evidence in di¤erent periods must also be treated carefully. In early periods (for English and other languages), evidence is not unlikely to be incomplete, and a lack of attestations cannot be taken as evidence of the non-existence of a lexeme or a particular sense of a lexeme. The discourse areas in which lexemes are likely to be used is also relevant, since not all text types are represented well in all periods. For example, a large number of religious texts exist for early periods of English, so that lexemes with religious senses are more likely to be attested in these periods, whereas culinary terms are rarer in the surviving historical record, as the evidence for milksop suggests. For the modern historical semanticist, the evolving OED still o¤ers unparalleled access to a large amount of information about word histories, and alongside other data sources it presents an opportunity to interrogate current theories about semantic motivation and patterns of change. A closer look at individual word histories clearly demonstrates how important it is to pay close and critical attention to the chronology of semantic change presented in OED entries, and to view this chronology as a starting point for further research.
References Allan, Kathryn 2008 Metaphor and Metonymy: A Diachronic Approach. (Publications of the Philological Society 42). Chichester: Wiley-Blackwell. Bejoint, Henri 2010 The Lexicography of English. Oxford: Oxford University Press. Berg, Donna L. 1993 A Guide to the Oxford English Dictionary. Oxford: Oxford University Press. Buck, Charles D. 1917 Studies in Greek noun-formation: Dental terminations I. 2. Classical Philology, Vol. 12 (2): 173–189.
38
Kathryn Allan
Coulson, Seanna 2006 Metaphor and conceptual blending. In: Keith Brown (ed.), Encyclopedia of Language and Linguistics, 2nd ed., Vol. 8, 32– 39. Oxford: Elsevier. Cuyckens, Hubert, Thomas Berg, Rene´ Dirven and Klaus-Uwe Panther (eds.) 2003 Motivation in Language: Studies in Honor of Gu¨nter Radden. Amsterdam: John Benjamins. Durkin, Philip 2002 Changing documentation in the third edition of the Oxford English Dictionary: sixteenth-century vocabulary as a test case. In: Teresa Fanego, Be´len Me´ndez-Naya and Elena Seoane (eds.), Sounds, Words and Change: Selected Papers from 11 ICEHL, Santiago de Compostela, 7–11 September 2000, 65–81. Amsterdam: John Benjamins. Geeraerts, Dirk 2003 The interaction of metaphor and metonymy in composite expressions. In: Rene´ Dirven and Ralf Po¨rings (eds.), Metaphor and Metonymy in Comparison and Contrast, 435–466. Berlin: Mouton de Gruyter. Geeraerts, Dirk 2010 Theories of Lexical Semantics. Oxford: Oxford University Press. Healey, Antonette diPaolo 2007 The Dictionary of Old English: A–G Online. University of Toronto. http://www.doe.utoronto.ca/ Ho¤mann, Sebastian 2004 Using the OED quotations database as a corpus – a linguistic appraisal. ICAME Journal 28: 17–30. Kay, Christian J., Jane Roberts, Michael Samuels and Irene´ Wotherspoon (eds.) 2009 The Historical Thesaurus of the Oxford English Dictionary. Oxford: Oxford University Press. Mugglestone, Lynda 2000 ‘Pioneers in the Untrodden Forest’: The New English Dictionary. In: Lynda Mugglestone (ed.), Lexicography and the OED: Pioneers in the Untrodden Forest, 1–21. Oxford: Oxford University Press. Murray, James A.H. 1881 Ninth annual address of the President to the Philological Society. Transactions of the Philological Society 18 (1): 117–176. Murray, James A.H. 1884 Thirteenth address of the President to the Philological Society. Transactions of the Philological Society 19 (1): 501–527. Murray, James A.H. 1900 The Evolution of English Lexicography (Romanes Lecture). Oxford: Clarendon.
Using OED data as evidence for researching semantic change
39
The Oxford English Dictionary 1884–1933 10 vols. Ed. Sir James A. H. Murray, Henry Bradley, Sir William A. Craigie and Charles T. Onions. Supplement, 1972–1986, 4 vols., ed. Robert W. Burchfield; 2nd. edn. 1989, ed. John A. Simpson and Edmund S. C. Weiner; Additions Series, 1993–7, ed. John A. Simpson, Edmund S. C. Weiner, and Michael Pro‰tt; 3rd. edn. in progress: OED Online, March 2000–, ed. John A. Simpson, www.oed.com. Queller, Kurt 2003 Metonymic sense shift: its origins in abductive construal of usage in context. In: Hubert Cuyckens, Rene´ Dirven and John Taylor (eds.), Cognitive Approaches to Lexical Semantics, 211–242. Berlin: Mouton de Gruyter. Radden, Gu¨nther and Klaus-Uwe Panther 2004a Introduction: Reflections on motivation. In: Gu¨nther Radden and Klaus-Uwe Panther (eds.), 2004b, 1–46. Radden, Gu¨nther and Klaus-Uwe Panther 2004b Studies in Linguistic Motivation. Berlin: Mouton de Gruyter. Shindo, Mika 2009 Semantic Extension, Subjectification and Verbalization. Lanham, Maryland: University Press of America. Silva, Penny 2000 Time and meaning: Sense and definition in the OED. In: Lynda Mugglestone (ed.), Lexicography and the OED: Pioneers in the Untrodden Forest, 77–95. Oxford: Oxford University Press. Sinclair, John (ed.) 1995 Collins COBUILD English Dictionary. London: HarperCollins. Sweetser, Eve E. 1990 From Etymology to Pragmatics. Cambridge: Cambridge University Press. Traugott, Elizabeth C. and Richard B. Dasher 2005 Regularity in Semantic Change. Cambridge: Cambridge University Press. Ullmann, Stephen 1959 The Principles of Semantics. (Glasgow University Publications LXXXIV.) Oxford: Basil Blackwell. Ullmann, Stephen 1962 Semantics: An Introduction to the Science of Meaning. Oxford: Basil Blackwell.
Developing The Historical Thesaurus of the OED Christian J. Kay Abstract The Historical Thesaurus of the OED (HTOED) consists of the contents of the second edition of the Oxford English Dictionary (OED), supplemented by Old English vocabulary not included in the OED, all arranged in hierarchically structured conceptual fields. These fields contain lists of synonyms with their dates of use under brief explanatory headings. HTOED was completed in 2008 and published in book form by Oxford University Press in October 2009 (Kay et al. 2009). Since 2010 it has been linked to the online OED and thus revised in conjunction with OED3. The original database is used in Glasgow for research and development purposes. One long-term ambition is to develop the Thesaurus as part of a suite of tools for tackling two key problems in computational lexicology, multiple meaning and variable spelling. Both of these have particular implications for people working with historical data. The paper demonstrates the principles of classification used in the Thesaurus. An example of what can be learned from words displayed in this way is given from the abstract field of meaning, Truth and Error. Although the project was started in 1965, long before the cognitive semantics paradigm became dominant, that paradigm has retrospectively proved sympathetic to the problems involved in categorizing large quantities of lexical data.
1. Introduction The Historical Thesaurus of English project began in 1965, when Michael Samuels announced in an address to the Philological Society that the English Language Department at the University of Glasgow would undertake the task of turning the headwords in the Oxford English Dictionary (OED) into a conceptual thesaurus as a collective research project (Samuels 1965). At the time it was thought that, with help from postgraduate students and interested scholars, the project would take about fifteen years. In fact, for a variety of reasons, ranging from the expansion of source materials in the OED to funding problems, it has taken 44 years for the work to reach the printed page (Kay et al. 2009).1 In the interim, as evidence of good 1. The project can be viewed online on the Glasgow English Language and Enroller sites and linked to the OED (see end for urls). We are grateful to the bodies which have provided funding and support over the years, principally
42
Christian J. Kay
intentions, the team has produced A Thesaurus of Old English (TOE; Roberts and Kay 2000), an online Historical Thesaurus of English, and associated online teaching packages for Old and modern English (see www.glasgow.ac.uk/historicalthesaurus/).2 Samuels’ motivation in starting the project was to supply a perceived gap in the materials available for studying the history of the English language, and especially the reasons for vocabulary change. For individual words, English has a unique resource in the OED, but, he argued, seeing those words in the context of others of similar meaning would o¤er a di¤erent and illuminating perspective on the development of the lexicon. In 1972, he wrote: [. . .] no solution to the problem of push- and drag-chains in lexis will be forthcoming until it is possible to study simultaneously all the forms involved in a complex series of semantic shifts and replacements. The required data exist in multivolume historical dictionaries like the OED, but they cannot be utilised because the presentation is alphabetical, not notional. The need is for a historical thesaurus which will bring together under single heads all the words, current or obsolete (and all the obsolete meanings of words still current) that have ever been used to express single and related notions. (1972: 180)
In HTOED, ‘‘single notions’’ are expressed as synonym groups, while ‘‘related notions’’ can encompass as much of the lexical field within which the particular group of lexical items is embedded as the researcher wishes to pursue. The extent and relevance of such semantic contextualization is particularly apparent when browsing the compact hierarchical presentation of the wordlists on the printed page; users of the version linked to OED3 can view the skeleton classification alongside the words they are examining.
the Arts and Humanities Research Council, the British Academy, the Carnegie Trust for the Universities of Scotland, the Leverhulme Trust, the University of Glasgow, and King’s College London. 2. The electronic Historical Thesaurus of English was supported by Arts and Humanities Research Council ICT Strategy Project Grant 112456. British Academy Large Grant 37362 supported the online TOE; the teaching packages, Learning and Teaching with the Thesaurus of Old English, and Word Webs: Exploring English Vocabulary, were funded by the English Subject Centre.
Developing The Historical Thesaurus of the OED
43
Figure 1. Adjectives meaning ‘‘white’’. 6 University of Glasgow 2009
Figure 1 shows the beginning of the adjective section in the category ‘‘white’’. Within each category and subcategory of meaning, groups of approximately synonymous words are arranged in chronological order, starting with the earliest to be recorded in English. It is thus possible to see how expressions for a particular concept have developed through the history of the language, and how they relate to concepts of equivalent, more general, or more specific meaning. It would, for example, be possible to move on from here to compare the objects used as typical exemplars of other colours, or to establish whether, as here, there had been ultimately unsuccessful attempts to introduce synonyms from other languages. From a cultural point of view, one might speculate about the objects themselves. Was ivory unknown to the Anglo-Saxons, or was it just not used as a type of whiteness? Looking up ivory as a material in HTOED category 03.10.12.04.14 reveals that the substance was known to the AngloSaxons as elpendban, ‘elephant bone’. HTOED is organized at the highest level in a series of broad conceptual fields (see section 3 below). Groups of words are organized vertically by hyponymous relationships such as ‘‘type of ’’ or ‘‘part of ’’. Individual
44
Christian J. Kay
meanings are organized horizontally by synonymy, including antonymy. Although the project was begun before the cognitive semantics paradigm became influential, that paradigm has retrospectively proved sympathetic to the problems involved in categorizing large quantities of lexical data. Within the HTOED classification, synonym groups are determined on the basis of prototypical instances, with a clear core of obvious members shading o¤ into the less obvious and then into cognate categories, i.e. the ‘‘related notions’’ referred to above. Just as you can have a less birdy bird, so you can have a less synonymous synonym. Such categorial fluidity is especially important in historical semantics, where precise information about meaning may be hard to find, and may also point to actual or potential meaning change (see Section 4 for comments on sooth). Ultimately, of course, a decision has to be made about where to place each individual meaning.
2. The data Data were collected initially from the first edition of the OED (Murray et al. 1884–1933). Each member of the team took a volume of the dictionary and went through it systematically, transcribing the required information onto paper slips. This information included the word itself, its part of speech, its definition, its dates of use as recorded by the OED, and any restrictive labels such as ‘‘dialectal’’ or ‘‘poetic’’ which might assist classification. The final step was to add a number or numbers from the 990 categories of the 1962 edition of Roget’s Thesaurus of English Words and Phrases (Dutch 1962). While there was never any intention to produce a historical version of Roget, there were advantages in using his work as a preliminary filing system which would allow the words to be retrieved in conceptually organized groups when we began to work on our own classification. Compiling slips was by no means a mechanical process. For polysemous words, we generally followed the OED sense divisions, but in some cases we felt that these divisions were too broad, and that other senses could be extracted, or, more usually, too narrow, and that senses could be amalgamated. The latter situation arose, for example, when the OED divided up verb meanings by essentially grammatical criteria, such as the type of object following a transitive verb. In other cases, such as participial adjectives linked to a verb, or adverbial senses derived from an adjective, the OED often presents the quotations as a single block without specifying a
Developing The Historical Thesaurus of the OED
45
particular sense of the parent form. In order to include these in a thesaurus category, it was necessary to analyze the citations and supply the links.3 We also made one substantial addition to the data by including meanings from A Thesaurus of Old English (Roberts and Kay 2000) which did not survive into the Middle English period and are thus omitted as a matter of policy from the OED. HTOED thus covers the entire period of English for which evidence survives. Adding the Roget numbers could present problems, thus identifying issues of classification even at this early stage. Most obviously these problems occurred where there was no single location for what seemed like a clear-cut category, such as Kinship or the Body and its parts, or where meanings were classified by abstract properties of the referent rather than their domain of use. Thus a category such as Roget 209 Height contains not only general terms for the concept, but subcategories for High land, Small hill, High structure (ranging from lampposts to skyscrapers) and Tall creature, including gira¤e, elephant, and (presumably human) lampposts. Such a system was sometimes di‰cult to apply (valley, for example, was in 209 Height, but also in 255 Concavity) and often seemed counterintuitive. Slip-making continued throughout the 1960s and 1970s. By 1980, work on OED1 was almost complete. However, in 1972 Oxford University Press began to publish Supplements to the OED (Burchfield 1972–1986), followed by the revised OED2 (Simpson and Weiner 1989). After much discussion, the decision was taken to continue excerpting revisions and new material from these volumes, thus improving the project but also adding to its duration. Later on, a further decision was taken to draw the line after the Additions Series (Simpson, Weiner and Pro‰tt 1993–1997) rather than attempt to cope with the ongoing revision of the online OED3 (Simpson 2000–), which targets words of particular interest as well as alphabetic ranges. As a result of these decisions, HTOED adheres to the linguistic principle of allowing verification of source data, but the material in the printed volume and the OED3 version will diverge as the latter develops, for example when new words are added or dates are revised (see Allan, this volume, for a discussion of changes in OED3 material).
3. Details and examples of the issues encountered in adapting the dictionary materials for thesaurus purposes are given in Kay and Wotherspoon (2002).
46
Christian J. Kay
3. The classification By the end of the project, the HTOED archive held around 800,000 slips, each containing one meaning extracted from the OED or TOE. It should be stressed that these are di¤erent meanings, not di¤erent words: for polysemous words, each sense was included in the appropriate category. Simultaneously with compiling the slips, members of the team turned their attention to how the material was to be classified. There was no precedent for classifying such a large body of diachronic data, and therefore no o¤-the-shelf-package which could be applied to it. Instead it was felt that, as far as possible, the semantic categories should be developed from analysis of the data. Some initial structure had been imposed by using Roget’s 990 heads, but increasingly this system was felt to be inadequate, both for the reasons described in Section 2 above, and because it lacked the detail necessary to organize such a large body of data in a relatively transparent way. There was a consensus of opinion that the largest unit of organization should be the conceptual field, i.e. the domain of experience where the word was likely to be used4. Since dictionary definitions are themselves a form of categorization, assigning things to classes or referring them to words of like or opposite meaning, and since the 1960s and 1970s were the age of feature analysis in semantics, Michael Samuels and I undertook the task of identifying the semantic components in OED definitions of key words, and sorting these into larger classes (Kay and Samuels 1975). This exercise led to three major divisions at the first level and 26 categories at the second, expressed eventually in the database and the printed HTOED by hierarchical numbering, as shown in Figure 2. This system was subsequently adapted for the classification of TOE, which was intended as a pilot study for the work as a whole, although it subsequently proved to be a useful tool in its own right. Because of the relatively small amount of surviving Old English vocabulary (around 50,000 meanings), and the di¤erent nature of Anglo-Saxon society, not all the level 2 categories were required; for example, the split shown between 01.07 The Supernatural and 03.07 Faith made no sense for Old English in either linguistic or cultural terms.
4. The role of semantic fields in HTOED is discussed in Kay (2011).
Developing The Historical Thesaurus of the OED
47
Figure 2. First and second level categories
3.1. Level 1 At the top level, HTOED is presented in three major divisions, deriving from the main areas of human experience, at least as they are represented in the lexicon of English speakers: I the External (or physical) world, II the Mental world, and III the Social world. Of these, Section I is by far the largest, containing as it does the vocabulary used to describe the physical universe, the creatures living in it, and the operations of human beings upon it. For a diachronic thesaurus, this seemed to be the obvious starting point for the work as a whole, since one could hypothesize that the earliest conversations (long before the advent of the English language) are likely to have been about individual needs and the most readily observable phenomena of the environment. Deciding where to start a conceptually organized thesaurus can be a problem, since there is nothing in semantic structure comparable to the fixed alphabetical order available to dictionaries. For Roget, the answer was to start with abstractions, notions such as Identity, Quantity, Order, etc., which inform later sections. In the case of HTOED, the historical nature of the data supported the placing of such attempts at analyzing and interpreting the world at the end of Section I, after the vocabulary of more concrete concepts. Although we have not attempted this yet, it would be possible to use the
48
Christian J. Kay
database to calculate the relative ‘‘linguistic age’’ of categories at various levels, and thus justify this hypothesis (or not). Section II, the Mental world, contains categories such as Perception, Emotion and Will, and follows on logically from Section I, since much of its core lexis derives metaphorically from the vocabulary of the material universe. Allan (2008), for instance, o¤ers extended examples of links between a concrete category such as Density and an abstract one such as Stupidity. Using such insights, a project is being planned at Glasgow to build up a map of the development of metaphor across the history of the English language. When it comes to classification, there are no right answers, since semantic categories are inherently fuzzy; there are only better answers (which can be an advantage as, by the same token, there are no wrong answers). Section II contains one of the categories that was most di‰cult to place: Having / possession. H-J. Diller (2008: 125) queries why it is here rather than in III Society, where many of the material instruments and outcomes of possession, such as trade and commerce, occur in 03.10 Occupation / work. Our decision was influenced by a comment under have in OED2, describing have, alongside be and do, as ‘‘the most generalized representatives of the verbal classes’’, predicating, in its weakened senses, ‘‘merely a static relation between the subject and object’’. The presence of this relationship suggested a mental process. We therefore decided that this more abstract notion of possession should be separated from the huge body of material in Section III. Somewhat similar issues were raised by the split between Language in Section II as an intellectual activity, and Communication in Section III, or between Sincerity, classified in II as a subsection of Truth, and Trustworthiness, classified in III as a type of social relation within the broader concept of Morality. Although it contains fewer meanings than Section 1, Section III has the largest number of categories, reflecting the huge changes in society in the approximately 1300-year period covered by HTOED. The level and rate of change is reflected in the expanding vocabulary denoting families, government, law, manufacture, trade, communications, and so on. Categories such as Leisure and entertainment, which are very small in the early period, have grown almost beyond recognition as a result of changing life-styles. The same, of course, can be said of major scientific categories in Section I, such as Chemistry or Medicine, representing advances in knowledge.
Developing The Historical Thesaurus of the OED
49
3.2. Level 3 and beyond Overall, HTOED allows twelve places in the classification, seven categories and five subcategories. These are further subdivided by part of speech, following a fixed order. Initially, we experimented with identifying a dominant part of speech for each category. Thornton (1988), for example, identified the adjective as the basic grammatical category in her thesis on Good and Evil; it would also dominate in the Colour section mentioned below. Earlier, Chase (1983 and 1988) had gone further, allowing semantic considerations to override grammatical categories at all times. In many if not most categories, however, the noun was dominant, and in others no particularly dominant part of speech could be identified. We therefore felt that the structure o¤ered by a fixed order was likely to be most useful to the reader.5 A typical array of categories is illustrated in Figure 3 below, showing the progression from Level 1 The world to the place of the colour adjective white (see Figure 1) in the overall hierarchy. Not every link in the chain can be shown, so not all numbers are consecutive. Subcategories, which can be added at any level, are preceded by a slash in the example.
Figure 3. The place of white in HTOED
5. Semantic categories also override grammatical categories in TOE. In the printed version there are no grammatical labels since the editors thought that the part of speech was obvious from the form of the heading. However, in a user survey conducted prior to producing the online TOE, the addition of part of speech labels was one of the most commonly mentioned desiderata, so they were duly added.
50
Christian J. Kay
Thereafter, the classification proceeds to 01.04.09.07.02 Black, following the same order; as far as possible, the structure of parallel categories follows a common order so as to facilitate comparison within HTOED. A considerable degree of detail is provided by the category headings, which both indicate the place of a concept within the overall structure and, if the user supplies the gaps, can be read backwards from the lowest to the highest number to form a quasi-definition. If we start with the subcategory 01.04.09.07.01.02 / 07 Bleached, we read that it indicates a specific way of making white, which is a type of the general adjective white, which is a named colour, which is a colour, which is a property of matter, which occurs in the world – which may be excessive granularity for this particular example, but illustrates the general point. Formulating headings which worked in this way was one of the skills developed by HTOED editors. If we read in the other direction, we find subcategories of greater specificity describing types of bleaching, ending with the negative, ‘‘not bleached’’. Antonymic categories with few exponents are often placed at the ends of sections in this way: an antonym is, after all, a synonym with a single di¤erentiating component of meaning. Also worth noting is the fact that Colour is a level 3 category, although in strict taxonomic terms it should have been a level 4 category subordinate to 01.04.08 Light. Since the object of HTOED was to produce data for linguistic research rather than to explore taxonomy per se, heavily lexicalized categories were sometime ‘‘promoted’’ in this way on the grounds that the degree of lexicalization reflected the importance of the concept to speakers of the language (as well as, in this case, to linguists). On a purely practical level, for each step down the taxonomic scale that a category starts, it loses a potential level of delicacy of classification at the lower end. However, we are now working on a more strictly hierarchical version of the classification in order to facilitate comparison across levels. 3.3. Methodology Most of the classification was done initially at level 3, which contains 354 categories, as shown in Kay et al. (2009: xxix–xxx) and on the wallchart accompanying HTOED. Below these are a further 236,400 categories and subcategories, covering level 4 to level 12 as required. Level 3 categories produced a manageable number of slips for an individual lexicographer to tackle, usually a few thousand, and are conceptually coherent. In semantic terms, level 3 is the prototype level, the level at which categories are most salient to users of the language. Thus we talk about sleeping or
Developing The Historical Thesaurus of the OED
51
seeing rather than physical sensibility or tunnel vision, and about thought rather than mental capacity. As Diller, who has made extensive use of HTOED data in his research, puts it when discussing the growth of generic terms for emotions (2002: 110): ‘‘[. . .] talk about emotions in general presupposes talk about specific emotions. This is the historical reflex of a phenomenon well-known from the work of Rosch (e.g. 1978), who has shown that we begin categorization at an intermediate ‘basic’ level from which we proceed to the superordinate level by abstraction and to the subordinate level by di¤erentiation’’. In HTOED, di¤erentiation is achieved by the categories at levels 4–7 and the five mobile subcategories, so that a level 7 category with five subcategories achieves twelve levels of conceptual delicacy. Within the general taxonomic framework described above, classifiers were given a free hand to determine the most appropriate classification for their particular set of data, thus fulfilling the intention of allowing the structure to emerge from the meanings rather than being imposed upon them. Although no attempt was made to perform feature analysis on the entire corpus, the regularities in the OED definitions were used for guidance when sorting the slips. The basic instruction given to classifiers was simply to ‘‘sort, sort, and sort again’’ until an acceptable structure emerged; in other words, the classification was ‘‘bottom up’’ from the data rather than imposed ‘‘top down’’. A balance had to be maintained between semantic clarity and economy of presentation. Thus, in Figure 1, concepts such as ‘‘white as snow’’ and ‘‘white as milk’’ were considered su‰ciently salient to merit a subcategory, but subcategory 04.09 (not displayed in Figure 1) is headed ‘‘as other typical things’’ and contains a rather mixed collection of rarely-occurring items which are found in this particular context but are not synonymous in the usual sense of the term. Our overall aim was to produce a folk taxonomy, informed by what Hallig and von Wartburg describe as ‘‘naı¨ve realism’’, setting forth ‘‘the intelligent average individual’s view of the world, based on pre-scientific general concepts made available by language’’ (Ullmann 1962: 255). However, since in some sections, such as Animals and Plants,6 we found that an established scientific taxonomy was the best way of dealing with some of the data, we ended up with what we describe as a ‘‘modified folk taxonomy’’, where the naı¨ve view may be combined with an expert one as appropriate. 6. In the case of Plants, both types of taxonomy were attempted. See the study reported in O’Hare (2004).
52
Christian J. Kay
Editing a thesaurus is an endlessly circular process, since the whole cannot be considered complete until every available meaning is slotted into place. In addition to the initial classification, which was often undertaken by postgraduate students and assistants, each section was reviewed at least twice by an experienced editor. At the same time slips forwarded from other categories and new slips from the OED had to be added in a recursive process. The latter task could be tricky when it involved matching an original OED1 slip to an updated slip from a subsequent edition, which sometimes required an understanding of one’s colleagues’ thought processes as well as of the structure of a category. All this became much easier, of course, when, from the early 1980s, we began to store the data electronically. Classifying was thus a painstaking business and one which involved a lot of sorting and resorting of piles of slips. Despite some attempts to use computers in the classifying process, we discovered early on that it takes a human brain to do this kind of work e¤ectively. Anyone who looks for a single identifiable set of principles of classification in operation throughout HTOED will look in vain. Equally, there is no attempt to make a sharp distinction between encyclopaedic and linguistic knowledge. Rather, each piece of classification will be a¤ected by the perceptions and knowledge of those who worked on it, more remotely by the OED editors who selected the citations and formulated the definitions, and, most importantly, by the material itself. Categories dealing with concrete objects, such as items of furniture or musical instruments, lend themselves to detailed classification by features such as ‘‘type of ’’ or ‘‘part of ’’, whereas more abstract sections such as Thought rarely require the full 12-place taxonomy (see further Kay and Wotherspoon 2005).
4. The example of Truth and Error The HTOED category of Truth contains some 6000 records, of which about 25% are headings and the remainder lexical entries. It occurs in Section II as a level 4 category subordinate to level 3 Knowledge 02.01.12. The main noun headings for about half the section are given in Figure 4 below. Indentations indicate a lower category level. Truth is quite a complex concept. The OED divides the lexeme truth into three main branches. Branch I, sense 1, is defined as ‘‘The character of being, or disposition to be, true to a person, principle, cause, etc.; faithfulness, fidelity, loyalty, constancy, steadfast allegiance. Now rare or arch’’,
Developing The Historical Thesaurus of the OED
53
Figure 4. The position of Truth and Error in the hierarchy
which is classified in HTOED Section III (see 3.1 above). Sense 4, ‘‘Disposition to speak or act truly or without deceit; truthfulness, veracity, sincerity; formerly sometimes in wider sense: Honesty, uprightness, righteousness, virtue, integrity’’, approaches the territory covered by 02.01.12.08 above, which begins with OED branch II, sense 5, ‘‘Conformity with fact; agreement with reality; accuracy, correctness, verity (of statement or thought)’’, followed by three other senses and then branch III, ‘‘Something that is true’’. Senses for the adjective true follow the same pattern. From this starting point, the HTOED section is organized round a cline or chain of meaning, proceeding through various degrees of truth, inaccuracy and semi-truth, such as exaggeration, until it reaches deliberate untruth and finally deception (which is one of the largest and most entertaining sections in HTOED). This structure reflects the fact that the category involves di¤erent kinds of opposition. Non-conformity with truth can arise both through ignorance or misunderstanding, resulting in error, or through falsehood, with intent to deceive. The question of which of these takes priority is discussed with reference to the word lie in Lako¤ (1987: 71–74), using research by Coleman and Kay (1981: 43), who found that ‘‘[. . .] falsity of belief is the most important element of the prototype of lie, intended deception the next most important element, and factual falsity is the least important’’. In the same experiment, however, informants consistently defined a lie as a false statement, despite the fact that this
54
Christian J. Kay
condition was ranked as the least important of the three. Lako¤ explains this by supporting an Idealized Cognitive Model in which falsity is the defining characteristic of lying, entailing both lack of belief on the part of the speaker and intention to deceive. By di¤erent routes and for di¤erent purposes, the HTOED classifier has reached a somewhat similar conclusion, as did Kay and Samuels in the original analysis, where true is defined as same as world (1975: 54).
Figure 5. 02.01.12.08 truth and true. 6 University of Glasgow 2009
Figure 5 shows the noun and adjective synonym groups occurring under the first heading 02.01.12.08, shown in Figure 4. Ways of expressing that something is true have varied considerably over the years. In Old English, the main word for conformity to fact was soþ, which survived
Developing The Historical Thesaurus of the OED
55
into the 17th century as sooth and marginally thereafter in archaic use and in compounds. The word true, which in Old English meant ‘‘loyal, faithful, trustworthy’’, started taking over the ‘‘conformity to fact’’ meaning in early Middle English and has retained it despite the subsequent introduction or attempted introduction of various words from French and Latin, such as verity and veracity. Why sooth should have been replaced in this way by true, a word that was already quite hard-worked, is an interesting question. One reason, drawing on cultural factors, might be the later association of sooth with witchcraft and magic. The word soothsayer, for example, is recorded in the OED as meaning ‘‘one who speaks the truth; a truthful or veracious person’’ in the mid-fourteenth century and in 1642, but by 1381 has also taken on its modern meaning of ‘‘one who claims or pretends to the power of foretelling future events’’, often with implications of fraudulence, in conflict with the older meaning. Another reason for the shift might be that true in the sense of ‘‘loyal’’ was marginalized to some extent by the introduction of French words for that concept such as loyal and faithful, along with changing ideas about loyalty in the post-Conquest period. It is questions such as these, of both linguistic and sociolinguistic interest, which HTOED is intended to raise by displaying words in their semantic contexts; as a data source, it therefore o¤ers new possibilities for the study of historical semantics.
5. Present and future For many years, we have been releasing sections of data to scholars who wished to make use of our material in articles, theses, etc. Many of these are included in the bibliography of works associated with the project (Kay et al. 2009: xiii–xx, xxxiii and on the Glasgow website; see also Kay 2009b). HTOED is o¤ered to the academic community as an exploratory tool rather than a set of solutions, one which will pose questions, some of which we ourselves may not have thought of raising, and at least suggest some possible answers. At Glasgow, in addition to the projects on metaphor, linguistic age, and taxonomy mentioned above, we plan to develop the potential of our web version through the Enroller project,7 which is creating an integrated online repository aiming to bring together electronic resources for the study of language and literature. As part of
7. Funded by JISC (Joint Information Systems Committee).
56
Christian J. Kay
Enroller, we are looking at ways of dealing with spelling variation in nonstandard texts, both synchronic and diachronic. When variants are lemmatized under HTOED headwords, it will be possible to link occurrences in texts to groups of synonyms, thus facilitating automatic text-processing if probabilistic models of co-occurrence can be applied to the disambiguation of polysemous words. However, a good deal of work remains to be done before anything like this can be attempted (see further Kay 2009).
References Allan, Kathryn 2008 Metaphor and Metonymy: A Diachronic Approach (Publications of the Philological Society 42). Chichester: Wiley-Blackwell. Burchfield, Robert W. (ed.) 1972–1986 Supplements to The Oxford English Dictionary. Oxford: Oxford University Press. Chase, Thomas J. P. 1983 A diachronic semantic classification of the English religious lexis. Ph.D. dissertation, Department of English Language, University of Glasgow. Chase, Thomas J. P. 1988 The English Religious Lexis (Texts and Studies in Religion 37). Queenston, Ontario: Edwin Mellen Press. Coleman, Linda and Paul Kay 1981 Prototype semantics: The English verb Lie. Language 57 (1): 26– 44. Diller, Hans-Ju¨rgen 2002 The growth of the English emotion lexicon. In: Katja Lenz and Ruth Mo¨hlig (eds.), Of Dyuersitie & Chaunge of Langage: Essays Presented to Manfred Go¨rlach on the Occasion of his 65th Birthday, 103–114. Heidelberg: Winter. Diller, Hans-Ju¨rgen 2008 A lexical field takes shape: The use of corpora and thesauri in historical semantics. Anglistik 19 (1): 123–140. Dutch, Robert A. (ed.) 1962 Roget’s Thesaurus of English Words and Phrases. Harlow: Longman. Kay, Christian 2009a Issues for historical corpora: first catch your word. In: Dawn Archer (ed.), What’s in A Word-list? Investigating Frequency and Keyword Extraction, 66–76. Farnham, Surrey, and Burlington, Vermont: Ashgate.
Developing The Historical Thesaurus of the OED Kay, Christian 2009b
Kay, Christian 2010
57
The Historical Thesaurus of English: Past, present and future. In: Heli Tissari (ed.), Approaches to Language and Cognition. Helsinki: Research Unit for Variation, Contacts and Change in English (VARIENG), University of Helsinki. http://www.helsinki.fi/varieng/journal/volumes/03/kay/
Classification: Principles and practice. In: Michael Adams (ed.), Cunning Passages, Contrived Corridors: Unexpected Essays in the History of Lexicography, 255–270. Monza: Polimetrica. Kay, Christian, Jane Roberts, Michael Samuels, and Irene´ Wotherspoon 2009 Historical Thesaurus of the Oxford English Dictionary, 2 volumes. Oxford: Oxford University Press. Kay, Christian and M. L. Samuels 1975 Componential analysis in semantics: Its validity and applications. Transactions of the Philological Society 74 (1): 49–81. Kay, Christian and Irene´ Wotherspoon 2002 Turning the dictionary inside out: Some issues in the compilation of a historical thesaurus. In: Javier E. Diaz Vera (ed.), A Changing World of Words: Studies in English Historical Semantics and Lexis, 109–135. Amsterdam: Rodopi. Kay, Christian and Irene´ Wotherspoon 2005 Semantic relationships in the Historical Thesaurus of English. Lexicographica 21: 47–57. Lako¤, George 1987 Women, Fire and Dangerous Things. Chicago: University of Chicago Press. Murray, Sir James A. H., Henry Bradley, Sir William A. Craigie, and Charles T. Onions (eds.) 1884–1933 The Oxford English Dictionary. Oxford: Oxford University Press. O’Hare, Cerwyss 2004 Folk classification in the HTE Plants category. In: Christian J. Kay and Jeremy J. Smith (eds.), Categorization in the History of English, 179–191. Amsterdam: John Benjamins. Roberts, Jane and Christian Kay with Lynne Grundy 1995 A Thesaurus of Old English (King’s College London Medieval Studies XI). Second edition 2000. Amsterdam: Rodopi. Rosch, Eleanor 1978 Principles of categorization. In: Eleanor Rosch and Barbara B. Lloyd (eds.), Cognition and Categorization, 27–48. Hillsdale, N. J.: Erlbaum. Samuels, M. L. 1965 The role of functional selection in the history of English. Transactions of the Philological Society 64 (1): 15–40.
58
Christian J. Kay
Samuels, M. L. 1972 Linguistic Evolution. Cambridge: Cambridge University Press. Simpson, John A. (ed.) 2000– OED Online. Oxford: Oxford University Press. Simpson, John A. and Edmund S. C. Weiner (eds.) 1989 The Oxford English Dictionary, 2nd ed. Oxford: Oxford University Press. Simpson, John A., Edmund S. C. Weiner, and Michael Pro‰tt (eds.) 1993–1997 OED Additions Series. Oxford: Oxford University Press. Thornton, Freda J. 1988 A classification of the semantic field ‘‘Good and Evil’’ in the vocabulary of English. Ph.D. dissertation, Department of English Language, University of Glasgow. Ullmann, Stephen 1962 Semantics. Oxford: Blackwell. Online versions of the Historical Thesaurus project are available at: http://www.oed.com/public/htoed/historical-thesaurus-of-the-oed www.glasgow.ac.uk/historicalthesaurus/ http://www.gla.ac.uk/departments/stella/enroller/
The NeoCrawler: identifying and retrieving neologisms from the internet and monitoring ongoing change Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid Abstract Why do some new words manage to enter the lexicon and stay there while others drop out of use and are neither used nor heard anymore? Of interest to both lay people and linguists, this question has not been answered in an empirically convincing manner to date, mainly because systematic methods have not yet been found for spotting new words as soon as possible after their first occurrence and monitoring their early development and di¤usion as exhaustively as possible. In this paper we present a new and improved tool which is designed to accomplish precisely these tasks when applied to material from the Internet. Following a brief review of existing tools for retrieving linguistic data from the Web (Section 2), we will introduce in some detail a tailor-made webcrawler, the so-called NeoCrawler, which identifies and retrieves neologisms from the Internet and stores data necessary for the systematic monitoring of their early development with regard to form and meaning as well as di¤usion (Section 3). Following this description, we will present a case study discussing the results of an analysis of the neologism detweet with regard to its di¤usion, institutionalization, lexicalization and lexical networkformation (Section 4). The study indicates that the NeoCrawler can indeed be applied fruitfully in the study of ongoing processes relating to how the meanings and forms of new words are negotiated in the speech community, how words spread in the early stages of their life cycles and how they begin to establish themselves in lexical and semantic networks.
1. Introduction Which mechanisms are involved in lexical change and what languageinternal factors (such as the morphological and phonological make-up of words) and language-external factors (such as the salience of the concept or referent and the authority of the coiner or early users) control these mechanisms? The methodological approach presented in this paper tries to tackle these long-standing and central questions in historical semantics
60
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
by introducing a new method and by investigating – literally – new material, i.e. very recently coined neologisms. A neologism is defined here as a recently coined word1 which is new to the majority of the members of the speech community. Unlike nonce-formations2, however, neologisms are used with recurrent frequency, but are nevertheless still rare enough not to have become fixed and stable elements of the language. While it may seem strange to look at new words in order to investigate historical change, the study of new words has a number of crucial advantages. Firstly, probably the most prominent asset – especially if one focuses on material retrieved from the Internet, as we do – lies in the possibility of collecting a more or less exhaustive sample of all authentic tokens of a new form within a certain period of time subsequent to its coinage. Secondly, the monitoring of recently coined words gives us the unique opportunity to study processes of ongoing change so to speak ‘in vitro’. While lexicological theory has made a large number of claims concerning the early development of new words (cf. e.g. Bauer 1983: 42–61, Schmid 2011: 69–83), to the best of our knowledge these have never been tested empirically and systematically3. Is it true that meanings oscillate for a while and tend to rely on the context and co-text before they begin to stabilize? Is it true that forms are subject to variation before the speech community begins to agree on spelling, hyphenation and other formal properties? Is it true that changes in form and meaning (lexicalization) tend to go hand in hand with an increase in frequency of usage (di¤usion)
1. Strictly speaking, the term lexical unit would be more appropriate here than the vague term word, since lexical innovations can concern various aspects of new linguistic signs. As such, a novel lexical unit can arise because both form and meaning are new, but also because a new form is paired with an existing meaning (very often for creative or pragmatic purposes) and vice versa (the traditional polysemy case). Tournier (1985) distinguishes between morphosemantic, morphological and semantic neologisms. Since this paper deals exclusively with new words, i.e. new forms with new meanings, we have used the general terms new word, new lexeme and neologism, all of which are treated as being semantically interchangeable here. 2. See Hohenhaus (1996) and Sˇtekauer (2002) for a detailed overview of nonceformations. 3. Hohenhaus (2006) studies the di¤usion process of the noun bouncebackability on the Internet, but does not consider other aspects of the lexicalization and institutionalization process. More recently, Buchstaller et al. (2010) use Google newsgroups to investigate a grammatical innovation, i.e. the decline and narrowing of usage of quotative all in favour of quotative like.
The NeoCrawler
61
and the spread of words within the speech community, across text-types, registers and discourse domains and functions (institutionalization)? The web-based methodology described in this paper aims to provide the means for answering questions of precisely this type. Before we embark on this endeavour, we would like to emphasize that we are well aware of the limitations involved in using only data from the Internet rather than ‘real-life’ texts and conversations. To an extent this limitation, which could only be overcome by means of very costly field work, is mitigated by the fact that many of the words we study are indeed ‘born’ on the Internet and are mainly used and spread there as well. And since the Internet plays an increasingly important role in the lives of an ever-growing number of people and is becoming more and more interactive4, the general mechanisms and principles of new-word developments may not be too di¤erent from what goes on outside the Web after all. This paper is a report on an undertaking which is very much in its infancy, as are the words it aims to investigate. It is therefore important to point out that the ‘answers’ suggested to the questions raised above are somewhat preliminary and will have to be investigated in future work.
2. Linguistic approaches to dynamic web-crawling With an estimated 13.7 billion pages and an indefinite number of words (see www.worldwidewebsize.com)5, the Web o¤ers an amount and variety of language material that corpora cannot compete with. Even the currently largest corpus, the Oxford English Corpus (OEC ), contains ‘only’ two billion words. Despite their careful compilation regarding text types 4. Even though the myth of the doubling of Internet tra‰c every three months has been proven wrong (Odlyzko 2003), the percentage of Internet users is still increasing steadily worldwide (Andre´s, Cuberes, Diouf and Serebrisky 2007). 5. World Wide Web Size is a homepage run by Maurice de Kunder, who developed a method for estimating the size of the Surface Web (cf. de Kunder 2007). This figure, updated on a more or less daily basis, is based on the average of the indexes of Google, Bing, Yahoo Search and Ask, from which the amount of overlap between these search engines is detracted (cf. Gulli and Signorini 2005). The size of the index in turn is calculated through a daily query of 50 words extracted from a one-million-word corpus following Zipf ’s Law. In order to calculate the size of the search engine’s index, the number of returned pages is multiplied by the relative frequency of the word in the corpus.
62
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
as well as social, regional and stylistic varieties, corpora remain static snapshots of the language at a given time. Corpora using the Web for their language make-up, such as ukWaC or the OEC, are also a¤ected by this temporal rigidity, despite regular updates. While in principle language change can be studied with the help of comparable static corpora representing di¤erent synchronic cross-sections of a language (see e.g. Mair 2006, Leech et al. 2009), for the purpose of neologism-monitoring the time lag between data collection and public access is a crucial problem. This is also true for continuously augmented corpora such as the Bank of English, which are also known as monitor corpora (cf. McEnery, Xiao and Tono 2006: 67–69), because words that are new at the time of corpus compilation tend to be either obsolete or firmly lexicalized and institutionalized by the time the corpus is available for research. As a result a detailed investigation of these processes has become impossible or can only be carried out in hindsight and with great di‰culty. Therefore, the timely discovery of potential candidates is of utmost importance for the study of the processes going on in the early phases of the establishment of neologisms. Before we introduce the NeoCrawler, we will briefly discuss two types of existing crawling approaches in linguistics: downloadable crawlers, which are not available for online use on the Net, but are installed on and operated from a desktop computer (Section 2.2), and on-demand crawlers accessible online (Section 2.3). 2.1. Downloadable crawlers 2.1.1. KWiCFinder Like the NeoCrawler, KWiCFinder (cf. http://kwicfinder.com/) uses a commercial search engine to access the Web and generate user-defined language material. Queries are submitted to AltaVista, downloaded as HTML or .txt, summarized and documented with KWiC display. In addition, users also have the option to search the Web with the Java application WebKWiC, which retrieves cached website copies from Google and is considered to be more user-friendly by the developer (cf. Fletcher 2007: 36). Special search features include enhanced wildcard and ‘‘tamecard’’ options (Fletcher 2007: 34), which yield syntactic and orthographic alternatives for any given word. Queries can be expanded or narrowed down by means of ‘‘inclusion and exclusion’’ criteria (Fletcher 2001: 34), restriction searches to specific words, pages, dates and hosts, which are entered together with the search string. Post-processing tools include conversion into XML format as well as annotation and classification options. Unfortunately, Fletcher remains rather vague in this respect.
The NeoCrawler
63
2.1.2. GlossaNet 2 Unlike KWiCFinder, GlossaNet 2 (cf. http://glossa.fltr.ucl.ac.be/) uses RSS and Atom feeds6 to collect linguistic data. The original GlossaNet of 1998 was restricted to newspaper texts. In both versions, the user selects predefined feeds or adds some of their own and compiles a corpus to which the query is submitted. These pages are crawled in regular intervals and added to the corpus via the so-called ‘‘Manager’’ (Fairon, Mace´ and Naets 2008: 3). The Manager not only retrieves the feeds from the server, but also sends them to the next server, which will perform boilerplate stripping, i.e. removal of programming code and duplicates. The second server subsequently assembles the corpus and is responsible for tokenization, lemmatization and tagging. The final results are then returned to the Manager, which informs the user that their queries have been performed and the corpus has been created and/or updated. Despite creating a dynamic corpus, which would enable neologism researchers to keep track of chronological developments, GlossaNet 2’s reliance on a selection of RSS and Atom feeds provides only very specific information within a fairly narrow range of genres and semantic domains. 2.2. On-demand crawlers In contrast to the crawlers described above, on-demand crawlers are available on the Web, where any user can consult them whenever necessary. 2.2.1. Kilgarri¤ ’s Linguistic Search Engine Kilgarri¤ ’s Linguistic Search Engine (LSE) consists of five components7. The first one, the web crawler, performs daily crawls and feeds them into the LSE database, which is updated once or twice a year. While this may be su‰cient for all kinds of applications of LSE (cf. Kilgarri¤ 2003: 3), this restriction poses a serious problem for the systematic study of very recent neologisms. The second component is responsible for filtering and
6. RSS and Atom feeds are tools that enable users to update, publish and exchange web content easily. They contain basic information about the content, such as title, link, description and publication date in XML format. GlossaNet 2 uses this link to access and download the page into the corpus. 7. To our current knowledge, the LSE has not been realized (yet).
64
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
classifying the crawled results. All material that does not contain ‘real’8 sentences, such as images, sound, lists of prices and people, is removed. The remaining pages are converted into standard XML format and their language is automatically identified with a Unicode compliant classifier. Pages are classified according to parameters such as text type and semantic domain with the help of TypTex and TypWeb tools (Folch et al. 2000). After filtering and classification, the linguistic processor, supported by the IMS Corpus Workbench where possible9, performs tagging, parsing and lemmatization. After completion of linguistic post-processing, the results are stored in a database. Subsequently, the statistical summarizer Word Sketch (cf. http://wasps.itri.bton.ac.uk/) can be used to create automatic summaries of a given word’s behaviour. 2.2.2. WebCorp Linguistic Search Engine The WebCorp Linguistic Search Engine represents an improved and expanded version of the 1998 WebCorp programme (Renouf 1998). The most important change is the development of an independent linguistic search engine to access the Web, because of the various problems caused by commercial search engines (see 3.2.2 below). The proposed independent linguistic search engine is currently limited to The Guardian and The Independent newspaper websites and works progressively, i.e. only results collected on the crawling day are fed into the corpus (cf. Renouf, Kehoe and Banerjee 2005: 8). The authors have developed a vast and impressive array of crawling and post-processing features, such as exclusion lists, requerying of failed pages, wildcard and POS search options, neologism detection and collocation extraction. Despite the enormous potential for linguistic research, the WebCorp Linguistic Search Engine is not yet available for public use. At present the WebCorp version (http://www.webcorp.org.uk/) available on the Internet still operates with commercial search engines. Query options include case sensitivity, output format in HTML or plain text, the size of the concordance span, the number of pages to visit (500 maximum) and options to search specific domains only and include or exclude specific words. Before the results are displayed to the user, HTML code, banners,
8. Kilgarri¤ defines a sentence in terms of prototypical characteristics and suggests a heuristic formula to detect these in the flow of diverse language material on the Internet (cf. 2003: 3). 9. http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/
The NeoCrawler
65
links and ads are stripped and duplicates removed. In addition, the date, author, headline and subheadline of the page are automatically extracted. The user receives a list of all the tokens per page, highlighted in red, and has the option to visit the original page. A valuable feature is the error logging of failed pages: the user is able to see how many and which pages returned errors. Unfortunately, the results cannot be downloaded in any form, so that further linguistic analysis is complicated. Moreover, the results only remain available for 24 hours on the WebCorp homepage.
3. The Architecture of the NeoCrawler 3.1. Overview of the architecture While the crawlers and linguistic search engines discussed in the previous sections are very valuable and sophisticated tools for the study of language material culled from the Web, none of them is ideally suited to supplying the kind of data needed for answering the questions posed in the introduction. The NeoCrawler, which tries to improve this situation, was initially developed to replace a downloadable crawler used in our first tests. At that time our focus was on observing a selection of neologisms, so the crawler’s first module, the Observer (see 3.3), was designed to serve this purpose. Because of the extendable architecture, which relied on a database (see 3.2), the second module, the Discoverer (see 3.3), integrated seamlessly into the existing project. In order to explain the mechanisms behind the web interface of the NeoCrawler, we will give an overview of the basic structure first. The figure below outlines the main tasks of the two central modules. Module I, the Discoverer, attempts to detect new words on the whole Web as closely to their date of coinage as possible. Since the module is comparatively young and still in its testing phase, we confine ourselves for now to crawling the latest blogs from Google Blog Search10 in the first step (a). The NeoCrawler retrieves a list of the blogs o¤ered for all of Google’s categories (see Section 3.4) (b) and follows the hyperlinks to obtain the contents of the blog pages (c). The pages are stripped to plain
10. http://blogsearch.google.com
66
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
Figure 1.
The NeoCrawler
67
text and split into single words; then each word is compared to a previously compiled dictionary (cf. 3.4) to detect possibly unknown words (d). The words are subsequently analyzed with a trigram filter that compares the sequence of letters in the potential neologism with known typical patterns and rates the potential neologisms accordingly. The Discoverer then outputs a rated list of unknown words to the user interface in the web browser (e). The automatically generated suggestions have to be reviewed manually (f ). Researchers can use the web interface to easily select the neologisms to be added to a database of neologisms (g), which will be crawled automatically in the future by the crawler’s second module. Module II, the Observer, handles the periodical searches for selected neologisms (1a), provides a public interface to the NeoCrawler (1b), and semi-automatically classifies the results. For the periodical observations, the NeoCrawler conducts a search for each neologism in the database. It compiles a web address with the search string and other parameters for Google, and passes the request to the search engine (2). Google treats the query like any other search process and searches the Web for relevant pages (3). The addresses of these pages are then returned to the NeoCrawler (4), which in turn follows each address and retrieves the contents of the pages from the Web (5). In the next step, the NeoCrawler partitions each web page to prepare its contents for the database (6). Both the entire HTML file and the automatically analyzed content of the search results are saved to the database (see 3.2) (7). From there, the data is passed to the web interface of the NeoCrawler (8), where the search results are permanently available to the researchers.11 The user interface o¤ers various representations of the data, ranging from an outline of the di¤usion progress of a neologism to basic statistics, detailed linguistic information and concordance lines. The data can also be downloaded in di¤erent formats, HTML and plain text, as well as in chronological order or classified structure to import the results in a concordancer, for example. With this survey in mind we will now have a closer look at the individual modules, beginning with the foundation of the NeoCrawler, its database.
11. Due to the restrictions imposed by Google’s University Research Program (http://research.google.com/university/search/terms.html), the data obtained by the Observer is only accessible to our own researchers for the time being.
68
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
3.2. The database: Laying the foundation Since it is common12 in present-day corpus linguistics to annotate texts using the XML format13, some explanation of why a database approach was chosen for this project may be required. The main reason is that despite its flexibility, XML is subject to a number of restrictions that make it insu‰cient for demands more complex than mere descriptive tasks. Basically, the structure of XML files is designed in such a way as to facilitate the hierarchical categorizing of (textual) data. Each unit or element, from page to morpheme level, is tagged, and the tags can be extended by any number of specifications. This facilitates very profound descriptions and in principle o¤ers an unlimited number of markup options. The hierarchical structure o¤ers many possibilities for single-user desktop utilization (cf. Carletta 2005). However, it is this very freedom in manual editing that allows for the danger of inconsistencies in categorization and labelling, which make documents prone to errors in automatic processing. As a result, the file format has considerable drawbacks for the kind of large-scale server data mining required in this project. For example, the fact that it is virtually impossible to process complex computations with a large amount of data in the XML format has proven problematical. Processing XML files is slower in general, especially when it comes to searching and filtering, both central requirements for all kinds of data retrieval. In addition, complex relations in the source material need to be converted into the simpler hierarchical structure, which results in loss of expressiveness, unnecessary complication of data structures or redundancy of data. This either imposes restrictions on later analyses or requires duplication of data, especially when errors in the raw data have to be corrected. A common alternative, which was chosen for the NeoCrawler project, is to store structured data in a relational database like MySQL or PostgreSQL. A relational database consists of a number of tables, each comprising columns with unambiguous headlines, and rows with the actual data (see figure 2).
12. Among others, Eckart (2008), Ide et al. (2002) and Dipper (2005) outline the methods of XML-based corpus annotation. 13. The Extensible Markup Language (XML) is specified by the World Wide Web Consortium (W3C, http://www.w3.org/XML/)
The NeoCrawler
69
Figure 2.
The rows of the tables are identifiable with a unique ID, which can be referred to in other tables as well. In a relational database, the smallest unit, such as a single token of a crawled neologism, is linked to rows of tables with more general information, for example the web page and its author(s). Thus, indirectly, the single tokens carry all the information available for them. The key feature of relational databases is that fields are linked, so any token can be tied up with any number of other tables. The advantage of this network of relations, unlimited in principle (compared to the hierarchical structure of the XML format), lies in the possibility of modelling facts of unlimited complexity.
Figure 3.
70
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
In the case of our periodical observations (cf. step 1a in Figure 1), the database behind the web interface of the NeoCrawler is modelled in exactly this way (see outline in Figure 3) and serves as both the source for the queries and destination for the results. Once a neologism has been added to the database for regular observation (table ‘‘lemma’’), the NeoCrawler gets the list of neologisms and initiates a search for each one. After the search process (see 3.) is completed, categorized information on the search results is stored in the database on four levels: process, lemma, page and token. The headlines of the boxes in the figure above provide labels for their contents: – The table ‘‘process_info’’ saves information retrieved for a given neologism in one crawling session, with one session corresponding to one ‘process’ uniquely identifiable and stored in the database. On the process level, the total number of pages and tokens found for the respective neologism are stored along with the date, the time restriction set in the query and the search string. – As can be seen, the table ‘‘process_info’’ is linked to table ‘‘lemma’’, i.e. the type level: here information pertaining to the neologism can be specified and stored, e.g. the word-formation pattern, types of semantic transfer, such as metaphor and metonymy, semantic competitors (e.g. google-cooking as a competitor for fridge-googling), and, last but not least, meanings. This information has to be entered manually. – Information on the page level is represented in the tables ‘‘source’’ and ‘‘version’’, which contain details about the web pages (see Section 3.3.3) that are retrieved in a search process, as well as their possible versions. Information on authors is specified in the table ‘‘author’’. – Every single token of a neologism identified by the NeoCrawler receives one row in the table ‘‘token’’, containing a large number of cells including a co-text of 1000 characters and many other features such as the part of speech or the mode and style of use (see Section 3.3.3). The connecting lines between the boxes point out the links to other tables and levels, which are represented by IDs (e.g. ‘‘id_source’’, ‘‘id_ version’’, ‘‘id_author’’ in table ‘‘token’’) in the table rows. The last table, ‘‘blacklist’’, contains lists of strings that are to be excluded from the search results when crawling. The blacklist is the only table that is not directly linked to the ‘‘token’’ table, but connected with the lemmata instead, because its content applies to all results found for a lemma. The principle of inter-linked tables containing information of increasing specificity avoids redundancy, which in turn enables complex queries
The NeoCrawler
71
and fast access to a large amount of data. With the linked data, the NeoCrawler is prepared for virtually any representation of the data and any kind of query, even though only basic computations are performed at present. The data does not need to be modified for more complex statistics, and server-based usage makes it possible for multiple researchers to edit the data simultaneously, even while the NeoCrawler is adding more results in the background. 3.3. The Observer: Monitoring neologisms While in principle the Discoverer is the more basic module, as it identifies neologisms, we will nevertheless begin by describing the Observer, because some of its principles also provide the foundation for the Discoverer. Basically, the Observer contributes three crucial steps to the systematic acquisition of data on neologisms and their further processing for linguistic analysis: the web search, linguistic post-processing and classification. 3.3.1. Web search The NeoCrawler uses Google to search for and monitor neologisms by means of an automated version of the same processes carried out in ‘normal’ manual Google searches. In a normal search scenario, a user enters a search string into Google’s standard web interface, optionally adds a number of parameters such as date and language, and receives Google’s response web page with a list of matching links. In responding to such queries, the Google Search web interface has the web browser encode the parameters set by the user. Following the user’s click on the ‘‘submit’’ button, the web browser encodes a web address, also known as uniform resource locator (URL), with the search details. For example, typing the string ‘‘detweet’’ in the Google search form and opting for ‘‘100 results’’, ‘‘English’’ and ‘‘past week’’ in the advanced search menu will result in the creation of an URL like this (represented in slightly simplified form):14 http://www.google.com/search?q=detweet&num=100&hl=en&tbs=qdr:w&start=100
The parameters included in the search are more or less recognizable in this code, following abbreviations such as ‘‘q’’, ‘‘num’’, ‘‘hl’’ and ‘‘tbs’’. As an answer to the web browser sending this address, the search engine
14. For details see http://yoast.com/wp-content/uploads/2007/07/google-url-parameters.pdf
72
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
compiles an HTML web page containing links to pages that match the selected criteria. All common web browsers display this HTML file as the well-known Google results page. Rather than using Google’s main search page manually, the NeoCrawler assembles the URL codes with all specified parameters itself, and fetches Google’s answer by pretending to be a web browser. Since the periodical searches are carried out by the server at weekly intervals, the time parameter is currently set to one week, which ensures seamless retrieval of data more or less at the time they enter the Internet. Since 100 is the maximum number of results that Google returns for each call, the NeoCrawler requests a series of result pages for each neologism by varying the ‘‘start’’ value.15 Each HTML page returned by the Google server is then parsed by the NeoCrawler. It extracts all web links from it, i.e. links to pages containing the search string, and filters out Google-internal tracking links, blacklisted sites (see 3.2.1) and Google cache links. In this way, outdated and duplicate versions of websites are prevented from spamming the database, and the search process is kept as e‰cient as possible. In the next step, the NeoCrawler follows all remaining links from the search results and downloads the exact contents of the page, excluding pictures. While the use of a commercial web engine like Google is not uncontroversial (cf. Kilgarri¤ 2003, Renouf, Kehoe and Banerjee 2005)16, it can be argued in favour of this decision that Google allegedly has the largest number of indexed pages (cf. http://googleblog.blogspot.com/2008/ 07/we-knew-web-was-big.html). Moreover, the index is updated fastest in comparison to other search engines, for many pages even on a daily basis (cf. Lewandowski 2008a: 820). As a result, Google shows the latest, updated versions of pages and is the leader in ‘‘freshness’’ regarding its 15. It should be noted that the NeoCrawler used Google’s standard search interface in the pilot phase, which has a limited query rate. In the meantime, our project has been accepted by Google’s ‘‘University Research Program for Google Search’’ (http://research.google.com/university/search/), which gives us the permission to run automatic queries with full access to the Google repository. 16. The main criticism concerns the commercial ranking of results. As a result, statistical analyses are distorted, because the displayed pages might not accurately reflect the real use of a lexeme. Secondly, the absence of a wildcard search restricts the researcher’s query options, but this can be solved by incorporating a search engine like Yahoo, with which such searches are possible. The problematic display of a limited co-text on the Google interface has been solved by setting the NeoCrawler’s co-text extraction to 1000 characters.
The NeoCrawler
73
index (Lewandowski 2008a: 824). Lewandowski furthermore investigated display delay and found that it is Google again that shows the lowest delay margin, 2 days on average, between the retrieval of updated pages and their inclusion in the Google search engine (cf. 2008a: 823). Fast discovery of new pages and re-retrieval of updates is qualitatively important, because research has shown that although the majority of pages change only marginally, approximately 8% of the web consists of new pages that go online every week and 20% of all web pages vanish within a year of their publication (cf. Ntoulas, Cho and Olson 2004: 3). Since Google scores best on quantity (the amount of indexed pages), quality (their freshness) and speed (both concerning retrieval and re-retrieval of updates), our current reliance on Google for web access appears justifiable. 3.3.2. Post-processing Features When a web page has been retrieved and the full HTML version has been stored in the database, the NeoCrawler performs a number of automated analyses on the individual pages. It features further filters, syntactic parsing and suggestions for subsequent manual evaluation. As users of Google know, Google’s harvest tends to be quite confusing. Often a large number of potential hits turn out to be either false positives, i.e. pages that do not feature the string searched for (which is usually due to the fact that pages indexed by Google have been changed since indexing), duplicate copies, or otherwise useless pages. To increase the integrity and validity of the collected material, the NeoCrawler therefore checks each page for false positives and identifies exact duplicates or nearly similar versions of the same page with no relevant changes. Both types of page are removed from the list of pages prepared for parsing. Duplicates are reliably detected by comparing the title and the file size to all previous results of the same search. The NeoCrawler ignores the invalid pages in all subsequent computations and does not store their contents in order to keep both the database and the final output slim, but stores the addresses to ensure gapless coverage. Subsequently, the remaining pages are stripped of all content irrelevant for linguistic analysis, such as HTML tags and script code. The result is the human-readable content of the web pages that can be displayed in any text editor and can be passed on for further linguistic processing to a concordancer, for example. Nevertheless, the complete page is still available in the database and can be viewed and downloaded in its original form at any time.
74
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
Some results pages are not useful because their content is either encrypted or a mere compilation of links to other pages without linguistically valuable content. Facebook, for example, allows Google to search the content of the private member sites and returns their links, but their body is only readable for users logged in with a Facebook account17. Because of this, the NeoCrawler allows researchers to individually blacklist sites for the neologisms. Blacklisted sites will no longer be displayed in the current search results or previous ones, but they are kept in the database. The next steps in preparing the pages for linguistic analysis relate to the content level. First, the NeoCrawler extracts the title of the document, breaks up the stripped content into words and sentences and identifies the relevant tokens, that is, the instances of the requested neologism. This is the process of tokenization. For each token found, the NeoCrawler saves a co-text of 1000 characters around the target word, which can be used later for fully searchable concordance lines. The NeoCrawler also counts the number of tokens found on each page, adds up the number found on all pages of the corresponding search process, and stores the information in the database. With this information, the NeoCrawler can provide basic statistical data such as the page/token ratio. The second step is part-ofspeech tagging. The stripped contents are automatically analyzed with an open source part-of-speech tagger18, which considerably facilitates later analyses, e.g. concerning the collocational behaviour of the new words. Last but not least, the NeoCrawler detects novelty markers (e.g. so-called, quotes etc.), and adds information about them to the token table of the database. 3.3.3. Linguistic Classification After post-processing, the pages are available in a form that linguists can use for further research. If the aim is to investigate the behaviour and development of new words from a language-internal and languageexternal perspective, as suggested in the introduction to this paper, one has to set up a classificatory system which captures not only their formal, morphological and semantic properties, but also textual and sociopragmatic characteristics of their environment. The establishment of such 17. As a result, only publically accessible Facebook pages are included. 18. The Standford Log-linear Part-Of-Speech Tagger is licensed under the GNU General Public License (http://www.gnu.org/licenses/gpl.html) and can be downloaded free of charge from http://nlp.stanford.edu/software/tagger.shtml.
The NeoCrawler
75
a framework is not entirely unproblematic, because the research undertaken in the field of computer-mediated discourse (CMD) has not yielded any reliable classification schemes for Internet text-types and genres, while categories established in traditional discourse analysis and stylistics (cf. e.g. Wehrlich 1976, Beaugrande and Dressler 1981, Biber 1988, 1989, 1995, 2007) are largely inadequate for capturing the variability, dynamicity and fuzziness of the material found on the Internet. Biber (2007: 116), for example, proposes the four text-type dimensions ‘‘personal, involved narration’’, ‘‘persuasive/argumentative discourse’’, ‘‘addressee-focused discourse’’ and ‘‘abstract/technical discourse’’ on the basis of statistical multi-dimensional analysis, which uses text type-specific linguistic features. However, suitable as this framework may be for ‘‘traditional’’ texts, these four types seem to be too broad to reflect the range of variation found on the web. An approach which comes closer to meeting the demands of this project is Herring’s ‘‘faceted classification scheme’’ (2007), which adapts Dell Hymes’ (1974) SPEAKING model to CMD. Herring argues that the various CMD forms are the result of interaction between technological and situational influence factors, which she calls ‘‘facets’’ (2007: 10). Both facets are open-ended and dynamic. Social-situational facets include topic, purpose, tone of the message as well as structure and characteristics of the participants. The technological dimension captures several medium factors such as synchronicity or 1-way vs. 2-way message transmission. This dimension is indeed very important for linguistic issues, because technological innovations have created new forms of communication, e.g. Twitter, and are of utmost importance in the di¤usion process of neologisms. Since Herring’s system is too fine and detailed to be applied for the present purpose where thousands of pages await linguistic classification, we have taken it as an inspiration for a somewhat simpler three-level multi-dimensional19 classification, which tries to balance practicability and adequacy (cf. Table 1). A primary distinction at page level is made between meta- and objectlinguistic modes of use. Since profuse talking about, rather than referential use of, a new lexeme is assumed to inhibit lexicalization (cf. Metcalf 2002: 155–157), we first identify those instances that merely define, paraphrase
19. We do not use dimension in the sense intended by Biber (2007) as synonym for text types. In our approach, the dimensions represent linguistic perspectives on classification.
76
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
Table 1. Page-level classification scheme Mode of use
Metalinguistic Object-linguistic
Semantic features*
Field of Discourse
Sub-field of Discourse
general politics law business sports science advertising lifestyle
celebrities, food and drink, fashion, health, other
entertainment
radio and TV, movie, music, other
computing/Internet
gaming, technology, business, other
other Socio-pragmatic features
Type of Source
Sub-type of Source
Blog News Discussion groups Portal
directory, jobs, community, Hollers, Gather, Bebo, Blippy, other
Social Networks
Facebook (public), MySpace, Meetup, other
Filesharing
documents, music, video, photo, blog
Microblogging
Twitter, Tumblr, other
Self-reference Academic Dictionary and thesaurus** Other Authorship Private Professional
* not applicable to metalinguistic uses ** only applies to metalinguistic uses
The NeoCrawler
77
or comment on the given neologism. The top level furthermore involves the dimensions ‘‘field of discourse’’ and ‘‘type of source’’, both of which are more or less explained by the categories listed in Table 1. A third dimension is concerned with authorship and only applied to a small number of categories. Certain types of discourse contain an inherent authorship status: the people who write for established newspapers will be professional journalists, but the majority of discussion group users will use the forum for personal reasons. Blogs, however, can fulfil both functions: on the one hand they replace the old-fashioned diary or internal monologue, and on the other hand, they are used by professionals as an extension of or a complement to their work. We therefore distinguish between private and professional authorship. Although the distinction between private and professional blogs is not always straightforward, several linguistic and visual di¤erences set them apart from each other. In professional blogs, for instance, a lot of space is filled with advertisements, much more so than in private blogs. Furthermore, professional blogs more frequently name the author or use the generic admin, whereas private blogs are characterised by authors publishing under nicknames or pseudonyms. Unfortunately, the geographic origin of a page20 does not necessarily correspond to the current location of a user, let alone to his or her background. For some pages only, regions can be determined manually by relying on the information users share, for instance in discussion groups or blogs. The location of the author is thus deemed too unreliable to be included as a variable. The lower classification level is concerned with a linguistic description of the individual tokens. Whereas we have assumed semantic, socio-pragmatic and to a certain extent also textual homogeneity on the page level, the di¤erent tokens contained on a single page might di¤er with regard to a range of linguistic properties. Table 2 shows the classification scheme on the token level, which contains categories that are all more or less well established in linguistic terminology. At present, classification proceeds manually, assisted by drop-down menus on the interface of the Observer. This process is to be automatized as far as possible by means of URL parsing for the semantic and sociopragmatic types and fields of discourse. Apart from automatic part-ofspeech identification with parser and tagger, we aim to integrate further
20. The geographical location of a web server can be determined by the IP adress, a practice called geolocation.
78
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
Table 2. Token-level classification scheme Linguistic dimension
Class label
Class realization label
Syntax feature
Part of speech
verb, noun, adjective, adverb, interjection, phrase, other
Text feature
Position
banner, title, headline, body, footer, signature, caption, teaser, category, tag
Metalinguistic feature*
Explanation
definition, paraphrase, none
Sociolinguistic feature
Style of use
neutral, formal, informal, vulgar, e-speak
Cognitive feature
New referent
yes/no
* only applies to metalinguistic uses
tools that reduce the amount of manual classification required; a certain degree of manual labour will most likely remain indispensable. 3.4. The Discoverer: Identifying neologisms Besides monitoring the development of known neologisms, one of the most important aims of the NeoCrawler project is to identify new words in the World Wide Web. Our vision is to find them on the very date of coinage and observe their development from that point on, but given the current size of the Internet – Google’s index listed eight billion pages in 2005 (Uyar 2009) – and the complexity of web technologies in general, this is an ambitious aim which we can only approximate for now. The NeoCrawler rises to this challenge in two ways. The first method tackles the task with the help of the Observer by targeting metalinguistic markers of linguistic novelty. This means that the NeoCrawler searches for strings such as – came up/ made up/ with a/the (new) term/word – invented a/the (new) term/word – coined/ heard/ read / stumbled upon a/the (new) term/word. The results output produced by the NeoCrawler is a table that displays the search strings in context along with the option to save a new word to the database for future observation. Once added to the database, the
The NeoCrawler
79
neologism will be automatically included in the upcoming and all future crawling rounds. In the list of results to be reviewed manually, however, only the search string such as ‘‘stumbled upon a new term’’ can be automatically identified within the web page and thus highlighted. As a consequence, the researcher has to read and analyze large parts of the co-text to detect a new word, which is a time-consuming procedure. Another obvious disadvantage concerns the time of detection. Since we are relying on pages where people already talk about a new word, we are always one step behind, even though first attestations of neologisms are usually found in the first search, which is always conducted without time restriction. The second method, implemented more recently and referred to as The Discoverer, tries to reduce the time gap between coinage and identification by means of a direct automatic analysis of web pages. This also has the advantage of drastically decreasing the necessary amount of manual intervention. The Discoverer was programmed by Rene´ Mattern, to whom we are greatly indebted, as part of his M.A. thesis in computational linguistics. At the time of writing, the Discoverer is in its testing phase, in which it does not yet crawl the entire Web for neologisms. The Discoverer module is operated with a separate web interface that currently o¤ers two possibilities: on request, the NeoCrawler searches for neologisms either in blogs on the Internet or in files on a local hard disk. In the case of the blog search, we have so far consulted only a few blogs preselected by Google on Google Blog Search21. For the time being, the blog search retrieves an individually specified number of blogs of all available categories22. In the next step, both blogs and files from the hard disk are prepared for processing. The downloaded HTML files are stripped of all linguistically irrelevant content such as HTML tags and programming code, date and time, email addresses and URLs, and the NeoCrawler extracts the body of the blogs. The files and the plain text of the blogs are then split into single words, using capital letters and punctuation marks as delimiters between words. The remaining words are compiled into a list sorted by frequency in the text. This list is then passed through a set of filters. In 21. http://blogspot.google.com. Admittedly, we are subject here to commercially motivated selection by Google, but we intend to detach from the search engine in the near future to extend our blog search to all major blog providers. 22. At the time of writing, Google Blog Search presents current blogs of the following categories: politics, US, world, business, technology, video games, science, entertainment, movies, television and sports.
80
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
this process, the NeoCrawler eliminates stop words23, words with fewer than three letters and words containing more than two digits. Proper names are filtered out by consulting a database of proper names contributed by a cooperating department24. All remaining words are then compared to a reference dictionary and a user-generated catalogue of known words, which is currently based on a reduced version of Google’s web-scale N-grams25. The N-gram Corpus was created in 2006 and consists of about a trillion running words taken from web pages. This data is organized as unigrams, bigrams and so on up to five-grams. Taking into consideration the size of the corpus, we decided to use the approximately 14 million unigrams, i.e. single words, as a start, and also removed nonwords according to the same criteria later applied to the blogs. The resulting dictionary still contains more than 7.8 million tokens, which helps the NeoCrawler to filter out most of the words used before 2006, as well as common typing errors and misspellings. The general output of the Discoverer still contains many items which clearly are not new words, or in fact are not words at all. Therefore, it rates the remaining words by performing a trigram analysis on the sequences of letters. The NeoCrawler contains a database of trigrams (a sequence of three letters), which is a list of all three-letter substrings of Google’s N-grams database and their respective frequencies. We assume that the trigrams represent typical sequences of letters in English words. With this reference, the frequencies of all trigrams within a potential neologism are used to calculate the probability that it is an English word. The words with the lowest values are dropped. At this point, the number of potential neologisms per average web page is down to less than ten, and the researcher has to go through this list of candidates manually and decide for each word whether it is a neologism, a known word or not a word at all. The NeoCrawler saves all words marked as ‘‘known’’ and ‘‘not a word’’ (including typing errors and misspellings) in two user-generated catalogues, which augment the N-grams database, so they will be ignored in future analyses. With these growing catalogues, we hope to soon decrease manual intervention to a minimum. 23. Stop words are extremely common words that typically cause problems in natural language processing and are therefore typically extracted prior to natural language processing (Luhn 1958). 24. We are indebted to Michaela Geierhos and the Centrum fu¨r Informationsund Sprachverarbeitung, LMU Mu¨nchen; cf. Geierhos (2007). 25. Google’s N-grams are freely available at http://www.ldc.upenn.edu/Catalog/ CatalogEntry.jsp?catalogId=LDC2006T13.
The NeoCrawler
81
If a word is marked as a neologism, the NeoCrawler saves it to the database. From then on, the Observer module will include it in the periodical crawling processes and analyze the results in the way described above.
4. Applied NeoCrawling: detweet In the following section we present a case study which illustrates some of the NeoCrawler’s functions and applications in the field of neologismmonitoring. Our focus will be on the practical aspects of monitoring the di¤usion, lexicalization and institutionalization processes observable for the young lexeme detweet. The study is based on 144 tokens of this form found up to April 2010 and of course cannot claim to come close to presenting statistically reliable analyses and interpretations. We have selected this small dataset for our case study because it provides maximum transparency for all stages of the application of the NeoCrawler. The notion of di¤usion is used to refer to the spread of a new word as measured in terms of discourse frequency, or more precisely in the present context, in terms of the number of tokens and types of new words found on Internet websites. Institutionalization is defined in a fairly narrow sense (as compared to, e.g. Bauer 1983: 48, Lipka 2002: 112, Brinton and Traugott 2005: 45–47) as a process of spread across text-types, register and genres, both within and outside the Internet, as well as across the fields of discourse mentioned in 3.3.3. The rationale behind this notion is that in addition to sheer frequency, the ‘‘success’’ of a new word is reflected in its spread across di¤erent socio-pragmatic situations and the purposes for which it is used. In line with existing suggestions (cf. e.g. Bauer 1983: 42–61, Brinton and Traugott 2005, Schmid 2011: 69 ¤.), lexicalization is regarded as a cover term for structural changes undergone by neologisms, i.e. morphological, grammatical or semantic developments. Conventionalization will be used as a cover term subsuming di¤usion and institutionalization, while establishment includes all three types of process. 4.1. First recorded occurrence Detweet is one of the more recent coinages that have arisen after the introduction of the popular microblogging service Twitter. The sentence in (1) represents the first use that was found by the NeoCrawler in May 2008, when it appeared on a Question and Answer portal page called AskMosio. From a morphological perspective, detweet is the result of a prefixation
82
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
process and consists of the prefix de- and the basis tweet, which is used as a noun and verb referring to ‘a Twitter message’ and ‘to post messages on Twitter’ respectively. Using the ablative prefix de-, detweet denotes the removal of Twitter messages or tweets, i.e. ‘to delete a tweet’. (1) Can you delete your twitters? yup, login to twitter.com, then select the trashcan by the tweet you want detweeted. (my 1000th answer!!!). In spite of the fact that the meaning of detweet in (1) is fairly clearly ‘delete’, not all of the word’s uses during its early stage of conventionalization allow for a similarly unambiguous semantic analysis. An occurrence of detweet in a tweet in October 2008, given in (2), poses a problem, for example. Although the co-text, which is restricted to 140 characters on Twitter, does not provide enough clues to assign a distinct meaning, it seems certain that the sense ‘to delete’ does not apply here: (2) What is everyone going to do with their Twitter withdrawal time tonight? Is there a cure for the DT’s (DeTweets)? Judging from the preceding phrase Twitter withdrawal time, the prefix de- might be interpreted as a negation of to tweet, yielding ‘not to tweet’. The presence of the definite article the however, excludes a reading as a verb and suggests that DeTweet functions as a noun. This not only shows that, as predicted by lexicological theory, the meanings of new words are variable and subject to modifications, but also that their grammatical status seems to stay flexible. This should be kept in mind when we now proceed to report on the early di¤usion of the form detweet and its semantic development. 4.2. Di¤usion By April 2010, the NeoCrawler identified a total number of 117 web pages that contained detweet in one of its word forms. Table 3 shows the distribution of tokens grouped according to word classes and word forms. As the table shows, the majority of the 144 tokens extracted by the concordancing software CasualPConc26 are verbal forms (130 tokens, constituting 90.2%). Within the verbal paradigm, base-form occurrences accounted 26. CasualPConc is a freeware concordancing programme for Mac OS X. It works similarly to other concordancers like AntConc, but includes the advantage of concordancing parallel corpora. CasualPConc can be downloaded from http://sites.google.com/site/casualconc/, together with other CasualConc tools.
The NeoCrawler
83
Table 3. Tokens per word form ratio Verbal forms Tokens
Nominal forms
detweet
detweets
detweeting
detweeted
detweet
detweets
74
2
34
20
12
2
Total 144
for the lion’s share (56.9%), followed by the present participle form detweeting (26.1%). Although one of the first known uses, as illustrated in (2), was in nominal form, the token analysis suggests that detweet is spreading in the Internet/speech community mainly as a verb. To provide an idea of how the di¤usion of detweet has proceeded so far, Figure 4 represents the overall number of pages per month that contain this form and its variants.27
Figure 4. Cumulated pages per month
27. Seven pages whose publication date could not be traced have been omitted.
84
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
While the curve in Figure 4 suggests a continuous and constant increase in numbers of websites, this does not in fact do justice to the dynamics of the di¤usion process. To provide a more detailed picture, Figure 5 charts the number of newly uploaded pages which were identified by the NeoCrawler at weekly intervals in the period from the first attested use in May 2008 up to April 2010.
Figure 5. New pages per month
This figure indicates that rather than seeing a linear increase in the number of websites containing detweet, ups and downs can be observed, reflecting more or less intense communicative activity using the form detweet. Looking at Figure 5, the most striking peaks are found around August 2009 and in early 2010. Two extra-linguistic events appear to be responsible for the increased use of detweet in August 2009. Firstly, at that time, J.R. Smith, a well-known NBA player, decided to suspend his Twitter account in the wake of some controversial tweets which stylistically resembled the discourse of a certain street gang. The original article entitled ‘‘J.R. Smith decides to deTweet’’ appeared in the Denver Post28 and was afterwards taken up in a specialized blog and discussion forum29.
28. http://www.denverpost.com/nuggets/ci_12993784 29. http://www.binarybasketball.com/forums/threads/9718-J.R.-Smith-decidesto-deTweet
The NeoCrawler
85
Almost simultaneously, the Twitter account of a somewhat dubious businessman was deleted by Twitter itself, because he had been trying to raise money for another one of his suspicious activities30. Although this news did not spark an article in any of the established newspapers, it was passed around in several community portals, among them everyjoe.com (31 August 2009): (3) [. . .] In the End, Rawman Was Detweeted. (http://www.everyjoe. com/articles/franchise-founder-loses-twitter-food-fight/) This example confirms our earlier observation that di¤erent meanings, ‘to give up tweeting’ in the J.R. Smith case and ‘to be kicked out by Twitter’, are competing with each other. In addition to (3), there are only two more uses in which the passive form be detweeted refers to the act of being removed from the Twitter service. 4.3. Lexicalization As predicted by lexicological theory (cf. e.g. Lipka 2002: 110 ¤.; Schmid 2011: 73–83), then, the recent coinage detweet still seems to be both grammatically and semantically – and, incidentally, orthographically – unstable, or, and this remains to be observed in the future, has already embarked on developing a system of polysemous senses associated with the form. In this section we will leave the level of the di¤usion of the form in the (cyber-)speech community and move to a semantic investigation of the data reaped by the NeoCrawler.31 The most frequently used meaning in the data available so far, which can be rendered as ‘sign o¤ ’, is illustrated in a tweet from April 2010 in (4): (4) Detweeting until 3–5 pm. If needed DM/text/email me. This sense is instantiated in 29.5% of the records. What is important is that of the 36 tokens, only one is metalinguistic in nature, which indicates that this sense is currently the preferred ‘normal’, i.e. object-linguistic, use in the speech community. Since the denotatum is clearly an action, it is
30. http://www.everyjoe.com/articles/franchise-founder-loses-twitter-food-fight/ 31. Eighteen pages, where the meaning could not be disambiguated or determined on the basis of the often insu‰ciently informative co-text, were omitted from further analysis.
86
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
hardly surprising that detweet typically occurs as a verb (in the infinitive or as the present participle). In example (5), the author explicitly explains his definition of the word detweet as signifying the opposite of the more well-known retweet 32. Example (5) was taken from the author-coiner’s blog post in February 2009. In contrast, the second most frequently found sense of detweet, is mostly used as a noun and in metalinguistic uses. Detweet in this sense of ‘forwarding a tweet with disapproval’ accounts for 23.7% of the tokens, but the majority of these occurrences are metalinguistic comments such as the definition in (5) or references to this blog entry. (5) So I’m going to just De-Tweet it in the same way people Re-Tweet stu¤. I hope to start a trend. The DeTweet Defined: DeTweet (AKA: De-Tweet or DT) ¼ Passing along the tweet of another with some degree of disapproval. It can range from strong (that’s a lie) to mild (there are exceptions or conditions). Detweet in this sense of ‘forwarding a tweet with disapproval’ is the second most frequent usage. The meaning evoked by (6), synonymous with ‘to unfollow’, i.e. to stop following someone’s tweets, was identified in 17 tokens (13.9%). For this sense, only one metalinguistic result was recorded. Similarly to the first meaning ‘sign o¤ ’, the action-like character of the word is reflected in its exclusive use as a verb in the entire inflectional paradigm. Although the third person singular form was found only once, the other morphological options did not show any preferences. This particular meaning is illustrated in (6), which was found in a private blog post in March 2010. (6) I mean Barack Obama, Martha Stewart, Dame Elizabeth (whom I had to detweet for spamming me about that whole Michael Jackson nonsense) never started following me. Finally two usage-types can be identified which occur predominantly in passive mood. The first, ‘be removed from Twitter’ was already illustrated in example (3) above (‘‘Rawman was detweeted ’’). In addition, the object
32. To retweet means ‘to post a tweet of another user on your page, because it is funny, important, meaningful, etc.’ It is followed by the abbreviation RT.
The NeoCrawler
87
of the detweeting process can also be a Twitter message deleted by the Twitter team, as demonstrated in example (7) from a private blog in March 2010. (7) Detweeted. One of my tweets disappeared today. It wasn’t a latency issue – sometimes text tweets to Twitter appear several hours later or never appear at all. This tweet was in my stream long enough to receive a reply and to be referenced in another tweet before it went missing. I didn’t delete it, and I’ve never experienced or heard chatter about spontaneously combusting tweets before, which led me to wonder if Twitter administrators deleted it because they considered it o¤ensive. It could be argued that the sense in (7) is a semantic narrowing of ‘to delete’, as it is not the individual user that decides to remove their tweets, but the Twitter authorities. A mere 8% of the tokens are uses of this type. In terms of grammatical form, 8 out of 10 tokens were the past participle, once the third person plural form preceded by Twitter as subject was found. Meaning and grammatical form thus strongly correlate. Table 4 provides a summary of the five senses identified in the dataset and cross-tabulates them with their grammatical distribution. While it is impossible of course to predict if some or only one of the five meanings will eventually win the race for establishment and push out the others, or whether a system of five polysemous senses will stabilize, it is interesting to chart the temporal development of the senses. This is rendered in Figure 6 which gives the timeline of the frequencies for each of the five semantic usage types. Table 4. Grammatical-semantic distribution per word form detweet (V) 1) to sign o¤
15
2) to delete
22
3) to pass along with disapproval
16
4) to unfollow
7
5) to be removed from Twitter
1
detweet (N)
detweets (V) 1
6
detweets (N)
detweeting
detweeted
Total
19
1
35
2
5
30
3 1
25
4
5
17
1
8
10
88
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
Figure 6. Monthly page frequency per assigned meaning
The NeoCrawler
89
As mentioned above, the rather irregular peak in August 2009 is caused by an increased frequency of detweet with the meaning ‘being removed from Twitter’. The graph shows that except for this peak, this meaning of detweet has apparently not caught on and disappeared from use. The same pattern is found for ‘to pass along with disapproval’. After its deliberate coinage in February 2009, an e¤ort was made by the author to facilitate the spread of detweet in this particular sense. The many metalinguistic results in our data set confirm this development. However, these e¤orts were rather unsuccessful, since the graph shows that frequencies did not increase, but rather dropped. As Metcalf (2002: 185) notes, attempts at establishing a new word will stand a better chance if the word is ‘sneaked’ into the language without creating a buzz around it. Having begun its lexicalization process with the meaning of ‘to delete’, detweet has now acquired other and indeed more frequently used meanings. Its original meaning is still in use, but to a lesser degree. At the time of writing, ‘to unfollow’ and most notably ‘to sign o¤’ prevail. While we do not want to engage in new-word astrology, we can venture the prediction that the latter meaning will become fixed for reasons of language economy, as unfollow has already become conventionalized in the meaning in question, which might make a new word form for the same concept redundant. 4.4. Institutionalization As we have mentioned, describing the di¤usion of a new word in a speech community, even if it is just a limited one of the type studied here, is not just a matter of monitoring the frequency of use as discussed in Section 4.2, but also relates to the socio-pragmatic spread of a new lexical item across text-types, semantic domains and registers. Figure 7 presents a text-type analysis of occurrences of detweet in the five di¤erent meanings, which is based on the categories used for annotating NeoCrawler data (cf. 3.3.3). Unsurprisingly, all meanings are used to some degree on Twitter. Specifically, ‘to sign o¤ ’ is frequently found in this discourse domain, because detweeting has become a common expression among Twitter users to indicate their upcoming o¤-line status. The text-type distribution, however, shows that this usage-type is by no means restricted to the microblogging genre, as detweet also appears in personal blogs and community portals. These three kinds of text type represent the informal end of the Internet genre continuum; other genres on the more formal side, such as news media, do not feature the word detweet so far, with the exception of the
90
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
Figure 7. Overall text type distribution of detweet
Denver Post mention. This suggests that so far detweet has only been institutionalized somewhat tentatively, because it has not started to disperse into more formal registers and text types. It also remains doubtful whether this spread will take place at all, since the concept is, at this stage at least, used exclusively with respect to Twitter activities. It is not unlikely that its morphological make-up, i.e. the Twitter-specific base tweet, will prevent a future cross-over into other registers and discourse types, because of its strong cognitive association with Twitter. The current results would support this claim, but further monitoring is necessary. Characteristic of neologisms, furthermore, is the presence of metalinguistic activity. Nearly all of the observed meanings of detweet have been written about and commented upon linguistically by users. Two developments can be distinguished here. In the first a metalinguistic comment is the earliest occurrence and the word is subsequently used in an object-linguistic manner. This is the case for the oldest meaning ‘to delete’. In the complementary type, the word is first used in the speech community and then commented upon at a later stage. Detweet with the meaning of ‘signing o¤ ’ represents this case. One of the earlier occurrences was on Twitter in June 2009. In the subsequent months, detweeting stayed under
The NeoCrawler
91
the radar of linguistic observation and did not receive metalinguistic attention until March 2010. Interestingly, it is precisely this unobtrusive, unremarked use that prevails. Although cognitively more prominent in its sense as the lexical opposite of retweet and actively propagated by the inventor, the meaning of ‘pass along with disapproval’ has not become established. Whether the presence or absence of metalinguistic comments are mere coincidental factors, or whether a significant influence on the conventionalization process exists, also constitute topics for further research. 4.5. Lexical network formation The NeoCrawler not only allows us to investigate the di¤usion of a neologism throughout the language community, registers and genres, but also to describe the lexical networks it starts to develop after its introduction. Arguably, this is an important indicator for the establishment of new words, not just from a language-systemic point of view, but also from a cognitive one, since network-building is a crucial step in lexical acquisition and the life-long reorganization of the mental lexicon (Aitchison 2003: 189–199). The following section will discuss some of the paradigmatic and syntagmatic patterns that detweet has already established in its early stages, which are also seen as initial evidence of the emergence of cognitive routines in the minds of language users. In almost 30% (7 out of 22 tokens) of its occurrences with the meaning ‘to delete’, detweet is complemented by the noun tweet, which is of course identical in form to the base of the prefixed verb. These occurrences are all metalinguistic uses providing definitions. For detweeted, too, tweet was found to collocate in almost half of the subsequent co-texts. These observations will hardly come as a surprise, since it is only reasonable to explain the meaning of a prefixed verb with reference to its base. On the other hand, neglecting the metalinguistic function of these uses, to detweet a tweet can be regarded as an incipient lexical collocation or a ‘cognate’ verb-object construction acquiring the status of a collostruction (cf. Stefanowitsch and Gries 2003). The restricted, metalinguistic use of detweet in the sense ‘to pass along with disapproval’ is also confirmed by the collocational analysis. Firstly, it is mainly preceded by introducing, which is part of the title of the article in which its coinage is explained. Secondly, the antonym retweet is also found in the immediate co-text, which indicates that the writer consciously tries to establish a lexical and cognitive reference to a word that is supposedly known to the readers.
92
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
The synonyms unfollow and not follow and the antonym follow occur in 30% of the neighbouring co-texts of detweet as ‘to unfollow’. Collostructional preference for an object or a subject was not observed. The tokens furthermore occurred in object-linguistic use, so that the synonyms and the opposite serve as valuable cognitive and lexical anchoring points in the meaning negotiation process required by the reader. These preliminary results indicate that since its inception, users of detweet have relied on strong morphological, lexical and semantic connections to the co-text. Whether and how long these initial semantico-lexical relationships are retained during the lexicalization and institutionalization process, when the need for co-textual clues is reduced due to the strengthening and disambiguation of meaning, and, more importantly, the extent of their positive or negative e¤ect on di¤usion constitute further interesting questions for future research.
5. Summary and Outlook In this paper we have described a new methodology for the identification, retrieval and linguistic analysis of neologisms. We hope that the case study presented in Section 4 has provided an idea of the potential of the NeoCrawler for supplying the means to address long-standing questions in historical semantics and lexicology. Specifically, the case study on the neologism detweet has demonstrated how the NeoCrawler can facilitate the study of processes such as – semantic disambiguation, competition-resolution and semantic change (i.e. lexicalization processes); – semantic-grammatical correlations between word classes and meanings; – di¤usion, i.e. changes in discourse frequency; – institutionalization, i.e. spread across text-type, genres, fields of discourse, functions (including meta-linguistic vs. object-linguistic uses); – incipient network-formation manifested in evidence for a gradual establishing of paradigmatic and syntagmatic relations In short, possible applications of the NeoCrawler pertain to the fields of semantic change, early morphological and grammatical change, the establishment of collocations, collostructions and valency patterns, as well as use-related aspects. In the future, the NeoCrawler is to be optimized in a number of directions including automatic classification of fields of discourse, addition of
The NeoCrawler
93
another module to search microblogging services and extension to other search engines. Our impression is that the combination of the Discoverer and the Observer as well as reliance on the relational database approach have proven quite rewarding and promising.
References Aitchison, Jean 2003 Words in the Mind: An Introduction to the Mental Lexicon, 3rd ed. Malden, MA: Blackwell. Andre´s, Louis, David Cuberes, Mame Astou Diouf and Toma´s Serebrisky 2007 Di¤usion of the Internet: A Cross-Country Analysis. World Bank Policy Research Paper WPS4420. Bauer, Laurie 1983 English Word-formation. Cambridge: Cambridge University Press. Beaugrande, Robert-Alain and Wolfgang Dressler 1981 Introduction to Text Linguistics. London: Longman. Bergh, Gunnar 2005 Min (D) Ing English language data on the Web. What can Google tell us? ICAME Journal 29: 25–46. Biber, Douglas 1988 Variation across Speech and Writing. Cambridge: Cambridge University Press. Biber, Douglas 1989 A typology of English texts. Linguistics 27: 3–43. Biber, Douglas 1995 Dimensions of Register Variation. Cambridge: Cambridge University Press. Biber, Douglas 2007 Towards a taxonomy of web registers and text types: a multidimensional analysis. In: Marianne Hundt, Nadja Nesselhauf and Carolin Biewer (eds.), Corpus Linguistics and the Web, 109–131. Amsterdam: Rodopi. Brinton, Laurel J. and Elizabeth Closs Traugott 2005 Lexicalization and Language Change. Cambridge: Cambridge University Press. Buchstaller, Isabelle, John R. Rickford, Elizabeth Closs Traugott, Thomas Wasow and Arnold Zwicky. 2010 The sociolinguistics of a short-lived innovation: tracing the development of quotative all across spoken and internet newsgroup data, Language Variation and Change 22: 191–219. Carletta, Jean, Stefan Evert, Ulrich Heid, and J Kilgour 2005 The NITE XML toolkit: Data model and query language. Language Resources and Evaluation 39 (4): 313–334.
94
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
de Kunder, Maurice 2007 Geschatte Grootte van het Gei¨ndexeerde World Wide Web. MA thesis, Tilburg University. www.dekunder.nl (accessed December 21, 2010). Dipper, Stefanie 2005 XML-based stand-o¤ representation and exploitation of multilevel linguistic annotation. Proceedings of Berliner XML Tage 2005 (BXML 2005). Eckart, Richard 2008 Choosing an XML database for linguistically annotated corpora. Sprache und Datenverarbeitung 32 (1): 7–22. Evert, Stefan 2010 Google Web 1T 5-Grams made easy (but not for the computer). Sixth Web as Corpus Workshop (WAC-6). Fairon, Ce´drick, Ke´vin Mace´ and Hubert Naets 2008 GlossaNet 2: a linguistic search engine for RSS-based corpora. In: Proceedings of LREC 2008, Workshop Web As Corpus (WAC4), Marrakesh. http://cental.fltr.ucl.ac.be/team/~ced/ papers/2008-wac4-glossanet.pdf (accessed May 27, 2010). Ferrara, Kathleen, Hans Brunner and Greg Whittemore 1991 Interactive written discourse as an emergent register. Written Communication 8 (1): 8–34. Fletcher, William 2001 Concordancing the Web with KWiCFinder. American Association for Applied Corpus Linguistics. Third North American Symposium on Corpus Linguistics and Language Teaching. Boston, MA, 23–25 March 2001. http://kwicfinder.com/FletcherCLLT2001.pdf (accessed May 27, 2010). Fletcher, William 2007 Concordancing the web: promise and problems, tools and techniques. In: Hundt, Marianne, Nadja Nesselhauf and Carolin Biewer (eds.), Corpus Linguistics and the Web, 25–45. Amsterdam: Rodopi. Folch, Helka, Serge Heiden, Benoıˆt Habert, Serge Fleury, Gabriel Illouz, Pierre Lafon, Julien Nioche and Sophie Pre´vost. 2000 TypTex: Inductive typological text classification by multivariate statistical analysis for NLP systems tuning/evaluation. Proceedings of the Second Language Resources and Evaluation Conference. Ghodke, Sumukh, and Steven Bird 2008 Querying linguistic annotations. Proceedings of the Thirteenth Australasian Document Computing Symposium. Gulli, Antonio and Alessio Signorini 2005 The indexable web is more than 11.5 billion pages. Proceedings WWW ’05 Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, 903–904. New York: ACM.
The NeoCrawler
95
Hayashi, Larry, and John Hatton 2001 Combining UML, XML and relational database technologies. The best of all worlds for robust linguistic databases. Proceedings of the IRCS Workshop on Linguistic Databases. Herring, Susan C. 2007 A faceted classification scheme for Computer-Mediated Discourse. Language@internet 4. http://www.languageatinternet.de/articles/2007/761/ (accessed June 2, 2010). Hohenhaus, Peter 1996 Ad-hoc-Wortbildung. Terminologie, Typologie und Theorie kreativer Wortbildung im Englischen. Frankfurt: Peter Lang. Hohenhaus, Peter 2006 Bouncebackability. A web-as-corpus-based study of a new formation, its interpretation, generalization/spread and subsequent decline. SKASE Journal of Theoretical Linguistics 3: 17–27. Hymes, Dell 1974 Foundations in Sociolinguistics: An Ethnographic Approach. Philadelphia: University of Pennsylvania Press. Ide, Nancy, Patrice Bonhomme, and Laurent Romary 2002 XCES: An XML-based encoding standard for linguistic corpora. LREC 2000 2nd International Conference on Language Resources and Evaluation, Athens. Kilgarri¤, Adam 2003 Linguistic Search Engine. Proceedings of the Shallow Processing of Large Corpora Workshop (SProLaC 2003), Corpus Linguistics 2003. Lancaster University. http://www.kilgarri¤.co.uk/ Publications/2003-K-LSEsprolac.pdf (accessed May 27, 2010). Leech, Geo¤rey, Marianne Hundt, Christian Mair and Nicolas Smith 2009 Change in Contemporary English: A Grammatical Study. Cambridge: Cambridge University Press. Lewandowski, Dirk 2008a A three-year study on the freshness of web search engine databases. Journal of Information Science 34 (6): 817–831. Lewandowski, Dirk 2008b The retrieval e¤ectiveness of web search engines considering results descriptions. Journal of Documentation 64 (6): 915–937. Lipka, Leonhard 2002 English Lexicology: Lexical Structure, Word Semantics and Wordformation. Tu¨bingen: Narr. Luhn, Hans Peter 1958 The automatic creation of literature abstracts. IBM Journal of Research Development 2 (2): 159–165. Mair, Christian 2006 Twentieth-Century English: History, Variation and Standardization. Cambridge: Cambridge University Press.
96
Daphne´ Kerremans, Susanne Stegmayr and Hans-Jo¨rg Schmid
McEnery, Tony, Richard Xiao and Yukio Tono 2006 Corpus-based Language Studies. London: Routledge. Metcalf, Allan 2002 Predicting New Words. Boston: Houghton Mi¿in Company. Murray, Denise E. 1990 CmC. English Today 23: 42–46. Ntoulas, Alexandros, Junghoo Cho and Christopher Olson 2004 What’s new on the Web? The evolution of the Web from a search engine perspective. http://www.cs.cmu.edu/~olston/ publications/webstudy.pdf (accessed May 12, 2010). Odlyzko, Andrew 2003 Internet growth: Myth and reality, use and abuse. SPIE— Optical Transmission Systems and Equipment WDM Networking II, Vol. 5247: 1–15. Renouf, Antoinette. (ed.). 1998 Explorations in Corpus Linguistics. Amsterdam & Atlanta: Rodopi. Renouf, Antoinette, Andrew Kehoe and Jay Banerjee 2005 The WebCorp Search Engine. A holistic approach to web text search. Electronic Proceedings of CL2005, University of Birmingham. http://www.webcorp.org.uk/publications.html (accessed May 27, 2010). Schmid, Hans-Jo¨rg 2011 English Morphology and Word-formation: An Introduction, 2nd ed. Berlin: Erich Schmidt Verlag. Stefanowitsch, Anatol and Stefan Th. Gries 2003 Collostructions: investigating the interaction of words and constructions. International Journal of Corpus Linguistics 8 (2): 209– 243. Sˇtekauer, Pavol 2002 On the theory of neologisms and nonce-formations. Australian Journal of Linguistics 22 (1): 97–112. Tournier, Jean 1985 Introduction Descriptive a` la Lexicoge´ne´tique de l’Anglais Contemporain. Paris: Champion-Slatkine. Uyar, A. 2009 Investigation of the accuracy of search engine hit counts. Journal of Information Science 35 (4): 469–480. Werlich, Egon 1976 A Text Grammar of English. Heidelberg: Quelle and Meyer.
Commentary: Data and Sources Philip Durkin The three contributions in the first part of this volume have in common at least three strands: firstly (as made clear by the section heading), an engagement with questions of data; secondly, more specifically, an engagement or dialogue with the data of dictionaries, especially the Oxford English Dictionary (OED); thirdly, an engagement with some questions that have been of central concern in the study of lexis for many decades, here shown in a new light through contemporary approaches (particularly from within the cognitive semantics paradigm). Perhaps I may begin with some observations very much from the standpoint of an OED lexicographer. Allan’s contribution notes that electronic publication of the OED has opened up new possibilities for researchers that could not have been envisaged by the editors of its first edition in the closing decades of the nineteenth century and the first few decades of the twentieth. It should also be noted that editing a dictionary in an electronic format has opened up new possibilities and new challenges for lexicographers. Most fundamentally, it has become much more possible to identify and trace connections between words than was possible within the limits of alphabetical ordering in a printed dictionary. In this respect, as described beautifully by Kay’s contribution, a new resource, the Historical Thesaurus of the Oxford English Dictionary (HTOED) is having an equally transformative role. And, crucially, it is in its electronic form, integrated as part of OED Online, that HTOED becomes most powerful as a tool for exploring word histories: it is now possible to move from OED senses to the corresponding location in the HTOED structure, and likewise from any item in HTOED to the corresponding OED entry. (Although, as noted in Kay’s contribution, the complex structure of HTOED categories can still sometimes be most readily appreciated on the printed page.) A seamless interface between OED and HTOED is more than just a convenient feature on a website, in two fundamental respects: firstly, it enables the HTOED data to be updated as OED is revised and updated; and secondly, it is of crucial importance for exploring the relationships between large-scale lexical networks and the detailed histories of individual words.
98
Philip Durkin
The three contributions here neatly illustrate the importance of paying attention to various di¤erent levels of detail in describing and researching the histories of lexemes. The contribution from Kay shows the importance of the structure of HTOED for gaining an overview of the lexical history of English at a high level of abstraction. The contribution from Allan illustrates the importance of looking at dictionary data in detail, rather than relying solely on a summary of first and last dates of attestation. The contribution from Kerremans et al. gives an exciting illustration of the complexity we can find when evidence is abundant and it is possible to move beyond the level of abstraction and summary found in even the largest dictionaries. As background to these issues, I would like to look first in a little more detail at the third of the common strands in these papers identified above: engagement with, and new approaches to, some questions that have been of central concern in the study of lexis for many decades. The contributions by both Allan and Kay mention the work of Ullmann in the 1950s and 1960s, while that by Kay also mentions von Wartburg, with whom Ullmann collaborated closely in the latter stages of von Wartburg’s career. The work of these scholars provided an important impetus for HTOED. It also left some important open questions for future research, which the contributions in this volume move us closer towards answering. Von Wartburg’s work on lexical change and etymology was of course firmly anchored in his lifelong lexicographical labours on the Franzo¨sisches etymologisches Wo¨rterbuch (FEW ). The subsequent editions of his Einfu¨hrung in Problematik und Methodik der Sprachwissenschaft (1943, 1962; English translation, Problems and Methods in Linguistics, 1969) show a keen consciousness of the importance of interaction between the individual lexemes within any linguistic system. The FEW itself collects material under etymons, rather than under their reflexes, as in the conventional format of a historical dictionary such as the Oxford English Dictionary (OED). Thus modern French mettre is found under its etymon Latin mitteˇre. This has the considerable advantage of bringing etymologically related items together in a single dictionary entry; thus, under mitteˇre we find also message, messager, mise, entremettre, de´mettre, e´mettre, and others. Relationships and divergences within an etymologically related word group can be traced along various axes: regional variation (since regional forms are included from the whole of the Gallo-Romance area), morphological variation, and, crucially for current purposes, semantic
Commentary: Data and Sources
99
development and variation. Thus divergent and convergent semantic developments within an etymologically related word group can be traced with great ease, as can di¤erences of semantic development in di¤erent regional varieties, as can semantic di¤erentiation between morphologically distinct items within a word group.1 Such a structure thus has distinct advantages for anyone wanting to explore lexical relations within an etymologically related word group, albeit bringing with it disadvantages for locating particular lexical items that make it unattractive for the general reader, as well as real methodological problems in dealing with cases of etymological merger.2 However, few scholars have shown more penetrating insight than von Wartburg himself into the complex interactions between lexical items which do not belong to the same etymologically related group, both within more or less narrowly definable semantic fields and more widely: Der Wortschatz ein großes Ganzes ist, innerhalb dessen jedes Glied, jedes einzelne Wort seine besondere Stellung in seiner Umwelt hat, in Beziehung steht zu den benachbarten Gliedern. Innerhalb dieses Ganzen gibt es gewisse Gebiete, die einem klar abgegrenzten Teil der subjektiven Welt entsprechen; in diesen schließen sich die einzelnen Elemente des Wortschatzes zu einem gegliederten Feld zusammen; an andern Stellen ist das Gefu¨ge lockerer, sind die Wo¨rter nicht in gleichem Maße zur Einheit und Untereinheit versammelt. (von Wartburg 1962: 174) [Published translation: The vocabulary is a vast whole within which each element, each individual word, has its own particular position in its environment and bears a relation to neighbouring elements. Within this whole there are some domains which correspond to a clearly circumscribed sector of the subjective world; in these domains, the di¤erent elements of the vocabulary come together to form a co-ordinated field. Elsewhere, however, the structure of the system is looser and there is no such strict co-ordination. (von Wartburg 1969: 174)]
Crucially, he goes on to note the fundamental problems posed by the differing lexical repertoire available to di¤erent speakers of a language even within a single time period:
1. On the structures of FEW see especially Bu¨chi (1996). 2. Compare Durkin (2009) 79–83, 86–88.
100
Philip Durkin
Dieser Wortschatz als Ausdruck des Weltbildes ist natu¨rlich innerhalb einer Sprachgemeinschaft in sehr verschiedenem Maße lebendig. Niemand kann das ganze Vokabular seiner Sprache besitzen. Auch wenn man von regionalen Verschiedenheiten absieht, so bleibt doch das gewaltige Gebiet der Berufssprachen. (von Wartburg 1962: 174) [Published translation: This vocabulary, as the expression of a world-picture, is naturally present in very varying degrees among the members of a linguistic community; quite apart from regional di¤erences, there is also the vast domain of the professional languages. (von Wartburg 1969: 174)]
He goes on to note the implications for the tools needed by the researcher, with a typically bold manifesto: Das wissenschaftliche, deskriptive Wo¨rterbuch muß die nichts-sagende, unwissenschaftliche Anordnung nach dem Alphabet aufgeben. Es wird nie mo¨glich sein, das Wesen des Sprachschatzes als der Gestaltung des Weltbildes ¨ konomie einer Volksgemeinschaft zu einer bestimmten Zeit und ihre innere O zu erfassen, solange nicht die alphabetische Reihenfolge ersetzt wird durch ein in der Sprache in ihrem jeweiligen Zustand selbst abgelauschtes System. Das Alphabet ist natu¨rlich als praktische Nachschlageordnung nicht zu entbehren, aber es geho¨rt als Ordnungsprinzip ins Schlußregister. Ein solches System kann nie durchgehend das gleiche sein, weil eben das Weltbild selber und damit auch sein Ausdruck sich dauernd verschiebt. Schon durch die Modifikationen, die das System erleidet, wird sich ein Bild ergeben von der neuen Gestalt des geistigen und materiellen Lebens eines Volkes. (von Wartburg 1962: 175) [Published translation: A scientific descriptive dictionary must abandon the meaningless and unscientific principle of alphabetical order. It will never be possible to understand the true nature of the vocabulary qua manifestation of the world-picture current in the community at a particular period, or to discern the general pattern of its internal economy, until alphabetical order is replaced by a system dictated by the state of the language itself at a given moment in time. Alphabetical order is obviously indispensable for purposes of reference, but as a principle of classification its place is in the index. Such a system as we envisage cannot, of course, be the same for all periods, since the world-picture – and therefore also its expression – is constantly changing. The very modifications in the system will provide a picture of the changes which have taken place in the spiritual and material life of a people. (von Wartburg 1969: 174)]
As a step towards such an approach to the lexicon, Hallig and von Wartburg (1962 [ed. 1 1952]) presented a classification of concepts from a ‘‘naively realist’’ viewpoint (see further Kay’s contribution in this volume).
Commentary: Data and Sources
101
The 1962 edition of von Wartburg’s Einfu¨hrung in Problematik und Methodik der Sprachwissenschaft was prepared in collaboration with Stephen Ullmann, and similar ideas permeate Ullmann (1959 [1951]) and Ullmann (1962), which were both enormously influential on the development of semantics in the English-speaking world. The influence on the conception and development of HTOED is very clear. In a passage near the end of his Linguistic Evolution (quoted also in the contribution by Kay), Samuels notes: No solution to the problem of push- and drag-chains in lexis will be forthcoming until it is possible to study simultaneously all the forms involved in a complex series of semantic shifts and replacements. The required data exist in multivolume historical dictionaries like the OED, but they cannot be utilised because the presentation is alphabetical, not notional. The need is for a historical thesaurus which will bring together under single heads all the words, current or obsolete (and all the obsolete meanings of words still current) that have ever been used to express single and related notions. (Samuels 1972: 180)
Tellingly, this passage has one footnote, referring to the passage from von Wartburg (1962) from which the above quotations are taken, and to a passage from Ullmann (1962) which itself describes Hallig and von Wartburg (1962). The importance of this intellectual context for HTOED is thus very evident (as noted also by Kay, this volume); as I hope to show, the contributions by Allan and by Kerremans et al. can usefully be seen in the same tradition, as moving closer towards answers to some of the questions posed by von Wartburg, Ullmann, and Samuels which have been of perennial concern to lexicologists and etymologists, as well as to questions posed more recently within the cognitive linguistics paradigm. Returning now to Allan’s contribution in this volume, the first of her three case studies, milksop, demonstrates the importance of close attention to dictionary data. The picture presented by the bare dates of first attestation is startlingly counter-intuitive, since the figurative use denoting a feeble, timid, or ine¤ectual person is attested earlier than the literal meaning denoting a milk-soaked piece of bread. Close inspection of the dictionary’s quotation evidence, combined with consideration of the survival of relevant text types from the period concerned (late Middle English), and with comparison with the history of the simplex sop, all support the conclusion reached also in the OED entry: the use denoting a person is indeed a figurative development from the use denoting a piece of bread, and what we are dealing with here is simply a failure in the historical record. Hence we see the importance of moving constantly back and forth
102
Philip Durkin
between the high-level abstraction of the HTOED data (in which these two senses appear with their first dates of attestation only) and the finer-grained detail of the OED entry, which makes it possible to resolve an apparently counter-intuitive word history with some reasonable degree of certainty. Her second and third test cases, pregnant and dull, both highlight in di¤erent ways the importance of not losing sight of another level of detail in the OED entries, namely its etymologies. pregnant again appears to show a counter-intuitive history if the first dates of attestation of its English senses are viewed in isolation. However, in this instance it is crucial to bear in mind that it is a loanword, which entails that two further dimensions must be investigated: firstly, its meanings in the donor languages, Latin and French, documented in the OED etymology;3 and secondly, via HTOED, the relationship of the borrowed word with existing English lexis in the same semantic fields. The third test case, dull, illustrates the importance of paying attention both to etymology (through OED) and to related lexical items within English (via HTOED) still more graphically: the evidence of the cognates of dull in other Germanic languages strongly supports the supposition that the mental sense was earliest; investigation of the historical relationship with the antonym sharp suggests a mechanism for explaining how an apparently counter-intuitive semantic development may have come about. The contribution by Kerremans et al. gives a glimpse of the fuller picture that lies behind the level of abstraction and summary that is unavoidable in any dictionary. A historical dictionary of a poorly attested stage in the history of a language may well record every surviving attestation of a word, but if so our surviving attestations will themselves be only a meagre reflection of the usage that once existed. (Unless perhaps a word may have been used only in writing and that use has survived to us, in which case its existence in the lexis was clearly only ever extremely marginal.) Typically, for recent stages in the history of well-documented languages such as English, the lexicographer’s task for any but the rarest words and meanings is to select representative and informative examples from a sea of possible examples, which in turn represent only a tiny fraction of the use of a word. The work by Kerremans et al. attempts a much closer look at a particular type of lexeme: a new word form, realizing a new meaning, observed in the very earliest stages of its existence, as reflected in web sources. Further, through the focus on a lexical item, 3. On this type of ‘‘dual’’ or ‘‘multiple’’ etymology see Coleman (1995), Durkin (2002), Durkin (2008), Durkin (2009).
Commentary: Data and Sources
103
detweet, which belongs specifically to the discourse of the online world, the likelihood is increased that the online uses accessible to the researchers may genuinely be the earliest or at least among the earliest examples. All of these factors mean that this precise method can have only very limited transferability to other situations: this sort of data is not available to us for other historical periods, nor for items that have shown their initial spread in areas of discourse not well represented on the web. (Indeed, the OED’s new words team have recently been collaborating with this research team in identifying likely candidate items for this sort of study.) Nor would this method be applicable without considerable modification for investigating new senses of existing words. Nonetheless, this method provides invaluable, perhaps unique, finegrained data on the earliest stages of the history of particular words, which can be used to test general assumptions about lexical change. As noted explicitly in the contribution, processes of institutionalization of new word forms and meanings, and of spread between di¤erent text types, can here be observed close up, in real time. Additionally, this single case study of detweet provides useful illustrations of the potential of this sort of data for testing whether new word forms are typically coined once only, and spread from user to user, or whether parallel conditions may lead to the same (semantically transparent) coinage occurring independently on more than one separate occasion, and similarly for testing whether meaning development typically occurs once and spreads, or occurs multiple times in similar situations.4 The three contributions under consideration thus provide vivid illustrations of the possibilities and challenges posed by data and its analysis in the study of lexical change. The contribution by Kay demonstrates the importance of a thesaurus framework in unlocking the data of a dictionary, as foreseen by Samuels; it also documents with candour the di‰cult and to some extent subjective decisions which must underlie any attempt to classify the whole lexicon of English on the basis of ‘‘naı¨ve realism’’. As noted already by von Wartburg, the diachronic dimension makes this task only the harder, since not just the lexicon but the world outlook of speakers have shown fundamental changes in many areas over the very long period which HTOED covers. The contributions by Allan and by Kerremans et al. show the value of close attention to the detail of individual word histories, in conjunction with an understanding of how these individual word histories interact 4. Compare Geeraerts (1997) 62–68, Durkin (2009) 68–74, 228.
104
Philip Durkin
with the larger structures of the lexicon of the language. All three contributions remind us of the complex interplay of factors involved in any instance of lexical change.
References Bu¨chi, Eva 1996
Coleman, Julie 1995 Durkin, Philip 2002
Durkin, Philip 2008
Les structures du ‘‘Franzo¨sisches etymologisches Wo¨rterbuch’’: recherches me´talexicographique et me´talexicologiques. Tu¨bingen: Niemeyer. The Chronology of French and Latin Loan Words in English. Transactions of the Philological Society 93: 95–124. ‘‘Mixed’’ etymologies of Middle English items in OED3: some questions of methodology and policy. Dictionaries: The Journal of the Dictionary Society of North America 23: 142–55. Latin loanwords of the early modern period: how often did French act as an intermediary? In: Richard Dury, Maurizio Gotti, and Marina Dossena (eds.), Selected Papers from the Fourteenth International Conference on English Historical Linguistics (ICEHL 14), Bergamo, 21–25 August 2006, Vol. II: Lexical and Semantic Change, 185–202. Amsterdam: Benjamins.
Durkin, Philip 2009 The Oxford Guide to Etymology. Oxford: Oxford University Press. Hallig, R. and Walther von Wartburg 1962 Begri¤ssystem als Grundlage fu¨r die Lexikographie. Versuch eines Ordnungsschemas, 2nd ed. Berlin: Akademie-Verlag. Ullmann, Stephen 1959 The Principles of Semantics, 2nd ed. Oxford: Blackwell. Ullmann, Stephen 1962 Semantics: An Introduction to the Science of Meaning. Oxford: Blackwell. Wartburg, Walther von 1922–1978 Franzo¨sisches etymologisches Wo¨rterbuch: Eine Darstellung des galloromanischen Sprachschatzes, 25 vols., 2nd ed., in course of publication. Basel: Zbinden. Wartburg, Walther von 1962 Einfu¨hrung in Problematik und Methodik der Sprachwissenschaft, 2nd ed., with the collaboration of Stephen Ullmann. Tu¨bingen: Niemeyer.
Commentary: Data and Sources
105
Wartburg, Walther von 1969 Problems and Methods in Linguistics, revised ed., with the collaboration of Stephen Ullmann. Translation of the second French edition (1963) of von Wartburg (1962) by Joyce M. H. Reid. Oxford: Blackwell.
Section 2: Corpus-based methods
How anger rose: Hypothesis testing in diachronic semantics Dirk Geeraerts, Caroline Gevaert and Dirk Speelman Abstract On the basis of a large database of attested examples of anger, ire and wrath in Middle English texts, we perform a statistical analysis of the factors contributing to the emergence of anger as the dominant term. Specifically, we perform a logistic regression to test the hypothesis formulated by Diller (1994), who suggests that anger was introduced in the lexical field of anger expressions because social changes gave rise to new forms of anger: in contrast with the traditional reference to anger, in which the angry person has a high social rank and typically reacts in a violent way, anger expressed the emotions of lower-ranked persons, who react less violently. Overall, our statistical analysis is consonant with Diller’s hypothesis, but it appears, importantly, that the hypothesis needs to be lectally enriched by means of a reference to the text type in which anger appears.
1. Anger and diachronic onomasiology By nature and necessity, diachronic semantics is a corpus-based endeavour. But the advance towards quantitative corpus methods that characterizes current synchronic lexical semantics does not yet pervade the historical study of word meaning. In the present case study, we will illustrate how the use of a logistic regression technique may help to shed light on the onomasiological emergence of anger in English. This demarcation of the subject matter of the paper contains two elements that need some further introductory comments: the emergence of anger, and the onomasiological perspective. Middle English anger is traditionally considered a Scandinavian loanword, derived from Old Norse angr ‘trouble, a¿iction’, which in its turn is derived from the Indo-European root angh ‘narrow’. The Middle English Dictionary suggests that this loan may have been influenced by Latin angor. Like Old Norse angr and Latin angor, the earliest meaning of English anger is ‘trouble, a¿iction, vexation, sorrow’ (OED). Its meaning ‘wrath, ire, hot displeasure’ (OED) occurs only sporadically at first but becomes more frequent by the turn of the 15th century, a period in which both meanings
110
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
Figure 1. The emergence of anger in comparison with ire and wrath
still occur, and reaches prototypical status by 1500. (The process through which the prototypical meaning ‘trouble, a¿iction, vexation, sorrow’ is replaced by the peripheral meaning ‘wrath, ire, hot displeasure’, fits well with other cases of words showing loss of prototypical meanings, as mentioned in Dekeyser 1995, 1996, Geeraerts 1997.) The original meaning only lives on in dialect uses, in which anger means ‘physical a¿iction’. In the material on which our case study is based (and which will be specified presently), the gradual rise of anger is illustrated by Figure 1, which plots the attested frequency of ire, anger and wrath over four centuries. Although this rise to prominence of anger is mentioned in many works on the history of the English lexicon, in most cases no explanation is o¤ered as to why this loanword was introduced and came to be the standard expression of anger in English. The introduction of the word anger is mentioned without any further explanation in, among others, Serjeantson (1935: 85), Geipel (1971: 65), Baugh and Cable (1978: 100), Berndt (1984: 65) and Burnley (1992: 421). Diller (1994) however suggests that this expression was introduced in the lexical field of anger expressions because social changes gave rise to new forms of anger: whereas wrath expressed the traditional type of anger, in which the angry person has a high social
How anger rose: Hypothesis testing in diachronic semantics
111
rank and typically reacts in a violent way, anger expressed the emotions of lower-ranked persons, who react less violently. We will consider whether corpus data confirm such a cultural explanation of the introduction and use of the loanword anger. With regard to the onomasiological perspective in diachronic semantics, let us first observe that, thematically speaking, the present paper links up with older work by the first and second author on the cultural history of emotion concepts: see Geeraerts and Gevaert (2008), and specifically also Geeraerts and Grondelaers (1995). In the framework of metaphor research in Cognitive Linguistics, the latter paper was instrumental in pointing out the importance of a historically and culturally contextualized, rather than just a universalist and physiological, concept of embodiment; for the evolution that Conceptual Metaphor Theory went through in this respect, see Ko¨vecses (2005). More important than this thematic lineage is the recognition that the onomasiological perspective continues a methodological line of research developed by the first and third author in the context of the Quantitative Lexicology and Variational Linguistics (QLVL) research group. Whereas Geeraerts (1997) brought together the prototype-theoretical semasiological work in diachronic semantics done in the 1980s and early 1990s, Geeraerts, Grondelaers and Bakema (1994) laid out the foundations for the onomasiological perspective that constituted the natural outcome of the semasiological work and that continues to be developed – although predominantly on a synchronic rather than diachronic domain – by the QLVL research group. The basic ideas of this approach that are relevant for the current study are the following. First, cognitive phenomena in language are primarily onomasiological rather than semasiological: categorization takes the form of onomasiological choices. Second, onomasiological choices of this kind are multivariate in nature: they involve conceptual phenomena, but also features like style and register, and the sociolinguistic situatedness of communication. It follows, third, that appropriate statistical techniques are necessary to disentangle the multivariate phenomena. Fourth, because the onomasiological choices are made at the usage level, that is where they have to be studied, and not at the level of linguistic structure. That is to say, diachronic onomasiology needs to assume a pragmatic perspective; see Grondelaers, Speelman and Geeraerts (2007). In the context of the lexicology of English, such a pragmatically oriented diachronic onomasiology is not yet strongly represented. On the one hand, there is a considerable amount of essentially semasiological work, either in the context of the currently popular grammaticalization studies,
112
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
or in the context of diachronic Cognitive Linguistics, where concepts such as prototypicality and conceptual metaphor are applied to semantic change. (Examples include Dekeyser 1990, 1995, 1996; Molina 2000, 2005; Koivisto-Alanko 2000; Tissari 2001; Fabiszak 2001; Allan 2008.) On the other hand, onomasiological studies received a major impetus through the development of the Historical Thesaurus of the OED (see Kay, this volume) and a number of associated PhD projects: Chase (1988) on religious language, Thornton (1988) on good and evil, Sylvester (1994) on expectation, and Coleman (1999) on the language of love, sex and marriage. Although this approach allows for a number of quantitative analyses, the perspective is still very much a structural one, and not yet a corpus-based pragmatic one of the type envisaged here.
2. Materials and method The present study is a further refinement of one of the case studies included in the PhD thesis of the second author (Gevaert 2007). In her dissertation, Gevaert analyzes the development of the lexical field of anger from Old to Early Modern English, with specific reference to the possible influence of cultural factors on the changes in the field; see also Geeraerts and Gevaert (2008). In this section, we first briefly describe the database that was compiled for Gevaert’s study, and then zoom in on the way in which the data were prepared for the logistic regression analysis. 2.1. The database In order to study the evolution of the lexical and conceptual field of anger, text samples were taken at reference points spread over the whole period covered by Gevaert (2007): texts from c800, 900, 1000, 1100, 1200, 1300, 1400 and 1500. For the Middle and Early Modern English period, which is most relevant for the rise of anger, this was easily realized: all available texts written c1200, 1300, 1400 and 1500 were collected as exhaustively as possible to form the Middle and Early Modern English corpus. Where possible, the chosen texts remain within the boundaries of 25 years before and after this reference date, so that c1400 stands, roughly speaking, for the period 1375–1425. Table 1 gives an overview of the texts used for the c1400 reference point (on which we will focus) and the number of attestations of anger-expressions found per text.
How anger rose: Hypothesis testing in diachronic semantics
113
Table 1. Overview of c1400 texts and number of attestations per text Author
Text
N
anonymous
The Romaunt of the Rose (B-fragment)
8
anonymous
Pearl
4
anonymous
Sir Gawayn and þe Grene Kny‰,t
6
anonymous
Floris and Blauncheflur Ms Trentham
9
anonymous anonymous
Gamelyn Athelston
13 2
anonymous
Le Morte Arthur
17
anonymous
The Sege of Melayne
anonymous
Patience
anonymous
The Cloud of Unknowing
2
anonymous
The Avowyng of Arthur
5
anonymous
The Pistil of Swete Susan
1
anonymous
The Four Leaves of the Truelove
2
anonymous anonymous
Erle of Tolous Sir Cleges
2 3
anonymous
2
anonymous
Sir Tryamour Emare´
anonymous
Richard the Redeless
8
anonymous
Alliterative Morte Arthur
41
anonymous
The Sowdone of Babylone
22
anonymous
The Tale of Beryn
29
anonymous anonymous
The Lanterne of Light Metrical Paraphrase of the Old Testament
anonymous
Friar Daw’s Reply
anonymous
Mum and the Sothsegger
Chaucer, Geo¤rey
Minor Poems
Chaucer, Geo¤rey
The Booke of the Duchess
Chaucer, Geo¤rey
The Romaunt of the Rose (A-fragment)
31
Chaucer, Geo¤rey?
The Romaunt of the Rose (C-fragment)
4
Chaucer, Geo¤rey
Anelida and Arcite
4
Chaucer, Geo¤rey Chaucer, Geo¤rey
The Parliament of Fowls Boece
1 17
Chaucer, Geo¤rey
Troilus and Criseyde
41
4 14
2
4 1 2 13 6 4
114
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
Table 1. (continued) Author
Text
N
Chaucer, Geo¤rey
The Legend of Good Women
Chaucer, Geo¤rey
The Canterbury tales
Chestre, Thomas
Sir Launfal
Julian of Norwich
The Shewings of Julian of Norwich
Gower, John Hoccleve, Thomas
Confessio Amantis Regiment of Princes
Hilton, Walter
The Scale of Perfection
Langland, William
Piers Plowman
Lydgate, John
Troy Book
Lydgate, John
Siege of Thebes
39
Wycli¤e, John
Wycli¤e’s Bible
803
7 204 7 32 127 44 73 44 300
This dataset will not be included in full in the analyses. Three restrictions will be applied to study the influence of the semantic factors suggested by Diller. First, the corpus data will be restricted to those instances in which anger is seen as a transitory emotional state. In the other cases, reference is often to anger as one of the capital sins or as a permanent personality trait. In such cases, the semantic factors suggested by Diller hardly play a role. If for instance anger refers to an emotion that is contextually triggered by a private o¤ence, then that kind of trigger is not really relevant for decontextualized references to irascibility as a temperament or a capital sin. In Table 2 we present a breakdown of the data for the c1300, c1400 and c1500 reference points, showing how the di¤erent types of meaning are semasiologically distributed for each of the lexical items. It becomes clear that the ‘capital sin’ reading is more prominent in ire than in the other items. This is consistent with the observation that ire is a romance loanword, as a continuation of Latin ira as it appears in religious and moralizing texts. (The specifics of ire and its relationship to the Old English form irre constitute an interesting study in itself, but one that lies beyond the boundaries of the present article.) Second, all instances have been excluded in which one of the literal expressions was used as a (near-)synonym of one of the others, in its immediate vicinity. The most typical examples of this stylistic feature are
How anger rose: Hypothesis testing in diachronic semantics
115
Table 2. Transitory state, sin and other references in ME texts transitory state 1300
1400
1500
ire
name sin
other
transitory state (%)
name sin (%)
other (% )
20
16
3
51.28
41.03
7.69
wrath
172
31
17
78.18
14.09
7.73
anger
85
5
15
80.95
4.76
14.29
ire
306
73
70
68.15
16.26
15.59
wrath
703
46
49
88.10
5.76
6.14
anger
276
9
20
90.49
2.95
6.56
59
23
17
59.60
23.23
17.17
294
48
51
74.81
12.21
12.98
ire wrath
the medieval doublets. The use of near-synonyms ‘‘a pour e¤et de neutraliser les traits distinctifs qui opposent les lexe`mes entre eux [has the e¤ect of neutralizing the distinctive features that distinguish the lexemes from each other]’’ (Kleiber 1978: 60). In other words, it leads to semantic levelling, a phenomenon that had better be excluded in our search for possible semantic di¤erences between (near-)synonyms. And third, instances stating that a person was not angry have been excluded. While in most of those cases it is possible to identify who might have been angry and sometimes also why, it is hard to say which reaction was expected without entering into speculation. 2.2. The hypothesis To see how the observations in the database can be prepared for a quantitative analysis, we should first have a closer look at the hypothesis formulated by Diller. Diller (1994: 227) suggests that ‘‘it seems a reasonable hypothesis that the growing distinction between private and ‘social’ made the emergence of a lexeme with the meaning of anger1 a necessity’’. The reference to anger1 in this quote derives from the fact that Diller considers there to be two homonyms anger: anger1, which has the modern meaning of anger, and anger2, which has the meaning ‘despair, sorrow, need’. In this study, like in most dictionaries, these are considered two di¤erent meanings of one lexeme, the second being earlier than the first. To support his suggestion, Diller presents a semantic analysis of anger, wrath, annoy
116
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
and grief and their morphological derivations in Chaucer’s works (with a few sporadic sideglances at the text of Piers Plowman). He argues that wrath and anger are semantically di¤erent in that wrath is used when the experiencer is of a high social rank, typically gods and mighty people, and the o¤ence involves ‘‘the dignity, the authority of the Experiencer, whose functioning is relevant to an entire social order’’ (Diller 1994: 227). When anger is used, the social rank of the experiencer is usually not higher than that of the o¤ender and the o¤ended value is personally rather than socially important. The reaction also seems to be less violent than with wrath. It may be added that the originality of Diller’s analysis does not lie in the analysis of the rank of the experiencer, but in that of the o¤ended value. In their research into Old English, the rank of the experiencers had been included in Tetzla¤ ’s earlier semantic analysis and has been included in Fabiszak’s more recent study. While Tetzla¤ (1954: 102) claims that ‘‘schon in der Bibel werden die gleichen Termini benutzt, wenn der ‘Zorn’ der Menschen als Verfehlung gebrandmarkt wird und wenn der ‘Zorn’ Gottes und seiner rechtscha¤enen Ma¨nner (rihtwisra manna) und Propheten als geheiligt dargestellt wird. So wird es fast unmo¨glich, an der Worten selbst den Bedeutungsunterschied zu sehen [Already in the Bible, the same words are used when the anger of people is labelled as a sin as when the anger of God and his righteous men (rihtwisra manna) and prophets is shown as holy. Thus it is impossible to distinguish the di¤erence in meaning by the words themselves]’’, Fabiszak claims that ‘‘the experiencers of anger are usually those in power’’ (Fabiszak 2002: 268) and that ‘‘common people do not seem to experience it’’ (Fabiszak 2002: 267). A similar idea is to be found in Kay (2000: 60), when she claims that wrath is typically used with reference to gods. There are, however, some elements in Diller’s study that indicate that his hypothesis may not hold – or at least, that it needs to be scrutinized more closely. The semantic traits that can be assigned to wrath in Chaucer’s work do not seem to hold for Piers Plowman. The usages of wrath in those two texts are ‘‘so much at variance with each other that any generalizations about meaning-in-the-language must be made with the greatest caution’’ (Diller 1994: 232). Moreover, he mentions frequent co-occurrences of wrath, anger and ire, which may indicate that there is but little semantic di¤erence between those expressions. Also, had ire not been excluded from Diller’s study, the results of his semantic analysis would most probably have been less clear-cut. The reason why Diller does not include ire in his analyses, although ‘‘often ire
How anger rose: Hypothesis testing in diachronic semantics
117
seems to be synonymous with anger and wrath’’, is that ‘‘its position as one of the cardinal sins makes it of course stand out [. . .] it is the disposition from which the states of mind and behaviours of wrath and anger arise’’ (Diller 1994: 226). In our material, however (as we saw in Table 2), ire does not exclusively refer to the cardinal sins or dispositions, nor are such references absent from anger and wrath. In addition, we may need to add lectal factors to the semantic ones suggested by Diller. The development of the Early Modern English vocabulary is heavily influenced by text type features: romance loanwords for instance will obviously appear more readily in texts of romance origin. So, in our test of the Diller hypothesis, we will add text type related features to the analytic framework. (Terminologically speaking, we here use ‘text type’ in a looser sense than the technical definition given by Biber 1989. Biber identifies text types on the basis of a statistical analysis of a broad array of linguistic features, whereas we refer to text types on the basis of a restricted number of features: the distinction between religious and non-religious texts, and that between texts of romance versus nonromance origin.) 2.3. The coding schema It follows from our analysis of the Diller hypothesis that the observations we have for anger and its competitors need to be coded for a number of potentially relevant features. With regard to the lectal variables, we make a distinction between religious and non-religious texts, and between texts of romance versus non-romance origin. (As texts of romance origin we consider translations or adaptations of Latin or French sources.) In technical terms, then, these two lectal features take the form of factors with each two factor levels. With regard to the semantic variables, we distinguish three factors: the status of the experiencer, which may be ‘high’ or ‘non-high’; the o¤ended value, which may be ‘private’ or ‘non-private’; and the type of reaction, which may be ‘violent’ or ‘non-violent’. Needless to say, consistently attributing these features is not an easy matter, and some allowance needs to be made for the e¤ect of subjective interpretations. The semantic feature that can be labelled most easily is the degree of violence of the experiencer’s reaction, provided that this reaction is mentioned in the text. With regard to the type of o¤ended value, we only apply the label private o¤ended value to very clear cases, like slighted secret love (in which social implications are ruled out because it is secret) or anger at practical
118
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
problems (a broken sword, an intractable donkey, a horse that is killed, being in pain, being hungry, etc.). The other category, labelled ‘‘other’’, contains the attestations in which it is hard to determine whether the o¤ended value is a private or a public one and those situations that a¤ect ‘‘the dignity, the authority of the Experiencer, whose functioning is relevant to an entire social order’’ (Diller 1994: 227), like a king’s status which is threatened by loss of warriors in a battle or by an inferior disobeying his order. With regard to the rank of the experiencer, the occasional di‰culties are settled by the following rule of thumb: ranks that are considered high are God, pagan gods, authorities in some field, the elements, kings, knights, husbands (vs. wives), parents (vs. children) and superiors. Ranks that are considered low are animals, children, servants, the author of the text (who often presents himself as the reader’s humble servant) and sinners. When people of the same rank are involved, they were labelled as neutral, as were all other ranks. However, given the low frequency of low ranking participants, the ‘low’ group and the ‘neutral’ group were eventually combined into the value ‘non-high’. 3. Analyses We will present two types of analysis. As a first step, we will perform a number of exploratory bivariate analyses, in which the presence of anger in the data is set o¤ against each of the three potentially relevant semantic factors separately. The second step consists of a multivariate analysis in which we consider the joint e¤ect of the three factors. Except for the distinction between a bivariate and a multivariate perspective, the analyses in the first section di¤er in two other respects from that in the second section. First, while the multivariate analysis will zoom in on the c1400 data point (which represents the period when anger receives its major frequency boost), the bivariate analyses take into account the c1300, c1400 and c1500 data. Second, while we will only consider an onomasiological perspective in the second section, the first will take both an onomasiological and a semasiological point of view, i.e. we will consider both the question whether the presence of anger in the onomasiological profile for a given semantic value changes significantly over time, as the question whether the presence of a given semantic value in the semasiological profile of anger changes significantly over time. These bivariate analyses are emphatically exploratory: the real test of the Diller hypothesis is to be found in the second section.
How anger rose: Hypothesis testing in diachronic semantics
119
3.1. Exploratory bivariate analyses First, let us consider the suggestion that anger correlates with private o¤ences; the relevant data are presented in Table 3. For the semasiological analysis, we consider the distribution of private versus non-private contexts in the range of application of anger from c1400 to c1500. A Pearson’s chi-squared test with Yates’ continuity correction yields a p-value of 0.01580, indicating a significant change. In terms of proportions, the relative frequency of private contexts rises from 23.6% to 40.8% from c1400 to c1500. For the onomasiological analysis, we consider the distribution of anger versus its competitors (ire and wrath combined) in the expression of private o¤ences from c1300 over c1400 to c1500. A Pearson’s chi-squared test yields a p-value of 1.872e-09, indicating a significant change. In terms of proportions, the relative frequency of anger rises from 0% over 29.8% to 50.0%. In other words, both semasiologically and onomasiologically, the association between anger and ‘private o¤ended values’ increases significantly over time. The onomasiological increase should be treated with some caution, however, because we may also note a significant increase from 0% over 5.5% to 44.5% of the onomasiological salience of anger for non-private contexts: the growing success of anger from c1400 to c1500 seems to imply that it becomes an attractor for the expression of all kinds of anger – in this case, for private and non-private contexts alike. Second, we may have a look at the suggestion that anger is typically used for experiencers with low or neutral social rank; the relevant data
Table 3. O¤ended value (private vs. non-private) in ME texts private 1300
1400
ire
6
13
wrath
42
130
anger
17
55
9
281
wrath
31
656
anger
71
183
ire
13
26
wrath
58
202
ire
1500
other
120
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
Table 4. Rank of experiencer (high vs. non-high) in ME texts high 1300
1400
1500
non-high
ire
12
7
wrath
83
87
anger
25
50
ire
191
107
wrath
460
26
anger
131
124
17
21
162
90
ire wrath
are presented in Table 4. For the semasiological analysis, we consider the distribution of high ranking versus non-high ranking experiencers in the range of application of anger from c1400 to c1500. A Pearson’s chi-squared test with Yates’ continuity correction yields a p-value of 0.008816, indicating a significant change. In terms of proportions, the relative frequency of non-high ranking experiencers changes from 66.6% to 48.6% from c1400 to c1500. For the onomasiological analysis, we consider the distribution of anger versus its competitors (ire and wrath combined) in contexts with non-high experiencers from c1300 over c1400 to c1500. A Pearson’s chi-squared test yields a p-value of 2.2e-16, indicating a significant change. In terms of proportions, the relative frequency of anger rises from 0% over 13.0% to 52.7%. In other words, the onomasiological association between anger and ‘‘non-high experiencers’’ increases significantly over time, but the semasiological change shows that the prominent position which the ‘‘non-high’’ feature occupies at the c1400 point is weakened at c1500. This is probably a reflection of the rise of anger as the default term for ‘anger’ at large: the success of anger implies that it becomes an attractor for the expression of all kinds of anger. This interpretation is corroborated by the fact that an onomasiological analysis of the distribution of anger versus its competitors (ire and wrath combined) in contexts with high ranking experiencers from c1300 over c1400 to c1500 also shows a significant increase of the onomasiological salience of anger, from 0% over 3.6% to 42.2%.
How anger rose: Hypothesis testing in diachronic semantics
121
Table 5. Intensity of reaction (high vs. non-violent) in ME texts violence 1300
1400
ire
other
8
2
5
wrath
44
23
33
anger
7
3
26
87
113
45
wrath
112
287
90
anger
43
55
80
ire
18
6
7
wrath
51
101
38
ire
1500
a¿iction
Third, we may explore whether anger is typically used for non-violent reactions; the relevant data are presented in Table 5. For the semasiological analysis, we consider the distribution of violent versus non-violent reactions in the range of application of anger from c1400 to c1500. (The features ‘‘a¿iction’’ and ‘‘other’’ are combined as instances of ‘‘non-violent’’.) A Pearson’s chi-squared test with Yates’ continuity correction yields a p-value of 0.694, which does not indicate a significant change. In terms of proportions, the relative frequency of non-violent reactions changes from 80.5% to 75.8% from c1400 to c1500. For the onomasiological analysis, we consider the distribution of anger versus its competitors (ire and wrath combined) in contexts with non-violent reactions from c1300 over c1400 to c1500. A Pearson’s chi-squared test yields a p-value of 2.2e-16, indicating a significant change. In terms of proportions, the relative frequency of anger rises from 0% over 5.1% to 47.0%. As in the case of the non-high ranking experiencers, the onomasiological association between anger and non-violent reactions increases significantly over time, but the semasiological status-quo suggests that the onomasiological change is primarily a reflection of the rise of anger as the default term for ‘anger’ at large. This interpretation is corroborated by the fact that an onomasiological analysis of the distribution of anger versus its competitors (ire and wrath combined) in contexts with violent reactions from c1300 over c1400 to c1500 also shows a significant increase of the onomasiological salience of anger, from 0% over 3.3% to 38.3%.
122
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
Figure 2. Features a¤ecting the choice for anger at c1400
Overall, this initial round of exploratory analyses primarily establishes the onomasiological attraction of anger in the transition from c1300 over c1400 to c1500: by c1500, anger has established itself as a major competitor for ire and wrath. There are some semasiological indications that at least one of the features indicated by Diller, viz. a context of private o¤ence, plays a specific role, but if they play a marked role, it would appear to be at c1400: by c1500, anger is semasiologically widening again. We therefore zoom in on c1400 for a second round of exploratory bivariate analyses. We now also include the lectal variables introduced earlier. In the analyses presented so far, the numbers di¤ered from analysis to analysis, because it was not possible to code all observations for all features. For instance, an observation that could not be coded for ‘‘o¤ended value’’ because the contextual data were missing, would have been excluded
How anger rose: Hypothesis testing in diachronic semantics
123
from Table 3 but might have been included in Table 4 and 5, if it was coded for intensity of reaction and experiencer rank. In the analyses to come, however, we will only use observations coded for all variables. The overview in Figure 2 shows that each of the variables, taken separately, seems to have an e¤ect on the presence of anger: the choice for anger is positively a¤ected by the presence of non-high ranking experiencers, private contexts, non-violent reactions, non-religious texts, and non-romance origins of the texts. (The e¤ects are significant. The p-values for a Pearson’s Chi-squared test with Yates’ continuity correction are respectively 5.074e-06, 5.718e-14, 1.501e-07, 2.935e-11, 0.01867.) 3.2. Multivariate analysis The results of the bivariate analyses lead to a multivariate analysis in which we can analyze the combined e¤ect of the variables. Given the onomasiological nature of our subject matter, we can model the lexical choice for anger in the form of a logistic regression. The response variable is the choice for anger versus its immediate competitors ire and wrath, and the explanatory variables are constituted by the five dimensions that we discussed above. The dataset consists of 782 observations. We first perform a regression analysis without interactions, i.e. an analysis in which the relevant variables may each have a separate e¤ect, but in which we do not yet investigate whether combinations of variables may have unpredictable e¤ects. Such an unpredictable e¤ect is called an ‘‘interaction’’. Linguists who are not familiar with the concept of a statistical interaction may think of it in terms of morphological compositionality. The meaning of a composite expression is often more specific than the meaning that can be predicted from the constituent parts and the compositional mechanism. It is the same with statistical interactions: if there is an interaction e¤ect, the e¤ect that you get in a combination of variables cannot be predicted entirely from the e¤ect of those variables separately. But as a first step, we go through a ‘‘main e¤ects’’ regression analysis that does not include interactions. A forward stepwise procedure for the selection of the variables suggests not including the variable Origin. (This is confirmed by a regression analysis in which all variables are entered immediately, in non-stepwise fashion.) As Figure 3 suggests, there is a strong link between the dimensions Origin and Type: most of the texts of romance origin are religious ones, and most of the texts of a non-religious nature have a non-romance origin. From that perspective, we can understand that Origin would not have a separate
124
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
Figure 3. The relation between Origin and Type at c1400
e¤ect on the statistical model: its e¤ect is so to speak pre-empted by the e¤ect of Type. The results of the regression analysis with the four variables Type, Reaction, O¤ence and Experiencer are given in summary form below. The significance values in the last column show that the presence of a non-religious text, a violent reaction, and a private o¤ence have a significant e¤ect (as indicated by the asterisks); the presence of an experiencer of non-high rank is borderline significant (as indicated by the dot). The column with the estimates shows that the e¤ect of a non-religious text, a private o¤ence and a non-high experiencer is positive: these features favour the use of anger. The e¤ect of a violent reaction is negative: it diminishes the use of anger. Overall, the model corresponds very well with the Diller hypothesis: the use of anger is favoured by contexts with
How anger rose: Hypothesis testing in diachronic semantics
125
private, personal o¤ences, with non-violent reactions, and (to a lesser extent) with subjects that do not occupy a high social rank. An important addition to the Diller model is the recognition of a lectal e¤ect that seems to correspond with the non-romance origin of anger: as a non-romance loan, it appears more in a text type that has a predominantly non-romance background, viz. the non-religious texts. glm(formula ¼ expression P type þ reaction þ offence þ experiencer, family ¼ binomial(logit), data ¼ c1400) Estimate Std. Error z value Pr(>|z|) (Intercept) typenonreligious reactionviolence offenceprivate experiencernonhigh
-4.3896 2.8192 -1.9798 1.3347 0.5993
0.4208 0.4644 0.4282 0.4149 0.3184
-10.432 6.070 -4.624 3.217 1.882
|z|) (Intercept) typenonreligious reactionviolence offenceprivate
-6.4790 4.9865 2.3356 1.0984
1.1916 1.2113 1.4824 0.4035
-5.437 4.116 1.576 2.722
5.42e-08 3.85e-05 0.11513 0.00648
*** *** **
126
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
experiencernonhigh typenonreligious: reactionviolence reactionviolence: experiencernonhigh typenonreligious: experiencernonhigh
4.0321
1.2565
3.209
0.00133
**
-3.5180
1.4860
-2.367
0.01791
*
-3.2007
1.3597
-2.354
0.01857
*
-3.4433
1.2820
-2.686
0.00724
**
Null deviance: 413.35 on 781 degrees of freedom Residual deviance: 279.26 on 774 degrees of freedom AIC: 295.26
Interpreting these results, we first notice a slight change in the main e¤ects: the significance of non-high experiencers increases, but the e¤ect of non-violent reactions disappears. Nevertheless, the main e¤ects still correspond to the general lines of the (lectally enriched) Diller hypothesis. In addition, the variable Reaction does not disappear from the model as a whole: it does play a role in the interactions. To get a better understanding of those interactions, we include Figure 4, which gives a graphical repre-
Figure 4. Mosaic plots for significant interactions
How anger rose: Hypothesis testing in diachronic semantics
127
sentation of the e¤ects of the three interactions that come out as significant in the regression analysis. The interaction of Type and Reaction reveals that the distinction between violent and other reactions only has an outspoken e¤ect in the context of non-religious texts. In the religious texts, the e¤ect is a minor one (and to the extent that it is present, it goes in the other direction than in the non-religious texts). The lectal factor, in other words, contributes even more outspokenly to the rise of anger than a main e¤ects model suggests: the e¤ect of the semantic feature Reaction is strongly dependent on the presence of the ‘non-religious’ value of Type. The interaction between Experience and Reaction works in the same way as that between Type and Reaction. The e¤ect of non-violent reactions as a trigger for anger is restricted to non-high ranking subjects. For subjects with higher rank, the e¤ect goes in the opposite direction, but it is too weak to pay much attention to. The interaction between Type and Experiencer, finally, is the only one that goes against expectations. In the previous two interactions, two factors that are positively associated with anger in the main e¤ects model, and that can – on the basis of the Diller hypothesis and on the basis of the etymological background of anger – be plausibly interpreted as triggering anger, reinforce each other. In the interaction of Type and Experiencer, however, anger is primarily reinforced by non-high experiencers (which is consistent with expectations) in the context of religious texts (which goes against the pattern found elsewhere). Although the regression model suggests that the direction of the e¤ect might be reversed in the non-religious texts, the mosaic plot shows that in the context of non-religious texts, there is also a positive (but much weaker) e¤ect on anger of non-high experiencers. This of course is just as might be expected. We have no clear explanation for the remarkable reinforcement of anger with non-high experiencers in religious texts. It is the only point on which our findings are not immediately interpretable in the context of a lectally enriched Diller hypothesis.
4. Coming to conclusions On the basis of an encompassing database of attestations of anger, ire and wrath in Middle English texts, we have performed a statistical analysis of the factors contributing to the emergence of anger as the dominant term. Specifically, we have tried to test the hypothesis put forward by Diller
128
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
(1994), who suggests that anger was introduced in the lexical field of anger expressions because social changes gave rise to new forms of anger: whereas wrath expressed the traditional type of anger, in which the angry person has a high social rank and typically reacts in a violent way, anger expressed the emotions of lower-ranked persons, who react less violently. Testing such a hypothesis is possible by manually coding the attestations for the factors that relate to the hypothesis, and then subjecting the data to a statistical analysis – in our case, a logistic regression – that is able to establish the statistical significance of the relevant features. Overall, our findings support the hypothesis formulated by Diller: in the transition from the 14th to the 15th century, when anger receives a major boost, the onomasiological choice for anger is fostered by private rather than public contexts, by the presence of subjects with a lower social rank, and by the presence of non-violent reactions. At the same time, an important observation has to be added to the hypothesis formulated by Diller. In fact, we find clear evidence that the emergence of anger is not only influenced by semantic factors but also by lectal ones. Anger occurs more in non-religious texts (which then also often happen to be texts of non-romance origin), and for at least one semantic feature, the e¤ect is reinforced when it occurs in texts of this type. While this makes clear that the Diller hypothesis needs to be lectally enriched, the interpretation of the role of the non-religious, non-romance text type can go in two directions. On the one hand, the lectal distribution can be seen as a reflection of the etymological origin of anger as a Scandinavian loan; non-romance loans are not naturally favoured in texts of romance origin. On the other hand, the lectal distribution may also link up with Diller’s cultural interpretation of the rise of anger. The rise of anger appears to be situated in contexts in which ordinary, non-high ranking people react non-violently (for instance, with introvert emotionality rather than extrovert retaliation) to o¤ences that are perceived as private rather than public. Diller suggests that this is exemplary of a cultural change towards a modern, individualist sensitivity. It is perhaps no surprise then that this sensitivity appears more often in texts of a non-religious kind: the religious texts are more likely to express the traditional worldview, and less likely to incorporate the emerging individual, pre-renaissance sensitivity of the common man. Establishing such a cultural interpretation will need more evidence than what can be derived from the present study of anger alone: more lexical items from the lexical field of emotion terms and cognitive terms would need to be investigated to see if we can indeed trace the development of
How anger rose: Hypothesis testing in diachronic semantics
129
a modern sensibility by lexical means. Also, the history of anger as such already contains elements that suggest that a cultural shift is not the whole story. While the changes at c1400 take the form of a conceptual specialization of anger for the ‘‘modern’’ type of feeling, we also saw that anger has already expanded its semasiological range by c1500. The lexical di¤erentiation that fostered anger at c1400 already seems to be on its way out by c1500, when anger begins to emerge as the dominant term for any form of anger. As such, the cultural reading of the changes at c1400 will have to be supplemented with an explanation of the subsequent developments. In other words, with anger and with other lexical items, more remains to be done if we aim at interpreting the history of the vocabulary against the background of the history of culture and mentality. But at least we now know that it can be done by using the analytical techniques of contemporary corpus linguistics. References Allan, Kathryn 2008 Metaphor and Metonymy: A Diachronic Approach. (Publications of the Philological Society 42). Chichester: Wiley-Blackwell. Baugh, Albert C. and Thomas Cable 1978 A History of the English Language, 3rd ed. London: Routledge & Kegan Paul. Berndt, Rolf 1984 A History of the English Language. Leipzig: Verlag Enzyklopa¨die. Biber, Douglas 1989 A typology of English texts. Linguistics 27: 3–43. Burnley, David 1992 Lexis and semantics. In: Norman Blake (ed.), The Cambridge History of the English Language, Volume II (1066–1476), 409– 499. Cambridge: Cambridge University Press. Chase, Thomas 1988 The English Religious Lexis. Queenston, Ontario: Edwin Mellen Press. Coleman, Julie 1999 Love, Sex, and Marriage: A Historical Thesaurus. Amsterdam: Rodopi. Dekeyser, Xavier 1990 The prepositions ‘with’, ‘mid’ and ‘again(st)’ in Old and Middle English: A case study of historical lexical semantics. Belgian Journal of Linguistics 5: 35–48.
130
Dirk Geeraerts, Caroline Gevaert and Dirk Speelman
Dekeyser, Xavier 1995 Travel, journey and voyage: An exploration into the realm of Middle English lexico-semantics. Nowele 25: 127–136. Dekeyser, Xavier 1996 Loss of prototypical meanings in the history of English semantics, or Semantic redeployment. Leuvense Bijdragen 85: 283–291. Diller, Hans-Ju¨rgen 1994 Emotions in the English lexicon: A historical study of a lexical field. In: Francisco Moreno Ferna´ndez, Miguel Fuster and Juan Jose Calvo (eds.), English Historical Linguistics 1992, 219–234. Amsterdam: John Benjamins. Fabiszak, Malgorzata 2001 The Concept of ‘Joy’ in Old and Middle English: A Semantic Analysis. Pila: Wyzsza Szkola Biznesu. Fabiszak, Malgorzata 2002 A semantic analysis of FEAR, GRIEF and ANGER words in Old English. In Javier E. Dı´az Vera (ed.), A Changing World of Words: Studies in English Historical Lexicography, Lexicology and Semantics, 255–275. Amsterdam: Rodopi. Geeraerts, Dirk 1997 Diachronic Prototype Semantics. A Contribution to Historical Lexicology. Oxford: Clarendon Press. Geeraerts, Dirk and Caroline Gevaert 2008 Hearts and (angry) minds in Old English. In: Farzad Sharifian, Rene´ Dirven, Ning Yu and Susanne Niemeier (eds.), Culture and Language: Looking for the Mind inside the Body, 319–347. Berlin/New York: Mouton de Gruyter. Geeraerts, Dirk and Stefan Grondelaers 1995 Looking back at anger: Cultural traditions and metaphorical patterns. In: John Taylor and Robert E. MacLaury (eds.), Language and the Construal of the World, 153–180. Berlin/New York: Mouton de Gruyter. Geeraerts, Dirk, Stefan Grondelaers and Peter Bakema 1994 The Structure of Lexical Variation: Meaning, Naming, and Context. Berlin/New York: Mouton de Gruyter. Geipel, John 1971 The Viking Legacy: The Scandinavian Influence on the English and Gaelic Languages. Newton Abbot: David & Charles. Gevaert, Caroline 2007 The history of ‘anger’: The lexical field of anger from Old to Early Modern English. PhD thesis, University of Leuven. Grondelaers, Stefan, Dirk Speelman and Dirk Geeraerts 2007 Lexical variation and change. In: Dirk Geeraerts and Hubert Cuyckens (eds.), The Oxford Handbook of Cognitive Linguistics, 988–1011. New York: Oxford University Press.
How anger rose: Hypothesis testing in diachronic semantics
131
Kay, Christian J. 2000 Historical semantics and historical lexicography: Will the twain ever meet? In: Julie Coleman and Christian J. Kay (eds.), Lexicology, Semantics and Lexicography, 53–68. Amsterdam: John Benjamins. Kleiber, Georges 1978 Le mot ‘‘Ire’’ en ancien franc¸ais. Paris: Klincksieck. Koivisto-Alanko, Paivi 2000 Abstract Words in Abstract Worlds. Directionality and Prototypical Structure in the Semantic Change in English Nouns of Cognition. Helsinki: Socie´te´ Ne´ophilologique de Helsinki. Ko¨vecses, Zolta´n 2005 Metaphor in Culture: Universality and Variation. Cambridge; New York: Cambridge University Press. Molina, Clara 2000 Give sorrow words. PhD thesis, Universidad Complutense de Madrid. Molina, Clara 2005 On the role of onomasiological profiles in merger discontinuations. In: Nicole Delbecque, Johan Van der Auwera and Dirk Geeraerts (eds.), Perspectives on Variation: Sociolinguistic, Historical, Comparative, 177–194. Berlin: Mouton de Gruyter. Serjeantson, Mary S. 1935 A History of Foreign Words in English. London: Routledge. Sylvester, Louise 1994 Studies in the Lexical Field of Expectation. Amsterdam: Rodopi. Tetzla¤, Gerhard 1954 Bezeichnungen fu¨r die Sieben Todsu¨nden in der altenglischen Prosa: Ein Beitrag zur Terminologie der altenglischen Kirchensprache. PhD thesis, Freie Universita¨t Berlin. Thornton, Freda J. 1988 A Classification of the Semantic Field of Good and Evil in the Vocabulary of English. PhD thesis, Glasgow University. Tissari, Heli 2001 Metaphors we love by: On the cognitive metaphors of ‘Love’ from the 15th century to the present. Studia Anglica Posnaniensia 36: 217–242.
Diachronic collostructional analysis: How to use it and how to deal with confounding factors Martin Hilpert Abstract This paper discusses an approach to historical semantics that is particularly concerned with meaning change in grammatical constructions. It focuses on a type of change that is less tangible than overt morphosyntactic change, but that is no less indicative of semantic developments, namely the changing interrelations of grammatical constructions and their lexical collocates. The first part of the paper outlines a diachronic corpus-linguistic technique that allows the analyst to track changes in the collocational profile of a construction. The English keep V-ing construction is chosen for a case study. Subsequent parts of the paper address two confounds of the technique. The first one relates to the issue that arbitrary divisions of diachronic corpus data (into decades, half-centuries, or centuries) may map only imperfectly onto actual stages in the development of a linguistic form. A refinement of the analysis, by grouping one’s data in a bottom-up, datadriven way, can alleviate this problem. The second confound relates to di¤erences between genres and the question whether a construction shows genre-specific developments. Another refinement is presented to alleviate this second problem.
1. Introduction This paper discusses a corpus-linguistic approach to historical semantics that is particularly concerned with meaning change in grammatical constructions. The latter notion is understood here in the sense of Goldberg (1995, 2006), who defines constructions as entrenched symbolic units that couple a form with a meaning. Both form and meaning of constructions are subject to diachronic change, as is discussed in numerous studies of grammaticalization (Hopper and Traugott 2003). Grammaticalizing forms typically display phonological and morphosyntactic change, and at the same time a semantic shift towards more abstract, grammatical meaning. The present paper focuses on a type of change in grammatical constructions that is less tangible than overt morphosyntactic change, but that is no less indicative of semantic developments. This type of change concerns the interrelations of grammatical constructions and the lexical elements that
134
Martin Hilpert
occur with them. The study of collocates has a long and successful tradition in corpus linguistics (Sinclair 1991, Biber 1993, to name just two influential publications), which has given rise to the collostructional methods of analysis (Stefanowitsch and Gries 2003, Gries and Stefanowitsch 2004) that focus in particular on the lexical collocates of grammatical constructions. Grammatical constructions are more abstract and schematic than lexical items – not only in meaning, but also in form. While the form of an English noun such as dog is fully specified, the form of a construction like [BE going to INF ] is not: The construction occurs with di¤erent forms of the copula, and it projects a lexical verb in the infinitive, which can be instantiated by just about any verb of the English language. While there is a large amount of variation in the infinitive slot, there are robust tendencies that certain verbs appear more often than others in the construction when their overall text frequency is controlled for. For instance, verbs that are telic and agentive, such as get and put, occur with be going to more often than expected (Hilpert 2008: 112). It is argued that the meaning of a construction can be characterized in terms of the lexical elements that most typically occur with it. If we accept the idea that the typical collocates of a grammatical construction reflect its meaning, then it follows that shifting collocational patterns are indicative of semantic change. This idea is in fact implicit in many grammaticalization studies. For instance, the English auxiliary can used to denote a mental ability (Heine 1993: 90), and consequently only took infinitive complements that would express the actions of sentient human beings, as in teach, say, or agree. Over time, as the auxiliary developed semantically, these selectional restrictions gradually loosened, and it came to occur with a wider set of collocates. The occurrence of new lexical elements in a construction thus testifies to on-going semantic change. Besides such qualitative changes in the collocate set of a construction, there are also quantitative changes. Certain collocates may become more or less frequent over time. The most frequent collocates of a construction in the 18th century will be di¤erent from the most frequent ones in the 19th and 20th century. It was argued in Hilpert (2006, 2008) that a principled, quantitative analysis of shifting collocational preferences could o¤er some insight into the historical semantics of grammatical constructions. The basic idea of such an analysis is an application of collostructional analysis (Gries and Stefanowitsch 2004) to diachronic corpus data. Diachronic corpora o¤er comparable sets of texts that represent subsequent periods of time in the development of a language. Amongst many other things, these resources allow the application of quantitative approaches
Diachronic collostructional analysis
135
such as the diachronic study of collocational patterns. For a construction such as be going to, we can compare its present-day collocates against its most typical collocates in earlier centuries, and this may teach us something about its semantic development. While diachronic collostructional analysis can generate interesting results, it also su¤ers from several confounds that are common ailments of historical corpus-linguistic approaches. A fundamental issue is of course the reliability of the results, given that many historical corpora are relatively small in size and heterogeneous in genre and variety. A second problem associated specifically with diachronic corpora concerns not so much the data itself, but the partitioning of the data into time-slices (Stefanowitsch 2006, Gries and Hilpert 2008). Arbitrary divisions into decades, half-centuries, or centuries potentially map only imperfectly onto actual historical stages in the development of a linguistic form. This paper addresses both of these problems, as will be explained below in more detail. In line with the aims of the present volume, this paper aims to present the diachronic application of collostructional analysis as a viable method for historical semantics, addressing its potential, its pitfalls, and ways around those pitfalls. This will be done through an illustrating case study of the English aspectual construction keep V-ing, which has been the subject of several accounts (Freed 1979, Brinton 1988, Cappelle 1999, amongst others). In the Longman Grammar, the construction is characterized as ‘‘a kind of progressive marker, emphasizing that the action described in the ing-clause is continuous or recurrent’’ (Biber et al. 1999: 746). Biber et al. also note that the construction is most frequent in spoken conversation, as compared to fiction, news, and academic writing. The focus of the present study is on very recent changes in the construction, which are studied through shifts in its most typical ing-form complements. The Corpus of Contemporary American English (Davies 2008), a large diachronic corpus of recent American English (1990–2009), is used for this purpose. The choice of this corpus takes care of the problem of data sparseness mentioned above; if there are shifting patterns to be observed, these are unlikely to be due to sampling error. However, several problems remain to be addressed, even when a large corpus is chosen. The remainder of this paper is organized in the following way. Section 2 presents a diachronic collostructional analysis of keep V-ing, discussing the analytical procedure in a step-by-step fashion. Section 3 discusses a first confound, namely the consequences of partitioning the corpus data into arbitrary time periods, and modifies the earlier analysis in order to alleviate this issue. Section 4 brings up a second confound that concerns
136
Martin Hilpert
the aggregation of data from di¤erent genres. Again, the earlier analysis is modified to do justice to di¤erences between genres. Section 5 o¤ers a few concluding remarks on the use of the proposed method. In sum, this paper hopes to demonstrate that diachronic collostructional analysis can be usefully applied in historical semantics and that, given the right resources, its limitations can be kept in check.
2. A diachronic collostructional analysis of English keep V-ing This section discusses the sequence of steps that constitutes a diachronic collostructional analysis – data retrieval, analysis, and interpretation. 2.1. Data retrieval The present study is based on the Corpus of Contemporary American English (COCA), which is a large balanced corpus of spoken discourse, fiction, popular magazines, newspapers, and academic texts. The corpus is a monitor corpus, which means that new materials are added incrementally. At the time of writing, the corpus contains more than 400 million words of text, with approximately 20 million words representing each year from 1990 onwards. The years from 1990 to 2007 were used in this study. In order to retrieve tokens of the keep V-ing construction, the corpus was searched for instances of the lemma keep with an ing-form to its immediate right, and no copula to the left, to exclude passives (be kept waiting). The present study excludes word sequences such as keep on trying, which arguably instantiates a separate, if similar, construction (Brinton 1988, Cappelle 1999). Table 1 gives an overview of the database for the present study.1 The numbers of tokens retrieved for each year range between approximately 1,500 and 2,000. A fair question to ask at this point is whether there are any reasons to suspect that keep V-ing has undergone changes in those eighteen years at all. One piece of evidence would be that the construction has become more frequent over time, as is visualized in Figure 1. Over the 18 years, a significant upward trend can
1. The usual starting point of any collostructional analysis (Stefanowitsch and Gries 2003) is an exhaustive retrieval of all tokens of the construction under investigation. The present study minimally departs from this principle by disregarding tokens of keep V-ing that occur only once in the entire corpus.
Diachronic collostructional analysis
137
Table 1. Database for the present study Year
Corpus Size
keep V-ing tokens
Tokens / MW
1990
20,532,370
1,600
77.93
1991
20,639,513
1,507
73.02
1992
20,728,152
1,714
82.69
1993
20,761,353
1,804
86.89
1994
20,465,062
1,836
89.71
1995
20,684,728
1,819
87.94
1996
20,098,563
1,882
93.64
1997
20,337,971
1,811
89.05
1998
20,592,091
1,908
92.66
1999
20,607,308
1,982
96.18
2000
20,632,590
1,918
92.96
2001
20,110,099
1,780
88.51
2002
20,518,429
1,777
86.61
2003
20,747,753
1,850
89.17
2004
20,844,105
1,975
94.75
2005
20,834,418
1,832
87.93
2006
20,993,988
1,931
91.98
2007
20,619,869
1,980
96.02
370,748,362
32,906
Mean ¼ 88.76
Totals
be observed (Kendall’s t ¼ 0.425, Ptwo-tailed ¼ 0.015).2 While this development does not vouch for the fact that some semantic change occurred as well, it at least establishes that the construction has not remained in complete stasis. Corroborating evidence for the presence of a recent development comes from the TIME corpus (Davies 2007), which shows that the construction has steadily increased its text frequency in journalistic 2. Figure 1 is based on the numbers in the rightmost column in Table 1. A motivation for the use of Kendall’s t to check for the presence of frequency trends is given in Hilpert and Gries (2009).
138
Martin Hilpert
Figure 1. Relative frequency development of keep V-ing
prose since the 1920s, and has continued this trend from the 1990s through the 2000s. It is therefore maintained here that the keep V-ing construction has undergone some type of change during the period that is sampled; what needs to be established is what qualitative changes accompanied the mere quantitative developments. It is here that a look at shifting collocational preferences may yield some insights. Table 2 is an excerpt from the data that has been retrieved, showing the absolute frequencies of the six most frequent verbs during the first six years of the corpus. Overall, close to 33,000 tokens were retrieved from
Table 2. ing-form frequencies by year 1990
1991
1992
1993
1994
1995
...
133
111
156
157
183
174
...
coming
87
81
78
84
102
99
...
saying
82
80
117
120
91
78
...
telling
55
48
64
50
46
52
...
looking
49
32
52
60
55
58
...
trying
48
53
57
55
57
59
...
...
...
...
...
...
...
...
going
Diachronic collostructional analysis
139
the corpus (cf. Table 1); these represent 875 di¤erent ing-form types that range in frequency between 3,040 (going) and 2 (absorbing, importing, pondering, and many others). The information shown in Table 2 constitutes the basic input for a diachronic collostructional analysis – the procedure operates on the information how often each verb occurred in each time slice of the corpus. The next subsection discusses how this input is processed and transformed into results. 2.2. Analysis The analytical procedure in the present study is an adapted form of distinctive collexeme analysis3 (Gries and Stefanowitsch 2004), which is a method that contrasts two or more constructions in their respective collocational preferences.4 Before turning to its diachronic application, this section briefly explains how this method works on the basis of synchronic data, for which it was originally designed. Distinctive collexeme analysis particularly lends itself to the study of semantically related constructions, as for instance English will and be going to. In this, the method can be metaphorically understood as a semantic wine tasting. In a wine tasting, tastes are compared that are very similar to begin with. Through the direct comparison of, say, two Chardonnays from the same region that are produced from slightly di¤erent combinations of grapes, quite minute di¤erences can be detected. For instance, one of the two may stand out as particularly buttery, smoky, or peppery. This perception is dependent on the contrast of the two; tasting the ‘‘smoky’’ wine in isolation might yield a di¤erent impression. Naturally, when we compare two di¤erent wines, we are making much finer distinctions than in a comparison of two entirely di¤erent types of beverage, such as beer and co¤ee. In a way then, our comparison exaggerates di¤erences between the two wines, but nevertheless these exaggerations are instructive, rather than misleading. We learn something real about the taste differences that characterize Chardonnay wines. To end this metaphorical side note, we have to keep in mind that a distinctive collexeme analysis
3. There are other types of collostructional methods that work in slightly di¤erent ways. For instance, collexeme analysis (Stefanowitsch and Gries 2003) focuses on only a single construction and compares the frequency of its collocates against the overall frequency of these elements in the chosen corpus. 4. The discussion in this section draws on materials from section 2 in Hilpert (2006).
140
Martin Hilpert
will help us detect rather fine semantic di¤erences that might not become apparent through the investigation of a single construction in isolation. In the case of will and be going to, both constructions are used to refer to future events, but besides several other di¤erences, the constructions show distinct co-occurrence patterns with infinitival collocates. Importantly, this di¤erence does not immediately fall out of directly observable raw frequencies. Table 3 lists the ten most frequent verbs in both constructions, based on data from the British National Corpus, henceforth BNC (Leech 1992). As can be seen, there is substantial collocational overlap. In both constructions, the most frequent elements are general verbs that are either semantically light, such as do, or polysemous, such as go. Distinctive collexeme analysis allows the researcher to abstract away from elements that are frequent in both constructions. Instead, it determines whether there are asymmetries in the relative frequencies of the co-occurring lexical verbs. It then highlights those elements that are maximally distinctive for each respective construction. The method identifies all verbs that occur significantly more often with one construction than with the other, and ranks these according to the degree to which they are distinctive. Table 4 gives the example of say, which occurs significantly more often than expected with be going to, as can be determined by
Table 3. Top 10 verbs with will and going to in the BNC Will
Be Going To
Verb
Tokens
Verb
Tokens
be
41,947
be
4,756
have
5,906
do
1,907
take
4,150
get
1,403
make
3,182
have
983
do
3,039
take
647
go
2,821
say
643
come
2,732
make
631
give
2,543
go
616
continue
2,477
happen
552
find
2,465
tell
434
141
Diachronic collostructional analysis Table 4. Input for a distinctive collexeme analysis of say in will and going to say
other verbs
Totals
will
813
185,734
186,547
going to
643
26,294
26,937
Totals
1,456
212,028
213,484
feeding the four inner fields of the table into a statistical test (Fisher Exact, p ¼ 5.41E-196). The calculation in Table 4 is done for all verbs that are encountered with the two constructions. The overall result of a distinctive collexeme analysis is a pair of lists presenting the most strongly distinctive collexemes of each construction in ranked order. Table 5 shows the ten most distinctive collexemes for both will and be going to based on data from the BNC.5 The more distinct an element is, the higher its numerical value of collostructional strength (CollStr), which
Table 5. Top 10 distinctive collexemes of will and going to in the BNC Will
CollStr
Be Going To
CollStr
continue
83.57
do
Inf
be
74.17
get
Inf
provide
61.39
say
195.36
include
56.35
happen
135.34
remain
44.76
ask
87.20
receive
42.50
die
78.72
become
41.15
put
74.96
depend
39.41
tell
58.85
enable
37.72
marry
53.99
require
36.58
let
42.95
5. All collostructional analyses discussed in this paper were performed using Coll.analysis (Gries 2004), a script for the open-source statistical software R (R development core team 2007).
142
Martin Hilpert
is a log-transformed probability value of results that are obtained through a test of the kind illustrated in Table 4. CollStr values that are larger than 1.3 indicate that an element is distinctive at the significance level of p < .05. Higher CollStr values correspond to lower probabilities of error. A value of ‘‘Inf ’’ indicates an infinitely small probability that the respective element is only erroneously found to be distinctive. The two lists in Table 5 can be used for a qualitative assessment of the semantic di¤erences between will and be going to. In a comparison of will and be going to based on the ICE-GB corpus, Gries and Stefanowitsch (2004: 114) argue that among the distinctive collexemes of will, many are non-agentive or low in transitivity. Their results are replicated here with data from the BNC, as Table 5 lists distinctive collexemes of will such as continue, include, remain, and depend. Conversely, be going to has distinctive complements that are agentive and high in transitivity, such as do, say, put, or marry. By bringing these elements into focus, distinctive collexeme analysis accentuates semantic di¤erences between potentially fairly similar constructions. Distinctive collexeme analysis can not only contrast two constructions, but it can also be used to compare three or more constructions (see Gilquin 2006 for an example). The comparison of more than two alternatives lies at the heart of the diachronic application of distinctive collexeme analysis. On the basis of diachronic corpora, the method can determine what types of co-occurring elements were typical of a given construction at di¤erent historical stages. Unlike the synchronic variant, the diachronic application does not contrast multiple constructions, but studies a single construction across multiple periods of time. To come back to the wine tasting analogy, this would compare to sampling di¤erent vintages of the same wine, or what experts call a ‘‘vertical tasting’’. In the case of the present study, the collocational profile of keep V-ing is compared across 18 years of corpus data. The motivation for applying a distinctive collexeme analysis to synchronic present-day data is the observation that raw frequencies sometimes obscure di¤erences that hold across sets of data (cf. Table 3). This motivation carries over to raw frequencies representing diachronic data. In the case of keep V-ing, the five most frequent ing-forms (as well as many other, less frequent ones) remain in their relative positions across the corpus. Figure 2 shows that going is the most frequent form throughout, followed by saying and coming, which in turn are followed by trying and telling.
Diachronic collostructional analysis
143
For an investigation of the history of keep V-ing, raw frequencies are hence only of limited use. Like a synchronic distinctive collexeme analysis, the diachronic application abstracts away from items that are equally common to all periods, highlighting instead those that are significantly more frequent than expected in one particular period. In this way, di¤erences between the periods are accentuated and actual developments become more prominent. Table 6 presents the observed and expected frequencies of the verb going over the 18 periods. The di¤erences between the respective values translate into a value of collostructional strength, which is shown in the rightmost column. A significant deviation from the expected values (i.e. an absolute value larger than 1.3) is found only in the years 1991 (less than expected) and 1998 (more than expected); the year 2004 approaches significance. These deviations appear as peaks and valleys in the topmost line of Figure 2.
Figure 2. The five most frequent verbs in keep V-ing
The comparisons of observed and expected frequencies are performed automatically for all verbs in the database with the Coll.analysis script (Gries 2004). The analysis then yields ranked lists of the most distinctive items for each year. It is here that the processing of mere numbers is complemented by a qualitative approach: the analyst has to compare the di¤erent lists according to semantic criteria, determining how di¤erences between them can be meaningfully interpreted. Do the verbs in earlier and later periods di¤er with regard to categories such as lexical aspect, agentivity, or transitivity? Is there a change in the level of their relative
144
Martin Hilpert
Table 6. The frequency of going in the keep V-ing construction Year
Observed
Expected
CollStr
1990
133
147,81
0,95
1991
111
139,22
2,18
1992
156
158,35
0,35
1993
157
166,66
0,63
1994
183
169,62
0,81
1995
174
168,47
0,48
1996
158
173,87
0,94
1997
165
167,31
0,35
1998
215
176,27
2,71
1999
177
183,11
0,47
2000
163
177,19
0,84
2001
156
164,44
0,58
2002
151
164,17
0,81
2003
184
170,91
0,79
2004
204
182,46
1,25
2005
176
169,25
0,51
2006
178
178,39
0,29
2007
199
182,92
0,93
abstractness? Can we observe that semantic classes such as for instance motion verbs, speech act verbs, or mental state verbs changed in their degree of attraction to the construction under investigation? A comparison of 18 di¤erent lists may be di‰cult to accomplish in a satisfactory manner for these types of questions. For practical reasons, it is therefore advisable to limit an analysis to three or four contrasting periods that collapse several years of the database. Since the database comprises 18 successive years, one solution is to partition the data into three six-year periods. If this is done, a distinctive collexeme analysis returns the results that are shown in Table 7.
Diachronic collostructional analysis
145
Table 7. Distinctive collexemes of keep V-ing across three 6-year corpus periods 1990–1995 Verb
1996–2001 Obs Exp
Sig Verb
2002–2007 Obs Exp
Sig Verb
Obs Exp
Sig
seeping
7
2.10 *** reminding
84
63.77 **
pushing
chewing
12
4.79 *** swimming
13
6.57 **
sipping
5
1.86 **
teasing
10
3.89 *** edging
5
1.64 **
learning
21
13.38 **
lending
8
3.00 **
dragging
17
9.86 **
paying
51
38.66 **
gagging
5
1.50 **
plowing
9
4.27 **
filming
13
7.43 *
reiterating
7
3.00 *
plying
4
1.31 *
believing
20
13.01 *
lurching
5
1.80 *
stomping
4
1.31 *
screwing
7
3.35 *
shaking
29
19.77 *
zooming
4
1.31 *
sni‰ng
7
3.35 *
having
68
53.32 *
picturing
11
5.92 *
happening
14
8.55 *
glancing
41
29.96 *
throwing
40
29.92 *
getting
wandering
12
6.59 *
tuning
6
2.63 *
looping
4
1.49 *
struggling
12
6.89 *
pulling
59
46.68 *
worshipping
4
1.49 *
brushing
7
3.30 *
plugging
29
20.71 *
challenging
9
4.83 *
17
10.78 *
holding
37
27.61 *
providing
9
4.83 *
6
2.70 *
winning
55
43.39 *
drifting
19
12.64 *
crying blinking
167 140.13 **
425 391.78 *
All elements are shown in the order of their relative distinctiveness for the respective period. Alongside the ing-forms, the table lists their observed and expected frequencies and indicates at what level of significance each ing-form is distinctive.6 In each case, the observed frequency is greater than the expected frequency. The table also shows that the most distinctive elements do not necessarily exhibit very large observed frequencies; it is the relation of observed and expected that counts. As is needless to say, the lists in Table 7 do not in themselves constitute a semantic analysis of the keep V-ing construction. A qualitative interpretation, ideally in close relation to previous accounts of the construction, is necessary and will be o¤ered in the following paragraphs. 6. Instead of reporting values of collostructional strength, this table and the following ones indicate significant distinctiveness with asterisks (*** p < 0.001, ** p < 0.01, * p < 0.05).
146
Martin Hilpert
Earlier research on keep V-ing (Freed 1979, Brinton 1988, Cappelle 1999) has pointed out several characteristics of the construction with regard to Vendlerian situation types (Vendler 1957). As is well-known, Vendler distinguishes between states, activities, accomplishments, and achievements. He proposes that English verbs can be categorized according to this distinction, recognizing of course that one and the same verb may often be construed as belonging to di¤erent categories. We will return to this issue shortly. To briefly flesh out Vendler’s four-way distinction, his first category is the one of stative verbs (love, believe), which describe acts that usually cannot be performed intentionally. Stative verbs ‘‘do not indicate processes going on in time, yet they may be predicated of a subject for a given time with truth or falsity’’ (1957: 146). By contrast, activity verbs (run, play) denote processes or actions that go on in time and can be temporally extended indefinitely. Accomplishments are defined as temporally extended actions with a clear end point; in English this type is most commonly expressed not by single verbs but instead by verb phrases such as write a letter or peel an orange. Finally, achievements are punctual events that are exemplified by verbs such as break and snap, or phrases such as reach the summit and win the race. How do these categories relate to the keep V-ing construction? Two issues are of primary interest here. First, the construction shifts between a continuative and an iterative meaning, depending on the verb type of the ing-form (Brinton 1988: 87). To keep doing something may either mean that an activity is drawn out in time, or that it is repeated over and over. Second, the construction does not occur indiscriminately with all four Vendlerian types. Certain tendencies of co-occurrence have been observed, and these would be subject to diachronic change (cf. section 1). It may thus be fruitful to examine the data in Table 7 with special regard to the distinction between continuative and iterative and the distribution of di¤erent verb types across the three periods. As ‘‘a kind of progressive marker’’ (Biber et al. 1999: 746), keep V-ing shows an expectable tendency to occur with activity verbs, which lend themselves well to the continuative meaning that is associated with the construction. The examples below highlight that a particular activity is being drawn out across a period of time, instead of being interrupted or finished. (1) a. Please keep walking! b. The band kept playing. The keep V-ing construction could be expected to occur sparsely with accomplishment and achievement verbs, since the telic nature of their
Diachronic collostructional analysis
147
lexical aspect clashes with the constructional meaning of a temporally extended process. However, this is not the case; both types are frequently found with the construction. Freed (1979: 91) points out that, in cases where keep V-ing occurs with accomplishments (2a) and achievements (2b), the construction conveys the meaning of a series of events. Even more dramatically, the construction may construe an activity verb as a repetitive sequence of activities (2c). (2) a. John keeps going to the bathroom. (series from accomplishment) b. John kept slamming the door. (series from achievement) c. John keeps sleeping on the job. (series from activity) These semantic shifts can be meaningfully interpreted as a constructional coercion e¤ect (Michaelis 2004), in which the meaning of the keep V-ing construction overrides the lexical semantics of the individual verbs and enforces a repetitive construal of the situation. A classic example for constructional coercion is the sentence John sneezed the napkin o¤ the table, where the normally intransitive verb sneeze is furnished with a more complex argument structure and receives a di¤erent, causative interpretation as a consequence. Despite its potential to coerce the inherent lexical aspect of verbs into meanings more in harmony with its constructional semantics, the keep V-ing construction shows a general reluctance to occur with Vendler’s category of stative verbs (Freed 1979: 57). Of course, many stative verbs do not readily form ing-forms to begin with and are hence barred from occurring in the construction. For instance, the verb know would traditionally be classified as such a case. Example (3a) is fairly unacceptable if we take it to mean that John remains in a state of knowing an answer to a question. However, coercion e¤ects are possible even with stative predicates. The example becomes acceptable if, as in (3b), it is understood as a series of events, in which John repeatedly o¤ers right answers. (3) a.
? John keeps knowing the right answer.
b. They have been playing Trivial Pursuit for hours, and John keeps knowing the right answer. (series from state) Brinton (1988: 88) further elaborates that stative verbs are fully acceptable when the expression may receive an intentional reading, which e¤ectively construes a stative verb such as own as an activity predicate. (4) Despite high gasoline prices, she keeps owning a large car.
148
Martin Hilpert
For the purposes of the present study, Vendler’s distinction will be taken as a starting point, but a finer distinction will be made in the domain of activity verbs. In particular, it may be fruitful to distinguish smooth and continuous activities (drift, think, sing) from activities that can be analyzed into smaller, iterative steps. The category of continuous activities also includes actions that are not internally homogeneous. Verbs such as work, learn, or supply act as cover terms for sets of actions that may be quite di¤erent from one another, but which are viewed as serving the same goal at a higher level of abstraction. By contrast, verbs such as brush, chew, shake, or sni¤ are activities that are comprised of multiple iterative actions that usually are not separately lexicalized: brushing involves multiple movements of a brush, chewing involves multiple openings and closings of the jaws, and so on and so forth. The examples in (5) illustrate the contrast that is made here between continuous and iterative activities. (5) a. The raft kept drifting across the ocean. (continuous activity) b. She kept brushing her hair. (iterative activity) With these distinctions in place, we can proceed to a first interpretation of the distinctive collexemes in Table 7. First of all, the table as a whole contains only two ing-forms that would instantiate stative predicates, namely having and believing. This is in line with what previous accounts have found. Among the ten most distinctive collexemes of the earliest period, six (seeping, chewing, teasing, gagging, lurching, and shaking) can be characterized as iterative activities. In each case, the examples denote a longer sequence of ‘‘little actions’’, as shown in (6). (6) a. I keep chewing the salty, under-cured ham. (a-c: COCA) b. The bull could see us plain enough, but I kept teasing him with cow calls. c. Scott kept shaking his head, staring down into the dark water. In the following periods, there are successively fewer iterative activities. The second period has swimming, edging, plying, and stomping; the third has sipping, sni‰ng, and looping. As a first tentative interpretation of the data, we can thus report a decline of iterative activities. If we turn to continuous activities, we observe the reverse tendency. The first column of Table 7 lists only wandering, struggling, and crying, the latter of which might even be viewed as an iterative activity. In the second period, there are the forms dragging, plowing, zooming, picturing, pulling, and holding. The third period lists concrete continuous activities
Diachronic collostructional analysis
149
such as pushing, filming, and drifting, but also more abstract continuous predicates such as learning, worshipping, and providing. This imbalance across the three periods allows the interpretation that the constructional meaning becomes more tolerant towards continuous activities, as more and semantically more diverse predicates of this kind are distinctive for keep V-ing in the two later stages. In comparison to activities, achievement and accomplishment verbs are less strongly represented; those that are found are evenly distributed across the three columns of Table 7. Concerning these verb types, the table thus does not suggest the presence of a tangible change. There are only two clear cases of accomplishment verbs (lending, P1, and tuning, P2), the remaining verbs denote highly punctual events (blinking, winning, paying, etc.). As observed by earlier accounts (Freed 1979, Brinton 1988), examples of keep V-ing with these verbs convey the meaning of a series of events. (7) a.
There’s a reason the New England Patriots keep winning. (a-c: COCA) b. I had to keep reminding him it was his project, not mine. c. In practice, reforms were never implemented and the World Bank kept lending more to the Ivorian government.
As was mentioned above, there are only two stative verbs (believing, having) in Table 7. In line with Brinton’s observation (1988: 88), in the actual examples with believing, it is not a state, but rather an activity that needs to be intentionally upheld. The data suggest that this particular coercion e¤ect remains marginal in the usage of keep V-ing. (8) No matter what obstacles you face down the road, keep believing and working toward your dream. (COCA) Summing up, the most important tendency indicated by the diachronic collostructional analysis is a trend within Vendler’s category of activity verbs. Whereas iterative activities are on the decline, continuous activities gain in importance. In order to establish the validity of this conclusion, the analysis needs to be refined through the elimination of certain confounds, as will be explained in the next sections. 3. Confound #1: How the data is partitioned Diachronic corpus analysis involves the practice of comparing di¤erent time slices of data against each other and making sense of the di¤erences
150
Martin Hilpert
that are observed. Many diachronic corpora come with a pre-established division of time periods. Some corpora are divided into centuries or decades; others base their division on independently established developmental stages of the language at hand. Resources such as COCA or the Oxford English Dictionary allow the researcher to create time slices by pooling data from a chosen sequence of years. This section discusses a problem associated with the partitioning of diachronic data that is rarely appreciated. In brief, cutting up diachronic data in di¤erent ways may lead to di¤erent results. In the development of a grammatical form, there may be extended periods of stasis followed by short spurts. For instance, a form may remain fairly infrequent for a longer period of time, show a sudden and dramatic increase in frequency, and continue a moderate increase after that. If such developments happen to be grouped together in the same corpus period, computing frequency averages may obscure or distort trends that are present in the data. In the case of diachronic collostructional analysis, choosing di¤erent periods will lead to di¤erent rankings of distinctive collexemes and potentially di¤erent interpretations of the semantic development (Stefanowitsch 2006). To alleviate this problem, Gries and Hilpert (2008) suggest that diachronic corpus data be partitioned in a data-driven and bottom-up fashion. This entails that depending on the phenomenon that is studied, the same corpus may be divided into di¤erent periods. This is in fact a desirable feature, since a partitioning that is meaningful for the development of one grammatical form need not be appropriate for the next one (2008: 62). The developments of grammatical forms also need not coincide with cornerstone dates in the history of the English language. Finally, the developmental phases of a grammatical form will rarely be of equal length. Gries and Hilpert (2008) use a hierarchical clustering algorithm for the purpose of dividing a diachronic corpus into di¤erent periods. The basic conceptual logic behind such an algorithm is to compare all items of a set in order to find out which ones are most similar to one another. On the basis of relative degrees of similarity, the overall set of items can be divided into separate groups that are characterized by shared characteristics. To give a concrete non-linguistic example, hierarchical clustering is used in areas such as market research, where analysts aim to figure out how a population of customers splits up into di¤erent groups. Products can then be specifically tailored to meet the preferences of the respective groups. Common linguistic applications include the identification of subgroups in a linguistic community, or the establishment of similarities
Diachronic collostructional analysis
151
between di¤erent textual genres. Standard hierarchical clustering algorithms make no prior assumptions about the internal structure of a set, which leads to a problem for their application to diachronic corpus data. If historical periods are compared just on the basis of relative similarity, data from very di¤erent, non-adjacent periods of time could be grouped together. In order to overcome this problem, Gries and Hilpert (2008) use a clustering algorithm that only assesses the relative similarity of temporally adjacent periods, hence the name Variability-based Neighbor Clustering (VNC)7. The algorithm thus takes diachronic order into account and produces only clusters that represent successive periods of time. Using VNC requires the researcher to make two choices that are common to all hierarchical clustering algorithms. First, one has to choose a similarity measure that will serve as the basis for the comparison. Ideally, the similarity measure should directly reflect the research question that is being investigated. If the frequency development of a linguistic form is at issue, text frequency should be chosen as a similarity measure. If the research focus is on a change from one variant of a grammatical form to another, the relative frequency of either variant will be a good choice. Since the present investigation focuses on collocational change, a similarity measure based on the distinctive collexemes of keep V-ing is chosen. Specifically, the clustering algorithm operates on the results of a diachronic collostructional analysis for all 18 years of the database. Each year is represented by a vector of distinctiveness values for each verb in the database; clusters are established on the basis of relative similarity between those vectors using Pearson’s product-moment correlation. A second methodological decision concerns the choice of an amalgamation rule, which determines by what criteria one complex cluster should be merged with another one. Again, di¤erent options are available here, out of which the present study uses weighted pairwise means. On the basis of the input and the above-mentioned specifications, the VNC algorithm returns the results visualized in Figure 3. Overall, the results confirm that partitioning the data into three di¤erent periods is a reasonable choice. However, instead of producing three clusters that are temporally equidistant, VNC produces a first cluster of four years (1990–1993), a second one of six years (1994–1999), and an eight-year cluster with the remaining years (2000–2007). That is, changes in the collocational behavior of keep V-ing will be most noticeable if a 7. A script for the computation of VNC is available upon request from Stefan Th. Gries.
152
Martin Hilpert
Figure 3. Variability-based neighbor clustering over the distinctive collexemes of keep V-ing
diachronic collostructional analysis is conducted on the basis of these three periods, rather than arbitrarily chosen ones. Table 8 presents the results of such an adapted analysis. The first thing to notice is that the choice of di¤erent corpus periods yields almost entirely new sets of distinctive collexemes, despite the fact that the original six-year periods are in substantial overlap with the new partitioning. However, if we re-assess our earlier observations with regard to Vendlerian verb types, we see that they still hold. There are only three stative predicates (anticipating, wanting, P1, having, P2); accomplishments only weakly represented (lending and committing, P1, redefining and answering, P2, adjusting, P3). Achievements are spread out across all three periods; no clear trend is observable. Importantly, the imbalance between iterative and continuous activities that was seen in Table 7 shows up again in Table 8. In the first period, the ing-forms chewing, gobbling, pawing, and coughing can be categorized as iterative. The second period lists brushing, plying, skating, and reeling; the third only slicing and sipping. As before, the decline of iterative activities is counterbalanced by an increase in continuous activities. The first period has babying, the second lists holding, grinding, dragging, and tightening. In the third period, pushing, filming, staring, playing, writing, and aiming can be classified as continuous activities.
Diachronic collostructional analysis
153
Table 8. Distinctive collexemes of keep V-ing across three VNC-periods 1990–1993
1994–1999
Verb
Obs Exp
Sig Verb
saying
399 352.27 ** hollering
2000–2007 Obs Exp 7
Sig Verb
2.29 *** getting
lending
6
1.93 ** hearing
182 150.25 **
pushing
chewing
8
3.09 ** brushing
9
3.60 **
starting
7
2.51 ** holding
41
27.50 **
filming
nominating
4
0.97 ** grinding
7
2.95 **
staring
anticipating
3
0.58 ** having
74
58.27 **
slicing
babying
3
0.58 ** plying
4
1.31 *
popping
gobbling
3
0.58 ** rebounding
4
1.31 *
playing
pawing
3
0.58 ** redefining
4
1.31 *
writing
coughing
7
2.70 ** dragging
16
9.82 *
committing
4
1.16 *
acquiring
5
assuring
7
2.90 *
skating
5
insisting
38
27.04 *
answering
wanting
27
17.96 *
blinking
5
1.74 *
pinching
Obs Exp
Sig
565 505.43 *** 210 180.78 ** 6
2.88 *
15
9.59 *
127 109.81 * 8
4.32 *
71
58.98 *
205 184.62 * 82
69.53 *
adjusting
9
5.27 *
1.96 *
aiming
9
5.27 *
1.96 *
sipping
5
2.40 *
7
3.27 *
stumbling
15
10.07 *
tightening
6
2.62 *
paying
60
49.87 *
reeling
6
2.62 *
checking
55
45.56 *
Despite the fact that the actual distinctive elements are di¤erent from the ones in the earlier analysis, the overall results yield similar conclusions. While this is encouraging, there is another confound that needs to be addressed. This confound concerns the fact that the development of keep V-ing need not be a unified phenomenon across the di¤erent genres represented in COCA. The next section addresses this problem in more detail.
4. Confound #2: Di¤erences between genres The corpus-based study of grammatical developments requires the analyst to make several idealizing assumptions, for instance that there is ‘‘a construction’’ that changes over time, and ‘‘a language’’ that represents the systemic habitat of that construction. The first idealization takes multiple linguistic outputs by di¤erent speakers and writers, in our case instances of
154
Martin Hilpert
keep followed by an ing-form, and construes them as a single object of study. This may suggest that grammatical forms lead autonomous lives, when in fact grammatical change merely reflects the reality that speakers use language di¤erently at di¤erent times. The second idealization takes multiple linguistic practices of partly overlapping communities and treats them as a single entity, in our case ‘‘the grammar of English’’. It is maintained here that it does make sense to use such a notion, but crucially, processes of change tend to originate in a delimited area of the system, from where they may spread. Since previous discussions of keep V-ing have pointed out its a‰nity to the spoken modality (Biber et al. 1999), it may prove fruitful to reanalyze the COCA data with regard to speech and di¤erent written text types. Figure 4 shows the distribution of the construction across the five COCA genres.
Figure 4. Frequency development of keep V-ing across five genres
The relative frequencies of keep V-ing shown in Figure 4 are very much comparable to frequencies reported in Biber et al (1999: 747). Interestingly, Biber et al. find the construction to be more frequent in spoken conversation than in fiction; in the COCA data the two are reversed, which may be due to the heavy representation of speech from TV shows, rather than informal conversation. Crucial for the present purpose is the question whether keep V-ing displays any developmental di¤erences across the five genres. If a statistical test for the presence of trends is applied, it turns out that the two genres in which keep V-ing is most frequent, fiction and spoken data, show no significant development. By contrast, the three remaining genres converge on an upward trend (magazines: Kendall’s t ¼ 0.419, Ptwo-tailed ¼ 0.017,
Diachronic collostructional analysis
155
news: Kendall’s t ¼ 0.481, Ptwo-tailed ¼ 0.006, academic writing: Kendall’s t ¼ 0.519, Ptwo-tailed ¼ 0.003). Incidentally, the upward trend is strongest for the most conservative genre in the set. The recent development of keep V-ing thus testifies to the colloquialization of written English (Mair 2006). In the light of these tendencies, a third collostructional analysis focused on the developing genres, excluding the data from fiction and speech. Table 9 presents the results.
Table 9. Distinctive collexemes of keep V-ing, FICTION and SPOKEN excluded 1990–1995 Verb starting rearranging telling applying hoping
1996–2001 Obs Exp
Sig Verb
6
1.48 *** hearing
3
0.55 **
2002–2007 Obs Exp
Sig Verb
63 47.49 ** getting
Obs Exp
Sig
282 253.56 **
acquiring
4
1.29 *
filling
11
6.40 **
rebounding
4
1.29 *
accumulating
6
2.95 *
2.58 **
redefining
4
1.29 *
providing
8
4.43 *
19 10.89 **
knocking
9
4.52 *
playing
127 110.78 *
95 74.57 ** 7
blaming
4
1.11 *
kicking
9
4.52 *
pushing
106
assuring
5
1.66 *
plowing
8
3.88 *
working
206 186.60 *
chewing
3
0.74 *
tightening
5
1.94 *
supplying
encountering
3
0.74 *
sinking
6
2.58 *
aiming
7
3.94 *
forcing
3
0.74 *
functioning 11
6.46 *
adding
47
38.40 *
predicting
3
0.74 *
brushing
3
0.97 *
learning
20
14.77 *
20 12.74 *
chopping
3
0.97 *
walking
74
64.01 *
109 91.37 *
coaching
3
0.97 *
mentioning
8
4.92 *
wondering saying
5
92.07 *
2.46 *
climbing
9
4.61 *
competing
3
0.97 *
apologizing
6
3.45 n.s.
urging
6
2.58 *
flopping
3
0.97 *
attacking
6
3.45 n.s.
Again, the new analysis produces di¤erent sets of distinctive elements, but the tendencies observed earlier are corroborated. In fact, the imbalance of iterative and continuous activities that was argued for above shows up much more clearly in the current analysis. In the first period, iterative activities are represented by rearranging, chewing, and climbing. The second period lists knocking, kicking, brushing, and chopping; the third lists walking. Conversely, continuous activities become successively
156
Martin Hilpert
more distinctive, with the third period listing filling, accumulating, providing, playing, pushing, working, supplying, aiming, and learning. These results suggest that the keep V-ing construction was not simply integrated into written genres as another grammatical resource, but that the integration process coincided with changes in its constructional meaning. In particular, the data indicate a waning association of keep V-ing with small, repetitive actions as the construction spreads into more formal writing.
5. Concluding remarks The preceding sections have outlined how distinctive collexeme analysis can be applied to diachronic corpus data, and how the elimination of confounds such as partitioning and genre e¤ects can bring di¤erences in the data more into focus. To conclude this paper, a few remarks on remaining problematic issues are in order. First of all, one may raise a general concern about the collostructional approach and ask whether the comparison of distinctive collexeme sets is not a rather subjective business. There undoubtedly are subjective decisions involved, and two analysts might look at the same results and arrive at di¤erent, although perhaps not diametrically opposed, interpretations. Earlier in this paper, the idea of a wine tasting was o¤ered as an analogy, and inasmuch as one can exaggerate di¤erences there, an analyst may certainly overinterpret collocational data. With regard to the results at hand, it may further be asked whether a change from three iterative verbs as distinctive collexemes in one period to six such elements in the next period constitutes a truly significant change. While the values of collostructional strength that are computed for each lexical element do establish that the observed changes are significant, the linguistic nature of those changes is a matter of human interpretation. The best way to safeguard against overly speculative interpretations is to set up definite research questions before conducting the collostructional analysis. In this way, the results can be interpreted as evidence that either contradicts or corroborates a pre-existing hypothesis. In Hilpert (2008) for instance, it was checked whether shifting collocational preferences of future constructions would develop in accordance with grammaticalization paths that had been suggested independently. The collocational evidence corroborated some earlier accounts, but contradicted others, thus leading to alternative proposals (2008: 183–186).
Diachronic collostructional analysis
157
A second concern pertains to the analysis presented in this paper, specifically the choice of diachronic data covering only a very short period of time. A fair question to ask is whether trends that are observed over periods as short as 18 years can be taken to reflect longer, ongoing developments. The choice of data involves a trade-o¤ here: On the one hand, corpus data covering a long, historical period of time will undoubtedly show relatively more pronounced collocational di¤erences, which can then be linked to hypotheses about grammatical change. On the other hand, corpus data from shorter, more recent periods of time occur in greater quantities and can be more adequately controlled for factors such as genre di¤erences. Depending on one’s research question, there is room for both strategies, and they should be seen as complementing each other. A third inherent limitation of distinctive collexeme analysis is that the method does not take into account the overall corpus frequencies of the lexical elements that occur with the construction. Applied to diachronic data, the method makes the assumption that all lexical elements have the same chance of occurrence in all of the periods that are compared. This is of course an idealization, since some lexemes become more frequent over time, whereas others may cease to be used. These processes will introduce a certain amount of noise into the analysis; it is however unlikely that the results will be distorted in a systematic, confounding fashion. A fourth issue concerns the question of psychological reality. Stefanowitsch (2006) rightfully points out that the collostructional methods have been designed to model the cognitive organization of lexico-grammatical interdependencies, which however cannot be maintained when historical data are used. Although not in the present case study, such data may vastly exceed the lifetime of individual speakers, so that they cannot inform questions about cognition. It therefore has to be kept in mind that diachronic distinctive collexeme analysis models processes of change at an aggregate level of language use, not at the level of the individual speaker’s mind. These two levels are distinct, but both merit investigation. Fifth, a reviewer raises the issue that all collostructional methods are relatively ‘‘data-hungry’’ procedures that work best with large corpora and at least moderately frequent grammatical constructions. Naturally, this imposes some restrictions on what types of phenomena can be analyzed with these methods. To add to the reviewer’s point, not only do the constructions have to have a high enough token frequency, they also need to have a relatively high type frequency. For instance, a collostructional analysis of what Schmid (2000) calls shell nouns (the fact that
158
Martin Hilpert
CLAUSE, the decision to V, etc.) is very likely to register infinite collostructional strength for nearly all participating nouns, because these form such a small and relatively closed set. While the possible applications of the collostructional methods are thus restricted in several ways, this also holds true for virtually every other quantitative method that is currently in the market: highly specialized tools usually do not have the wide applicability of a Swiss Army knife, but in return they yield high descriptive precision. Lastly, we can take a step back and consider a more general issue. Given a tool such as diachronic collostructional analysis, what new insights about grammatical change can be provided by analyses of this kind? The study of shifting collocational patterns certainly brings into focus the important role of the lexicon, and lexical semantics, in grammatical variation and change, which is being acknowledged in more and more contexts (Torres Cacoullos and Walker 2009). We learn in fine detail how a given development is sensitive to lexical semantic categories such as animacy, abstractness, situation type, transitivity, or other categories that would not be directly apparent from an introspective analysis, but which stand out in the results of a collostructional analysis. All corpus-based data-mongering thus serves the ultimate purpose of exploring which meaning distinctions matter to human beings and how these are reflected in the change of language use.
References Biber, Douglas 1993
Co-occurrence patterns among collocations: A tool for corpusbased lexical knowledge acquisition. Computational Linguistics 19/3: 531–538. Biber, Douglas, Stig Johansson, Geo¤rey Leech, Susan Conrad, and Edward Finegan 1999 The Longman Grammar of Spoken and Written English. Essex: Pearson Education Ltd. Brinton, Laurel 1988 The Development of English Aspectual Systems: Aspectualizers and Post-verbal Particles. Cambridge: Cambridge University Press. Cappelle, Bert 1999 Keep and keep on compared. Leuvense Bijdragen 88: 289–304. Davies, Mark 2007 TIME Magazine Corpus (100 million words, 1920s–2000s). Available online at http://corpus.byu.edu/time.
Diachronic collostructional analysis Davies, Mark 2008
159
The Corpus of Contemporary American English (COCA): 400þ million words, 1990–present. Available online at http://www.americancorpus.org.
Freed, Alice F. 1979 The Semantics of English Aspectual Complementation. Dordrecht: Reidel. Gilquin, Gae¨tanelle 2006 The verb slot in causative constructions. Finding the best fit. Constructions, Special Volume 1. Available online at http://elanguage.net/journals/index.php/constructions/issue/ view/17 Goldberg, Adele E. 1995 Constructions. A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. Goldberg, Adele E. 2006 Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press. Gries, Stefan Th. 2004 Coll. analysis 3. A program for R for Windows 2.x. Gries, Stefan Th. and Martin Hilpert 2008 The identification of stages in diachronic data: variability-based neighbor clustering. Corpora 3/1: 59–81. Gries, Stefan Th. and Anatol Stefanowitsch 2004 Extending collostructional analysis: A corpus-based perspective on ‘‘alternations’’. International Journal of Corpus Linguistics 9/ 1: 97–129. Heine, Bernd 1993 Auxiliaries. Cognitive Forces and Grammaticalization. Oxford: Oxford University Press. Hilpert, Martin 2006 Distinctive collexeme analysis and diachrony. Corpus Linguistics and Linguistic Theory 2/2: 243–57. Hilpert, Martin 2008 Germanic Future Constructions A Usage-based Approach to Language Change. Amsterdam: John Benjamins. Hilpert, Martin and Stefan Th. Gries 2009 Assessing frequency changes in multi-stage diachronic corpora: Applications for historical corpus linguistics and the study of language acquisition. Literary and Linguistic Computing 24 (4): 385–401. Hopper, Paul J. and Elizabeth C. Traugott 2003 Grammaticalization, 2nd ed. Cambridge: Cambridge University Press.
160
Martin Hilpert
Leech, Geo¤rey N. 1992 100 million words of English: the British National Corpus. Language Research, 28/1: 1–13. Mair, Christian 2006 Twentieth-Century English: History, Variation and Standardization. Cambridge: Cambridge University Press. Michaelis, Laura A. 2004 Type Shifting in Construction Grammar: An Integrated Approach to Aspectual Coercion. Cognitive Linguistics 15: 1–67. R Development Core Team 2007 R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Available online at: http://www.R-project.org.
Schmid, Hans-Jo¨rg 2000
English Abstract Nouns as Conceptual Shells. From Corpus to Cognition. Berlin: Mouton de Gruyter. Stefanowitsch, Anatol 2006 Distinctive collexemes and diachrony: A comment. Corpus Linguistics and Linguistic Theory 2/2: 257–62. Stefanowitsch, Anatol and Stefan Th. Gries 2003 Collostructions: Investigating the interaction between words and constructions. International Journal of Corpus Linguistics 8/2: 209–43. Sinclair, John 1991 Corpus, Concordance, Collocation. Oxford: Oxford University Press. Torres Cacoullos, Rena and James A. Walker 2009 The present of the English future: Grammatical variation and collocations in discourse. Language 85/2: 321–354. Vendler, Zeno 1957 Verbs and times. The Philosophical Review 66/2: 143–160.
Tracing semantic change with Latent Semantic Analysis Eyal Sagi, Stefan Kaufmann and Brady Clark Abstract Research in historical semantics relies on the examination, selection, and interpretation of texts from corpora. Changes in meaning are tracked through the collection and careful inspection of examples that span decades and centuries. This process is inextricably tied to the researcher’s expertise and familiarity with the corpus. Consequently, the results tend to be di‰cult to quantify and put on an objective footing, and ‘‘big-picture’’ changes in the vocabulary other than the specific ones under investigation may be hard to keep track of. In this paper we present a method that uses Latent Semantic Analysis (Landauer, Foltz & Laham 1998) to automatically track and identify semantic changes across a corpus. This method can take the entire corpus into account when tracing changes in the use of words and phrases, thus potentially allowing researchers to observe the larger context in which these changes occurred, while at the same time considerably reducing the amount of work required. Moreover, because this method relies on readily observable co-occurrence data, it a¤ords the study of semantic change a measure of objectivity that was previously di‰cult to attain. In this paper we describe our method and demonstrate its potential by applying it to several well-known examples of semantic change in the history of the English language.
1. Introduction The widespread availability of a¤ordable and powerful computational machinery for the storage, manipulation and analysis of large data sets has had a profound methodological impact on virtually every area of scholarly inquiry. Historical linguistics is no exception to this trend. This is not surprising inasmuch as the diachronic study of language has always relied on the analysis of large amounts of text. But it is an exciting development nonetheless because the new computational tools open up methodological possibilities that were hitherto unavailable. We see three major ways in which research in historical linguistics has already been a¤ected and will continue to be transformed by data-driven computational methods. First, data-driven computational methods complement tradi-
162
Eyal Sagi, Stefan Kaufmann and Brady Clark
tional qualitative methods of analysis (see Gries, this volume), but with the specific benefit of being reproducible and objectively measurable. Second, phenomena which manifest themselves as statistical trends in large corpora can be observed and quantified precisely and e‰ciently without enormous investments in manpower. Third, these methods have the potential to help detect interesting trends in the data based on large-scale observations on the entire corpus. Computational methods have only just begun to have an impact in historical linguistics. At this point, most work in the area is exploratory, testing and refining methods rather than putting them to work to produce new findings. This is also true for the work described in the present paper. Our goal is to demonstrate how an existing method which has enjoyed great success in such areas as natural-language processing and psychology can be used to automate and enhance certain aspects of research in historical semantics. This method is known as Latent Semantic Analysis (LSA). Although linguists would scarcely recognize it as ‘‘semantic analysis’’ in the familiar sense, we use the term here because of its wide currency in the fields in which it was first applied. The details of the method are described in the next section. Here we give a cursory overview of the main ideas and the motivation underlying our application of it. Our main interest is in semantic change, specifically the shifts in lexical meaning undergone by words1 in the history of English. Well-known examples of such shifts include the grammaticalization and attendant semantic ‘‘bleaching’’ of the verb do, and the broadening or narrowing of the senses of common nouns like dog and deer. More details on these changes are given below. Semantic change is an area in which computational methods face specific challenges due to the nature of the data. Texts generally carry few overt hints as to the denotations of the words that constitute them. While changes in morphosyntactic properties (as in grammaticalization) may be observable as di¤erences in the range of grammatical constructions in which a given word occurs, shifts in denotation that are not accompanied by syntactic change (as in broadening or narrowing) manifest themselves in less tangible ways. Add to this the problem that speakers of earlier varieties of English cannot be consulted, and it becomes rather mysterious just how human researchers themselves recognize and track such changes
1. Throughout this paper, we use the term ‘‘word’’ to refer to word types, and ‘‘token’’ or ‘‘occurrence’’ for word tokens.
Tracing semantic change with Latent Semantic Analysis
163
with any confidence, let alone how computers might be fruitfully employed in carrying out the task. To define the problem in such a way that it can be operationalized, we start from the assumption that intuitive notions like ‘‘breadth’’ or ‘‘narrowness’’ of a word’s denotation are related to the range of topics in whose discussion that word may occur.2 Of course, topics are themselves not directly observable, but here we can rely on long-standing and well-established research on the relationship between the topic of a passage of text and the words that constitute it (e.g. Firth, 1957).3 Thus what we actually observe is the range of contexts in which the word occurs, where by ‘‘context’’ we mean quite literally the text surrounding its individual occurrences.4 As we describe in more detail below, our method provides a measure of distance or (dis-)similarity between the various occurrences (tokens) of a given word (type). This measure is derived from large-scale observations on the co-occurrence patterns of the vocabulary in a corpus. Based on the central assumption that a tendency to occur in similar contexts is an indication of semantic relatedness, the method can be seen as locating each occurrence of a given word in an abstract ‘‘semantic space.’’ With this spatial metaphor in mind, our main interest lies in the overall distribution of large numbers of occurrences of a given word. Our hypothesis is that the ‘‘breadth’’ of the word’s meaning is inversely proportional to the ‘‘density’’ with which its occurrences are distributed in the space, and that shifts in the word’s meaning are accompanied by changes in the distribution of its occurrences in the space.
2. By topic we mean ‘‘what is being talked about’’ or the theme of the surrounding text. This use of the term is congruent with its use by Landauer and Dumais (1997) and the Latent Semantic Analysis literature in general. Importantly, these topics are an abstraction and do not always map to cognitively identified topics. As such, there is no explicit classification of topics but rather a fuzzy set of uses. Consequently, these abstractions are more sensitive to shifts than traditional definitions of topic and might change due to di¤erences in the underlying referential structure that the explicit topical classification is not sensitive to. 3. This is the foundational assumption underlying Latent Semantic Analysis and similar approaches (e.g. Landauer and Dumais 1997). 4. This notion of context is sometimes referred to as the co-text of a word. We continue our use of the term context in this sense because this usage is established in the computational literature. We believe that no confusion will arise from this.
164
Eyal Sagi, Stefan Kaufmann and Brady Clark
The next section gives a brief overview of LSA in general and of our application in particular. In Section 3, we describe the results of a study applying the method in the study of semantic change in English. Section 4 concludes with general remarks on the strengths, weaknesses, and future prospects of the method.
2. Latent Semantic Analysis and the Infomap system Latent Semantic Analysis (LSA) is a collective term for a family of related methods, all of which involve building numerical representations of words based on occurrence patterns in a corpus. The basic underlying assumption is that co-occurrence with the same linguistic contexts can be used as a measure of semantic relatedness. This idea has been around for some time – see Firth (1957), Halliday and Hasan (1976), and Hoey (1991) for early articulations – but applying it in practice only became feasible when large text corpora and powerful computational machinery were available. The first computational implementations in this vein, known at the time as Latent Semantic Indexing (Deerwester et al., 1990), were developed for technological applications in areas like Information Retrieval. There the goal was to build representations of documents which summarized and distilled information about their contents. The guiding idea was that similarities and di¤erences in the vocabulary used in documents could serve as indicators of thematic similarities and di¤erences between them. For more details on the history and current state of the art in this area, see Manning and Schu¨tze (1999), Manning et al. (2008), and references therein. From its early uses as an engineering tool in practical applications, the method was adapted in the late Nineties, now under the label Latent Semantic Analysis, to address more theoretical questions about the mental lexicon and the structure of conceptual spaces, again via the measure of word similarity it provides. In this tradition, the method has been used as a research tool in a diverse range of fields including Psychology (Landauer and Dumais 1997; Otis and Sagi 2008; see also the papers in Landauer and McNamara 2007) and Education (Dam and Kaufmann 2008; Steinhart 2001; Graesser et al. 1999; Wiemer-Hastings et al. 1999). For instance, Landauer and Dumais (1997) showed that the acquisition of vocabulary knowledge by school children can be successfully simulated by LSA, and that an LSA-trained automatic system can answer standardized, multiplechoice, synonym questions as well as test-takers. Dam and Kaufmann (2008) used an LSA-based classification method in the analysis of inter-
Tracing semantic change with Latent Semantic Analysis
165
views with middle school students to assess their scientific knowledge, and achieved high levels of agreement with human coders. The success of LSA in these and other applications has lent empirical support to the underlying assumption that semantic relatedness can be operationalized as similarity of co-occurrence with words in naturally occurring texts.5 Most applications of LSA focus on co-occurrence profiles of words in order to explore properties of the lexicon. We go one step beyond this representation and build vectors for all individual occurrences of a given word, thus enabling us to track di¤erences in its use. This method is inspired by ideas first introduced in Word Sense Discrimination (Schu¨tze 1998). Roughly speaking, two steps are involved: first the construction of vectors for word types, second the construction of vectors for individual occurrences of a given target word, based on the vectors obtained in the first step. In the remainder of this section we describe each of these steps in more detail. Before entering this discussion, it is well to emphasize once again the exploratory character of our study. The method is complex and involves many steps, and its implementation requires numerous parameter settings and design choices which one would ultimately want to base on experience, typically gained through a combination of trial-and-error and extensive empirical tests. However, since our application in historical semantics has no immediate precursors, the method has yet to undergo this long maturation process. Thus while readers familiar with applications of LSA elsewhere in computational linguistics may wish to see comparisons between alternative ways to carry out the various steps of the analysis,6 our main goal here is to demonstrate the viability of the idea itself, rather than to tweak the implementation. 2.1. Word vectors In building vector representations of words or texts, the crucial mathematical object underlying all flavors of LSA is a co-occurrence matrix, essentially a large table whose rows and columns are labeled by certain 5. Importantly, LSA identifies words that appear in similar contexts – i.e. words that have related meanings. Interestingly, because antonyms tend to appear in the same contexts, just as synonyms do, this method cannot e¤ectively distinguish between these two semantic relationships. Rather, the degree of similarity indicated by LSA measures semantic relatedness in a broader sense, akin to the associativity underlying priming and similar psychological phenomena. 6. We are grateful to an anonymous reviewer for raising a few specific questions of this kind to be addressed in subsequent and more technical expositions.
166
Eyal Sagi, Stefan Kaufmann and Brady Clark
entities occurring in the corpus (words or larger units). Cells cij contain numbers recording how often the i-th row label occurs with the j-th column label. The array of numbers in each row i can be thought of as a vector in an abstract space whose dimensions correspond to the columns. Two such vectors are similar to the extent that their components are correlated, and the similarity between rows is used as a stand-in for the similarity between the linguistic entities associated with them. Within the class of LSA methods, there is much variation in the nature of the entities associated with the rows and labels, as well as in the definition of ‘‘co-occurrence.’’ An early and still widely used implementation assembles a term-document matrix in which each vocabulary item (term) is associated with an n-dimensional vector representing its distribution over the n documents in the corpus. Thus two words are taken to be similar to the extent that they tend to occur in the same documents. But while using documents as the relevant text unit in this way may be the right thing to do if document retrieval is the ultimate purpose, it is less clear that the document is the right size unit for exploring lexical semantics. Topics may vary widely within a single document, and the properties of documents may depend on factors (genre etc.) that are not straightforwardly linked to word meaning. In contrast, the version of LSA we use measures co-occurrence in a way that is more independent of the characteristics of the documents in the corpus. It relies on a term-term matrix, each of whose rows encodes the co-occurrence pattern of a word with each of a list of words (column labels) that are deemed ‘‘content-bearing.’’ This approach originated with the WordSpace paradigm developed by Schu¨tze (1996). The software we used is a version of the Infomap package developed at Stanford University (in part by the second author) and available in the public domain (see also Takayama et al. 1990).7 Using a term-term matrix mitigates the impact of the properties of individual documents somewhat, but even so, the information represented in the co-occurrence matrix, and thus ultimately the similarity measure, depends greatly on the genre and subject matter of the corpus (Takayama et al. 1999; Kaufmann 2000). The results reported in this paper used a vector space based on word co-occurrence counts in a corpus composed of the Middle English and Early Modern English parts of the Helsinki Corpus. The word types were ranked by frequency of occurrence, and the Infomap system automatically 7. The default settings of this package were used for many of the parameter settings reported here. A more extensive exploration of the parameter space is left for future work.
Tracing semantic change with Latent Semantic Analysis
167
selected (i) a vocabulary W for which vector representations are to be collected, and (ii) a set C of ‘‘content-bearing’’ words whose occurrence or non-occurrence is taken to be indicative of the subject matter of a given passage of text. Usually, these choices are guided by a ‘‘stoplist’’ of (mostly closed-class) lexical items that are deemed useless to the task and therefore excluded, but because we were interested in tracing changes in the meaning of lexical items, we reduced the stoplist to a bare minimum containing only numbers and single letters. To compensate, we used a rather large number of 2,000 content-bearing words (the Infomap default is 1,000). Specifically, our vocabulary W consisted of the 40,000 most frequent nonstoplist words, and the set C of content-bearing words contained the 50th through 2,049th most frequent non-stoplist words. Thus the choice of words is based solely on frequency, rather than some linguistically more interesting property like semantic content or grammatical category.8 This may seem blunt, but it has the advantage of not requiring any human intervention or antecedently given information about the domain. The cells in the resulting matrix of 40,000 rows and 2,000 columns were filled with weighted co-occurrence counts recording, for each pair ðw; cÞ 2 W C, the number of times a token of c occurred in the context of a token of w in the corpus. The ‘‘context’’ of a token wi in our implementation is the set of tokens in a fixed-width window from the 15th item preceding wi to the 15th item following it (less if a document boundary intervenes).9 The number in each cell ðw; cÞ was transformed in two ways. First, the raw count was weighted with a tf:idf measure10 of the column label c, calculated as follows: tf:idf ðcÞ ¼ tf ðcÞ ðlogðD þ 1Þ logðdf ðcÞÞÞ 8. Discarding the most frequent words in assembling the column labels is a brute-force approach to filtering out words which due to their sheer frequency are unlikely to be very useful in discerning fine thematic distinctions (but see also the weighting by a tf.idf measure discussed below). 49 is not a magic number in this regard, but has simply proven useful in earlier applications of the Infomap systems. 9. One reviewer pointed out that one might consider not only document boundaries, but also topic boundaries (i.e. thematic shifts within the document) as natural breaking points for contexts. While LSA has been applied in detecting topic boundaries with relatively good success (see for instance Kaufmann 2000), this is a di‰cult and error-prone process which does not seem to us to yield substantive overall improvements for our task. More empirical work on this issue is called for. 10. tf and idf stand for ‘‘term frequency’’ and ‘‘inverse document frequency,’’ respectively.
168
Eyal Sagi, Stefan Kaufmann and Brady Clark
Here tf ðcÞ and df ðcÞ are the number of occurrences of c and the number of documents in which c occurs, respectively, and D is the total number of documents. While the column labels are chosen by their term frequency, the weighting by inverse document frequency is intended to scale down those columns labeled by words that are widely dispersed over the corpus. The idea is that words whose occurrences are spread over many documents are less useful as indicators of semantic content.11 Second, the number in each cell is replaced with its square root, in order to approximate a normal distribution of counts and attenuate the potentially distorting influence of high base frequencies (cf. Takayama et al. 1998; Widdows 2004). The matrix was further transformed by Singular Value Decomposition (SVD), a dimension-reduction technique yielding a new matrix which is less sparse (i.e. has fewer cells with zero counts) and with the property that, roughly speaking, the first n columns, for any 0 < n jCj, capture as much of the information about word similarities from the original matrix as can be preserved in the lower n-dimensional space (Golub and Van Loan 1989). The SVD implementation in the Infomap system relies on the SVDPACKC package (Berry 1992; Berry et al. 1993). The output was a reduced 40,000 100 matrix. Thus ultimately each item w 2 W is associated with a 100-dimensional vector w. 2.2. Context vectors Once the vector space for word types is obtained from the corpus, new vectors can be derived for any multi-word unit of text (e.g. paragraphs, queries, or documents), regardless of whether it occurs in the original corpus or not, as the normalized sum of the vectors associated with the words it contains.12 In this way, for each occurrence of a target word
11. Thus for instance, in most corpora the word do or its inflectional forms occur in all documents, making them poor indicators of semantic content. While this property does disqualify do as a ‘‘content-bearing’’ column label, it does not of course impede the study of the use of do itself, based on truly content-bearing words in the contexts of its occurrences. We are grateful to an anonymous reviewer for asking about this case. 12. The m vectors Pm sum of P w1 ; . . . ; wm with n dimensions is a vector w ¼ m w ; . . . ; w 1i ni i¼1 i¼1 P: The inner product or dot product of two n-dimensional pffiffiffiffiffiffiffiffiffiffi vectors w; v is w v ¼ ni¼1 wi vi . The length of a vector w is kwk ¼ w w.
Tracing semantic change with Latent Semantic Analysis
169
type under investigation, we calculated a context vector from the 15 items preceding and the 15 items following that occurrence.13 Context vectors were first used in Word Sense Discrimination by Schu¨tze (1998). Similarly to that application, we assume that the ‘‘second-order’’ context vectors represent the aggregate meaning or topic of the segment they are associated with, and thus, following the reasoning behind LSA, are indicative of the meaning with which the target word is being used on that particular occurrence. Consequently, for each target word w of interest, the context vectors associated with its occurrences constitute the data points. The analysis is then a matter of grouping these data points according to some criterion (e.g. the period in which the text was written) and conducting an appropriate statistical test. In some cases it might also be possible to use regression or apply a clustering analysis. 2.3. Semantic density analysis Conducting statistical tests comparing groups of vectors is not trivial. Fortunately, some questions can be answered based on the similarity of vectors within each group, rather than the vectors themselves. The similarity between two vectors w and v is measured as the cosine between them:14 wv cosðw; vÞ ¼ kwkkvk The average pairwise similarity of a group of vectors is indicative of its density – a dense group of highly similar vectors will have a high average cosine (and a correspondingly low average angle) whereas a sparse group of dissimilar vectors will have an average cosine that approaches zero (and a correspondingly high average angle).15 Thus since a word that has a single, highly restricted meaning (e.g. palindrome) is likely to occur in a 13. Since only 40,000 of the word types in the corpus are associated with vectors, not all items in the window surrounding the target contribute to the context vector. If a word occurs more than once in the window, all of its occurrences contribute to the context vector. 14. While the cosine measure is the accepted measure of similarity, the cosine function is non-linear and therefore problematic for many statistical methods. Several transformations can be used to correct this (e.g. Fisher’s z). In this paper we use the angle, in degrees, between the two vectors (i.e. cos1 ) because it is easily interpretable. 15. Since the cosine ranges from 1 to þ1, it is possible in principle to obtain negative average cosines. In practice, however, the overwhelming majority of vector pairs – both word vectors and context vectors – have a non-negative cosine, hence the average cosine usually does not fall below zero.
170
Eyal Sagi, Stefan Kaufmann and Brady Clark
very restricted set of contexts, its context vectors are also likely to have a low average angle between them, compared to a word that is highly polysemous or appears in a large variety of contexts (e.g. bank, do). From this observation, it follows that it should be possible to compare the density across groups of context vectors in terms of the average pairwise similarity of the vectors of which they are comprised. Because the number of such pairings tends to be prohibitively large (e.g. nearly 1,000,000 for a group of 1,000 vectors), it is advisable to use only a subsample in any single analysis. A Monte-Carlo analysis in which some number of pair-wise similarity values is chosen at random from each group of vectors is therefore appropriate.16 However, there is one final complication to consider in the analysis. The passage of time influences not only the meanings of words, but also styles and varieties of writing. For example, texts in the 11th century were much less varied, on average, than those written in the 15th century.17 This will influence the calculation of context vectors as those depend, in part, on the text they are taken from. Because the document as a whole is represented by a vector that is the average of all of its word vectors, it is possible to predict that, if no other factors exist, two contexts are likely to be related to one another to the same degree that their documents are. Controlling for this e¤ect can therefore be achieved by subtracting from the angle between two context vectors the angle between the vectors of the documents in which they appear.18
3. A diachronic investigation: Semantic change 3.1. Some background Semantics is the study of the mapping between forms and meanings. Consequently, the formal study of semantic change takes form-meaning pairs 16. It is important to note that the number of independent samples in the analysis is determined not by the number of similarity values compared but by the number of individual vectors used in the analysis. 17. Tracking changes in the distribution of the document vectors in a corpus over time might itself be of interest, but is beyond the scope of the current paper. 18. Subtraction of the angle between the document vectors was chosen because it was the simplest and easiest method to implement. However, future work might benefit from an approach that more fully explores the di¤erences between the documents within which the contexts are found and controls for them.
Tracing semantic change with Latent Semantic Analysis
171
as its object and explores changes in the association between the two. One way to approach this task is to consider a fixed form F throughout various periods t 0 ; t1 ; t2 ; . . . in the history of the language and ask about the resulting sequence ðF ; M0 Þ; ðF ; M1 Þ; ðF ; M2 Þ; . . . of form-meaning pairs, what changes the meaning underwent. For instance, the expression as long as underwent the change ‘equal in length’ > ‘equal in time’ > ‘provided that’. This is the kind of change we explore in our study. Another approach would be to hold the meaning constant and look for changes in the forms that express it (see Traugott 1999 for discussion). In this work we examine two of the traditionally recognized categories of semantic change (Traugott 2005: 2–4; Campbell 2004: 254–262; Forston 2003: 648–650): – Broadening (generalization, extension, borrowing): A restricted meaning becomes less restricted (e.g. Late Old English docga ‘a (specific) powerful breed of dog’ > dog ‘any member of the species Canis familiaris’ – Narrowing (specialization, restriction): A relatively general meaning becomes more specific (e.g. Old English deor ‘animal’ > deer ‘deer’) Semantic change is generally the result of the use of language in varying contexts, both linguistic and extralinguistic. Furthermore, the subsequent meanings of a form are related to its earlier ones. As a result, the first sign of semantic change is often the coexistence of the old and new meanings (i.e. polysemy). Sometimes the new meanings become dissociated from the earlier ones over time, resulting in homonymy (e.g. mistress ‘woman in a position of authority, head of household’ > ‘woman in a continuing extra-marital relationship with a man’). 3.2. Hypotheses As noted above, the main assumption underlying this project is that changes in the meaning of a given word will be evident when examining the contexts of its occurrences over time. For example, semantic broadening results in a meaning that is less restricted and as a result can be used in a larger variety of contexts. In a semantic space that spans the period during which the change occurred, the word’s increase in versatility can be measured as a decrease in the density of its tokens, i.e. higher average angles between the context vectors of the occurrences, across the time span of the corpus. For instance, because the Old English word docga applied to a specific breed of dog, we predict that earlier occurrences of the lexemes
172
Eyal Sagi, Stefan Kaufmann and Brady Clark
docga and dog, in a corpus of documents of the appropriate time period, will show less variety and therefore higher density than later occurrences.19 The process of grammaticalization (Traugott and Dasher 2002), in which a content word becomes a function word, provides an even more extreme case of semantic broadening. Since the distributions of function words generally depend much less on the topic of the text than those of content words, a word that underwent grammaticalization should appear in a substantially larger variety of contexts than it did prior to becoming a function word. One well-studied case of grammaticalization is that of periphrastic do. While in Old English do was used as a verb with a causative sense (e.g. ‘did him gyuen up’, the Peterborough Chronicle, ca. 1154), later in English it took on a functional role that is nearly devoid of meaning (e.g. ‘did you know him?’). Because this change occurred in Middle English, we predict that earlier occurrences of do will show less variety than later ones. By contrast, semantic narrowing refers to changes that result in a meaning that is more restricted. As a result, a word that underwent semantic narrowing is applicable in fewer contexts than before. This decrease in versatility of the type should result in higher vector density and thus be measurable as a decrease in the average angle between the context vectors of its tokens. For example, the Old English word deor denoted a larger class of living creatures than does its Modern English descendant deer. We therefore predict that earlier occurrences of the words deor and deer, in a corpus spanning the appropriate time period, will show more variety than later occurrences. A similar prediction can also be made regarding the meaning of the word hound and its Old English counterpart hund, which was originally used to refer to canines in general, but in subsequent use its meaning was narrowed to refer only to dogs bred for hunting. To be sure, this reasoning is not without limitations and pitfalls. The shifts in the meanings of the words we are interested in occurred in the context of an overall lexicon which was itself subject to incessant change. There are no absolute ‘‘poles’’ in the semantic space in which we represent the context vectors, and it is possible in principle that a meaning shift in one word eludes us completely if all the other words of interest underwent just the right kind of shift themselves. This risk is of course not limited to 19. It is important to recall that because we measure variability of context compared to the variability of the documents in question, the di¤erences in the variability of the documents between Middle English and Early Modern English is controlled for and should not influence the analysis.
Tracing semantic change with Latent Semantic Analysis
173
computational methods, but faced by human investigators as well. We believe that it could be minimized by tracking changes on a ‘‘global’’ scale, looking for patterns in the vocabulary as a whole. Computational methods like ours are in principle well-suited to this task, which is why we mentioned this application as one of their potential advantages. Implementing and testing our method on such a large scale is not trivial, however, and beyond the scope of the present study. Meanwhile, we believe that such a case of simultaneous shifts is highly unlikely, and our results suggest that the method can be used fruitfully despite this caveat. 3.3. Materials We used a corpus derived from the Helsinki Corpus (Rissanen 1994) to test these predictions. The Helsinki Corpus is comprised of texts spanning the periods of Old English (prior to 1150A.D.), Middle English (1150– 1500A.D.), and Early Modern English (1500–1710A.D.). Because spelling in Old English was highly variable, we decided to exclude that part of the corpus and focused our analysis on the Middle English and Early Modern English periods.20 The resulting corpus included 504 distinct documents totaling approximately 1.15 million words (approximately 200,000 from early Middle English texts, 400,000 from late Middle English texts, and 550,000 from Early Modern English texts). 3.4. Case studies In order to test our predictions concerning semantic change in the words dog, do, deer, and hound, we identified all of the contexts in which they occur in our subset of the Helsinki Corpus. This resulted in 130 contexts for dog, 4,298 contexts for do, 61 contexts for deer, and 36 contexts for hound. Because there were relatively few occurrences of dog, deer, and hound in the corpus, it was possible to compute the angles between all pairs of context vectors. Consequently, for those three words we elected to run a full analysis instead of using the Monte-Carlo method described above. The results of our analyses for all four words (and the word science which we discuss in Section 5) are given in Table 1. These results were congruent with our prediction. The average angle between context vectors 20. While the spelling in Middle English, especially during the earlier periods, is also quite variable, it is still less variable than that found in Old English. Because semantic change takes time, we expect to see at least part of these shifts in Middle English and Early Modern English.
174
Eyal Sagi, Stefan Kaufmann and Brady Clark
Table 1. Mean angle between context vectors for target words in di¤erent periods in the Helsinki Corpus (standard deviations are given in parentheses, sample size given below the mean) n
Unknown composition date ( 0.38)). Nevertheless, it is not possible to be sure of this without the type of multifactorial analysis the papers in this section – in particular Geeraerts et al. – have conducted. Thus, obviously, multifactorial questions require multifactorial methods. For Hundt and Smith’s data, the results of the above Poisson regression are represented in Figure 3: the xaxes represent independent variables and the y-axis and the figures in the bars represent the predicted frequencies from the regression.
Commentary: Corpus-based methods
191
Table 2. Redesigned Table 2 (Appendix 2) from Hundt and Smith (2009) Tense
Variety
Time
Frequency
pres. perf.
BrE
1960s
4196
pres. perf.
BrE
1990s
4073
pres. perf.
AmE
1960s
3538
pres. perf.
AmE
1990s
3499
simple past
BrE
1960s
35821
simple past
BrE
1990s
35276
simple past
AmE
1960s
37223
simple past
AmE
1990s
36250
Figure 3. The e¤ects of Tense (upper left panel), Variety x Tense (upper right panel), and Tense x Time x Variety (lower panel) of Hundt and Smith’s Table 5
192
Stefan Th. Gries
Geeraerts et al. laudably use exactly one such multifactorial approach, one of the multifactorial methods most widely used in corpus-linguistic studies: binary logistic regression. However, many other tools are available and should also find their way into the historical semanticist’s toolbox. The most straightforward extension is a method that I thought Geeraerts et al. were going to use: a multinomial logistic regression, which conceptually di¤ers from its binary counterpart in that the dependent variable can have more than two levels. (The dependent variable in Geeraerts et al. could have been Noun (anger vs. ire vs. wrath) . . .) Also, Poisson regressions of the type exemplified above are a useful tool for when the dependent variable consists of frequencies. Then, a very important recent development is the use of mixed-e¤ects models (or multi-level models), a class of regression models that allows the user to include both fixed e¤ects (variables whose levels exhaust all the possible levels such as SpeakerSex, where levels other than male and female are unlikely to be attested) and random e¤ects (variables whose levels in the analysis are only a sample of those in the population such as Speaker or Verb, where we would like to generalize to more than just the few speakers or verbs in our samples). These models can handle samples with dependent data points and uneven sample sizes much better than traditional regressions, provide much more precise results, and are becoming a more and more widespread technique in all areas of linguistics. (It has to be noted, though, that the method is still being developed and fine-tuned.) Finally, there is a large number of alternative approaches out there that may be of use to researchers working with noisy observational data. Classification and regression trees, support vector machines, learning algorithms, and neural networks are a few of the currently hot methods that are worth keeping an eye on (cf. Baayen 2011): once applications are available (or, even better, R packages) that make the applications of these tools easier, historical corpus-based studies will be able to explore even the most complicated data with renewed vigor.
4. Concluding remarks There are a few final comments I wish to make. Again, these comments must not be understood as a critique of the papers in this section, which already do a lot of the things I would like to see in corpus-based work (in historical semantics/linguistics).
Commentary: Corpus-based methods
193
First, there are the notions of interdisciplinarity and methodological pluralism. The papers in this volume and in this section are already interdisciplinary in how they bring together methods and insights from different linguistic disciplines. In this section alone, historical semantics is enriched by methods used in cognitive linguistics (distinctive collexeme analysis), first language acquisition (VNC), sociolinguistics (lectal variables in Geeraerts et al.’s logistic regression), computational linguistics / information retrieval (LSA), so when I advocate even more of this, then my target group is not the present authors. There are many more fields that have exciting methods to o¤er. I have already mentioned quite a few but as one additional example let me mention work in corpus-based dialectology, where statistical techniques and exciting visualization tools are now used for the bottom-up identification and characterization of dialect continua on the basis of corpus data (cf. Szmrecsanyi and Wolk 2011). Methods like these, which add a geographical perspective to the data, are just waiting to be added on top of the bottom-up methods discussed in this section, and there are many more interesting approaches out there once we look beyond linguistics proper (neighbor-clustering approaches to twoor three-dimensional data are common in the study of ecosystems, for instance). Given the above, methodological pluralism follows naturally: many studies can benefit from using several of the approaches advocated here together. For example, Gries and Hilpert (2010) first use VNC to arrive at temporal stages of the diachronic development of the third person singular marker in English, and then they use these stages as a predictor in a generalized linear mixed-e¤ects model to explore which linguistic features accompanied and/or drove that change, and similar applications are conceivable even for the papers in this section, as when Geeraerts et al. and Sagi et al. might benefit from the VNC approach to obtain the best temporal divisions in their data, or when Hilpert’s data might be explored with the above regression-with-breakpoints approach, etc. The second recommendation I want to make to researchers can only be made very briefly and programmatically: the more complicated one’s data and methods and the more they are borrowed from outside of one’s core area, the more one needs to use illuminating visualization tools. For example, few people really know what the coe‰cients of regressions mean (esp. for logistic and Poisson regressions), and few people understand odds ratios or log odds, etc., which makes it all the more important that graphs are used that provide all and only all the important information in a way that readers who are not (yet) statistically savvy can digest
194
Stefan Th. Gries
them. Obviously, this is very subjective, but we should all be aware of this and make the time we spend on developing meaningful and interpretable visualizations a function of the statistical complexity of our data and tools and, wherever necessary, provide not just p-values but also e¤ect sizes. By its very nature, this commentary can only scratch the surface, but I hope to have underscored a point that the three papers in this section already made beautifully. Historical corpus linguists and semanticists have a lot to benefit from being open to what new methodologies have to o¤er. Quantitative methods are being newly developed and popularized all the time, and staying informed about how these methods can help us along in our research should be a prime objective, especially given that linguistics as a whole is undergoing this move towards empiricism. Standard references such as Baayen (2008), Johnson (2008), or Gries (2009) provide easy entries to a whole new world out there that o¤ers possibilities too exciting to be ignored. References Baayen, R. Harald 2008 Analyzing linguistic data: a practical introduction to statistics using R. Cambridge: Cambridge University Press. Baayen, R. Harald 2011 Corpus linguistics and naive discriminative learning. Brazilian Journal of Applied Linguistics 11 (2): 295–328. Bresnan, Joan, Anna Cueni, Tatiana Nikitina, and R. Harald Baayen 2007 Predicting the Dative Alternation. In: Gerlof Bouma, Ineke Kraemer and Joost Zwarts (eds.), Cognitive foundations of interpretation, 69–94. Amsterdam: Royal Netherlands Academy of Science. Church, Kenneth W. and William Gale 1990 Word association norms, mutual information, and lexicography. Computational Linguistics 16 (2): 22–29. Gries, Stefan Th. 1999 Particle movement: a cognitive and functional approach. Cognitive Linguistics 10 (2): 105–145. Gries, Stefan Th. 2001 A multifactorial analysis of syntactic variation: particle movement revisited. Journal of Quantitative Linguistics 8 (1): 33–50. Gries, Stefan Th. 2006 Exploring variability within and between corpora: some methodological considerations. Corpora 1 (2): 109–151. Gries, Stefan Th. 2008 Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics 13 (4): 403–437.
Commentary: Corpus-based methods
195
Gries, Stefan Th. 2009 Statistics for linguistics with R: a practical introduction. Berlin and New York: Mouton de Gruyter. Gries, Stefan Th. 2010 Dispersions and adjusted frequencies in corpora: further explorations. In: Stefan Th. Gries, Stefanie Wul¤ and Mark Davies (eds.), Corpus linguistic applications: current studies, new directions, 197–212. Amsterdam: Rodopi. Gries, Stefan Th. To appear Corpus data in usage-based linguistics: What’s the right degree of granularity for the analysis of argument structure constructions? In: Mario Brdar, Milena Zˇic Fuchs, and Stefan Th. Gries (eds.), Convergence and expansion in cognitive linguistics. Amsterdam and Philadelphia: John Benjamins. Gries, Stefan Th. and Martin Hilpert 2008 The identification of stages in diachronic data: variability-based neighbor clustering. Corpora 3 (1): 59–81. Gries, Stefan Th. and Martin Hilpert 2010 From interdental to alveolar in the third person singular: a multifactorial, verb- and author-specific exploratory approach. English Language and Linguistics 14 (3): 293–320. Gries, Stefan Th. and Sabine Stoll 2009 Finding developmental groups in acquisition data: variabilitybased neighbor clustering. Journal of Quantitative Linguistics 16 (3): 217–242. Hilpert, Martin and Stefan Th. Gries 2009 Assessing frequency changes in multistage diachronic corpora: applications for historical corpus linguistics and the study of second language acquisition. Literary and Linguistic Computing 34 (4): 385–401. Hundt, Marianne and Nicholas Smith 2009 The present perfect in British and American English: has there been any change recently? ICAME Journal 33: 45–63. Johnson, Keith 2008 Quantitative methods in linguistics. Malden, MA: Blackwell. Leech, Geo¤rey and Roger Fallon 1992 Computer corpora – what do they tell us about culture? ICAME Journal 16: 29–50. R Development Core Team 2011 R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-90005107-0, URL . Szmrecsanyi, Benedikt and Christoph Wolk 2011 Holistic corpus-based dialectology. Brazilian Journal of Applied Linguistics 11 (2): 561–592.
Section 3: Theoretical Approaches
A sociolinguistic approach to semantic change1 Justyna A. Robinson Abstract The current study uses several sources of data to explore sociolinguistic dimensions of semantic change. More specifically, the investigation focuses on semasiological changes a¤ecting the adjective skinny in present-day English. In order to capture on-going meaning changes the apparent-time construct (Labov 1966) is employed. The results of the exploration indicate that meaning change largely progresses through a speech community in the way that the apparent-time hypothesis predicts, with younger and older speakers being responsible for innovative and conservative usage, respectively. Additionally, semantic variation and change is explained by employing the notion of socio-economic status or the social practices of speakers. Conclusions drawn from the current study indicate the considerable potential for adopting sociolinguistic approaches for investigations of semantic change in current and past communities.
1. The scope of the paper Labov said that ‘‘historical linguistics [. . .] is the art of making the best use of bad data’’ (1994: 11). Since historical documents survive by chance rather than through any deliberate selection researchers often need to deal with incomplete or scarce data in order to reconstruct past processes. Problems concerning gaps in historical evidence are also apparent for historical semanticists who usually base their studies around investigations of changes which took place centuries ago (for an overview of relevant studies see Traugott and Dasher 2005; for examples of case studies see Allan 2008; Blank and Koch 1999; Diaz Vera 2002; Fisiak 1988; Geeraerts 1997; Koivisto-Alanko 2000; McConchie et al. 2005). One of the potential solutions for reconstructing gaps in the historical evidence for semantic change may lie in learning from meaning changes which are happening as we speak. Thus, lessons learnt from a thorough exploration of current semantic changes may shed light onto processes a¤ecting meaning changes at larger time depths. 1. With thanks to Kathryn Allan, Joan Beal, Elizabeth Traugott and two anonymous reviewers for their helpful comments.
200
Justyna A. Robinson
The idea of looking into the present to explain the past is not new. More than 150 years ago scientists working in geology and biology noticed that the ‘‘knowledge of processes that operated in the past can be inferred by observing on-going processes in the present’’ (Christy 1983: ix). Although a number of linguists of that time were interested in what is known as a Neogrammarian approach, this perspective was not seriously embraced in linguistics until the 1960s when Labov published his studies on the sociolinguistic aspects of phonological variation and change in Martha’s Vineyard (1963) and New York (1966). These studies demonstrated that linguistic variation is not random, but is structured according to speakers’ age, gender, or socio-economic status. Moreover, Labov has shown that there is a link between observed sociolinguistic variation and linguistic change in progress. Both of the studies (Labov 1963, 1966) indicated that the linguistic di¤erences among di¤erent age groups or generations reflected the actual diachronic developments in the language. In other words, apparent-time variations mirrored real-time linguistic changes. Predictions concerning the variation of the diphthongs (ay) and (aw) in Martha’s Vineyard were confirmed by the real-time evidence provided earlier by Kurath (1941) (see Labov 1963: 275–276) and by a recently carried out follow-up study by Pope et al. (2007). In the past 50 years sociolinguistics has demonstrated that similar forces operate in various modern-day speech communities, thus confirming the validity of the variationist research programme, and the apparent-time construct specifically (Sanko¤ 2006). Recently, the apparent-time construct has also been successfully employed by historical linguists to account for linguistic changes that occurred in past speech communities. The seminal work in the area is Nevalainen and Raumolin-Brunberg’s (2003) investigation of a number of morphosyntactic changes in Early Modern English. Although historical sociolinguistics has embraced the apparent-time construct in the areas of phonology and morpho-syntax, there are hardly any studies that employ the same approach in order to account for gaps in evidence in diachronic semantics. In actual fact, there have been very few attempts to employ variationist sociolinguistic frameworks in order to account for meaning variation and change in the context of presentday communities (but cf. Robinson 2010a, 2010b).2 2. There have been several attempts to investigate meaning variation within functional paradigms (Hasan 1989, 1992, 2009) or discourse analysis frameworks (Cheshire 2007, Macaulay 2005, 2006, Stenstro¨m 2000, Wong 2002, 2008). These studies suggest that further exploration of semantic variation is worthwhile.
A sociolinguistic approach to semantic change
201
The current study aims to take the first steps towards exploring the sociolinguistics of semantic change by focusing on semasiological change in progress. First, I aim to establish whether semantic variation3 is sociodemographically structured. I am also planning to assess whether social structure is reflected in the semantic usage in the same way as it is reflected in phonological and morphosyntactic usage. Finally, I attempt to answer the question of whether semantic change in progress can be detected. This will involve employing the apparent-time construct in the interpretation of demographic information on the usage of incoming/disappearing senses. If the findings do indicate that semantic variation is socio-demographically structured and meaning change in progress parallels the trajectories found in linguistic change within other levels of language, one could attempt to employ the same construct to account for gaps in historical evidence and explain semantic changes that happened at larger time depths. However, investigating semantic change in past communities goes beyond the scope of the current study. The discussion in the current paper revolves around the analysis of the adjective skinny for which change in progress is hypothesised. The traditional Labovian study of linguistic change will be complemented with the investigation of other sources of data, such as corpus and dictionary evidence. The use of several sources of data aims to highlight a number of methodological issues of researching semantic change.
2. Method and data Although there are certain problems accompanying investigations of meaning at larger time depths, exploring semantic change in progress is not entirely a straightforward task either. First, one must determine how to empirically detect change in progress.4 Then, one must decide whether the observed variation indicates a genuine change in progress or whether it is merely a temporary ephemeral fluctuation in language use; how to do this is not entirely clear. 3. Semantic variation refers here to di¤erences between speakers in their frequency of use of salient senses (semantic variants) of polysemous words (semantic variables). 4. Kerremans et al.’s paper in the current volume o¤ers a possible solution to this methodological issue. The Authors use a tailor-made webcrawler to detect neologisms that appear on the Internet.
202
Justyna A. Robinson
The method suitable for this variationist research would have to achieve two primary aims. First of all, the usage of salient meanings (semantic variants) of a given polysemous lexeme would have to be detected in a way which would allow for comparisons between di¤erent participants. Secondly, the data on semantic usage would have to come from a sample of speakers who would be representative of di¤erent generations, gender, social status, etc. In order to address these two aims I collected data in a survey during which salient uses of investigated adjectives were experimentally elicited.5 72 speakers from South Yorkshire volunteered to take part in the study. The youngest participant was 11 and the oldest 94 years old. There were equal numbers of males and females and speakers were representative of socio-economic groups. All of the participants took part in one-to-one interviews during which I asked a series of questions which aimed to elicit the most salient uses of given adjectives. The questions followed the format of asking for a referent that could be described by the adjectives and then asking for a justification for the use of a given referent. Here, I present an example showing how the usage of the adjective skinny was elicited: Q: Who or what is skinny? A: My dog. Q: Why is your dog – skinny? A: Because it is very thin. The procedure of asking for a referent described by the adjective in question rather than directly asking for a meaning of the adjective was chosen as a way of eliciting a more natural and real usage of the word. Following the evidence of various cognitive semantic studies (for a summary see Geeraerts and Cuyckens 2007), a more salient and typical usage of a word for a given speaker is likely to emerge before its peripheral applications. Thus, it is possible that not all uses of the adjective active for a given speaker were elicited, but the ones that emerged were the more central ones for every speaker. Additionally, all participants were instructed to provide the answers that first came to their mind and were assured that they should say whatever they wished as there were no ‘‘bad answers’’. 5. Although sociolinguists tend to rely on usage data in making generalisations, there is also evidence that elicited material can also be successfully employed in sociolinguistic studies (cf. Labov’s (1966) department store survey).
A sociolinguistic approach to semantic change
203
This instruction was introduced to reinforce the elicitation of spontaneous, unconstrained responses that would be as close to their natural usage as possible. As a consequence, a set of usage data that was easily comparable between speakers was obtained. The rationale that participants provided for the use of each of the referents helped to gloss the meaning of each of the examples. The glosses of the meaning were mainly established on the basis of matching usage with dictionary definitions, although there were also instances of idiolectal or local uses that were not recorded under a separate definition in dictionaries. In total, fifteen adjectives were included in the interview: eight adjectives which are undergoing a change in progress6 and seven controlling adjectives7 (polysemous adjectives without recently developed sense extensions and (broadly understood) monosemous adjectives). The adjectives undergoing the change in progress were selected on the basis of real-time information on usage from relevant dictionaries and corpora. The most salient senses of each adjective were elicited during a structured interview and this was followed by a more casual conversation about the use and users of di¤erent words, speakers’ perceptions concerning the language of di¤erent generations and their comments on the use of a local dialect. This phase of the interview provided me with the opportunity to potentially clarify responses obtained in the elicitation experiment and also to record speakers’ metalinguistic comments on the linguistic variation. These comments turned out to be invaluable in understanding the factors involved in semantic change. In the current paper, I report findings on one of the investigated variables, namely the adjective skinny. This adjective was chosen for a discussion in the current article since its investigation raises a range of methodological issues that illustrate some of the key problems that researchers face when investigating semantic change, e.g. the scarcity or interpretation of real-time data. Before presenting the analysis of the data elicited from the survey, I will draw on corpus and dictionary evidence in order to establish the real-time development of individual senses of the adjective skinny. This real-time account of semantic change will serve as a benchmark for the interpretation of the apparent-time evidence. From a methodological point of view, subsequent sections will use several sources of data and demonstrate the 6. For example, the adjectives awesome, gay, skinny. 7. For example, the adjectives rectangular, stripy.
204
Justyna A. Robinson
value of combining di¤erent epistemological approaches in investigating changes in the meaning of words.
3. Sources of real-time evidence for the adjective skinny 3.1. Dictionary evidence In the Oxford English Dictionary on-line (hereafter, the OED), the adjective skinny is first attested in c1400 in an archaic sense of ‘beautiful or splendid of skin’. From the middle of 16th century it is also recorded with the sense ‘consisting or formed of skin, membranous’ and then with the sense ‘covered by skin’, ‘relating to skin’. Several colloquial uses of the adjective skinny also developed. One refers to the size of inanimate objects (e.g. skinny pencil ); whereas the other one refers to people, animals, or body parts lacking flesh, especially to an unusual or unattractive degree. One could argue that this pejorative reading could have facilitated the development of another evaluative sense extension of skinny, ‘mean, stingy’, which is illustrated by the following OED quotation from 1890: As a rule, the whole of the men in a factory would contribute, and ‘skinny’ ones were not let o¤ easily.
This sense seems to represent a regional use of the adjective skinny, as evidenced in Holloway’s General Dictionary of Provincialisms from 1838 where skinny is synonymous with ‘mean’, or ‘inhospitable’. Although this sense is currently used by older speakers in Northern Britain, the written evidence of its usage is rather scarce. The Dialect of Leeds (Smith 1862: 409) explains skinny as ‘niggardly’, as in the example ‘‘Ommast tuh skinny tuh live’’ (‘almost too niggardly/poor to live’). A recent written record of the regional use of skinny ‘mean’ can be found in Rollinson’s Cumbrian Dictionary of Dialect Tradition and Folklore (1997) (see the OED) and in sense-network questionnaires completed by speakers from Leeds in 1997 (Dr Carmen Llamas, personal communication) and from She‰eld in 2005 (Dr Katie Finnegan, personal communication). The OED also evidences other more recent developments of skinny. One of them (first attested in 1915) is ‘tight fitting’ with reference to clothing (e.g. skinny sweater). Another one is skinny in the sense of ‘naked’, mainly referring to the action of swimming naked, and used in collocation with dip or dipping. This colloquial use of the adjective skinny originated in North America and is supported by quotations starting from 1947. Another one, also marked as originating in North America, refers
A sociolinguistic approach to semantic change
205
to ‘low-calorie food and drink’, as in skinny latte. The earliest OED quotations for this sense in British English date back to 20028. In the current project, the following major uses of skinny are distinguished: – Skinny ‘low fat’: in reference to low-calorie food and drink. – Skinny ‘thin’: mainly refers to people and animals but also for describing inanimate objects, e.g. skinny pencil, green bean. – Skinny ‘showing skin’: ‘having skin prominently shown’ and found in phrases such as skinny dip(ping) ‘a naked swim’. – Skinny ‘mean’: ‘mean, not wanting to part with food, money, not generous’. – Skinny ‘tight fitting’: refers to clothes and is used in the sense of ‘tight fitting’ and also as a parasynthetic adjective such as skinny-rib ‘fitting tightly (across the ribs)’. – Skinny ‘other and non applicable’: mainly includes uses of skinny in proper names. 3.1.1. Corpus evidence In order to complement dictionary evidence with respect to the historical development of the adjective skinny, the distribution of individual senses across the relevant corpora of British English is also presented. The following corpora are used in the current project: the British National Corpus (hereafter the BNC ) and the Oxford English Corpus (hereafter the OEC ). While all data from the BNC (1960–1974) and the BNC (1975–1984) is analysed, the analysis of data from the BNC (1985–1993) is restricted to the following text types: spoken demographic, spoken context-governed, and written-to-be-spoken. The OEC is a two billion-word collection of English language which was used between 2000 and 2006. For the purpose of the current project I also selected a sub-corpus from the OEC. This subcorpus includes data from British English which are tagged as spoken and written unedited. The decision to use this subcorpus is motivated by the desire to focus on texts that reflect more spontaneous speech, since these are a likely source for slang, regionalisms, and neologisms (cf. Beeching 2005: 172). While all the concordances from the BNC are used, a sample of a random 250 concordances has been selected from the OEC. 8. The Encarta World English Dictionary, published in 1999, records the sense of skinny ‘low-fat’ as ‘made with skimmed milk (informal )’, but it gives no further information about which varieties of English use the sense.
206
Justyna A. Robinson
Table 1. Number of tokens of senses of the adjective skinny in corpora Count of meaning variants Skinny
Low fat
Thin
Showing skin
Mean
Other/ NA
Tight fitting
Sum
BNC 1960–1974
0
9
0
0
0
0
9
BNC 1975–1984
0
24
0
0
0
0
24
BNC 1985–1993
1
68
0
0
1
0
70
OEC 2000–2006
4
230
8
1
6
1
250
Sum of senses
5
331
8
1
7
1
Tables 1 and 2 respectively demonstrate the counts of tokens and the relative frequency of the senses of the adjective skinny in the corpora. The frequencies of use of di¤erent senses of the adjective skinny observed in corpus evidence are presented in Table 2. Skinny ‘thin’ is the leading sense here across all years investigated. The remaining senses of the adjective skinny have very low frequencies of use. This observation points to problems with the available corpus evidence in investigating change in progress. Firstly, the small size of available corpora makes it di‰cult to obtain su‰cient frequencies of use for comparisons across time (cf. Davies 2010a). For instance, skinny ‘showing skin’ was used in 8 examples only which account for 3% of the concordances of skinny extracted from the OEC. With the available 24 instances of the adjective skinny in the BNC
Table 2. Relative frequency of tokens of senses of the adjective skinny in corpora Frequency of meaning variants Skinny
Low fat
Thin
Showing skin
Mean
Other/ NA
Tight fitting
BNC 1960–1974
0.00
1.00
0.00
0.00
0.00
0.00
BNC 1975–1984
0.00
1.00
0.00
0.00
0.00
0.00
BNC 1985–1993
0.01
0.97
0.00
0.00
0.01
0.00
OEC 2000–2006
0.02
0.92
0.03
0.00
0.02
0.00
A sociolinguistic approach to semantic change
207
(1960–1974), there is little evidence to suggest the change in usage of this sense since the sample is simply too small (e.g. 3% of 24 would point to 0.72 example). Secondly, the size and the composition of the corpora are problematic in cases where dialectal uses are investigated. Only one instance of skinny ‘mean’ is registered in the corpora. Since the OED points to the regional use of this sense, it is questionable whether corpora (such as the BNC ) which typically under-represent regional usage are the best source for tracing the development of this meaning. Therefore, one may want to determine if variationist evidence could compensate for gaps in real-time evidence of the peripheral senses of the adjective skinny. 4. A sociolinguistic exploration of the semantic change in progress of the adjective skinny 4.1. Age-related variation of the adjective skinny The sociolinguistic analysis of the usage of the adjective skinny is based on a corpus of elicited interview data consisting of 195 observations (each participant provided on average three instances of usage of the investigated adjective and all instances of use are included in the analysis). The frequency of use of individual senses of the adjective skinny has been recorded and plotted across the X-axis representing participants’ age. An initial inspection of Figure 1 indicates that individual senses of the adjective skinny are non-randomly distributed across the di¤erent age groups. In order to assess whether the four generations significantly di¤er in their use of senses of the adjective skinny I carried out Kruskal-Wallis non-parametric tests9. The results indicate that significant age-related di¤erences are observed with regards to the use of skinny ‘showing skin’ (p ¼ .036), skinny ‘mean’ (p < .001), and skinny ‘low fat’ (p ¼ .04). Since the significance of age in variationist studies is usually interpreted as indicative of change in progress (Bailey 2002), one could attempt to examine the extent to which the distribution of senses in apparent time (Figure 1) reflects real-time semantic change of the adjective skinny. 9. The Kruskal-Wallis test involves the Test for Several Independent Samples procedure which compares two or more groups of cases on one variable. From the Kruskal-Wallis one-way analysis of variance, one might learn whether the four age groups of speakers di¤er in average use of di¤erent senses of the adjective skinny. The tests were run via SPSS18.
208
Justyna A. Robinson
Figure 1. Distribution of senses of the adjective skinny across age groups
Firstly, it is apparent (based on the corpus and dictionary evidence) that the central and most frequent sense skinny ‘thin’ is also the most salient sense for each generation. This observation suggests that the results of the current experiment largely correspond to the real-time information on the usage of this adjective and therefore provide support for the validity of the methodology employed in the current study. Although semantic variation in the adjective skinny in this speech community is characterised by the same central sense, age-related di¤erences in peripheral readings are evident. For example, by comparing the referential range for the youngest and the oldest participants presented in Figure 1, one immediately notices di¤erences in the type and number of salient senses of skinny used by each of these generations. There are four sense groups of the adjective skinny which are salient for the oldest speakers and only one that emerges for the young speakers. According to the
A sociolinguistic approach to semantic change
209
apparent-time construct these seemingly random di¤erences could actually be an outcome of processes of linguistic change. Let us then consider real and apparent-time data for the peripheral senses of the adjective skinny to verify this suggestion. The results of the survey presented in Figure 1 indicate that skinny ‘mean’ is used most frequently by the older generation and its use clearly diminishes with age. This finding would suggest that the adjective skinny is undergoing semantic change in progress with the sense skinny ‘mean’ gradually going out of use. This interpretation can be supported by considering the available real-time information. Although the corpus evidence is too limited to account for the usage of skinny ‘mean’, the OED presents information that sheds more light on the development of this sense. The OED indicates that skinny ‘mean’ is used regionally, by quoting dialect dictionaries as the main source of evidence for the use of this sense. A number of dialect surveys have shown that the use of regional lexis has been gradually diminishing in varieties of British English with processes such as dialect levelling, di¤usion, or standardisation usually accounting for this trend (cf. Britain 2001, 2002, 2009, Foulkes and Docherty 1999, and Trudgill 1986). Since skinny ‘mean’ is used regionally, one could suspect that the decrease in its use will not be unexceptional in the context of the decrease in the use of dialect lexis. This real-time development is thus reflected in the decreasing use of skinny ‘mean’ in apparent time. Other senses considered here are skinny ‘showing skin’ and skinny ‘low fat’, the use of which significantly di¤ers across generations. Again, since the frequency of use of these senses in the available corpora is very low the observations are drawn mainly on the basis of the OED evidence. The use of skinny ‘showing skin’ emerged in the late 1940s. The average age10 of the four oldest participants using skinny ‘showing skin’ is 70, which would suggest that they could have picked up this borrowing from American English when they were teenagers or young adults. Since the apparent-time hypothesis assumes the stability of vernaculars (Bailey 2002), the example of skinny ‘showing skin’ demonstrates that a given semantic innovation indeed remains mostly reflected in the language of the speakers who could have been responsible for di¤using this innovative use. In other words, the apparent-time evidence gathered in the survey corresponds to the real-time information and can be successfully used in tracing the steps of semantic change. 10. Speakers’ ID numbers and age in 2006: ID51 was 71 years old, ID1 was 67, ID102 was 70, and ID 94 was 74.
210
Justyna A. Robinson
The analysis of the generational distribution of the use of peripheral senses of the adjective skinny indicates that the usage of these senses is far from random. It has been demonstrated that the di¤erence in the number and type of senses salient for di¤erent age groups actually reflects language that the speakers learnt when they were young. One could therefore conclude that the incorporation of the apparent-time construct in investigations of semantic change is viable and also beneficial since it o¤ers another methodological tool to explore linguistic change. Although the use of the construct of apparent time can shed light on semantic change processes, selected aspects of the observed semantic variation require further attention. For instance, the analysis of skinny ‘showing skin’ indicates an increased use of this innovative sense for one generation (over 60s) but it is not clear why there is hardly any evidence of the use of this sense by younger generations. In order to address these questions, one could try to extend the sociolinguistic investigation beyond the category of age. Since a number of sociolinguistic studies have shown that apart from age, categories of gender and social status play an important role in explaining linguistic variation (see the review of relevant studies in Coates 2004, Eckert 1998, or Kerswill 2006), these sociolinguistic categories will be engaged to explain the observed usage of the adjective skinny. 4.2. Gender and social class The linguistic data explored in the current study comes from a sample of speakers which is not only representative of di¤erent age groups but also genders and social classes. The socio-economic status of participants was assessed on the basis of their education, occupation, and current place of residence. Speakers’ occupation is measured against the National Statistics Socio-Economic Classification (hereafter, NSEC) system which is used for o‰cial statistics and surveys in the United Kingdom.11 The professional status of speakers was classified into three NSEC groups, with 1 referring to higher level professions and 3 to lower level occupations.12 The value of the current place of residence was measured on the basis of postcode information provided by each participant. These postcodes were used to
11. The reduced method was used, the summary of which is available via http://www.ons.gov.uk/about-statistics/classifications/current/ns-sec/deriving/ reduced-method.pdf. 12. For further details of the classification see Robinson (2010b).
A sociolinguistic approach to semantic change
211
obtain the average house prices within the respective districts from the Land Registry House Prices Database (2008). In this way the relative status of each neighbourhood was determined and speakers were divided into those living in more and less a¿uent areas. In order to assess whether sense extensions of the polysemous adjective skinny are used significantly di¤erently by a particular gender or a socioeconomic group, Kruskal-Wallis statistical tests have been carried out. Only two results of these calculations yield statistically significant results. The education of participants is important in the case of skinny ‘mean’ (p ¼ .002), since speakers who left school before the age of 16 use this sense most. Occupation is significant in accounting for di¤erences in the use of skinny ‘showing skin’ (p ¼ .05) with professionals (NSEC1) leading the use of this sense. Although interesting results are obtained when social factors are considered in isolation, it could also be beneficial to examine the e¤ect that a combination of socio-demographic factors may have on the variation of skinny. External factors working in conjunction with each other may yield subgroups of speakers, for instance young males or working-class females, who exhibit significantly more or less frequent use of a given sense extension. In order to determine whether a combination of any socio-demographic factors could explain the observed variation in the use of the adjective skinny, a multifactor statistical model needs to be employed. While statistical tests, such as the Kruskal-Wallis test, can only tell us whether there is a significant di¤erence in use between groups of speakers from the point of view of one independent variable, multifactor models consider several external factors simultaneously and measure their e¤ect (both individually and in combination with other external factors) on the use of a given sense extension. In order to achieve the aims of a multifactor analysis in the current project, a decision tree analysis has been selected as most suitable. 4.2.1. Decision trees analysis A decision tree analysis is a technique based on separating cases into segments that are as di¤erent from each other as possible. For instance, with a decision tree analysis one can easily detect segments and patterns such as ‘females who graduated from university and living in the most a¿uent areas are likely to use a given linguistic structure’. In other words, this procedure predicts the class (belonging) of a dependent variable from values of predictor (independent) variables. The output of the analysis is usually presented in the form of a decision tree.
212
Justyna A. Robinson
This procedure requires the use of appropriate algorithms. The choice of algorithms largely depends on the type of data. The most appropriate algorithm chosen for the current analysis is Chi-square Automatic Interaction Detection (hereafter, CHAID)13. This is a non-parametric stepwise regression procedure that produces splits until it gets significant p-values for each split. All p-values in the CHAID analysis were adjusted for multiple comparisons using the Bonferroni method. The analysis was conducted via AnswerTree 3.1. For further explanation of the method see Schmid (2010). In the current project a decision trees analysis is run in order to verify the importance of socio-demographic factors in predicting the use of individual senses of the adjective skinny. This analysis will indicate whether there are any significant socio-demographic groups or subgroups (e.g. age by gender) that use individual senses of skinny. Decision tree analyses of all the sense extensions of the adjective skinny generated trees for only two meaning groups: skinny ‘mean’ and skinny ‘showing skin’ (see Figures 2 and 3 respectively). The fact that decision trees failed to grow for senses that produced significant results in the Kruskal-Wallis tests (e.g. skinny ‘low fat’) does not mean that there is a flaw in statistical calculations. Decision tree analysis involves more rigorous calculations that have predictive power, whereas the results of non-parametric tests are of descriptive value only. 4.2.1.1. Decision tree on skinny ‘mean’ The multivariate analysis via Answer Tree 3.0 is presented in Figure 2 and it shows that the age of participants and the value of the place of residence predict the use of skinny ‘mean’. The output presents several levels of significant splits (here two levels). The tree stops producing new splits when further splits become statistically insignificant. In decision tree diagrams, the square at the top (Node 0) represents the characteristics of a variable to be analysed. There are two categories in the variable skinny ‘mean’: non-use and use of skinny ‘mean’, represented by 0 and 1 respectively (Figure 2). There are 58 cases of the former and 14 cases of the latter, making up respectively 80.56% and 19.44% of the variable. These frequencies are visually presented in
13. Other algorithms have also been considered: CandRT, QUEST (SPSS White Paper, Answer Tree Algorithm Summary, 1999).
A sociolinguistic approach to semantic change
Figure 2. Decision tree of skinny ‘mean’
213
214
Justyna A. Robinson
the form of bars in the bottom of the square. Non-use of skinny ‘mean’ is represented by a darker shade of grey, whereas use of this sense is represented by a lighter shade of grey. The first significant split takes into consideration the age group to which speakers belong. The statistics for the significance of this split are described just above the split. The statistics summary indicates that age group is the most significant predictor for using skinny ‘mean’ (p < .001, w2 ¼ 26.6, df ¼ 1). Speakers who use this sense most frequently belong to the oldest generation who exhibit the use of skinny ‘mean’ in 61.11% of cases (by 11 speakers). The second significant split is based on the postcode values of speakers’ residences (p ¼ .0469, w2 ¼ 5.84, df ¼ 1). The multivariate analysis combined speakers living in medium and cheaper areas and separated them from those living in more a¿uent neighbourhoods. In other words, the results indicate that among people who are over 60 years old, those living in medium and cheaper areas use skinny ‘mean’ significantly di¤erently (more frequently) from those living in more a¿uent neighbourhoods. The risk estimate for this decision tree is 0.2114, which indicates that if I use the decision rule based on the current decision tree I would correctly classify 79% (100% minus 21%) of cases. The calculations of risk are not presented on the decision tree. 4.2.1.2. Decision tree on skinny ‘showing skin’ The decision tree analysis of the variation of skinny ‘showing skin’ presents two significant splits (Figure 3). The first significant split considers age group to be the most significant factor in using skinny ‘showing skin’ and segregates speakers into those who are above and under 60 years old (p ¼ .0227, w2 ¼ 8.67, df ¼ 1). The second split considers speakers’ professions to be important in determining users of skinny ‘showing skin’. The analysis demonstrates that this sense can be mostly predicted from the speech of the oldest speakers who also belong to NSEC1 (p < .0134, w2 ¼ 8.08, df ¼ 1). The risk estimate for this decision tree is 0.25 which indicates that if I use the decision rule based on the current decision tree I would correctly classify 75% of cases.
14. The exact calculations of risk estimates are not presented here.
A sociolinguistic approach to semantic change
Figure 3. Decision tree of skinny ‘showing skin’
215
216
Justyna A. Robinson
5. Discussion of the results of the sociolinguistic analyses of the adjective skinny The following sections discuss the results of the sociolinguistic analyses of the adjective skinny. While the e¤ects of age have been already considered in Section 4.1, the gender and the socioeconomic status of speakers are now additionally taken into consideration. Skinny ‘mean’, ‘showing skin’, and ‘low fat’ are discussed in the most detail since statistical tests yielded significant results for these senses. Although no significant statistical outcomes emerge for the usage of skinny ‘thin’, a brief discussion of survey responses that signal potential variation of its usage is also included. Skinny ‘tight fitting’ is not discussed here since it is has been used too infrequently15 in the survey for generalisations on its use to be made. 5.1. skinny ‘mean’: a disappearing sense? The decision tree analysis presented in Figure 2 does not only confirm that the age of speakers is a significant factor for explaining the variation in the use of skinny ‘mean’ but it also suggests that this sense extension is mostly accounted for in the speech of lower socio-economic groups (represented by the lowest and middle postcode values). This finding contributes to previously stated suggestions that the regional sense skinny ‘mean’ is likely to be disappearing (Section 4.1). Dialect studies indicate that regional vocabulary is mostly retained in the speech of people who are members of close-knit networks (e.g. Britain 2001). Members of such networks are less geographically and socially mobile and such speakers typically belong to lower socio-economic classes (e.g. Milroy and Milroy 1992). Since the majority of the oldest users of skinny ‘mean’ in the current study belong to lower rather higher social classes, it is not surprising that this regional sense extension is evidenced to be retained mostly in their vocabulary. Although the more a¿uent oldest middle-class speakers are the first ones to stop using skinny ‘mean’, there are still three out of eighteen of these participants who exhibit this regional use. In order to understand the reasons for the presence of the dialectal sense in the speech of middleclass speakers, contextual evidence of the usage from the interview is now considered. 15. There are three instances of use of skinny ‘tight fitting’ in the sample of interviewed speakers.
A sociolinguistic approach to semantic change
217
Additional comments provided during the interviews indicate that middle-class speakers who use skinny ‘mean’ introduce constraints regarding their usage of this sense extension. For example, a 67-year-old male speaker (ID3) states that skinny ‘mean’ is a dialectal term. By defining skinny ‘mean’ as a regional sense, he may not only be indicating limitations for its use but he may even be signalling a way of distancing himself from this term or potentially from the general use of regional dialect lexis. A 74-year-old female speaker (ID94) constrains her usage of skinny ‘mean’ more explicitly by saying that she only uses this sense with friends of a similar age. This comment could be indicative of several issues. Firstly, she might have noticed that this sense extension was not used by younger generations and she might therefore have been reporting on the observed semantic change in progress. Secondly, her indication that this term is used among friends would suggest that its usage is restricted to informal contexts only. One could conclude that a certain level of familiarity, privacy, and potentially trust between speakers is needed in order for skinny ‘mean’ to be successfully used and understood. There is also a possibility that skinny ‘mean’ is to some extent stigmatised (e.g. because it is used by dialect speakers?; present in the speech of lower social classes?), and is therefore pragmatically too ‘risky’ to use with strangers. Conclusions that skinny ‘mean’ is used between people who are in close relation with each other, such as friends or family members, can be drawn from comments made by the third middle-class speaker using this sense (ID51, age 71) who said ‘‘My sons thought I was a ‘skinny parent’ in that I didn’t give them enough pocket money’’. The brief analysis of comments provided by middle-class speakers in reference to their use of skinny ‘mean’ contributes to a greater understanding of how semantic change in dialect senses progresses through a speech community. Although regional terms are mostly retained in the speech of lower social classes, for middle classes disappearing dialect senses are kept at more informal levels only. Although skinny ‘mean’ is mainly used by speakers who are over 60years-old, the decision tree analysis (Figure 2) also indicates that there are three younger participants who also use skinny ‘mean’. Further scrutiny of the socio-demographic characteristics of these speakers shows that all of these speakers are males, aged 27, 46, and 59; the youngest and the eldest speakers are from lower working class, the remaining speaker is from upper working class. The analysis of the socio-demographic makeup of these three conservative speakers also confirms earlier observations that upper and lower working classes rather than middle classes are
218
Justyna A. Robinson
retaining the dialect uses. All of these speakers are males, which would also pose a question of whether males are more likely to use older dialectal senses than females. Although the sample of speakers is too small here to draw any solid conclusions, there might be some avenue for further research to verify that question, especially as available sociolinguistic studies indicate that females are more likely to be leading linguistic change (Foulkes and Docherty 1999: 16; Labov 1990: 218–219, 2001: 284; Robinson 2010b; Tagliamonte and D’Arcy 2009: 90). The 27-year-old speaker is the only one here who provides additional information on his conservative usage. He says that he uses skinny ‘mean’ only with his parents and not with his peers as his friends do not tend to use this term. These comments would suggest that semantic change is indeed a function of the age of participants with younger speakers using disappearing senses only when speaking to older generations. His comments also confirm what has been concluded from the analysis of middle-class speakers: that dialectal or older senses are likely to be retained mostly in informal contexts and in close relationships. During the interviews, I occasionally asked younger participants who did not use skinny ‘mean’ in the elicitation survey whether they had heard of this term. They stated that they were unaware of this sense extension, occasionally suggesting that I must have mistaken it with skint as in skinflint.16 I also asked a number of participants for whom skinny ‘mean’ was a salient sense whether they were aware that this sense was undergoing a change in progress. Apart from a couple of comments discussed earlier they said that they had not realised this sense was going out of use and were surprised to hear that younger people did not even realise that skinny is used in the sense ‘mean’, this was apparent even when I spoke to younger and older people from the same neighbourhood. Although semantic change can often be noticeable and salient in a speech community (for example, gay ‘happy’ > ‘homosexual’, wicked ‘evil’ > ‘good’), there are examples of meaning change that seem to go largely unnoticed. I think that the change of skinny ‘mean’ could be an example of such quiet, unnoticed change.
16. These comments could suggest the existence of folk associations of skinny ‘mean/niggardly’ with skinflint or even skint. Explorations of lexical field of MEAN/NOT GENEROUS could further establish whether these folk associations could have any e¤ect on usage of skinny ‘mean/niggardly’.
A sociolinguistic approach to semantic change
219
5.2. skinny ‘showing skin’: an analysis of a borrowing While it has been shown (Section 4.1) that an increased use of skinny ‘showing skin’ among speakers over 60-years-old corresponds to the realtime information on the emergence of this sense extension in British English, it is still unclear why the use of skinny ‘showing skin’ diminishes so drastically for younger generations in apparent time. Perhaps for these speakers the use of skinny ‘showing skin’ is so entrenched in the prefab collocation skinny dipping that this meaning does not come to mind when used alone. However, a detailed investigation of this idea is beyond the scope of the current paper. The decision tree analysis (Figure 2) demonstrates that this sense is used significantly by the oldest speakers who also hold more professional occupations (NSEC1). Other information on the socio-economic status of these speakers (education, neighbourhood) clearly places them as middleclass speakers. Why would the oldest speakers of higher socio-economic classes use a higher frequency of skinny ‘showing skin’ than other classes and age groups? Before answering both of these questions, background information on the etymology of skinny dipping and the practice of naked swimming in Britain needs to be outlined. All the uses of skinny ‘showing skin’ in my data and corpus materials refer solely to the action of swimming naked and almost solely occur in a collocation with dip, dipping or dipper. On reading various sources discussing nude swimming (Ayriss 2009; Deakin 1999) one could distinguish two ‘‘types’’ of naked swimming: a ‘‘traditional’’ and a ‘‘new’’ one. A more ‘‘traditional’’ type would reflect a common, largely socially acceptable past-time practice of bathing in rivers, lakes and ponds, which would usually involve individual persons or same-sex groups. A ‘‘new’’ type of naked swimming is associated more with recreational swimming, swimming after dark, mixed gender swimming, ideas of organic living, ecology and possibly sexual liberation. The ‘‘new’’ type of naked swimming has also connotations with committing some forbidden activity, as an activity on the edge of legal or appropriate behaviour. The OED evidence indicates that skinny in reference to naked swimming originated in North America and the first records of its use date back to 1947. The OED quotations and several available examples from the Corpus of Historical American English (Davies 2010b) suggest that skinny dipping referred to the ‘‘new’’, rather than the ‘‘traditional’’ type of naked swimming. Moreover, I could not locate any specific terms used to denote ‘swimming without clothes on’ before skinny ‘showing skin’
220
Justyna A. Robinson
was borrowed into British English, apart from nude or naked swimming. One could suggest that until naked swimming was an exceptional activity there was also no ‘‘special’’ term for it. When bathing naked became an unusual, and in some contexts stigmatised, practice, a new name was used to refer to it. In this context, skinny ‘showing skin’ in skinny dipping could be considered to be a transgressive label. This discussion contextualises possible answers as to why the oldest speakers of higher socio-economic classes use skinny ‘showing skin’ more frequently than other classes. I will first consider the e¤ects of age and then class. From the point of view of the apparent-time construct it is unsurprising to see that oldest speakers show an increased usage of skinny ‘showing skin’. The average age of these speakers is 70 which would mean that when skinny ‘showing skin’ was coming into use they were young adults. The subsequent decrease of its use among younger generations in time could be associated with the overall decrease of practices of outdoor swimming in the UK.17 Very recently however, one can notice an increased interest in ‘wild swimming’ (Deakin 1999; Perraton 2005; Rew and Tyler 2009; Start 2008) as a part of a wider interest in outdoor living. This could potentially lead to an increased use of skinny ‘showing skin’ among younger speakers in the near future. Moreover, since the lifestyle has been mainly discussed in the media (e.g. BBC4)18 with a largely middle-class audience19, one could also suggest that the usage of the sense skinny ‘showing skin’ might increase among middle-class rather than working-class speakers. This conclusion leads to consider the fact that in the current project all of the oldest speakers using skinny ‘showing skin’ are members of the middle classes. Why would the middle classes pick up and use this sense extension more than other classes? Since semantic usage has got an experiential nature, the presence of this sense extension in the speech of the middle classes could suggest that they have more exposure to, or at least more knowledge, of the practices that are glossed by this sense than members of other classes. The following paragraphs consider the extent to which this could be the case. 17. There were various concerns about swimming in polluted water and accidental deaths occurring during outdoor swimming. As a consequence, in 1970s children were actively discouraged from outdoor swimming by the screening of a short public information film called The Spirit of Dark and Lonely Water (1973) by Je¤ Grant which was commissioned by the British government. 18. For example see programme Wild Swimming (2010) by David Johnson. 19. See National Readership Survey and Broadcasters’ Audience Research Board.
A sociolinguistic approach to semantic change
221
Historical sources indicate that in certain English schools, especially public schools such as Manchester Grammar School, nude swimming was compulsory until the 1970s (Cohen 2005). So there is a possibility that middle-class teenagers at that time could be at least more aware of this practice than their working-class peers who would attend state schools which rarely provided comparable swimming pool facilities. However, in the discussion of naked swimming in Manchester Grammar School (Cohen 2005) contributors refer to this activity as naked swimming (18 instances) and occasionally as nude swimming (3 instances). The term skinny dipping is not mentioned at all, which may be due to the fact that naked swimming in schools could be classified as the ‘‘traditional’’ swimming type whereas skinny dipping signifies the ‘‘new’’ type of swimming naked. Since skinny ‘showing skin’ is a North American borrowing, another possibility would be to consider the extent to which the middle classes were more likely to be borrowing this term than other classes. On the one hand, middle-class speakers could have been exposed to American English more than other social groups. This could happen because of an increased possibility that these speakers might meet Americans: e.g. via work, since middle-class speakers are usually more socially mobile and are members of more loose and open social networks compared to working classes.20 On the other hand, one could consider whether it was not exposure of these speakers to American English as such but to the social practice represented by ‘skinny dipping’ that was responsible for an increased frequency of skinny ‘showing skin’ in middle-class speech. These middle-class speakers could a¤ord to travel more and therefore might have more opportunity to become more familiar with the new practice of skinny dipping, for example whilst on holiday. The new practice of skinny dipping could have been an interesting novelty for some but unacceptable and possibly o¤ensive for others. One could suggest that an additional emotional load (regardless of whether positive or negative) accompanying the meaning of this term could also have facilitated the acquisition of this use. Although it is di‰cult to establish a definitive answer to the question of why middle-class speakers in the survey use skinny ‘showing skin’, the discussion highlights possible explanations that could be further investigated in follow-up interviews with speakers. 20. The occupations of the four older participants using skinny ‘showing skin’ were as follows: civil servant, school teacher, IT manager, and artist.
222
Justyna A. Robinson
5.3. skinny ‘low fat’: an additional dimension to the apparent-time hypothesis Although skinny ‘low fat’ is a recently developed sense extension of the adjective skinny, it is not recorded in the speech of the youngest speakers as the apparent-time hypothesis might predict. However, this finding is not surprising since the absence of this sense for the under 18-year-old speakers is the function of them probably not drinking too many lattes. Let us then look more closely into speakers who do use skinny ‘low fat’. Section 4.1 shows that an increased use of the innovative sense skinny ‘low fat’ is present in the speech of two age groups and . Statistical tests that considered external factors such as gender and socioeconomic status did not yield significant results. In this case, I attempt to account for at least certain aspects of the variation of skinny ‘low fat’ by considering the socio-demographic characteristics of individual speakers who use this sense extension. Skinny ‘low fat’ is a borrowing from American English, which refers to both food and drink. In British English, skinny is mainly used to describe cafe latte made with skimmed milk. The use and spread of this sense and, more importantly, its concept, is associated with ‘‘co¤ee culture’’, a relatively new phenomenon in a tea-drinking country. Co¤ee shops are more visible than tearooms in city centres: in She‰eld, where this the majority of participants of this study live, five Starbucks co¤ee shops selling skinny latte and skinny mu‰ns have opened within just the last few years. If skinny ‘low fat’ is associated with new social practices, people who use this sense are the ones who are more exposed to these practices. The socio-demographic information on participants who use skinny ‘low fat’ indicates that they are mainly females from the upper working and middle classes who work at the University of She‰eld21 (see Table 3). It is feasible to suggest that working in the city centre would provide more opportunities to go to co¤ee shops and potentially to be exposed to what the concept skinny ‘low fat’ represents. So from the point of view of the apparent-time paradigm, one could infer that not all meanings are introduced by younger generations, especially if the youngest speakers lack the world experience associated with a given word. Certain meanings can be introduced by other socio-demographic groups, for whom a need to express a new concept arises, and these meanings may afterwards 21. The first Starbucks in She‰eld was opened in the vicinity of the University of She‰eld (Western Bank).
A sociolinguistic approach to semantic change
223
Table 3. Socio-demographic information on speakers who use skinny ‘low fat’ (mean age ¼ 39.8 years) Gender
Age
Class
Employment
Female
24
Middle
University: Postgraduate student
Female
25
Upper working
Bar Manager
Female
49
Middle
University: Employee
Male
49
Middle
University: Employee
Female
52
Upper working
University: Employee
spread to other sections in a community (cf. the development of French arriver ‘arrive at the shore’ which initially was used by sailors before its use generalised to ‘arrive’). The apparent-time hypothesis applied to meaning variation and change also has to consider the referential meaning of a new variant and its functional properties, an issue that is not so important in the case of phonological and morpho-syntactic variation and change. Although evidence gathered in this study shows that innovations are likely to be introduced by younger rather than older people, cases of skinny ‘low fat’ or French arriver ‘arrive at the shore’ suggest that certain types of semantic change may originate anywhere in a sociodemographic structure. 5.4. Skinny ‘thin’: cultural change While skinny ‘thin’ is the most salient sense for all speakers interviewed in the survey, there are di¤erences in the evaluative load of the examples provided by participants of the survey. Younger speakers and women tended to provide more neutral or positive examples of the use of this sense (such as people, my friend, my daughter, someone who is running a lot, Kate Moss, Posh Spice, Twiggy, people on TV, celebrities). Also, one female22 exclaimed ‘‘What I long to be’’ when I asked her about skinny. On the other hand, older speakers and men provided more negatively loaded examples of the use of this sense (anorexics, unhealthy people, old people). Additionally, there was one male23 who specifically said that he does not like this word being used of humans as it is not very nice. Although a 22. ID67, upper working class, age 52. 23. ID8, upper working class, age 63.
224
Justyna A. Robinson
detailed analysis of individual responses is not discussed in this paper, my initial observations lead to a conclusion that skinny ‘thin’ is changing towards a more ameliorated meaning. While for older speakers being skinny evokes associations of sickness, malnourishment, and potentially poverty, younger speakers see this adjective almost as a compliment. This di¤erence in the usage of skinny ‘thin’ could be considered a reflection of changes in the perception of beauty that are visible on a number of levels in western societies, such as media, fashion and art. Being skinny seems to be considered as a desirable rather than a detestable quality, and this cultural shift surfaces as variation in the connotative value of the sense skinny ‘thin’ used by di¤erent generations in the current survey.
6. Discussion The sociolinguistic analysis of semantic variation of the adjective skinny provides a number of insights into why di¤erent speakers use di¤erent senses of the same polysemous word. By employing variationist analytical tools, it turns out that the usage of di¤erent senses of the adjective skinny is not random, but relates to the socio-demographic characteristics of speakers. Out of all of the external factors considered, the age of participants is the most important factor that accounts for the observed variation. Historically older senses are more frequently used by older generations and more recent meanings are present in the speech of younger speakers. This significant correlation between usage and the age of speakers is considered in the context of the apparent-time construct and therefore taken as indicative of semantic change. Additionally, a consideration of real-time evidence confirms predictions of hypothesised changes in meanings of the adjective skinny. The most important conclusion here is that semantic change in progress can be observed, which opens the door to a number of investigative possibilities and leads to a better understanding of the processes of meaning development. Moreover, semantic change seems to be following similar routes of development to the types of change observed at other levels of language, such as phonology or morpho-syntax. A further investigation into speakers who are, for example, innovative on several linguistic levels could contribute to further development of theories of linguistic change. Although the significance of age is crucial to the interpretation of the observed variation, demographic factors rarely work in isolation from the
A sociolinguistic approach to semantic change
225
other social characteristics of a speaker. By incorporating categories of education or profession, one can identify more precisely areas of semantic innovation or conservatism. This knowledge can then give researchers a better chance of locating circumstances of semantic development and to hypothesise about potential motivations for changes in meaning (cf. the analysis of skinny ‘mean’). Although the gender of speakers has turned out to not be a statistically significant factor for explaining semantic variation, the discussion of certain peripheral uses of the adjective skinny (Section 5.1) points to the fact that the possibility of the e¤ects of gender on semantic variation should not be discarded. It needs to be emphasised that a successful sociolinguistic analysis of semantic variation would be di‰cult without the incorporation of quantitative statistical methods. The decision tree analysis clearly shows how a combination of external factors can explain semantic usage. An additional advantage of using this technique is that it does not require much manual coding to quickly classify speakers into groups of similar usage. The knowledge of social practices of speakers and their socio-economic characteristics can even in some cases outweigh the importance of age in explaining their semantic usage. For instance, the analysis of skinny ‘low fat’ demonstrates that semantic innovations are not necessarily introduced by the youngest members of the community, but can also originate elsewhere in the socio-demographic structure. This finding suggests that further dimensions (as compared with studies of change on other levels of language), such as the actual referential meaning of a given sense, need to be considered when incorporating the apparent-time hypothesis in semantic change research. The referential range of a word together with any connotations and attitudes associated with its di¤erent uses may also provide an explanation for the change of individual senses within a polysemous word. For older speakers, the adjective skinny seems to project more negative connotations both literally (‘thin’) and metaphorically (‘mean’). One may suggest that the subsequent amelioration in the use of skinny ‘thin’ could have contributed to the decrease of the pragmatic application of derogatory skinny ‘mean’. Although the main aim of the current study involves the sociolinguistic analysis of a polysemous adjective, there are other possible avenues of exploration of semantic usage that could complement the conclusions drawn. Since semantic change does not happen in isolation, taking an onomasiological perspective would allow for a consideration of the e¤ect other items in the lexical field may have on individual uses of skinny. Also,
226
Justyna A. Robinson
a closer analysis of the referents used by participants to contextualise uses of the adjective skinny could shed more light onto the role of collocations in the development of particular senses. This could be particularly useful when analysing senses of ‘showing skin’ and ‘low fat’. For example, the use of skinny in skinny dipping may be so entrenched in the collocation that this meaning may not come to mind when used alone. There is a possibility that someone may use skinny-dip but the word skinny would not trigger that meaning for them.
7. Summary and conclusions The wealth of observations made about semantic variation and change suggest that taking a sociolinguistic perspective in studies of meaning is a viable and beneficial methodological choice. Apart from being able to capture semantic change in progress and explore recent change, the variationist approach is sensitive to relatively small fluctuations in the usage of a given polysemous word. As a consequence this method allows investigation into more ephemeral meaning changes which complete in a relatively short space of time. Moreover, the current study shows the value of drawing on di¤erent sources of data in explaining meaning change. It has been shown that fine-grained evidence on usage generated in a variationist study could be employed to complement dictionary or corpus materials in historical language research. Finally, the current study demonstrates the benefits of employing quantitative statistical methods in semantic change research. The employment of variationist analytical methods in semantics still needs to be more extensively researched. However, the conclusions drawn from the current study indicate the considerable benefits of taking a sociolinguistic approach in order to investigate semantic change in current communities and to suggest its potential for tracing semantic change in past communities.
Bibliography Allan, Kathryn 2008 Metaphor and Metonymy: A Diachronic Approach. (Publications of the Philological Society 42.) Chichester: Wiley-Blackwell. AnswerTree, Version 3.1 2002 Chicago: SPSS Inc.
A sociolinguistic approach to semantic change Ayriss, Chris 2009 Bailey, Guy 2002
227
Hung out to Dry. Swimming and British Culture. Lulu.com. Real and apparent time. In: Jack K. Chambers, Peter Trudgill and Natalie Schilling-Estes (eds.), The Handbook of Variation and Change, 312–332. Oxford: Blackwell.
Beeching, Kate 2005 Politeness-induced semantic change. The case of quand meˆme. Language Variation and Change 17: 155–180. Blank, Andreas and Peter Koch 1999 Historical Semantics and Cognition. Berlin/New York: Mouton de Gruyter. Britain, David 2001 Space and spatial di¤usion. In: Jack K. Chambers, Peter Trudgill and Natalie Schilling-Estes (eds.), Handbook of Language Variation and Change, 603–637. Oxford: Blackwell. Britain, David 2002 Phoenix from the ashes? The death, contact, and birth of dialects in England. Essex Research Reports in Linguistics 41: 42–73. Britain, David 2009 One foot in the grave?: Dialect death, dialect contact and dialect birth in England. International Journal of the Sociology of Language 196/197: 121–155. British National Corpus, Version 3 (BNC XML Edition) 2007 Accessed via http://www.sketchengine.co.uk. Broadcasters’ Audience Research Board http://www.barb.co.uk/ (accessed January 2011). Chambers, Jack. K., Peter Trudgill and Natalie Schilling-Estes (eds.) 2002 The Handbook of Language Variation and Change. Malden, Mass./Oxford: Blackwell Publishers. Cheshire, Jenny 2007 Discourse variation, grammaticalisation and stu¤ like that. Journal of Sociolinguistics 11(2): 155–193. Clark, Alexander, Chris Fox and Shalom Lappin (eds.) 2010 The Handbook of Computational Linguistics and Natural Language Processing. Oxford: Blackwell. Coates, Jennifer 2004 Women, Men and Language. London: Routledge. Cohen, Michael 2005 Swimming naked at MGS. The Mancunian, December 2005. http://www.oldmancunians.org/html/news/current/ TOM%2021December%202005.pdf (accessed in January 2011). Christy, Craig T. 1983 Uniformitarianism in Linguistics. Amsterdam: Benjamins.
228
Justyna A. Robinson
Davies, Mark 2010a
Davies, Mark 2010b
Deakin, Roger 1999
The 400-million-word Corpus of Historical American English (1810–2009): A new tool for in-depth research on Late Modern English. Paper presented at the 16th International Conference of English Historical Linguistics. Pe´cs, Hungary. The Corpus of Historical American English (COHA): 400þ million words, 1810–2009. Available online at http://corpus.byu.edu/coha.
Waterlog: A Swimmer’s Journey Through Britain. London: Chatto and Windus. Diaz Vera, Javier E. (ed.) 2002 A Changing World of Words: Studies in English Historical Lexicography, Lexicology and Semantics. Amsterdam/New York: Rodopi. Eckert, Penelope 1998 Gender and sociolinguistic variation. In: Jennifer Coates (ed.), Language and Gender: A Reader, 64–75. Oxford: Blackwell. Encarta World English Dictionary 1999 London: Bloomsbury. Fisiak, Jacek (ed.) 1988 Historical Dialectology: Regional and Social. (Trends in Linguistics. Studies and Monographs 37.) Berlin: Mouton de Gruyter. Foulkes, Paul and Gerard J. Docherty 1999 Urban Voices: Accent Studies in the British Isles. London: Arnold. Geeraerts, Dirk 1997 Diachronic Prototype Semantics: A Contribution to Historical Lexicology. Oxford: Clarendon Press. Geeraerts, Dirk and Hubert Cuyckens (eds.) 2007 The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford University Press. Grant, Je¤ 1973 The Spirit of Dark and Lonely Water. Central O‰ce of Information. http://www.nationalarchives.gov.uk/films/1964to1979/ filmpage_lonely.htm (accessed in January 2011). Hasan, Ruqaiya 1989 Semantic variation and sociolinguistics. Australian Journal of Linguistics 9: 221–275. Hasan, Ruqaiya 1992 Meaning in sociolinguistic theory. In: Kingsley Bolton and Helen Kwok (eds.), Sociolinguistics Today: International Perspectives, 80–119. London: Routledge. Hasan, Ruqaiya (ed.) 2009 Semantic Variation: Meaning in Society and in Sociolinguistics. The Collected Works of Ruqaiya Hasan. London: Equinox.
A sociolinguistic approach to semantic change
229
Johnson, David 2010 Wild Swimming. Produced for BBC 4. http://www.bbc.co.uk/ programmes/b00t9r28 (accessed in January 2011). Kerswill, Paul 2006 Socio-economic class. In: Carmen Llamas and Peter Stockwell (eds.), The Routledge Companion to Sociolinguistics, 51–61. London: Routledge. Koivisto-Alanko, Pa¨ivi 2000 Abstract Words in Abstract Worlds: Directionality and Prototypical Structure in the Semantic Change in English Nouns of Cognition. Helsinki: Socie´te´ Ne´ophilologique. Kurath, Hans (ed.) 1941 Linguistic Atlas of New England. Providence, R. I.: Brown University. Labov, William 1963 The social motivations of a sound change. Word 19: 273–309. Labov, William 1966 The Social Stratification of English in New York City. Washington: Center for Applied Linguistics. Labov, William 1990 The intersection of sex and social class in the course of linguistic change. Language Variation and Change 2, 205–254. Labov, William 1994 Principles of Linguistic Change. Oxford: Blackwell. Labov, William 2001 Principles of Linguistic Change: Social Factors. Oxford: Blackwell. Land Registry House Prices Database http://www.landregistry.gov.uk/ (accessed between January 2007 and January 2008). Macaulay, Ronald K. S. 2005 Talk that Counts: Age, Gender, and Social Class Di¤erences in Discourse. New York/Oxford: Oxford University Press. Macaulay, Ronald K. S. 2006 Pure grammaticalization: The development of a teenage intensifier. Language Variation and Change 18(3): 267–283. McConchie, Ronald W., Olga Timofeeva, Heli Tissari and Tanja Sa¨ily (eds.) 2005 Selected Proceedings of the 2005 Symposium on New Approaches in English Historical Lexis (HEL-LEX). Somerville, MA: Cascadilla Proceedings Project. Milroy, Lesley and James Milroy 1992 Social network and social class. Toward an integrated model. Language in Society 21: 1–26. National Readership Survey http://www.nrs.co.uk/ (accessed in January 2011).
230
Justyna A. Robinson
Nerlich, Brigitte and David D. Clarke 1992 Semantic change: case studies based on traditional and cognitive semantics. Journal of Literary Semantics 21: 204–225. Nevalainen, Terttu and Helena Raumolin-Brunberg 2003 Historical Sociolinguistics: Language Change in Tudor and Stuart England. London: Longman. O‰ce for National Statistics 2008 http://www.statistics.gov.uk/pdfdir/fng0908.pdf (accessed in January 2010). O‰ce for National Statistics: Standard Occupational Classification 2000, Volume 2 2000 – Coding Index London: The Stationery O‰ce. Oxford English Corpus 2008 http://www.sketchengine.co.uk (accessed in January 2009). Oxford English Dictionary (OED) Online 2000– http://dictionary.oed.com (accessed in January 2011). Perraton, Jean 2005 Swimming against the Stream. Oxford: Jon Carpenter. Pope, Jennifer, Miriam Meyerho¤ and Robert D. Ladd 2007 Forty years of language change on Martha’s Vineyard. Language 83(3): 615–627. Rew, Kate and Dominick Tyler 2009 Wild Swim: River, Lake, Lido and Sea. The Best Places to Swim Outdoors in Britain. London: Random House UK. Robinson, Justyna 2010a Awesome insights into semantic variation. In: Dirk Geeraerts, Gitte Kristiansen and Yves Piersman (eds.), Advances in Cognitive Sociolinguistics, 85–110. Berlin: Mouton de Gruyter. Robinson, Justyna 2010b Semantic variation and change in present-day English. Unpublished PhD Dissertation. The University of She‰eld. Rollinson, William 1997 Dictionary of Cumbrian Dialect, Tradition and Folklore. Otley: Smith Settle. Sanko¤, Gillian 2006 Apparent time and real time. In: Keith Brown (ed.), Elsevier Encyclopedia of Language and Linguistics, Volume 1, 110–116. Amsterdam: Elsevier. Schmid, Helmut 2010 Decision trees. In: Alexander Clark, Chris Fox and Shalom Lappin (eds.), The Handbook of Computational Linguistics and Natural Language Processing, 180–196. Oxford: Blackwell. Smith, John Russel (ed.) 1862 The Dialect of Leeds. London: F. Pickton. SPSS 18 for Windows 2007 Chicago: SPSS Inc.
A sociolinguistic approach to semantic change
231
SPSS White Paper, AnswerTree Algorithm Summary. Printed in the USA. 1999 Start, Daniel 2008 Wild Swimming. Richmond: Portfolio Books Limited. Stenstro¨m, Anna-Brita 2000 It’s enough funny, man: Intensifiers in teenage talk. In: John M. Kirk (ed.), Corpora Galore: Analyses and Techniques in Describing English, 177–190. Amsterdam: Rodopi. Tagliamonte, Sali A. and Alexandra D’Arcy 2009 Peaks beyond phonology: Adolescence, incrementation, and language change. Language 85(1): 58–108. Traugott, Elizabeth C. and Richard B. Dasher 2005 Regularity in Semantic Change. Cambridge: Cambridge University Press. Trudgill, Peter 1986 Dialects in Contact. Oxford: Blackwell. Wong, Andrew 2002 The semantic derogation of tongzhi: A synchronic perspective. In: Kathryn Campbell-Kibler, Robert Podesva, Sarah J. Roberts and Andrew Wong (eds.), Language and Sexuality: Contesting Meaning in Theory and Practice, 161–174. Stanford, California: CSLI. Wong, Andrew 2008 On the actuation of semantic change: The case of tongzhi. Language Sciences 30(4): 423–449.
A pragmatic approach to historical semantics, with special reference to markers of clausal negation in Medieval French Maj-Britt Mosegaard Hansen Abstract In this chapter I discuss the role of pragmatic inferencing in semantic change. I take semantics to be, not the study of truth-conditional meaning, but rather the study of meanings that are coded in linguistic items and structures, be they truthconditional or not. Semantic change is thus defined as any change in the conventional content of items or constructions, including (but of course not limited to) non-truth-conditional textual or interpersonal markers. The exposition focuses on the following key issues: (i) the nature of the pragmatic entities involved in semantic change (implicatures vs. presuppositions, generalized vs. particularized implicatures); (ii) the nature of the mechanisms by which change is brought about (metaphor vs. metonymy); (iii) the nature of the contexts in which semantic change takes place; (iv) the nature of and relationships among observed cross-linguistic tendencies of change (e.g. subjectification, increase in procedural content, scope increase; grammaticalization vs. pragmaticalization); Moreover, some consideration is given to the phenomenon of ‘‘persistence’’ (i.e. the retention of aspects of a source meaning), which is argued to play a role in constraining semantic change. The points made are illustrated by examples from English and French. The chapter moreover features a comparative analysis of the rise and subsequent evolution of three negative reinforcing particles, pas, mie, and point in medieval French.
1. Introduction This chapter discusses the role of pragmatic inferencing in semantic change. I take semantics to be, not the study of truth-conditional meaning, but rather the study of meanings that are coded in linguistic items and structures, be they truth-conditional or not. Thus, although the mainstream Anglo-American tradition considers the meaning di¤erence between and
234
Maj-Britt Mosegaard Hansen
and but in (1)–(2), for instance, to be the result of a conventional implicature (Grice 1989: 26), hence a matter of pragmatics, an alternative view represented among others by the Theory of Language-Inherent Argumentation (e.g. Ducrot et al. 1980), and by Relevance Theory (e.g. Blakemore 1987) considers it to be very much a matter of semantics, for while the two conjunctions are truth-conditionally equivalent, the adversative element added by the choice of but is context-independent and will only be understood by language users who have learned the coded meaning of that word. In other words, it is not the object of a pragmatic inference:1 (1) Gaby is young, and eager to learn. (2) Gaby is young, but eager to learn. In consequence, semantic change is defined for the purposes of this chapter as any change in the conventional content of items or constructions, and it includes not only the rise of conventional non-truth-conditional textual or interpersonal markers out of meanings that were originally truth-conditional, but also the development of new non-truth-conditional meanings of such items. Thus, both the development of the Modern English coordinating conjunction but from the Old English preposition butan (‘outside’, ‘without’), and the development of a clause-final concessive discourse marker but from the adversative conjunction but in Scottish English (cf. (3), from Beeching 2009: 101) are just as much cases of semantic change as is the extension of the meaning of the noun mouse from ‘small rodent’ to ‘computer pointing device’, where both the source and target meanings are truth-conditional. (3) A. Real Madrid are playing tonight.
B. They’re not televising it but.
Traditionally, the study of semantic change has focused on truthconditional items like mouse, but in the past couple of decades, the field – which had to a large extent been dormant since the mid-20th century – has gained a renewed impetus from focusing precisely on the rise and subsequent evolution of non-truth-conditional items and on items whose content (whether truth-conditional or not) is procedural rather than conceptual in nature (cf. Hansen 2008: 19¤ ). The discussion in this chapter, leading up to and including the case study of negative markers in medieval 1. The exact nature of any conclusions drawn from the use of (1) vs. (2) in any given context, and how those conclusions are arrived at, on the other hand, is of course a matter of pragmatic analysis.
A pragmatic approach to historical semantics
235
French in section 3, will concentrate on these latter types of expressions, and the exposition will focus partly on the types of pragmatic inferences, mechanisms, and contexts involved in semantic change, and partly on the nature of and relationships among observed cross-linguistic tendencies of change.
1. From pragmatics to semantics In his seminal paper on implicature, Grice (1989: 39) suggested that ‘‘it may not be impossible for what starts life, so to speak, as a conversational implicature to become conventionalized’’. Overwhelmingly, recent developments in diachronic semantics, most saliently the pioneering work of Elizabeth Traugott, have built on that idea, arguing that semantic change frequently ‘‘arise[s] out of the pragmatic uses to which speakers or writers and addressees or readers put language’’ (Traugott & Dasher 2002: xi). Thus, according to Traugott & Dasher (2002: 281), a large number of observed meaning changes appear to instantiate a set of nine semantic/ pragmatic tendencies, of which the following five appear most significant:2 1. Meanings tend to become increasingly subjective (i.e. increasingly grounded in the speaker’s subjective perspective), and possibly even intersubjective (i.e. explicitly grounded in the relationship between speaker and hearer). 2. Meanings that were conceptual at the outset tend to become increasingly procedural in nature. 3. Items and constructions that originally had scope within the host proposition tend to progressively enlarge that scope, potentially even up to the level of the discourse. 4. Meanings that were truth-conditional at the outset tend to become non-truth-conditional. 5. Meanings that originally made reference to the described event progressively come to refer to the speech event itself. 2. These clines represent a revision of three tendencies posited by Traugott (1982), viz.: I.
Meanings based in the external described situation > meanings based in the internal (evaluative/perceptual/cognitive) described situation. II. Meanings based in the external or internal described situation > meanings based in the textual or metalinguistic situation. III. Meanings tend to become increasingly based in the speaker’s subjective belief state/attitude toward the proposition.
236
Maj-Britt Mosegaard Hansen
As will be shown in section 3, the first four of these tendencies are directly relevant to the evolution of the postverbal negative markers pas (< Latin passu(m) ‘step’), mie (< Latin mica(m) ‘crumb’), and point (< Latin punctu(m) ‘point’, ‘speck’) in Medieval French. Furthermore, much recent discussion in the field of diachronic semantics and pragmatics relates directly or indirectly to Traugott & Dasher’s approach, focusing on the key issues mentioned at the end of section 1 above. 2.1. Types of inferences, mechanisms, and contexts of change Researchers have in the first instance been concerned with elucidating the nature of the pragmatic inferences that lead to semantic change, the cognitive mechanisms that underlie those inferences, and the contexts in which they are triggered. Traugott & Dasher’s (2002) Invited Inferencing Theory of Semantic Change (IITSC) assumes that conversational implicature3 is central to the rise of context-level meanings, and these authors follow Levinson (1995: 95, 2000: 263) in proposing that change proceeds from particularized conversational implicatures (PCI or [particularized] ‘‘invited inferences’’, or IIN as they prefer to call them) through generalized implicatures/ invited inferences to conventionalized implicatures, along the cline in (4): (4) PCI(IIN) > GCI(GIIN) > coded meaning Hansen & Waltereit (2006) argue that this sequence is actually rare, and that the role of di¤erent types of implicature in meaning change is a good deal more nuanced than the IITSC suggests. These authors (2006: 235) contend that particularized conversational implicatures are prototypically in the communicative foreground of messages, whereas generalized conversational implicatures are prototypically backgrounded, and that only foregrounded elements of meaning are liable to become conventionalized. They therefore posit the existence of two further, more frequently instantiated, clines as in (5)–(6): (5) PCI ! coded meaning (6) GCI ! PCI ! coded meaning 3. As the name of their model suggests, Traugott & Dasher (2002) speak of ‘‘invited inferences’’ rather than of ‘‘conversational implicature’’. The essential di¤erence, however, appears to abide in the fact that the IITSC puts more explicit emphasis on speaker/hearer interaction and negotiation than do either Grice’s ([1975]1989) original model of implicature or Levinson’s (1995, 2000) theory of generalized implicatures.
A pragmatic approach to historical semantics
237
An example of the change in (5) would be the conventionalization of the originally metaphorical use of mouse to denote a computer pointing device. The cline in (6), on the other hand, is instantiated by the evolution of the German particle selbst from intensifier (‘him-/herself ’) to focus particle (‘even’) (Eckardt 2003: ch. 6), cf. (7)–(9) below: (7) Der Ko¨nig selbst half den Palast zu bauen. ‘The King himself helped build the palace.’ (8) Der Ko¨nig selbst o¨¤nete die Tu¨r. ‘The King himself opened the door.’ (9) Selbst der Ko¨nig verstand den Witz. ‘Even the King understood the joke.’ ((8)–(9) from Eckardt 2003: 163) When the intensified entity is a central exemplar of some domain, the older, intensifying use frequently conveys an additive GCI, whereby not only that central exemplar, but also other lesser ones, took part in the denoted process, as in (7). As (8) shows, however, that implicature is cancelable, given that one would not usually assume that more than one person was needed to open a door. Now, Eckardt’s (2003: ch. 6) examples of the crucial onset contexts for the extension from intensifier to focus particle (e.g. (10)) suggest a foregrounding of the additive GCI as a PCI, in as much as there is no independently conceptualized core-periphery structure of which the selbst-marked entity can be seen as a central exemplar. Such a structure must therefore be the object of an ad-hoc inference on the part of addressees: (10) Welch Jammer war nun da? Man sah au¤ allen Gassen / In ho¨chster Einsamkeit die Ha¨user gantz verlassen: / Der Vatter ließ sein Kind / das Kind den Vatter stehn / Vnd dor¤te sicherlich kein Mensch zusammen gehn. / Die Vo¨gel machten selbst sich in die ferren Wu¨sten / Vnd wolten auß Gefahr nunmehr bey vns nicht nisten. (M. Opitz, 1624 – from Eckardt 2003: 170) ‘What misery was now there? One saw on all the streets / In the highest loneliness the houses entirely deserted: / The father had left the child / the child left the father standing / And certainly no men might go together. / [The birds themselves / Even the birds] left for the far deserts / And did no longer want to nest with us, out of danger.’
238
Maj-Britt Mosegaard Hansen
I will argue, in section 3 below, that the change of the Old French nouns pas, mie, and point into negative adverbs similarly instantiate the cline in (6). Cases where what must originally have been a PCI has subsequently evolved into a GCI are exemplified by conventionally indirect speech acts such as (11). Here, however, Hansen & Waltereit (2006) argue that the GCI will most likely not become fully semanticized, as demonstrated by the cancelability of the implicature ((12)).4 In other words, the third stage of the cline in (4) will typically not be instantiated: (11) [A person vacuuming the floor to another who is sitting in an armchair] Can you raise your legs? >> ‘Please raise your legs.’ (12) [A physical therapist to a patient] Can you raise your legs? ¼ ‘Are you able to raise your legs?’ Moving from types of inferences to the more specific inferential mechanisms of change, there is general agreement that diachronic sense change is based on patterns of association either between aspects of di¤erent linguistic signs or between linguistic signs and aspects of the cognitive or real-world entities that they represent, the two principal patterns of association identified in the literature being metaphor and metonymy. While the former has traditionally been regarded as the single most important mechanism, metonymy – in a broad sense of the word, which includes the conventionalization of all sorts of contextually invited inferences (Traugott & Dasher 2002: 78¤ ) – has increasingly come to be recognized as equally important, and indeed perhaps conceptually more basic than metaphor (e.g. Hopper & Traugott 1993, Blank 1997, Barcelona, ed. 2000; see also Koch, this volume). Indeed, metonymy is of particular importance to semantic changes that come about as a result of contextual inferencing, for its e¤ect lies precisely in a more or less obvious figure/ground shift of the type outlined above, i.e. a reversal of normally foregrounded and normally backgrounded elements of the interpretation of a given item (Koch 2001, Waltereit 2002, 4. Elizabeth Traugott (p.c.) suggests that the di¤erence between (11) and (12) may not be due to the cancelation of an implicature, but to polysemy. In my view, that cannot be the case, because polysemy can only be established if either meaning can occur without the other being present (cf. the analysis of (13)–(15) below). As far as I can tell, the ‘‘request’’ interpretation of ‘‘Can you do X?’’ interrogatives will always necessarily imply the ‘‘ability’’ interpretation, but not vice versa.
A pragmatic approach to historical semantics
239
Hansen & Waltereit 2006). For instance, when used ‘‘literally’’, the imperative of the verb look and its equivalents in a number of other languages (e.g. Italian guarda!, French regarde(z)!, Spanish mira!) suggest that the object that the hearer is directed to look at is worthy of immediate attention (Waltereit 2002). Speakers may exploit that background suggestion by using such imperatives when there is, in fact, no particularly noteworthy object in the environment, because they wish instead to draw attention to the contents of their own upcoming discourse. If this happens with su‰cient frequency, the erstwhile content-level imperative will change its meaning and function from that of a verbal imperative to that of a discourse marker. It will be argued in section 3 below that subsequent figure/ground shifts can account for the evolution of the French postverbal negative markers from nouns denoting minimal quantities through negative polarity items to markers of clausal negation. Because of the Gestalt-structure of metonymical reinterpretation, the possibility of such changes seems to rely on a particular type of context, known as ‘‘bridging contexts’’ (cf. Evans & Wilkins 2000, Heine 2002). Bridging contexts are characterized by the fact that they allow for two di¤erent interpretations of an item: an established ‘‘source’’ interpretation, and an innovative ‘‘target’’ interpretation. The existence of bridging contexts does not, however, in and of itself constitute evidence that change is, in fact, taking place: while bridging contexts allow for innovative interpretations, such potential interpretations need not actually be brought to bear by hearers. Only at a subsequent stage, where only the target interpretation is possible, can we thus be certain that reinterpretation has taken place. (13)–(15) illustrate three stages in the evolution of the French connective puisque (< Latin postquam), from its temporal source meaning via a bridging context to a newer justificational target meaning: (13) Puis que il venent a la Tere Majur ;//Virent Guascuigne, la tere lur seignur ; (Chanson de Roland, ca. 1090, vv. 818–19) ‘After they arrived in the lands of their forefathers, they saw Gascony, the land of their lord.’ (14) Puis que il sunt a bataille justez,//Ben sunt cunfe`s e asols e seignez ; (Chanson de Roland, vv. 3858–3859) ‘Once/Seeing as they’re prepared for battle, they go to confession, are absolved and blessed;’
240
Maj-Britt Mosegaard Hansen
(15) Chertes, Marot, je le voeil bien,//Puis que vos volente´s i est (Adam Le Bossu, dit Adam de la Halle, Le jeu de Robin et Marion, ca. 1285, vv. 701–702) ‘Certainly, Marion, I will, seeing as that’s your desire’ A number of recent studies have sought to further refine the characterization of the contexts of change, partly by taking structural constraints into account, but also increasingly by putting emphasis on interactional factors, such as dialogic contexts featuring multiple view-points, turn-taking and shared knowledge, as well as on genre and discourse traditions (e.g. Diewald 2002, Pons Borderı´a 2006, Torres Cacoullos & Schwenter 2007, Hansen & Visconti 2009). In particular, Detges & Waltereit (2009) have suggested that di¤erent types of source contexts may systematically give rise to di¤erent categories of target item. They compare the Spanish discourse marker bien with the cognate French bien, whose function is more akin to that of the modal particles found in the Continental Germanic languages, and hypothesize that discourse markers in general arise as a result of negotiations about the next move in a dialogic exchange, while modal particles are the result of negotiation of common ground in discourse. While this potentially important hypothesis is in need of further empirical substantiation5, it does serve to underscore the increasingly common assumption that linguistic items do not undergo meaning change in a vacuum; rather, they do so in the context of particular source constructions. That fact has two important consequences: (a) Any given item may change its meaning in some of its source contexts, but not in others, leading to ‘‘layering’’ (Hopper 1991), i.e. the synchronic coexistence of several di¤erent uses of the same item, some of which will be diachronically more recent than others. Thus, for instance, English while currently functions as both a noun, a temporal conjunction originally derived from that noun, and a concessive conjunction derived from the temporal conjunction (cf. Traugott 1982: 253–254). Similarly, with reference to the case study in section 3, the morphemes pas, mie and point all retained their original nominal uses in Medieval French, alongside their more recently developed uses as markers of clausal negation, and both pas and point continue to do so in Modern French.6 5. Some such support may be provided by the analysis in section 3 of the Medieval French negative markers. 6. Mie, being now obsolete as a negative marker, retains only a nominal use, whose meaning is related, but not identical, to that of its medieval forebear.
A pragmatic approach to historical semantics
241
For this reason, we should expect items that have undergone meaning change to be synchronically polysemous. Moreover, this polysemy can, in conjunction with observed cross-linguistic tendencies of semanticpragmatic change, be used as a basis for internal semantic reconstruction, as suggested by Traugott (1986). Thus, if we find, in a language X for which insu‰cient diachronic data is available, that the same conjunction is used with a temporal and with a concessive meaning, we will be justified in hypothesizing that the latter represents an extension of the former, rather than vice versa, as per tendencies 1 and 4 in section 2 above. Similarly, if we find in language Y that the marker of clausal negation is formally identical to a noun denoting a minimal quantity, we may hypothesize that that noun is the diachronic source of the negative marker, instantiating tendencies 2, 3, and 4. In so far as the tendencies in question, however strong they may appear to be, are simply tendencies, and not unidirectional laws of change, reconstructions based on them must, however, be proposed with all due caution (cf. Norde 2009: 52). (b) When items change their meaning, traces of the meaning they had in the source construction will often persist and constrain both the direction and the extent of subsequent changes (cf. Visconti 2005, Hansen 2008). In other words, a given existing form will not be recruited to express any arbitrarily chosen new meaning. A corollary of this is that two source items may seem largely synonymous in a number of contexts, but may nevertheless di¤er in one salient respect, such that only one of them will lend itself to a particular (further) meaning extension. For instance, Hansen (2008: 170) argues that while the two French adverbs encore (‘yet’, ‘still’) and toujours (‘still’) appear largely synonymous in their so-called ‘‘phasal’’ aspectual uses illustrated in (16), they differ subtly along the dimension of prospectivity: whereas encore implies that a future change of state is conceivable, toujours is completely neutral on this point. This di¤erence in meaning is attributed to the persistence of semantic features ‘‘inherited’’ from the respective etymological source constructions of the two adverbs, viz. Latin *hinc ad horam (‘from then until the hour’) or *hinc hac hora (‘from then until the hour’), both of which imply temporal evolution, vs. the inherently stative tous jours (‘all days’). (16) Max est encore/toujours la`. ‘Max is still there.’ Further, in section 3 below, I tentatively suggest that persistence of aspects of their original nominal semantics may have been responsible for
242
Maj-Britt Mosegaard Hansen
a perceived division of labor between the negative markers pas, mie, and point in medieval French. 2.2. Nature of the changes Finally, there is some debate about the relationship between the diachronic clines posited by Traugott & Dasher (2002), a debate that ultimately leads to the question whether the development of non-truth-conditional and/or more procedural meanings out of truth-conditional and/or more contentful meanings qualifies as an instance of grammaticalization, or whether a di¤erent type of process – dubbed ‘‘pragmaticalization’’ by Erman & Kotsinas (1993) – is involved. With respect to the second question, those who argue that the former types of expressions are grammaticalized focus on the fact that their evolution will typically feature the decategorialization of the (truth-conditional/ contentful) source item, some degree of phonological reduction of that source item, as well as subjectification and increased procedurality of its content. Those scholars who prefer to describe such expressions as ‘‘pragmaticalized’’, on the other hand, do so on the basis of Lehmann’s (1985) classic diagnostics of grammaticalization, viz. phonological and/or semantic attrition, paradigmatization, obligatorification, scope reduction, syntagmatic coalescence, and syntagmatic fixation. According to Lehmann’s model, grammaticalized items will exhibit a high degree of several, if not all, of these characteristics, a set of criteria which are in fact not fulfilled to any great extent by the class of non-truth-conditional procedural expressions that are of interest to us (cf. Waltereit 2002: 1005, Eckardt 2003: 42, Hansen 2008: 57f ). Rather, these expressions typically exhibit scope increase, greater syntactic freedom, optionality, and strengthening of their pragmatic import. The issue of the interrelation of Traugott & Dasher’s clines is part and parcel of this debate, and concerns in particular the interrelation between subjectification and scope, the central question being whether subjectification is necessarily part of, and unique to, the grammaticalization process, which – if true – raises questions about the scope of grammaticalized items. According to Traugott (1995) and Brinton & Traugott (2005), subjectification is at the very least a strong tendency in grammaticalization, particularly in its early stages (Traugott 1995: 47), but it is nevertheless an independent process (Brinton & Traugott 2005: 109, Traugott 2010). Company Company (2006) argues that subjectification is not just a semanticpragmatic phenomenon, but actually constitutes a specific type of syntactic
A pragmatic approach to historical semantics
243
change as well, namely one that is characterized by scope increase and syntactic isolation, and she conceives of subjectification as a subtype of grammaticalization. Interestingly, what this author calls subjectification would seem to be largely equivalent to what others describe as pragmaticalization, were it not for the fact that she deems the syntactic changes mentioned criterial, as opposed to just typical, of subjectification. However, the existence of items like the Germanic modal particles appear to provide a strong argument against such a view, for while modal particles are indubitably subjectified as compared to their source items and also exhibit scope increase, they are nevertheless syntactically highly constrained. The debate about grammaticalization vs. pragmaticalization raises the issue of the exact understanding of what is implied by these notions: are they labels for specific processes of change, in which case they have independent theoretical status, or are they largely just convenient short-hands for the cumulative results of sets of frequently converging, but essentially independent, changes that linguistic items and constructions can undergo? If the former, we would expect there to be a rather strict separation between grammaticalized and pragmaticalized items. If the latter, we should not be surprised to find cases where either or both label(s) might seem appropriate. This would appear to be the case, for instance, with the Germanic modal particles mentioned above, which are characterized by scope increase, optionality, and pragmatic strengthening, but also – unlike discourse markers, for instance – by paradigmaticization and syntagmatic fixation. 3. Case study: the rise of negative reinforcers in medieval French It is well-known, at least since Jespersen (1917), that negation in Standard French evolved from an originally preverbal construction (cf. (17)) to a bipartite construction embracing the finite verb, in which the preverbal marker is reinforced by a postverbal element of nominal origin (cf. (18)):7 (17) Je ne sais. (18) Je ne sais pas. ‘I don’t know.’ 7. In contemporary colloquial French, mainly of the spoken variety, the preverbal marker is increasingly deleted, as in Je sais pas. This more recent stage of evolution will not concern us here, however (but see Hansen fc for discussion).
244
Maj-Britt Mosegaard Hansen
The shift from (17) to (18) was, as one would expect, preceded by a period of variation between the two structures, which in this case lasted for several centuries throughout the Middle Ages. Moreover, whereas Modern French makes near-exclusive use of pas as the postverbal element, Medieval French speakers had a choice of several such elements, the most frequently used ones being pas, mie, and point. Since all of these expressions are nouns denoting minimal quantities of something, it is widely, and uncontroversially, assumed that they must originally have been added to make the negated utterance more informative, by asserting that the activity denoted by the verb ‘‘did not take place even to the smallest degree’’ (Detges & Waltereit 2002: 177). This assumption is supported by the existence of Latin examples such as (19): (19) quinque dies aquam in os suum non coniecit, non micam panis (Petronius, Satyricon, 1st c. AD) ‘for five days he didn’t put any water in his mouth, nor a crumb of bread’ Such a stage is unattested in French, where there is strong indication even in the oldest texts that the postverbal markers were already perceived as inherently negative, in as much as they were able to co-occur with negative polarity items (e.g. (20)). Nevertheless, the fact that they can occasionally be found in medieval French in non-assertive (e.g. hypothetical) contexts without ne, as in (21), provide evidence, via layering, of an intermediate stage where pas, mie, and point must have functioned as negative-polarity items themselves (e.g. Marchello-Nizia 1997: 306, Eckardt 2003: ch. 4): (20) Tuit vos Franceis ne valent pas meaille. (Li coronemenz Looı¨s, ca. 1130, v. 2433) ‘All your Frenchmen aren’t [ pas] worth a dime.’ (21) Tuit seie fel, se jo mie l’otrei ! (La chanson de Roland, ca. 1090, v. 3897) ‘Let me be a complete traitor, if I grant that in the least!’ The initial change in these items from nouns denoting minimal quantities to minimizer NPIs would have involved a figure/ground shift whereby a backgrounded scalar GCI (cf. Israel 2001) became foregrounded and subsequently conventionalized as the new meaning of the markers. This change further involves subjectification, increasing procedurality of meaning, and scope increase. The second change, whereby the NPIs became
A pragmatic approach to historical semantics
245
actual negative elements, would again have involved a figure/ground shift based on a probably preponderant occurrence of these NPIs in negative contexts (as compared to other types of non-assertive contexts), as well as a further increase in scope and procedurality. Hansen (2009a, 2009b) and Hansen & Visconti (2009) have argued that, in medieval French, the choice between the plain preverbal negator and a bipartite construction was governed by discourse-functional constraints, such that the latter construction was only used if either the negative proposition itself or its underlying a‰rmative variant was DiscourseOld in the sense of Birner (2006), i.e. assumed to be either already activated in the short-term memory of the addressee or accessible to activation based on another proposition or discourse entity thus activated. In (22), for instance, the negated sentence represents a reformulation of the preceding sentence, while in (23) the underlying positive proposition represents a salient inference from the preceding text: (22) « Je ne sui mie entrez en sa terre, mes en la nostre, car il n’a point de terre qu’il ne doie tenir de nos ; (La mort le roi Artu, ca. 1230, §160) ‘I have not entered his land, but ours, for he doesn’t [ point] have (any) land for which he is not obligated to us;’ (23) [. . .] si lor avint si merveilleuse aventure qui tuit li huis dou pale´s ou il mengoient et les fenestres clostrent par eles en tel manie`re que nus n’i mist la main ; et neporquant la sale ne fu pas ennuble ; (La queste del Saint Graal, ca. 1220, p. 7) ‘[. . .] then an extraordinary event occurred: all the doors and windows of the palace where they were dining shut themselves without anyone having touched them; and yet, the hall didn’t [ pas] become dark;’ The older preverbal negation, on the other hand, was pragmatically unmarked, and could thus be used in any contexts, Discourse-Old as well as Discourse-New. Hansen (2009) does, however, adduce data that shows that in the course of the medieval period preverbal ne alone became increasingly confined to Discourse-New contexts, and that when it did occur in Discourse-Old contexts, these tended to be of a kind where the discourse-salience of the negation itself was downplayed in comparison to Discourse-Old contexts where bipartite negation was used. Thus, for instance, plain ne in Discourse-Old propositions occurs in non-referential contexts like (24), where the two negatives cancel one another out, such that the global interpretation of the sentence is positive (‘everyone came’):
246
Maj-Britt Mosegaard Hansen
(24) Si i acorrent li un et li autre en tel maniere qu’il ne remest chevalier en tot le pale´s qui la ne venist. (Graal, p. 11) ‘Then they all came running so that no knight remained in the entire palace who didn’t come there.’ Hansen (2009) and Hansen & Visconti (2009) further suggest that the loosening of discourse constraints hypothesized to have originally governed the use of bipartite negation in French may have taken place as a result of the relatively frequent occurrence of bipartite negation in what these authors call ‘‘Janus-faced’’ contexts such as (25): (25) Et quant len li volt demander qui il estoit, il n’en tint onques plet a ax, ainz respondi tot pleinement qu’il ne lor diroit ore pas, car il le savroient bien a tens se il l’osoient demander. (Graal, p. 8) ‘And when they tried to ask him who he was, he never exchanged words with them, but replied quite plainly that he wouldn’t [ pas] tell them now, for they’d soon find out if they dared to ask.’ In this excerpt, the use of bipartite negation can be interpreted both as backwards oriented (i.e. as marking a Discourse-Old proposition), in as much as the negated sentence constitutes a refusal to grant a request expressed in the preceding discourse, and as forwards oriented, i.e. as contrasting with contents of the immediately following sentence. In other words, such Janus-faced contexts are claimed to have constituted the crucial bridging contexts that allowed bipartite negation to be reinterpreted as neutral with respect to the information-structural properties of the discourse: This analysis relies on the assumption that reinforcing markers were not introduced – as has traditionally been assumed following Jespersen (1917: 4) – simply in order to compensate for the reduced phonological substance of the preverbal ne as compared to its forebear, Latin non. A phonologically-based explanation of the history of French negation cannot easily account for the centuries-long period of variation between the simple and the bipartite structure. But if we assume that the formal di¤erence between those two basic structures reflects an underlying meaning di¤erence, then we must by the same token ask ourselves if the variation that we find between the reinforcing markers pas, mie, and point themselves might not likewise correlate with a di¤erence in the semantic and/or pragmatic status of their host utterances. Variation between pas and mie, as well as the decline and ultimate loss of mie, has been noted to reflect dialectal di¤erences, such that mie was
A pragmatic approach to historical semantics
247
preferred in Northern French dialects (e.g. Togeby 1974: §258, Buridant 2000: §609), while pas was the marker preferred by speakers in the Ile-deFrance region, where the social elite was based. However, point arises at a later date than pas and mie, and remains significantly less frequent than the latter two throughout the medieval period only to become very fashionable in Renaissance and particularly Classical French, following which its frequency of occurrence drops again sharply (cf. Togeby 1974: §258f ). Apart from the fact that the use of point is not accounted for, the dialectal explanation is rendered less than satisfying by the observation that many medieval texts contain both pas and mie with approximately equal frequency. By the 13th century, the frequency of use of the bipartite constructions with mie and pas had increased significantly, and the use of point as the third principal marker of negative reinforcement had become established, if not as frequent as the other two (e.g. Togeby 1974: §258). An in-depth analysis was therefore carried out of the uses of ne . . . pas and ne . . . mie in two 13th cent. prose texts, the romances La queste du Saint Graal (ca. 1220) and its ‘‘sequel’’ La mort le roi Artu (ca. 1230). Given the lower frequency of ne . . . point, the uses of that construction was studied in a further three texts, also included in the corpus used by Hansen (2009a/b) and Hansen & Visconti (2009): – Le coronemenz Looı¨s (ca. 1130) – Jean de Joinville: La vie de Saint Louis (1298–1309) – Miracle de l’enfant donne´ au diable (ca. 1339) The results suggest that there was indeed a functional di¤erentiation between the three constructions. I will first discuss what seems to characterize the contexts where point is chosen as the postverbal element, in opposition to those where either mie or pas is used, and subsequently look at di¤erences between the latter two. 3.1. Ne . . . point vs. ne . . . mie/pas Until the late 14th century, point occurs – as already noted above – far less frequently than the two competing markers, mie and pas. The data reveals that point was also – indeed, remains to this day – less grammaticalized than the latter two. Thus, in the large majority of its occurrences up until the late 14th cent., negative point heads a partitive construction with the preposition de (‘of ’) or a corresponding pronoun (relative dont or the pronominal adverb en), cf. (26). Mie is only rarely, and pas hardly
248
Maj-Britt Mosegaard Hansen
ever, found in such partitive constructions, which indicates that they retained fewer nominal properties than point. (26) Or ne me faut mes fors escu, dont je n’ai point. (Graal, p. 12) ‘Now I don’t need anything else except a shield, of which I have none.’ Secondly, point is more frequent than either mie or pas as an NPI, cf. (27): (27) . . . si l’en porroit tost max avenir, se li chevaliers avoit en lui point de proesce. (Artu §83) ‘[. . .] so misfortune might soon befall her if the knight had the least bit of skill in him.’ Thirdly, even in Modern French point, unlike Modern French pas, can occur clause-initially, in emphatic position, as in (28), a constructional possibility that pas lost in the course of the medieval period. (28) . . . et point n’est donc besoin pour le retour au plus haut du ciel de quelque rapport que ce soit a` d’autres eˆtres de ce bas monde, (Y. Bonnefoy, La Tempeˆte, 1997 – from Frantext) ‘[. . .] and thus there is no need in order to return to the highest heaven for any kind of relationship with other beings from this world below’ Fourthly, Modern French point can constitute an utterance on its own, as in (29). Although the medieval corpus used here contains no examples like (29), it seems reasonable to assume that such autonomous, i.e. comparatively more nominal, uses must also have been possible in the Middle Ages, given that they should, if anything, tend to become lost with increasing grammaticalization. (29) Comme il entendait quelque bruit vers le bas du bois, il s’attendait a` eˆtre salue´ par d’autres coups de feu. Mais point. (H. Pourrat, Les vaillances, farces et aventures de Gaspard des montagnes, 1930 – from Frantext) ‘Hearing a sound from the lower part of the woods, he expected to be met by further shots. But no(t a bit).’ Finally, point’s quantitative source meaning appear to persist to a greater degree that in the case of either mie or pas, in as much as point
A pragmatic approach to historical semantics
249
is used with predicates that admit of degrees in close to 45 of the instances analyzed, whereas mie and pas tend to occur (in almost 34 of cases) with predicates that do not (cf. (30)–(31)):8 (30) . . . si que je leur mousterrai tout cler que je n’emporte point d’argent. (Joinville §611) ‘[. . .] so that I’ll show them quite clearly that I’m not taking any money with me.’ (31) Sire, fet la damoisele, son non ne vos dirai je mie ; (Artu §26) ‘My Lord, says the damsel, his name I’ll not tell you;’ Pragmatically, point appears in medieval French to be more ‘‘emphatic’’ than either mie or pas, in a sense that further research will need to make more precise. Thus, in reported speech, negation with point frequently occurs at moments that are emotionally and/or spiritually charged in some way (cf. (32)–(33)), while in narrative, it tends to mark turning points or facts that are of particular significance, cf. (34): (32) La vostre gent ne puet il point amer. (Looı¨s, v. 830) ‘Your people he cannot love.’ (33) « Par foi, fet il, onques puis que vos venistes devant moi ne senti mal ne dolor, ne plus que se je onques n’eusse plaie ; ne encore tant come vos parlez a moi n’en sent je point, (Graal, p. 115) ‘By my troth, says he, at no time since you’ve come before me have I felt su¤ering or pain, no more than if I’d never had a wound; nor do I feel any as long as you’re speaking to me,’ (34) . . . et meintenant qu’ele aproucha de l’eve, il vit une main qui issi del lac et aparoit jusqu’au coute, mes del cors dont la mein estoit ne vit il point; (Artu §192) ‘[. . .] and now that it was approaching the water, he saw a hand that came out of the lake and appeared up to the elbow, but (of ) the body that the hand belonged to, he didn’t see (anything);’
8. Indeed, Renaissance commentators proposed that ne. . .point was more specialized for the expression of what they called ‘‘quantitative’’ as opposed to ‘‘qualitative’’ negation, the latter being preferentially expressed by ne . . . pas (Catalani 2001: 163)
250
Maj-Britt Mosegaard Hansen
3.2. Ne . . . mie vs. ne . . . pas In the two 13th century texts analyzed, ne . . . mie appears to have mitigating properties in comparison with ne . . . pas.9 Thus, in reported speech, mie tends to be used as the postverbal negator when the hearer is superior to the speaker in the social hierarchy, as in (35) where Bagdemagus, vassal king and knight of the Round Table, is contradicting King Arthur, and/or other markers of mitigation are present, as in both (35) and (36): (35) « Vos esmeustes premierement ceste Queste, venez avant et si faites premiers le serrement que cil doivent faire qui en ceste Queste se metent. » – « Sire, fet li roi Baudemagus, salve vostre grace, il nel fera mie premiers, mes cil le fera avant nos toz que nos devons tenir a seignor et a mestre de la Table Reonde. . . (Graal, p. 23) ‘You were the first to call for this quest, come forward and be the first to take the oath that those who join this quest must take.’ – ‘Sire, says King Bagdemagus, by your leave, he shall not do it first, but he whom we must hold as the lord and master of the Round Table shall do it before all of us [. . .]’ (36)
– Ce fu folie, fet Lancelos, de baer a moi en tel manie`re, meı¨smement puis que ge vos dis que mes cuers n’estoit mie a moi, et que, se g’en peu¨sse fere ma volente´, je m’en tenisse a beneu¨re, se tel damoisele com estes me daignast amer ; (Artu §57) ‘It was madness, says Lancelot, to want me in that way, the more so as I told you that my heart wasn’t mine to give away, and that, if I could do as I wish, I’d consider myself blessed if such as damsel as yourself would deign to love me;’
Pas, on the other hand, tends to preferred when the hearer is socially inferior to the speaker, as in (37), when the negative speech act appears to the speaker to be in the interest of the hearer or to somebody socially superior to both speaker and hearer, as in (38), and/or when boosting expressions are present, as in (39):
9. Interestingly, a cognate of OF mie, the Northern Italian postverbal negator mica, appears to have developed a mitigating function when appearing on its own in interrogatives (J. Visconti, p.c.): (i) Hai mica una cigaretta ? ‘Do you by any chance have a cigarette?’
A pragmatic approach to historical semantics
251
(37) Apre´s ce que mestres Gautiers Map ot mis en escrit des Aventures del Saint Graal assez soufisanment si com li sembloit, si fu avis au roi Henri son seigneur que ce qu’il avoit fet ne devoit pas soufire, (Artu §1) ‘After Master Gautier Map had put into writing the Adventures of the Holy Grail very thoroughly as it seemed to him, it was the opinion of King Henry, his lord, that what he’d done wasn’t enough,’ (38) « Damoisele, ge nel vos puis pas del tout descouvrir, car ge me parjurroie et en porroie mon seigneur corroucier ; (Artu §13) ‘Damsel, I can’t tell you everything, for I’d perjure myself and might anger my lord thereby;’ (39) « Dame, fet il, si fere´, se Dieu plest ; je reviendrai assez plus prochienement que vos ne cuidiez. » – « Ha ! Diex, fet ele, mes cuers nel me dit pas, qui me met en totes les mesaises dou monde et en totes les poors ou onques gentilz fame fust por home. » (Graal, p. 24) ‘My Lady, he says, of course I will, if it please the Lord; I’ll return much sooner than you think’. – ‘Ah God, she says, my heart doesn’t tell me so, it inspires in me all the unease in the world and all the fear that ever a noblewoman felt for a man.’ Furthermore, mie and pas are occasionally used in the same context, in which case they appear to contrast along the lines just sketched: thus, in (40), pas is initially used with the imperfect indicative of the verb vouloir (‘to want’), followed by mie used with the imperfect subjunctive of the same verb, a form which mitigates the expression of desire; moreover, while the pas-marked clause describes something that the subject can control, the mie-marked clauses describes a possible event over which he can exert no direct control: (40) Mes messier Gauvains ne porta pas icelui jor armes ne Gaheriez ses fre`res, einz leur avoit li rois desfendu, por ce qu’il savoit bien que Lancelos i vendroit ; se ne vouloitimp.ind. pas qu’il s’entreblac¸assent, se au jouster venist, car il ne volsistimp.subj. mie que mellee sorsist entr’eus ne mautalenz. (Artu §16) ‘But Sir Gawain didn’t bear arms that day, nor Gaheris his brother, in fact the King had forbidden it because he knew that Lancelot would come; so he didn’t want them to wound one another, for he wouldn’t want a quarrel or bad blood between them.’
252
Maj-Britt Mosegaard Hansen
In sum, there seems to be prima facie evidence of a functional di¤erentiation of the three postverbal negators at this stage of the French language. In addition to marking negation, the three markers appear to have had speech-act modifying properties not unlike those of the modal particles found in Continental Germanic languages. An evolution of this type is not at all surprising given the general avoidance of complete synonymy in language. It instantiates, on the one hand, the tendency towards greater subjectification, indeed, to the extent that the social relationship between speaker and hearer is indexed by the choice between mie and pas, arguably intersubjectification; and on the other hand, the tendency for meanings to increasingly come to make reference to the speech event itself, as opposed to the described event. Moreover, on the assumption that the use of bipartite negation in medieval French was constrained to discourse-old contexts, the development just sketched supports the hypothesis put forward by Detges & Waltereit (2009) that the diachronic rise of modal particles is triggered by negotiations of common ground. The quantitative þ strongly boosting vs. qualitative distinction that we find between point and pas/mie, and the mitigating vs. slightly boosting distinction that we find between mie and pas, might be due to persistence of the meanings of the three markers in emphatic source constructions like (19) above: point denotes the smallest conceivable, purely quantitative measure, and exists independently on any human endeavour. Mie and pas, on the other hand, have qualitative properties, but while mie denotes a static object of inherently negligible value, pas denotes the dynamic result of an action. However, the suggested division of labor clearly did not take permanent hold in the language; rather, the frequency of mie soon started to decline, and although point does still exist and is intuitively felt by native speakers to express a ‘‘stronger’’ form of negation that pas (e.g. Grevisse 1988: §976), it is very rare indeed in Modern and Contemporary French, where it has become confined to extremely formal registers.10 Although it
10. A quick Frantext search for combinations of ne with pas and with point in 82 texts published after 2000, yielded a total of 210 examples of point and 19,925 examples of pas. While these figures include some instances of the nouns point and pas, they nevertheless provide a clear indication of the sharp di¤erence in frequency between the two negators. Commonly cited studies of late-20th c. spoken French (Sanko¤ & Vincent 1977, Ashby 1981, Coveney 1996) show no examples at all of point as a postverbal negator.
A pragmatic approach to historical semantics
253
does possess a handful of items which in some of their uses resemble the Germanic modal particles (cf. Hansen 1998, 2008), French does in general seem resistant to developing fully-fledged items of this type, for reasons which may or may not be connected to word-order typology (cf. Abraham 1991: 336). The survival of pas as the canonical negator in Modern French at the expense of mie and point may perhaps itself be attributable to persistence, the noun pas being more abstract in meaning than the other two (in as much as its referent has no existence independently of the specific action by which it is realized), hence lending itself better to the expression of an abstract notion like negation.
4. Conclusion This chapter has discussed a number of concepts and issues that are currently at the forefront of research on the role of pragmatics in semantic change. In sections 1 and 2, I adduced one or more representative examples of well-known changes in major European languages to illustrate the significance of each concept evoked. In section 3, a more in-depth analysis of one specific process of change, namely the rise of bipartite clausal negators in Medieval French, was o¤ered, showing how a cluster of the theoretical concepts discussed in section 2 can contribute, at various levels, to the explanation of that change. The overall goal of the exposition has been to argue for pragmatics as a significant driver of semantic change, and thereby to encourage further exploration of what is currently a lively and growing field.
References Abraham, Werner 1991 The grammaticization of the German modal particles. In: Elizabeth Closs Traugott and Bernd Heine (eds.), Approaches to Grammaticalization, vol. II. Amsterdam: John Benjamins, 331– 380. Ashby, William J. 1981 The loss of the negative particle ne in French: a syntactic change in progress. Language 57 (3): 674–687. Barcelona, Antonio (ed.) 2000 Metaphor and Metonymy at the Crossroads. Berlin: Mouton de Gruyter.
254
Maj-Britt Mosegaard Hansen
Beeching, Kate 2009 Procatalepsis and the etymology of hedging and boosting particles. In: Maj-Britt Mosegaard Hansen and Jacqueline Visconti (eds.), Current Trends in Diachronic Semantics and Pragmatics (Studies in Pragmatics 7), 81–108. Bingley: Emerald. Birner, Betty J. 2006 Semantic and pragmatic contributions to information status. In: Maj-Britt Mosegaard Hansen and Ken Turner (eds.), Explorations in the Semantics/Pragmatics Interface. Acta Lingvistica Hafniensia (Special issue) 38: 14–32. Blakemore, Diane 1987 Semantic Constraints on Relevance. Oxford: Blackwell. Blank, Andreas 1997 Prinzipien des lexikalischen Bedeutungswandel am Beispiel der romanischen Sprachen. Tu¨bingen: Max Niemeyer. Brinton, Laurel J. and Elizabeth Closs Traugott 2005 Lexicalization and Language Change. (Research Surveys in Linguistics.) Cambridge: Cambridge University Press. Buridant, Claude 2000 Grammaire nouvelle de l’ancien franc¸ais. Paris: SEDES. Catalani, Luigi 2001 Die Negation im Mittelfranzo¨sischen. Frankfurt am Main: Peter Lang. Company Company, Concepcio´n 2006 Subjectification of verbs into discourse markers: semanticpragmatic change only? Belgian Journal of Linguistics 20: 97– 121. Coveney, Aidan 1996 Variability in Spoken French. A Sociolinguistic Study of Negation and Interrogation. Exeter: Elm Bank Publications. Detges, Ulrich and Richard Waltereit 2002 Grammaticalization vs. reanalysis. A semantic-pragmatic account of functional change in grammar. Zeitschrift fu¨r Sprachwissenschaft 21 (2): 151–195. Detges, Ulrich and Richard Waltereit 2009 Discourse pathways and pragmatic strategies: di¤erent types of pragmatic particles from a diachronic point of view. In: MajBritt Mosegaard Hansen & Jacqueline Visconti (eds.), Current Trends in Diachronic Semantics and Pragmatics (Studies in Pragmatics 7.), 43–61. Bingley: Emerald. Diewald, Gabriele 2002 A model for relevant types of contexts in grammaticalization. In: Ilse Wischer & Gabriele Diewald (eds.), New Reflections on Grammaticalization. Amsterdam: John Benjamins, 103–120.
A pragmatic approach to historical semantics
255
Ducrot, Oswald et al. 1980 Les mots du discours. Paris: Editions de Minuit. Eckardt, Regine 2003 The Structure of Change. Meaning Change Under Reanalysis. Habilitation-thesis. Berlin: Humboldt Universita¨t. (Revised version published by Oxford University Press in 2006.) Erman, Britt and Ulla-Britt Kotsinas 1993 Pragmaticalization: the case of ba’ and you know. Studier i Modern Spra˚kvetenskap 10: 76–93. Evans, Nicholas and David Wilkins 2000 In the mind’s ear: the semantic extensions of perception verbs in Australian languages. Language 76: 546–592. Grevisse, Maurice 1988 Le bon usage. Grammaire franc¸aise. 12th ed. revised by Andre´ Goosse. Paris: Duculot. Grice, H. Paul 1989[1975] Logic and Conversation. In: H. Paul Grice, Studies in the Way of Words, 22–40. Cambridge, MA: Harvard University Press. Hansen, Maj-Britt Mosegaard 1998 La grammaticalisation de l’interaction, ou, Pour une approche polyse´mique de l’adverbe bien. Revue de se´mantique et pragmatique 4: 111–138. Hansen, Maj-Britt Mosegaard 2008 Particles at the Semantics/Pragmatics Interface: Synchronic and Diachronic Issues. A Study with Special Reference to the French Phasal Adverbs. (Current Research in the Semantics-Pragmatics Interface 19.) Oxford/Bingley: Elsevier/Emerald. Hansen, Maj-Britt Mosegaard 2009a Forms of sentence negation in a 14th-century French text: a cognitive-functional analysis. Quaderns de Filologı´a, Estudis linguı´stics 14: 153–168. Hansen, Maj-Britt Mosegaard 2009b The grammaticalization of negative reinforcers in Old and Middle French: a discourse-functional approach. In: Maj-Britt Mosegaard Hansen and Jacqueline Visconti (eds.), Current Trends in Diachronic Semantics and Pragmatics (Studies in Pragmatics 7), 231–255. Bingley: Emerald. Hansen, Maj-Britt Mosegaard Forthcoming The negative cycle in French. In: Anne Breitbarth, Christopher Lucas and David Willis (eds.), The History of Negation in the Languages of Europe and the Mediterranean, vol. II: Case Studies. Oxford: Oxford University Press. Hansen, Maj-Britt Mosegaard and Richard Waltereit 2006 GCI theory and language change. In: Maj-Britt Mosegaard Hansen and Ken Turner (eds.), Explorations in the Semantics/
256
Maj-Britt Mosegaard Hansen
Pragmatics Interface. Acta Lingvistica Hafniensia (Special issue) 38: 235–268. Hansen, Maj-Britt Mosegaard and Jacqueline Visconti (eds.) 2009 Current Trends in Diachronic Semantics and Pragmatics (Studies in Pragmatics 7.) Bingley: Emerald. Heine, Bernd 2002 On the role of context in grammaticalization. In: Ilse Wischer and Gabriele Diewald (eds.), New Reflections on Grammaticalization, 83–111. Amsterdam: John Benjamins. Hopper, Paul J. 1991 On some principles of grammaticization. In: Elizabeth Closs Traugott and Bernd Heine (eds.), Approaches to Grammaticalization, vol. I, 17–35. Amsterdam: John Benjamins. Hopper, Paul J. and Elizabeth Closs Traugott 1993 Grammaticalization. Cambridge: Cambridge University Press. Israel, Michael 2001 Minimizers, maximizers and the rhetoric of scalar reasoning. Journal of Semantics 18: 297–331. Jespersen, Otto 1917 Negation in English and Other Languages (Det Kgl. Danske Videnskabernes Selskab. Historisk-Filologiske Meddelelser I, 5.) Copenhagen: Høst & Søn. Koch, Peter 2001 Metonymy: unity in diversity. Journal of Historical Pragmatics 2: 201–244. Koch, Peter 2004 Metonymy between pragmatics, reference, and diachrony. Metaphorik.de 7: 6–54 Lehmann, Christian 1985 Grammaticalization: synchronic variation and diachronic change. Lingua e stile XX (3): 303–318. Levinson, Stephen C. 1995 Three levels of meaning. In: Frank R. Palmer (ed.), Grammar and Meaning, 90–115. Cambridge: Cambridge University Press. Levinson, Stephen C. 2000 Presumptive Meanings. The Theory of Generalized Conversational Implicature. Cambridge, MA: MIT Press. Marchello-Nizia, Christiane 1997 La langue franc¸aise aux XIVe et XVe sie`cles. Paris: Nathan. Norde, Muriel 2009 Degrammaticalization. Oxford: Oxford University Press. Pons Borderı´a, Salvador 2006 From pragmatics to semantics: esto es in formulaic expressions. In: Maj-Britt Mosegaard Hansen and Ken Turner (eds.), Explorations in the Semantics/Pragmatics Interface. Acta Lingvistica Hafniensia (Special issue) 38: 180–206.
A pragmatic approach to historical semantics
257
Sanko¤, Gillian and Diane Vincent 1977 L’emploi productif de ne dans le franc¸ais parle´ a` Montre´al. Le franc¸ais moderne 45: 243–256. Togeby, Knud 1974 Pre´cis historique de grammaire franc¸aise. Copenhagen : Akademisk Forlag. Torres Cacoullos, Rena and Scott A. Schwenter 2007 Towards and operational notion of subjectification. Proceedings of the Annual Meeting of the Berkeley Linguistics Society 31: 347–358. Traugott, Elizabeth Closs 1982 From propositional to textual and expressive meanings: some semantic-pragmatic aspects of grammaticalization. In: Winifred P. Lehmann and Yakov Malkiel (eds.), Perspectives on Historical Linguistics, 245–271. Amsterdam: John Benjamins. Traugott, Elizabeth Closs 1986 From polysemy to internal semantic reconstruction. Proceedings of the Annual Meeting of the Berkeley Linguistics Society 12: 539–550. Traugott, Elizabeth Closs 1995 Subjectification in grammaticalization. In: Dieter Stein (ed.), Subjectivity and Subjectivisation, 31–54. Cambridge: Cambridge University Press. Traugott, Elizabeth Closs 1999 From subjectification to intersubjectification. Paper presented at the Workshop on Historical Pragmatics, XIV International Conference on Historical Linguistics. Vancouver, BC, August. Traugott, Elizabeth Closs 2010 Revisiting subjectification and intersubjectification. In: Kristin Davidse, Lieven Vandelanotte and Hubert Cuyckens (eds.), Subjectification, Intersubjectification and Grammaticalization (Topics in English Linguistics, vol. 66.), 29–71. Berlin: Mouton de Gruyter. Traugott, Elizabeth Closs and Richard B. Dasher 2002 Regularity in Semantic Change (Cambridge Studies in Linguistics, vol. 97.) Cambridge: Cambridge University Press. Visconti, Jacqueline 2005 On the origins of scalar particles in Italian. In: Maj-Britt Mosegaard Hansen and Corinne Rossari (eds.), The Evolution of Pragmatic Markers. Journal of Historical Pragmatics (Special issue) 6 (2): 237–261. Waltereit, Richard 2002 Imperatives, interruption in conversation, and the rise of discourse markers: a study of Italian guarda. Linguistics 40 (5): 987–1010.
The pervasiveness of contiguity and metonymy in semantic change Peter Koch Abstract This article sets out to demonstrate the importance and omnipresence of the cognitive relation of contiguity in semantic and lexical change. Metonymy is a central issue in this context, but the scope of contiguity goes beyond metonymy, and this is discussed in detail. Thanks to phenomenological philosophy and to modern frame semantics, it is possible to endow the notion of ‘contiguity’ with a definition that is comprehensive and yet su‰ciently distinctive with respect to other cognitive relations. This constitutes a good basis for a definition of metonymy in terms of figure-ground e¤ects (highlighting, perspectivization) within conceptual frames, an approach that enables us to discuss the internal conceptual and referential typology of metonymy, to reconstruct its unity, and to delimit its range, especially in contrast to other types of lexical semantic change. The processes of ‘subjectification’ and ‘delocutive’ change further illustrate the wide range of metonymy; the distinction between speaker- vs. hearer-induced metonymies enriches its pragmatic typology. A look at the metonymic bases of the emergence of discourse markers, and of grammatical reanalysis and grammaticalization, further completes the picture. A section of the article is dedicated to di¤erent aspects of word-formation, at the interface of the lexicon and morphology. In this domain metonymy is prominent in (folketymological) remotivation and in the semantic change of word-formation devices. Contiguity (not only metonymy) reveals itself to be pervasive across the whole variety of formal devices of lexical innovation (including semantic change, conversion, su‰xation, prefixation, composition, etc.), in a quantitative as well as in a qualitative respect. The omnipresence of contiguity is explained by its cognitively fundamental and extremely simple character, in comparison especially to the taxonomic and the metaphorical principle, and, hence, by its extraordinary semantic and pragmatic flexibility.
1. Contiguity 1.1. Contiguity and frame In his book De memoria et reminiscentia (451b, 18–22), Aristotle introduced the three basic associative relations, ‘‘similarity’’, ‘‘contrast’’ and
260
Peter Koch
‘‘contiguity’’, to occidental thought. It is mainly the latter, contiguity, as relatedness with something close, which has shaped the history of philosophy and associationist psychology (cf. Amin 1973: 19–94). Rid of its mechanistic inheritance, it reappears in phenomenological philosophy (see below) and in the context of gestaltist laws (cf. Wertheimer 1922/23: esp. 4, 304–311; Holenstein 1972: 307; Amin 1973: 97–202; cf. also Raible 1981: 5f.). By intersecting the strictly linguistic theory of the two axes by de Saussure with the rhetorical tradition, Jakobson rediscovers the cognitive dimension of similarity, the basis of metaphor, and of contiguity, the basis of metonymy (see 2.1.)1. In the course of the reception history of associative relations, it is the relation of contiguity which imposes itself the most immediately (it even tends to be sometimes identified erroneously with ‘‘association’’ tout court). Therefore, the notion of contiguity is often considered to be imprecise: ‘‘la contiguita` e` concetto abbastanza sfumato’’ [‘contiguity’ is a wishy-washy notion] (Eco 1984: 147). It is probably widely admitted that the notion of contiguity goes beyond the etymological meaning of ‘‘spatial proximity’’,2 but once it also recovers temporal succession, the relation of cause and e¤ect, the relation of a part to a whole (and the other way round), the relation of container to content (and the other way round), etc., it is legitimate to ask the question of where to stop. The present article addresses questions of this kind and considers the central importance of contiguity in semantic change. In the framework of cognitive linguistics the notion of contiguity appears only marginally – always in the context of metonymy3 – in competition with other notions which are a lot more central to this approach, such as: ‘‘domain’’, ICM ¼ (idealized) cognitive model, ‘‘scene’’, ‘‘scenario’’, ‘‘script’’ and ‘‘frame’’ (cf. e.g. Taylor 1995: 90, 125f.; Croft 1993: 348; Ungerer and Schmid 1996: 128; Radden and Ko¨vecses 1999: 21). This is certainly the right track, but the terminology must be clarified, since all these notions have been subjected, in the course of time, to considerable terminological inflation. Since the terms
1. For a more nuanced interpretation of this problem, see Happ 1985; Koch 1999a: 143; Koch 2005a: 162f. 2. Lat. contingere ‘to touch’ (which implies spatial proximity) ! Lat. contiguus ‘touching together’ > Fr. contigu ‘adjoining, adjacent’ ! E. (now obsolete) contigue ‘adjoining, adjacent’ ! E. contiguous. 3. Cf. e.g. Taylor 1995: 122; Croft 1993: 347; Ungerer and Schmid 1996: 115f.; Radden and Ko¨vecses 1999: 19. Slightly more explicitly stated observations may be found e.g. in Feyaerts 2000: 63–65; cf. also Dirven 1993: 14.
The pervasiveness of contiguity and metonymy in semantic change
261
‘‘domain’’ and ICM prove to be too seriously ambiguous4, we will stick to a natural, but clearly restrictive interpretation of the term frame: A semantic frame [. . .] is a coherent structure of related concepts where the relations have to do with the way the concepts co-occur in real world situations (Geeraerts 2006: 16).
‘‘Real world situations’’ is to be understood here in the sense of ‘‘real world situations as human beings perceive them’’. In this sense the term ‘‘frame’’5 has the advantage of expressing a notion perfectly compatible with the notion of contiguity. As represented in Figure 1, it can be said that there is contiguity between the elements of a frame, but also between the frame as a whole and each one of its elements. In order to render this conception more precise, it is useful to go back to phenomenological philosophy. According to Husserl, who discovered the transcendental role of so-called associative relations for the constitution of any object (Gegensta¨ndlichkeit) in passive genesis (cf. Husserl 1973: 111–114; Holenstein 1972: 19–22; see also Koch 2007: 12–15), it is necessary to distinguish between what is perceived in the strict sense and a surplus which is not perceived, but is nevertheless accessible in some sense. So, every perception unites ‘‘presented’’ components and ‘‘appresented’’ components, not really perceived, but integrated in perception. And it is the latter components that open a ‘‘horizon’’ of contiguities (cf. Husserl
Figure 1. Frame and elements
4. For ‘domain’ cf. Koch 1999a: 152f.; Feyaerts 2000: 62f.; see also Croft and Cruse 2004: 261 n. 1. For ICM cf. Koch 2005: 167 n. 13 (see also below n. 11). 5. Cf. e.g. Fillmore 1977; 1985; Barsalou 1992; Taylor 1995: 87–92; Ungerer and Schmid 1996: 205–217; Croft and Cruse 2004: 7–14.
262
Peter Koch
1950/52, I: 58–60; 1973: 150f.; Holenstein 1972: 41–43, 317f.). From this perspective, a frame constitutes a horizon of contiguities, i.e. our ‘‘encyclopedic’’ expectations which are grounded on the contiguities that connect concepts or constituents of more complex concepts, especially types of situations (cf. Koch 2007: 15f.; 18–20). 1.2. Contiguity and other cognitive relations For the sake of terminological symmetry, let us call the principle of frames engynomy (cf. Koch 2001a: 216f.).6 Engynomy di¤ers substantially from the other great existing principle of conceptual organization, i.e. taxonomy (cf. also the distinction between E-relations and C-relation made by Seto 1999). In conceptual taxonomies we have to distinguish two types of relations, namely taxonomic sub-/superordination (inclusion) and taxonomic similarity. If (1)(a) holds between two concepts P and Q and if, consequently, (1)(b) holds between the respective categories (C) defined by P and Q, then CQ is taxonomically subordinated to CP, or, vice versa, CP is taxonomically superordinated to CQ (cf. e.g. Lyons 1977, I: 291; Cruse 1986: 136–156). This can be exemplified by (2)(a) and (b). (1) (a) Qs are Ps, but Ps are not necessarily Qs. (b) CQ K CP: i.e. CP includes CQ / CQ is taxonomically subordinated to CP / CP is taxonomically superordinated to CQ (2) (a) dogs are animals, but animals are not necessarily dogs. (b) Cdog K Canimal: i.e. the category of animals includes the category of dogs / the category of dogs is taxonomically subordinated to the category of animals / the category of animals is taxonomically superordinated to the category of dogs. According to the insights of prototype theory (cf. e.g. Rosch 1973; Taylor 1995: 38–46; Evans and Green 2006: 248–279), the rules in (1) are to be understood in a rather flexible way, in order to cope with prototypicality e¤ects, as e.g. with P ¼ bird and Q ¼ penguin: Qs are more or less Ps, but Ps are not necessarily Qs; and: CQ is more or less a subcategory of 6. The term ‘engynomy’ is derived from Ancient Greek engy´s ‘near, close’ and evokes Aristotles term to` sy´nengys ‘the contiguous (thing)’ (Aristotle, De memoria et reminiscentia, 451b: 18–22).
The pervasiveness of contiguity and metonymy in semantic change
263
CP. Furthermore, individuals and even categories cannot be assigned to objectively or logically preexisting taxonomic (superordinated) categories. Category assignment, though not totally arbitrary, is rather a matter of relevance. For instance the category flower can be taxonomically subordinated to di¤erent superordinated categories: plant, gift, decoration, etc. The last point also applies to the other cognitive relation present in taxonomies, namely similarity, which always means ‘‘relevant similarities’’. If CQ and CR are subcategories of CP, then a relation of taxonomic similarity holds between instances of CQ and CR (3). This can be illustrated by (4). (3) If CQ and CR are taxonomically subordinated to CP (cf. (1)(b)), then Qs and Rs are taxonomically similar to each other. (4) dogs and horses are animals, but animals are not necessarily dogs nor necessarily horses. Consequently dogs and horses are taxonomically similar to each other. In contrast to this, the engynomic principle – to take up Husserl’s felicitous terminology – is based on the relation between ‘‘presented’’ and ‘‘appresented’’ conceptual/perceptual knowledge, i.e. on contiguity within frames (cf. 1.1). This excludes from the outset that there be a relation of taxonomic sub- or superordination between two contiguous concepts P and Q (5)(a). The statements (5)(b) on the one hand and the positive counterpart of (5)(c) or (d) on the other are conceptually incompatible, which is exemplified in (6). This incompatibility is due to the fact that contiguity is an interconceptual relation, while taxonomic sub-/superordination is an intraconceptual one (cf. Klix 1984: 18–23). Engynomy is not a problem of categorization, like taxonomy, but a problem of the co-presence of concepts. This does not prevent us, however, from taking into consideration the classes of referents of contiguous concepts (see 2.2.3). (5) (a) If P and Q are contiguous concepts, then neither a relation of taxonomic subordination nor a relation of taxonomic superordination holds between P and Q. (cf. (1)(a)) (b) P and Q are contiguous. (c) CP is not included in/not taxonomically subordinated to CQ . (cf. (1)(b)) (d) CQ is not included in/not taxonomically subordinated to CP . (cf. (1)(b))
264
Peter Koch
(6) (a) If winter and snowflake are contiguous concepts, then neither a relation of taxonomic superordination nor a relation of taxonomic subordination holds between them. (b) winter and snowflake are contiguous. (c) Cwinter is not included in/not taxonomically subordinated to Csnowflake . (d) Csnowflake is not included in/not taxonomically subordinated to Cwinter. Note that, just like the above taxonomic relations, contiguities are not objectively or logically given relations, but a matter of relevance (cf. Sperber and Wilson 1995). In fact, the ‘‘existence’’ (or not) of a contiguity relation depends on anthropological, social, cultural, and other parameters of relevance. Thus, the contiguity between diaphragm and mind is typical of ancient Greek thinking; even the contiguity relation used in example (6) is not necessarily universal. Contiguity depends on what is called ‘‘construal’’ in Cognitive Semantics (cf. Dirven 1993). Since the second taxonomic relation, i.e. similarity, exemplified in (3) and (4), is an extraconceptual relation, it is not always conceptually incompatible with contiguity, even though in certain cases it is extremely di‰cult to imagine both relations for a given couple of concepts (7). In other cases we may hesitate, and in some rare cases contiguity and similarity are equally well applicable to a given couple of concepts. Thus, we can see chair and table as contiguous elements within a frame dining room (8)(a), but we can also see them as similar within the taxonomy piece of furniture (8)(b). Nevertheless, even if both relations are conceptually compatible in this case, they are incompatible in two other respects. First, they exclude each other logically, because they simply do not constitute the same way of linking the two concepts. Second, they exclude each other from the point of view of relevance, which compels us to choose either a construal related to the frame dining room (8)(a) or a construal related to the taxonomy piece of furniture (8)(b). (7) (a) kitchen and (to) cook are (seen as) contiguous. (cf. (5)) (b) *kitchen and (to) cook are (seen as) taxonomically similar. (cf. (3)) (8) (a) chair and table are (seen as) contiguous. (cf. (5)) (b) chair and table are (seen as) taxonomically similar. (cf. (3))
The pervasiveness of contiguity and metonymy in semantic change
265
All in all, the two major principles of conceptual organization, engynomy and taxonomy, are clearly incompatible with each other. Taxonomic sub-/ superordination is conceptually incompatible with contiguity ((5), (6)). (Taxonomic) similarity and contiguity, whether conceptually incompatible (7) or not, are always incompatible from the point of view of logic and relevance (8). It may be useful to recall that neither the taxonomic nor the engynomic principle as such has a strictly prognostic value, because they are a matter of construal. When starting from a given concept P, we cannot necessarily predict to which concept Q taxonomic sub-/superordination or similarity will lead us, and even less so for contiguity. As Taylor puts it (using the term ‘‘domain’’: cf. below 2.2.2): [. . .] it would be an error to suppose that domains constitute strictly separated configurations of knowledge; typically domains overlap and interact in numerous and complex ways (Taylor 2002: 196f.).
Nevertheless, once we know that people establish a relation between P and Q, we can perfectly specify the relevant relation, because, as shown in this section, taxonomic and engynomic relations are incompatible, and hence, clearly distinguishable. (The diagnostic value of metonymies is discussed in 2.2.2; for metaphor see 2.2.4). 2. Metonymic lexical change The engynomic principle proves to be particularly simple from a cognitive point of view. It allows us to produce e‰cient conceptual e¤ects, but at a low cost. A central question in this article is whether such a simple cognitive principle as contiguity within a frame should not have a considerable impact on language change. Let us start with the most common case: metonymic lexical change. 2.1. The standard case: rhetoric and historical semantics It is well known that the classical rhetoric doctrine defines the trope of metonymy by proximity or contiguity. (9) Denominatio est, quae ab rebus propinquis et finitimis trahit orationem, qua posit intellegi res, quae non suo vocabulo sit appellata (Rhetorica ad Herennium 4, 32, 43; my italics). ‘Metonymy is a trope that takes its expression from near and close things and by which we can comprehend a thing that is not denominated by its proper word.’
266
Peter Koch
The historical semantics of the 19th century (Reisig, Bre´al, Paul, Darmesteter, Nyrop, etc.) essentially uses traditional rhetorical bases to construct, among others, a category of metonymic change (cf. Nerlich 1992; Blank 1997: 7–18; Geeraerts 2010: 27f., 31–33). It was Roudet (1921), Ullmann (1957: 231–234) and Jakobson (1971; see section 1) who definitely renewed the associationist tradition in order to bring back metonymy to the cognitive relation of contiguity. If Jakobson’s approach is rather focused on the rhetorical and literary trope of metonymy, Roudet and Ullmann target lexical metonymic change. The traditional examples already show that the interpretation of the notion of contiguity goes beyond purely spatial or even temporal relations right from the beginning, including for instance cause–effect relations: (10) [. . .] frigus ‘pigrum’, quia pigros e‰cit (Rhetorica ad Herennium 4, 32, 43). ‘a 3lazy4 cold, since it makes (people) lazy’ Even though the doctrine of traditional rhetoric is mainly concerned with metonymy as a trope in discourse, it also introduces the notion of ‘‘catachresis’’ based on metonymy (or other tropes), which prefigures in a certain way the modern notion of a completed metonymic lexical change:7 (11) I. CATACHRE`SE DE ME´TONYMIE [. . .] 3 Ces metonymies du contenant: La Cour, pour Les courtisans; [. . .] (Fontanier 1977: 214) ‘I. METONYMIC CATACHRESIS [. . .] 3 Those metonymies of the container: The Court, for The courtiers; [. . .]’ In the last fifteen years Cognitive Semantics has been discovering the fundamental importance and the impressive range of metonymy (cf. e.g. Nunberg 1995; Panther and Radden 1999; Panther and Thornburg 2003; 2007). A very influential definition of metonymy that has been proposed in this context is the following: Metonymy is a cognitive process in which one conceptual entity [. . .] provides mental access to another conceptual entity [. . .] within the same cognitive model (Radden and Ko¨vecses 1999: 21).
7. For the successive steps of lexicalization of ad hoc metonymies (as well as other ad hoc tropes in discourse) cf. Koch 2004: 15–19.
The pervasiveness of contiguity and metonymy in semantic change
267
If we understand ‘‘cognitive model’’ in the sense of ‘‘frame’’ defined in section 1, we get close to the following definition that brings contiguity into play: Metonymy is a semantic link between two readings of a lexical item that is based on a relationship of contiguity between the referents of the expression in each of those readings (Geeraerts 1997: 96).
The fundamental e¤ect, which turns out to be indispensible for a dynamic approach to metonymy, is what Cognitive Semantics calls highlighting or perspectivization with relation to a conceptual domain (matrix) or, as we prefer to say in the following, to a frame (cf. Taylor 1995: 90, 107f.; 125f.; Croft 1993: 348; Panther and Thornburg 2007: 242). Metonymy can be regarded as a figure-ground e¤ect between two elements E1 and E2 of a given frame (Figures 2a and 2b) or between the frame as a whole and one of its elements E1 (Figures 3a and 3b), and vice versa. The above traditional examples illustrate this very clearly. To call the cold ((10): Lat. frigus) ‘lazy’ ( pigrum), because it makes people lazy, is a metonymy that presupposes a frame which involves a natural force, the cold in that case, and a human patient who is subject to this force: thus,
Figure 2. Figure-ground e¤ect between two elements of a frame
Figure 3. Figure-ground e¤ect between a frame (a) and one of its elements (b)
268
Peter Koch
from lazy, as the state of the patient (¼ E1), we shift – by a figureground e¤ect, according to the schemas of Figure 2 – towards inactivating, as a quality of the natural force (¼ E2). As to Fontanier’s example of a metonymic ‘‘catachresis’’, i.e. of a completed metonymic lexical change, we can say that in the case of the French word cour, we can identify a metonymic change based on the fact that the courtiers (¼ E1) constitute the essential element of the frame court (¼ FRAME), as in the schemas in Figure 3. The OED entry for English court shows the same development (senses 5. and 7a.). A shift has taken place, through a figure-ground e¤ect, from the entire FRAME of the court – from the container, if you will – to an element of the frame (or to the content), indeed the courtiers (¼ E1). This metonymy has been lexicalized and fixed in the polysemy of the French word cour (and similarly of E. court). 2.2. Metonymy: internal typology – unity – range Up to now several already traditional problems have been discussed with respect to metonymy: 314 There are di¤erent subtypes of metonymy. What is its internal variety and typology? (2.2.1, 2.2.3, 2.5) 324 Is metonymy a unitary phenomenon despite its internal variety? How can its internal unity be accounted for? (2.2.2) 334 What are the external limits of the phenomenon of metonymy? What is the range of this notion? (2.2.4) As we will see, these three problems are closely interrelated. The internal variety of metonymy 314 comprises di¤erent dimensions, which are treated separatedly here: a ‘‘conceptual’’ (2.2.1), a ‘‘referential’’ (2.2.3) and a ‘‘pragmatic’’ (2.5) dimension of the typology of metonymies. 2.2.1. The internal conceptual typology of metonymy As for the internal ‘‘conceptual’’ variety of metonymies 314, inventories of di¤erent subtypes can be found already in traditional rhetoric (cf. Lausberg 1973: §§ 565–568; e.g. Fontanier 1977: 79–86) and, further on, from traditional historical semantics of the 19th century up until Cognitive Semantics (cf. Geeraerts 2010: 31–33; Peirsman and Geeraerts 2006a: 275–277; Radden and Ko¨vecses 1999: 29–44). Well-known subtypes are (frequently in both directions):
The pervasiveness of contiguity and metonymy in semantic change
269
location–located, e.g., possibly, (11) container–contained, e.g., again, possibly, (11) cause–effect, e.g. (10) action–actor, e.g. (27) property–person, e.g. (24) part–whole (for the problematic of this ‘‘synecdochic’’ type see 2.2.4). In order to go beyond a mere listing of subtypes, e¤orts have been made to group together the di¤erent subtypes according to particular criteria. A binary, very general grouping of metonymies is reflected in our Figures 2 and 3: element–element relations within a frame vs. FRAME–element relations (in a somewhat di¤erent terminology Radden and Ko¨vecses 1999: 30–43). The part–whole type, if it is incorporated into metonymy (see below 2.2.4), is only one particular subtype of the FRAME–element schema. container–contained and possibly also location–located are other subtypes; cf. (11). Another binary, very general grouping in time-related terms has been proposed by Blank (1999): co-presence (e.g. (11), (23), (24), (27)) vs. succession (10). A typology in terms of spatial vs. temporal vs. abstract domains is suggested by Seto (1999: 98). Keen to go beyond unidimensional typologies, Peirsman and Geeraerts (2006a) put forward a prototypical definition of contiguity along three dimensions: – strength of contact: part-whole containment – direct contact – mere adjacency – boundedness – unboundedness – domain: space vs. time vs. action/event/process/state vs. assemblies/collections For example, (10) Lat. pigrum lazy ! inactivating would be a case of direct contact, unbounded elements, and state (quality); (11) Fr. cour/E. court court ! courtiers would be a case of part-whole containment, bounded whole/bounded part, and space; and so forth. In this way, we get a whole universe of types of contiguities and, hence, of conceptual types of metonymy, with part-whole containment, boundedness, and space representing the prototypical instance of contiguity and metonymy. What is at stake here is not prototypicality as a psychological principle acting in the object language, but the – often undeniable – prototypicality of
270
Peter Koch
metalinguistic notions.8 So we will have to examine how far the range of the prototypical category of contiguity goes in this case (2.2.4). 2.2.2. The unity of metonymy Peirsman and Geeraerts’ prototype-based approach to metonymy (2006a) is an interesting attempt to reconcile at the same time internal diversity and unity of metonymy (cf. beginning of section 2.2, 324). In his criticism, Croft (2006: 320) underlines that the notions of ‘‘association’’, ‘‘domain highlighting’’, and ‘‘shift of reference’’ are more powerful (the last point will be taken up in section 2.2.3). ‘‘Association’’, as we have seen in section 1.1, is certainly too broad a term, because it traditionally comprises ‘‘similarity’’, ‘‘contrast’’, and ‘‘contiguity’’. So ‘‘contiguity’’, if anything, would be much more precise. As for ‘‘domain’’, this term is not free from imprecisions either, as underlined by Croft and Cruse themselves (2004: 216 n. 1). In practice it often assumes the character of a rag-bag. If it was really identified with ‘‘frame’’ (cf. Croft and Cruse 2004: 15; Croft 2006: 320), this would amount in the last resort to the frame approach sketched in section 1. The notion of ‘‘highlighting’’ (or of perspectivization) is precious and central to a unitary definition of metonymy, as already demonstrated in section 2.1. As shown there, it would be possible to bring together the contiguity approach and the highlighting approach, because in fact they ‘‘are not necessarily incompatible’’ (Peirsman and Geeraerts 2006b: 334). Note that ‘‘frame’’ and ‘‘highlighting’’ are not coextensive notions. First of all, ‘‘frame’’ (or ‘‘domain’’ ¼ ‘‘frame’’) characterizes the type of structure and ‘‘contiguity’’ the type of relation underlying metonymy, whereas ‘‘highlighting’’ (i.e. the figure-ground e¤ect) insists on the type of process involved in metonymy. Furthermore, metonymic highlighting presupposes contiguity, but contiguity does not necessarily trigger highlighting (see 2.2.3 and 5.3). Another problem is the explanatory power of contiguity for specific metonymies (and for the impossibility of others; cf. Croft 2006: 319). In 1.2 we had to acknowledge that the engynomic principle of frames and contiguities does not have a strictly prognostic value, because it is a matter
8. Cf. Peirsman and Geeraerts’ reply (2006b: 329f.) to Croft’s (2006: 319) criticism, which appeals for psychological evidence underpinning the prototypicality of the phenomenon of contiguity. As to the prototypicality and non prototypicality of metalinguistic notions, cf. Koch 1998: 291–303.
The pervasiveness of contiguity and metonymy in semantic change
271
of construal. This holds not only in general, but also for contiguities as a basis of metonymies. In fact, change of meaning with respect to a given signifier is mainly unpredictable. Peirsman and Geeraerts (2006a: 271) cite the striking example of Latin Mone#ta ‘Iuno’ (12)(a), where a whole chain of contiguities (activated in part through metonymies) is involved on its way to (12)(b): Iuno ! Iuno’s temple ! mint (located in the temple) ! coin. (12) (a) Lat. (Iuno) Mone#ta ‘goddess Iuno’; attested since the 1st century BC (surname derived from mone#re ‘admonish’, because Iuno is said to have given serveral good admonitions to the Romans – another, previous contiguity e¤ect; cf. ALDH, Mone#ta, I, B). (b) Lat. mone#ta ‘coin’; attested since the end of the 1st century BC (ALDH, Mone#ta, II, B, 1) [¼ etymon of It. moneta ‘coin’, of Fr. monnaie ‘small change’, and via French also of E. money] As a rule, we identify contiguities ‘‘after the fact only’’ (Peirsman and Geeraerts 2006b: 333), i.e. after metonymic change or another type of contiguity-based change has taken place. In fact, the same holds for other types of semantic and/or lexical change and their relevant cognitive relations (2.2.4). But once the change has taken place, a frame-and-contiguity model like the one sketched in 1.2 is a powerful means to unify di¤erent subtypes of metonymy and to sharply distinguish them from other types of semantic change, such as specialization, generalization, metaphor, etc., which are based on other cognitive relations. This way, we can identify and understand di¤erent cognitive paths of change – an important goal in diachronic semantics. If contiguities and frames do not have a strictly prognostic value, metonymies (and other contiguity-based changes) have nevertheless a diagnostic value, because they tell us which contiguities were relevant for the people who ‘‘invented’’ the innovation (and which were not). 2.2.3. The internal referential typology of metonymy In section 1.2 we noted that a contiguity relation between two concepts P and Q is conceptually incompatible with plain taxonomic subordination of P to Q or vice versa, i.e. in the case of contiguity the referent class of P is neither a subset of the referent class of Q nor vice versa ((5), (6)). This
272
Peter Koch
does not mean however that referential considerations are completely out of place in the context of metonymy. In particular, metonymy has been straightforwardly characterized, in addition to domain highlighting, by shift of reference (cf. Nunberg 1995; Croft 1993; 2006: 320). In fact, in many metonymies the referent classes of the two concepts are completely disjunct (13), as for instance, with respect to (12): Iuno ! Iuno’s temple or mint ! coin. (13) (a) P and Q are contiguous. (b) CP B CQ ¼ Ø (referent-sensitive metonymy) We can denominate the metonymies for which (13) holds as ‘‘referentsensitive’’, because they involve a complete shift of reference (cf. Koch 2001a: 219–224; 2004: 21–23). On closer inspection we even notice that in many cases the contiguity relation between P and Q only holds for a prototypical subset of CP and/or for a prototypical subset of CQ (cf. Geeraerts 1997: 68–75; Koch 1999a: 149–151; 2004: 23). During the lexicalisation process these restrictions to prototypical subsets often get lost so that, in the end, the metonymy holds for the total classes of referents of CP and CQ (‘‘inductive generalization’’ according to Dik 1977). The possible dependence of metonymy on prototypicality e¤ects goes far beyond these cases and may even have repercussions on referentsensitivity. The referent classes of the two readings of English child (14)(a)/(b) are not identical, because not every offspring (Q) of another person is a young person (P) and not every young person (P) is considered in its quality of offspring (Q). (14) (a) child ‘young person below the age of puberty’; first attestation: Cunae, cild claðas (Corpus glossary, 623; cit. OED, clothes, n.pl., sense 1.b., a800). (b) child ‘o¤spring’; first attestation: Riche men . . þe habbeð . . feire wifes . and feire children (Lambeth homilies, 49; cit. OED, child, n., B., sense II.8.a., c1175) But the referent classes P and Q are not disjunct either, since there is a substantial subset of CP as well as of CQ for which P as well as Q applies (CP B CQ). This is exactly the prototypical subset for which the contiguity of P and Q is relevant and which underlies the metonymy of child (cf. Koch 2001a: 223f.; 2004: 23–25; an example of a possible bridging
The pervasiveness of contiguity and metonymy in semantic change
273
context can be found in OED, flint, n., sense III.8.a., c1175). For such ‘‘non-referent-sensitive’’ metonymies we must state, di¤erently from (13): (15) (a) P and Q are contiguous. (b) CP B CQ A Ø (non-referent-sensitive metonymy) Note that in these cases non-referent-sensitivity holds only for the innovative phase based on a bridging context. During lexicalization the new reading CQ is extended, by inductive generalization to referents to which the old reading CP clearly does not apply, as in (14)(b). Since it would be absurd to disregard the metonymic figure-ground e¤ect that produced cases like this one, complete reference shift can no longer be considered a necessary condition of metonymy. Consequently the referential typology of metonymy comprises two fundamentally di¤erent subtypes: referent-sensitive (e.g. (11)) and non-referent-sensitive ((14); another example will be (24)). In the discussions of lexical semantics there is another referential problem that seems to be related to contiguity. The examples in (16) illustrate the phenomenon of ‘‘facets’’ (cf. Cruse 2000: 114–117; Croft and Cruse 2004: 116–125; Kleiber 1999: 87–101). The English word book can be understood either in the sense of tome (16)(a) or in the sense of text (16)(b). In (16)(c), however, both facets are present. (16) (a) This book weighs ten pounds. [tome] (b) This book is a history of Great Britain. [text] (c) This book, which is a history of Great Britain, weighs ten pounds. [tome þ text] (d) I like books. [?] It is interesting to note that the facets tome and text are contiguous elements of the frame book. However, the relation between the two sense e¤ects (a) and (b) does not correspond to the non-referent-sensitive type of metonymy (15) and even less to the referent-sensitive type (14). The referential problem does not even arise, because the referent classes of book as a tome and book as a text are identical (17). (17) (a) P and Q are contiguous. (b) CP ¼ CQ (facets) This explains why co-presence of the two facets (16)(c) as well as indeterminacy (16)(d), i.e. a lack of any figure-ground profile, is possible in
274
Peter Koch
completely ordinary contexts. In contrast to this, truly metonymically related readings of a given lexical item involve, as we have seen above, totally or partially disjunct referent classes and, hence, a clear figureground profile in most contexts. Moreover, facets do not seem to be very sensitive to lexical change.9 All this indicates that the phenomenon of facets has to be excluded from the category of metonymy (cf. Croft 1993: 349f.), although it is based on contiguity. 2.2.4. The range of metonymy The conclusions above lead us directly to the question of the range of metonymy and of its external limits (cf. beginning of section 2.2, 334). Even in traditional rhetoric a controversial point is the relation between metonymy and synecdoche, which is often considered a trope of its own (cf. Lausberg 1973: § § 572–577). On closer inspection synecdoche turns out to be a rather heterogeneous category (cf. the comprehensive survey in Koch and Winter-Froemel 2009). Among other things there are partwhole synecdoches on the one hand and species-genus synecdoches on the other. To begin with the first of these patterns, English bar displays a shift from part (a) to whole (b). (The well attested opposite process is whole to part, which we do not exemplify here in order to save space.) (18) (a) bar ‘counter in a public house’; first attestation: He was acquainted with one of the seruants . . . of whom he could haue two pennyworth of Rose-water for a peny . . . wherefore he would step to the barre vnto him (Robert Greene, Art of conny catching ¼ Notable discovery, iii. 20; cit. OED, bar, n.1, sense III.28.a., 1592). (b) bar ‘public house’; first attestation: He sees the girl in the bar (Frederick Marryat, Jacob Faithful, xii; cit. OED, bar, n.1, sense III.28.a., 1835). In diachronic studies metonymy and synecdoche sometimes continue to be kept apart (cf. Hock 1991: 285f.; Campbell 2006: 257–260), but many 9. Note however that they may be subject to profound cultural change. Thus, the facet cluster book ¼ tome þ text did not exist at the epoch of scrolls and may disappear with the di¤usion of electronic books.
The pervasiveness of contiguity and metonymy in semantic change
275
theorists include the part-whole pattern in metonymy.10 In a very radical way Ruiz de Mendoza Iba´n˜ez (2000: 115f.) even chooses the solution of reducing all metonymies to part-whole processes. A little less radically, Peirsman and Geeraerts (2006a: 278f.) conceive (spatial bounded) partwhole relations as the prototypical core of contiguities underlying metonymies (cf. 2.2.1). Still more cautiously, we may confine ourselves to observing that the part-whole pattern is only a special case of the FRAME–element schema (cf. Figures 2 and 3 in 2.1 and once more 2.2.1). This is a strong argument for not separating the part-whole subtype from the other instances of FRAME–element (such as containercontent, person–property, etc.). An additional argument relies upon the fact that the border-line between part-whole and other contiguity patterns is sometimes blurred. Thus, in our example (18) the counter could also be conceived as the located and the public house as the location. Once we include the part-whole pattern in metonymy, the question arises where we want to place the most important remaining type of synecdoche, the species-genus (i.e. member-category) pattern. Let us underline at once that the cognitive relation underlying this pattern corresponds exactly to what we treated in 1.2 as taxonomic sub-/superordination. Thus, the new reading (20)(b) of English hound, canis used for hunting, stands in a relation of taxonomic subordination to the original reading (20)(a), canis. This type of semantic change is called semantic specialization or ‘narrowing’. (The well attested opposite process is generalization, or ‘broadening’, which we do not exemplify here in order to save space.) (20) (a) hound ‘quadruped of the genus Canis’; first attestation: Dumbe hundas ne magon beorcan (K. Ælfred, Gregory’s pastoral care tr., xv. 89; cit. OED, hound, n.1, sense 1., c897) [last authentic attestation of this original reading according to OED: 1508].
10. Cf. e.g. Ullmann 1957: 89, 203, 204, 222, 232, 234; 1964: 212; Jakobson 1971: 90f. (E. hut–thatch); Le Guern 1973: 29–38; Lako¤ and Johnson 1980: 36; Croft 1993: 350; Blank 1997: 253–255; Koch 1999a: 153f.; 2001a: 216f.; Nerlich and Clarke 1999: 197–203; Radden and Ko¨vecses 1999: 30–36; Seto 1999; Panther and Thornburg 2007: 238. – As for other solutions in the relevant literature, cf. Koch and Winter-Froemel 2009: col. 361–364.
276
Peter Koch
(b) E. hound ‘quadruped of the genus Canis used for hunting’; first attestation: Hundes and hauekes, and alle ðo þing ðe ‰eu hier gladien mai (Vices and Virtues, 69; cit. OED, hound, n.1, sense 2., c1200) [ordinary reading in Modern English]. For those who include the part-whole pattern in metonymy, there are two possible strategies with respect to the species-genus residue of former ‘‘synecdoche’’. Either the species-genus pattern is considered to be close to the part-whole pattern and therefore incorporated into metonymy as well (‘‘incorporation strategy’’), or a rigorous split is introduced between the (henceforth metonymical) part-whole pattern and the taxonomic species-genus pattern (‘‘split strategy’’). The ‘‘incorporation strategy’’ is visible in Lako¤ 1987 (77–90, 287)11 and in Radden and Ko¨vecses 1999 (34f.). It is discussed with an explicit proviso by Peirsman and Geeraerts (2006a: 307f.), who assign to the species-genus pattern at best a marginal position within the prototypical category of metonymy, and underline the possibility of an alternative interpretation in terms of specialization and generalization. As already noted above, the species-genus pattern is based by definition on the relation of sub-/superordination of categories, i.e. on the taxonomic principle. In contrast to this, the part-whole pattern explained above is based on the relation element-frame, i.e. it corresponds to one aspect of the engynomic principle (cf. section 1.2). It is undeniable that there are interesting interfaces between the part-whole and the speciesgenus pattern in ontogenesis, with respect to semantic change in individual lexical items, and on the level of visual representation.12 The partwhole pattern may even be interpreted as a metaphor for the speciesgenus pattern (Radden and Ko¨vecses 1999: 34). However, the passage 11. It is based, however, on an insu‰cient definition of ‘metonymy’ (due among other things to an ambiguous definion of ICM: see above 4): ‘‘one element of [an] ICM, B, may stand for another element A [sc. of the ICM].’’ This is just the reason why everything becomes possible, even the application of the notion of ‘metonymy’ to taxonomies. 12. For the building up of logical classes out of collections of contiguous objects in ontogenesis cf. Inhelder and Piaget 1970. For the passage from the engynomic to the taxonomic point of view in the change of individual lexical items designating collections, cf. Peirsman and Geeraerts 2006a: 304–307; Mihatsch 2006: 98–114. From the point of view of visual representation it is interesting that taxonomic classes and their inclusion are often designed in the form of Euler diagrams, i.e. in terms of wholes and parts.
The pervasiveness of contiguity and metonymy in semantic change
277
from part-whole, understood as a case of engynomy, to species-genus taxonomy is not a simple shift, but rather a switch from something basic to something cognitively di¤erent and more complex. In fact, as we have seen in 1.2, (5) and (6), contiguity and taxonomic sub-/superordination, as such, are conceptually incompatible (hence the term ‘‘switch’’). So we are entitled to settle upon the split strategy and to keep apart metonymic change, including part-whole, on the one hand and specialization/ generalization, based on species-genus sub-/superordination, on the other hand. Interestingly, this decision concerning the internal diachrony of single lexical items is confirmed by an analogous split in the realm of synchronic relations between distinct lexical items, namely the distinction between ‘‘taxonomy’’ and ‘‘meronomy’’ (cf. Cruse 1986: 136–180; Croft and Cruse 2004: 141–163). So the traditional notion of synecdoche blows up.13 With regard to semantic change, we will henceforth distinguish the engynomic process of metonymy, including part-whole processes, from the taxonomic processes of specialization and generalization. Furthermore, we have to distinguish metonymy, as contiguity-based, from similarity-based types of semantic change. For example, a rather rare type of semantic change that is based on taxonomic similarity is ‘‘co-hyponymic transfer’’ (cf. Blank 1997: 207–217). Its di¤erence from metonymy follows from the incompatability between contiguity and taxonomic similarity demonstrated in section 1.2, (3), (5), (7), (8). A very frequent similarity-based type of semantic change is metaphor (cf. Black 1954; Lako¤ and Johnson 1980; Croft 1993; Koch 2005: 171– 174; Grady 2007). Metaphor is based on the interaction (Black) or mapping (Lako¤ and Johnson) between two similar14 concepts P and Q that belong to two di¤erent frames and/or to two di¤erent taxonomies. So, what we can call ‘‘metaphorical similarity’’ (21)(a) is not compatible with contiguity (21)(b) nor with taxonomic sub-/superordination or taxonomic
13. As for the term ‘synecdoche’, it may be dispensed with or be restricted either to part-whole processes within metonymy (cf. Le Guern 1973: 36; Waltereit 1998) or to taxonomic species-genus processes (cf. Nerlich and Clarke 1999; Seto 1999: 113–116). 14. Like the other cognitive relations (cf. Chapter 1.2), metaphorical similarities are not ontological, preexisting links, but a matter of construal. They are ‘‘discovered’’ in the metaphorical process (cf. the criticism in Lako¤ and Johnson 1980: 112–114). With this proviso in mind and di¤erently from Lako¤ and Johnson, we do not hesitate here to use the term ‘similarity’.
278
Peter Koch
similarity (21)(c). The negation of contiguity (21)(b) is a definitional element of the notion of ‘‘metaphor’’. Taxonomic similarity (21)(c), even where it accidentally arises, does not account for the essence of metaphor. (21) (a) P and Q are metaphorically similar. (b) P and Q are not contiguous (are not part of the same frame). (5) (c) P and Q do not stand in a taxonomic relation (sub-/superordination (1), taxonomic similarity (3)). This can be illustrated by example (22). The concepts bag and belly (body part) emerge as metaphorically similar. They belong to two di¤erent frames (21)(b). There is no relation of taxonomic sub-/superordination between them (21)(c). Taxonomic similarity (21)(c) is not relevant either, because the fact that both bags and bellies (body-part) are physical objects does not help us at all to grasp the metaphor. (22) (a) Old English bælg ‘bag, purse’; first attestation: And wilnade gefylle womb his of bean-bælgum (Lindisfarne gospels, Luke xv. 16; cit. OED, belly, n., sense I.1., c950) [reading attested only in Old English]. (b) belly ‘body part between the breast and the thighs’; first attestation: Þe brest with þe bely (Richard Rolle of Hampole, The pricke of conscience (Stimulus conscientiæ); a Northhumbrian poem, 679; cit. OED, belly, n., sense II.3.a., 1340) [ordinary reading in Modern English]. All in all, including the extremely rare contrast-based types of change, we get the following inventory of types of lexical semantic change, which can be clearly delimited from each other (cf. Blank 1997: 157–281; 2000; Koch 2005: 165–183; Ge´vaudan 2007: 77–113): – – – – –
metonymic change, including part-whole processes. specialization (cf. ex. (20) and generalization (contrary of specialization). co-hyponymic transfer (so above) – rather rare. metaphorical change (cf. ex. (22) contrast-based changes: antiphrasis, auto-antonymy (cf. Blank 1997: 217–229) – extremely rare.
To sum up, the range of metonymic change can be delimited in di¤erent ways. From the conceptual point of view, it stands against the other types
The pervasiveness of contiguity and metonymy in semantic change
279
of lexical semantic change listed above. From the referential point of view, it is opposed to another contiguity-based phenomenon, namely facets, discussed above in 2.2.3. As we will finally see in section 5.3, within the lexicon it has to be di¤erentiated from and situated with respect to other contiguity-based, but formally di¤erent processes of change. In order to measure the whole extent of metonymy, the following sections 2.3–2.5 examine a certain number of subtypes that are particularly frequent and/or spectacular. Our conception of the cognitive relation of ‘‘contiguity’’, being both precise and powerful, as well as the just as precise and powerful notion of ‘‘frame’’ (section 1) will allow us to equally and e¤ectively bring back much less banal types of examples to the metonymic model. 2.3. Subjectification One notion in diachronic semantics which has been very successful in the last two decades is the notion of ‘‘subjectification’’, a term used by Langacker as well as by Traugott. Although both use the same term, there are undeniable di¤erences between the two authors concerning their notions of subjectification (underlined especially in Traugott 1999: 187– 190, less so in Langacker 1999: 149–156; cf. also De Smet and Verstraete 2006). One important divergence can be stated with respect to the distinction between the level of the description of an event in the extralinguistic world (propositional meaning: conceptualized described event) and the level of the communicative event (textual or expressive meaning: conceptualized speech event). In Traugott, ‘‘objectivity’’ (if we may call it this) corresponds to the conceptualized described event, ‘‘subjectivity’’ to the conceptualized speech event. The semantic change of the English word observe is a case of subjectification according to Traugott. The starting point is the description of an event in the extralinguistic world (conceptualized described event: notice, perceive; first attestation in the OED: observe, v., sense 8., 1560). From there, the concept expressed shifts to the level of a communicative event (conceptualized speech event: state): (23). (23) observe ‘say by way of remark, state’; first attestation: Your Majesty doth excellently well observe, that witchcraft is the height of idolatry (Francis Bacon, Of the advancement of learning, ii. xxv. §24; cit. OED, observe, v., sense 10., 1605). In fact, our experience tells us that we often utter remarks that are based on perception of the world surrounding us. Consequently, a figure-ground
280
Peter Koch
e¤ect within a frame of communicative activity (¼ FRAME in Figure 2) makes us shift from the concept notice, perceive (¼ E1) to the contiguous concept state (¼ E2). So, then, this type of subjectification is a type of metonymy. For Langacker, the di¤erence between described event and conceptualized speech event is not crucial as such. ‘‘Objectivity’’ and ‘‘subjectivity’’ may belong to the same level and, thus, remain for instance on the level of the described event. What really counts is the perspectivisation process, whose starting point is the objective construal of an onstage object of conception (comprising however already an element of additional subjective construal by the conceptualizer) and which brings about the total fading away of the objective construal and a reduction to the element of subjective construal. The semantic change of the English word boor, for example, is a case of subjectification according to Langacker (1999: 150) in the sense that a change of perspective makes the objective construal progressively fade away. From peasant (attestations in OED, boor, sense 1, going from 1430 to 1798), the concept expressed successively passed to rustic (24)(a) and to rude fellow (24)(b), where only the subjective construal survives.15 (24) (a) E. boor ‘rustic, country clown,’; first attestation: I dull-sprighted fat Boetian Boore (John Marston The metamorphosis of Pygmalions image, and certaine satyres, ii. 142; cit. OED, boor, sense 3.[a.], 1598). (b) E. boor ‘rude, ill-bred fellow, clown’; first attestation: Grossolano, a lubber, a clown, a boore, a rude fellow (John Florio, A worlde of wordes, or most copious and exact dictionarie in Italian and English; cit. OED, boor, sense 3.b., 1598). In fact, everyday experience told people of the 16th century that farmers/ countrymen were prototypically rude and ill-bred. Consequently, a figureground e¤ect made them shift from the concept farmer (¼ FRAME in Figure 3) to the contiguous concept rude (fellow) as one element of the frame (¼ E1). This type of subjectification is therefore a type of metonymy, too. As our analyses of (23) and (24) have shown, the processes of subjectification, be it in Traugott’s or be it in Langacker’s sense, are clearly 15. From the point of view of attestation the two senses corresponding to (24)(a) and (b) are simultaneous, but from the logical point of view (b), of course, presupposes (a).
The pervasiveness of contiguity and metonymy in semantic change
281
metonymical in nature (cf. also Traugott and Dasher 2002: 29; MarchelloNizia 2006: 100).). The converse relation, however, does not hold. I have cited and will cite other examples of metonymy without subjectification in one of the two senses ((10), (11), (12), (14), (27), (30)). The unidirectionality, typical of subjectification, does not necessarily apply to other metonymies. There are even typically bidirectional metonymies (cf. Koch 2008: 123–125). 2.4. Delocutive change Let us take an additional look at a particular type of subjectification a` la Traugott, which was discussed originally in a completely di¤erent theoretical context. Generalizing the notion of ‘‘delocutivity’’ by Benveniste (1966), Anscombre shows the relevance of the phenomenon, not only for Speech Act Theory, but also for diachronic semantics (cf. Anscombre et al. 1987). An interesting example is English encore: (25) encore ‘again, once more (used by spectators/auditors of artists’ performance)’; first attestation and explanation: Whenever any Gentlemen are particularly pleased with a Song, at their crying out Encore . . the Performer is so obliging as to sing it over again (Sir Richard Steele, The spectator, No. 314 39; cit. OED, encore, int. and n., A. int., 1712). In accordance with Anscombre’s analyses, we could reconstruct the following semantic change for this word: (26) Delocutive semantic change of English encore I. II. III. IV.
existence of an English lexeme encore ‘once more’ (borrowed from French encore16); frequent use of English encore ‘once more’ in a speech act {demand the repetition of an artist’s performance}; lexicalization of a new performative reading of encore that corresponds to the speech act accomplished in II.; reanalysis of the usage of English encore in II., in conformity to the new performative reading resulting from III (for reanalysis cf. section 2.5.).
16. According to DHLF, encore, and OED, encore, Fr. encore has never been used in the sense exemplified in (25). So it must have come into English in the sense of ‘again, once more’ that existed and still exists in French. – It is
282
Peter Koch
In my view, this ‘‘delocutive’’ change is undeniably metonymic in nature. The starting point is a speech act (SA), in this case {demand the repetition of an artist’s performance}, constituting a frame that contains, among others, the following two elements: a concept C, corresponding to a contextual element that is indispensable for the conceptualization of the speech act SA (in this case: the nonverbal reaction of the hearer, the repetition, to be specific), and a verbal formula F that expresses C (or an aspect of C) and that, consequently, forms a frequent element and is thereby prototypical for the realization of the speech act SA (cf. Koch 1993: 268–272):
Figure 4. Delocutive frame
This triple contiguity of SA, C and F easily triggers a figure-ground e¤ect which makes the main word of the formula F (whose meaning corresponds to C or to an aspect of C) shift towards a meaning which corresponds to the sense of the contiguous speech act SA (cf. (26), I. ! III.). Following Anscombre, we can reconstruct, in the same way, the type of semantic change that generates many speech act verbs (cf. Koch 2001a: 209f.; Blank 1997: 256–258). Delocutive changes are certainly lexical changes, but they import metonymical e¤ects based mainly on the contiguities between a lexical meaning and pragmatic elements, such as the sense of a speech act SA and the verbal formula F used for performing SA. It is not pure chance that Traugott not impossible that the sense shown in (25) has been prefigured by E. ancora (borrowed from I. ancora ‘again, once more’), which has developed – but once more only in English (cf. GDLI, ancora) – the same sense, attested already in 1712, but nowadays obsolete (cf. OED, ancora); cf. also the hybrid form E. encora (OED, encore, Etymology and A., 1781).
The pervasiveness of contiguity and metonymy in semantic change
283
and Dasher (2002) dedicate a whole chapter to the genesis of performative verbs and constructions. The delocutive genesis of a performative verb actually corresponds to a process of subjectification following the formula ‘‘conceptualized described event ! conceptualized speech event’’ (see 2.3). 2.5. Speaker-induced metonymy vs. hearer-induced metonymy (reanalysis): the internal pragmatic typology of metonymy In 2.2.1 we discussed important dimensions of the typology of metonymy. The pragmatic dimension has been postponed there in order to be broached in this section, because it corresponds to the most fundamental divide within the field of metonymy. In traditional rhetoric it seems to go without saying that the trope of metonymy – like all the tropes– is triggered by a speaker’s choice. Nothing but the term that describes metonymy in the Rhetorica ad Herennium (cf. (9): ‘‘denominatio’’) presupposes the speaker’s perspective (cf. also ‘‘trahit orationem’’, ‘‘sit appellata’’; the hearer’s receptive role is represented by ‘‘possit intellegi res’’). From this perspective it is hence necessary to conceive the creation and the subsequent adoption of a semantic innovation as follows:
Figure 5. Innovation triggered by the speaker
If we apply this schema to metonymy, it can be said that it is the speaker S1 who creates a figure-ground e¤ect in a frame. The hearer H1 takes note of this innovation and, as a speaker S2 in a subsequent communicative act, uses the same metonymy, etc. This is consequently a case of ‘‘speaker-induced’’ metonymy. This schema can actually be applied to all changes based on tropes (metaphors, species-genus synecdoches, etc.) and, thus, certainly also to a great number of metonymies, if not to the majority of them ((10), (11), (23), (24), 4.2., go-future). There is however one type of metonymy whose genesis is characterized by a quite di¤erent pragmatic ‘‘punctuation’’. This may be illustrated by example (27). In Old English the word witnes had the two readings, ‘testimony’ and ‘person giving testimony’, which have survived in Modern English witness:
284
Peter Koch
(27) (a) witness ‘testimony’; first attestation: Falsa testimonia, leasa witnesa (Lindisfarne gospels, Matt. xv. 19; cit. OED, witness, n., sense 2.a., c950). (b) witness ‘person giving testimony’; first attestation: Falsi testes, lease vel lycce witnesa (Lindisfarne gospels, Matt. xxvi. 60; cit. OED, witness, n., sense 4.a., c950). With respect to the underlying word-formation, OE. wit-nes is clearly a nomen abstractum, meaning originally ‘knowledge, wisdom’ and hence ‘testimony’ (cf. also OED, witness, n., Etymology and 1.). Consequently, even if the – very early – first attestations of the two readings (27)(a) and (b) are simultaneous, the reading ‘person giving testimony’ must be due to a semantic change on the basis of ‘testimony’.17 Obviously the change described for witness is metonymic in nature. According to Figure 2, it is a figure-ground e¤ect from testimony (¼ E1 to person giving testimony (¼ E2), which belong both to a common FRAME, say, trial (cf. Blank 1997: 246, 383f.). From the perspective of the speaker S1 (Figure 5) it would not be very natural to trigger such a metonymy. Expressing her/himself in the technical context of a trial, s/he would be rather inclined to di¤erentiate linguistically the two distinct, though related, elements testimony and person giving testimony of the relevant frame. So, in order to understand the pragmatic rationale of this change, it is more promising to take the hearer’s perspective and to start from utterances like – in modernized version – Let’s hear the next witness!, which could be realized in the context of a trial. If we situate such an utterance in its communicative context, the global pragmatic meaning of the utterance could be, for instance, ‘The trial hearing is going to continue’. If speaker S1 (e.g. a judge) uses witnes to express this, it certainly occurs through the concept testimony (¼ E1), because it corresponds to the conventional meaning of this word. The hearer H1 is mainly interested in the global pragmatic interpretation of the utterance. Communication is successful, even if her/his personal reconstruction of the conceptual meaning of the linguistic elements used partly deviates from the construction of 17. A similar, diachronically better documented change took place from ClassLat. te#stimo#nium ‘testimony’ to OFr. tesmoin ‘testimony; person giving testimony’ (cf. DHLF, te´moin). Note that, di¤erently from ModE. witness, Mod.Fr. te´moin has lost the sense ‘testimony’.
The pervasiveness of contiguity and metonymy in semantic change
285
the meaning by S1. In the present case the global pragmatic interpretation ‘The trial hearing is going to continue’ would be compatible as well with the contiguous concept person giving testimony (¼ E2) – salva veritate, so to speak: listening to a testimony implies listening to a person who gives the testimony. From the moment onwards when a hearer H1 conceives a (personal) conceptual analysis containing E2 instead of E1, a ‘‘hearer-induced’’ metonymy comes into being (cf. Koch 2001a: 225–228; 2004: 42–45; Ge´vaudan 2007: 57f.).
Figure 6. Innovation triggered by the hearer
Utterances like Let’s hear the next witness!, performed in a typical situation (a trial in the present case), represent what Heine (2002: 86–92) calls a ‘‘bridging context’’, a notion that is useful, as we are seeing here, for both grammatical and lexical change. As shown in Figure 6, the speaker S1 uses the word in question according to traditional rules, without wanting to suggest any innovation whatsoever. It is hearer H1 who accomplishes a figure-ground e¤ect within the relevant frame. The e¤ect, however, remains entirely compatible with the global pragmatic meaning of the utterance. As a speaker S2 of a subsequent communicative act, s/he then actively transmits her/his innovation to a hearer H2 etc. etc. The course of this change corresponds exactly to what Detges and Waltereit (2002) identified as the mechanism of reanalysis. They showed that reanalysis presupposes two cognitive principles: the ‘‘principle of reference’’ and the ‘‘principle of transparency’’. As far as the principle of reference is concerned, it must be the case that the ‘‘personal’’ conceptual interpretation of hearer H1 is compatible with the reference of the utterance at the moment of its being uttered. The – possibly extralinguistic – reaction through hearer H1 makes speaker S1 understand that H1 has grasped the global pragmatic meaning of the utterance, independently of the fact that H1 has given a ‘‘deviant’’ conceptual analysis to this utterance. With respect to the principle of transparency, it is, in this case, the described contiguity between testimony and person giving testimony which guarantees a semantic motivation.
286
Peter Koch
Refusing a conviction which is firmly anchored in certain linguistic milieux, Detges and Waltereit (2002) underline that reanalysis is in the first place a semantic process which operates on a given chain of speech (but which can also have repercussions on the grammatical level: see 4.1). There are only two semantic relations that lexical reanalysis can exploit without violating the principle of reference: firstly, taxonomic subordination, since it is always possible to assign a more precise concept to a referent in the taxonomic hierarchy18; and secondly, contiguity, since as we have seen a shift towards a contiguous concept does not necessarily jeopardize the global reference of the utterance. The majority of reanalyses seems to be of metonymical nature (cf. Detges and Waltereit 2002: 165; also Waltereit 1999). Given the extremely elementary character of the figure-ground e¤ect between contiguous elements (cf. 1), this is not at all surprising. It could be asked how it is possible to distinguish hearer-induced and speaker-induced metonymies, in general and above all a posteriori. We must bear in mind the importance of the bridging context and of the principle of reference for reanalysis and, hence, for hearer-induced metonymies. As we have seen, the conceptual figure-ground e¤ect is pragmatically indi¤erent within the bridging context. This does not hold for speakerinduced metonymies, as intentional rhetorical tropes, which are pragmatically highly relevant within the bridging context. Thus, it makes a considerable di¤erence, both conceptually and pragmatically, whether I use boor (24) to express simply peasant or to express mainly country clown, rustic. Some speaker-induced metonymies do not even involve any bridging context (cf. (10) Lat. frigus pigrum ‘lazy cold’ with respect to puer piger ‘lazy boy’). It is up to further research to pin down this di¤erence between hearer-induced and speaker-induced metonymies more exactly. The distinction between speaker-induced and hearer-induced metonymies adds an additional dimension to our classification. A closer look will reveal the impact of hearer-induced metonymy: step IV. of delocutive changes (2.4., (26)) is nothing else than a metonymical reanalysis on the hearer’s side, following the reasoning: If S1 uses the English lexeme encore to express the speech act SA {demand the repetition of an artist’s performance}, this verb must express the concept demand the repetition of an artist’s performance. As we will see, popular etymology,
18. Cf. Ge´vaudan 2007: 103; Koch 2004: 45 n. 31.
The pervasiveness of contiguity and metonymy in semantic change
287
too, will not be understood without going back to a reanalysis on the hearer’s side (5.1).
3. Discourse markers The diachronic processes considered in section 2 concern not only the lexicon, but also other levels of linguistic analysis. As we saw especially in section 2.5, pragmatics is of paramount interest to the production of metonymies. So it is not surprising to find metonymic e¤ects that are not only triggered by discourse pragmatics, but are also relevant to its innermost functioning. The emergence of discourse markers has been drawn near to the processes of grammaticalization as well as lexicalization; some prefer to rather speak of ‘‘pragmaticalization’’. Without wanting to bring this discussion to a close here19, I want to underline that, in publications about diachronic evolutions within this field20, it is very di‰cult to find a case of change that is not metonymic in nature (even if that is not always made obvious by the authors). The example of the English discourse marker look you, analysed following Waltereit (2002), is helpful here.21 Originally, it is the imperative singular of the verb look, followed by the personal pronoun of the 2nd person singular, which simply expresses a request to the hearer to direct his/her sight to s.th., facing some important thing or event visible in the situation (a horsetrick in (28)): (28) Look you, here’s your worship’s horsetrick, sir. (Gives a spring.) (P. Massinger et al., The excellent comedy called The old law . . , iii. ii; cit. OED, horse, n., IV.27.b., 1599). This request to the hearer to direct his/her sight to s.th. (¼ E1 in Figure 2a) implies, within the same pragmatic frame, an appeal to the
19. Cf the discussion in Erman and Kotsinas 1993; Dostie 2004: 22–33; Brinton and Traugott 2005: 136–140. 20. Cf. Brinton 1996; Hansen 1998; Waltereit 2002; Dostie 2004. 21. Waltereit (2002) analyzes the close Italian equivalent guarda. His approach is somewhat di¤erent from the one presented in Brinton 2001 and Brinton and Traugott 2005: 138f. with n. 28. Nevertheless the reliance on metonymy (in the form of subjectification according to Brinton and Traugott) is a common point.
288
Peter Koch
attention of the hearer (¼ E2). If E2 passes, by a figure-ground e¤ect, to the foreground (Figure 2b), look you can transform into a discourse marker, more exactly: an opening marker, expressing an appeal to listen, without any reference to a thing or an event visible in the situation:22 (29) Look you, she loved her kinsman Tybalt dearly, /And so did I (William Shakespeare, Romeo and Juliet, iii. iv. 3–4 [1594–96]; cit. Brinton 2001: 184). Obviously, this metonymic step is at the same time a process of subjectification in Traugott’s sense following the formula ‘‘conceptualized described event ! conceptualized speech/audition event’’ (see section 2.3).
4. Metonymic grammatical change Up until now we have only considered examples of metonymies which were of purely lexical nature (section 2) or which generated pragmatically relevant elements. Grammar itself is not our main concern here, but we will nevertheless hint at two types of grammatical phenomena frequently involving metonymic change: grammatical reanalysis and grammaticalization. 4.1. Grammatical reanalysis (hearer-induced metonymy) It is in the context of grammar that investigation into reanalysis started by paying special attention to the aspects of syntactic rebracketing and morphological recategorization.23 In recent years it has been shown that reanalysis is a pragmatically and semantically motivated phenomenon and that it may – and indeed very often does – involve semantic change. This not only holds for lexical reanalysis (section 2.5), but also for grammatical reanalysis. As on the lexical level, the majority of grammatical reanalyses seems to be metonymic in nature (examples in Detges and 22. The ‘‘literal’’ example (28) is slightly posterior to the metonymic one (29), but for our purpose it is su‰cient to see that they are roughly simultaneous, and that a use like in (28) must have existed as a base for a use like in (29). – The development into ModE. lock’ee (cf. Brinton 2001: 185; Brinton and Traugott 2005: 138) is a further step, not only from the formal, but also from the pragmatic point of view, because it conveys the speaker’s impatience – another metonymic shift, by the way. 23. Cf. Langacker 1977; overviews in Hopper and Traugott 2003: 50–63; Lang and Neumann-Holzschuh 1999a: 1–18; Marchello-Nizia 2006: 43–46.
The pervasiveness of contiguity and metonymy in semantic change
289
Waltereit 2002: 165–168; cf. also Waltereit 1999). Thus, the psych verb Old English lician/Middle English like(n) originally displayed the construction (30)(a) with respect to its two participants: experienced ¼ nominative subject (Nom/S); experiencer ¼ dative indirect object (Dat/ IO). From the point of view of information structure se#o ‰iefu is thematic and þam eorlam rhematic. (30)(b) represents a most frequent informational variant of (a), the experiencer ( þam eorlum) being thematic and the experienced (se#o ‰iefu) being rhematic. As shown in (c), during the Middle English period the original Dat/IO of the thematic experiencer (Þe erles) has been reanalysed, probably via a kind of dative subject (Dat/S), as an ordinary subject (S), which is the only possible interpretation in Modern English ((d): the nobles), whereas the rhematic experienced ( þe ‰ift) has been reanalysed as a direct object (DO), which is the only possible interpretation in Modern English ((d): the gift) (cf. Jespersen 1949, III: 208–210; Seefranz-Montag 1983: 104–144; Allen 1995). (30) (a) Old English Se#o ‰iefu licode þam eorlam Nom/S Dat/IO (for the construction cf. Allen 1995: 146, (103), 245, (169)). (b) Old English Þam eorlum licode se#o ‰iefu. Dat/IO Nom/S (for the construction cf. Seefranz-Montag 1983: 114) (c) Middle English Þe erles likede þe ‰ift. Dat/IO Nom/S ! Dat/S (?) ! (Nom/)S DO (for the construction cf. ibid. and Allen 1995) (d) Modern English The nobles liked the gift. S DO Verbs surrounded by their participants are an almost ideal example for the linguistic expression of frames (cf. e.g. Fillmore 1977). Now, the change illustrated in (30) is not merely a transformation of syntactic coding of the same frame. Since the constructions in (30)(a) and in (30)(d) correspond to two di¤erent perspectivizations of the same frame, the shift from (a) to (d) is metonymic in nature. In (a) the foregrounded element is the experienced, in (d) the experiencer of the psych frame (cf. Koch 2001b: 73–77; also Waltereit 1998: 79–83). The specific informa-
290
Peter Koch
tional arrangement of (c) constitutes a ‘‘bridging context’’, typical of reanalysis (2.5).24 Sentence type (b/c), though being a variant of construction (a), which intrinsically foregrounds the experienced, is analysed as an instance of a construction that intrinsically foregrounds the experiencer (later on (d)). This semantic reperspectivization formally manifests itself in the interpretation of the Dat/IO Þe erles as a Dat/S and furtheron as an S and of the Nom/S þe ‰ift as a DO. The semantic and formal process of reanalysis with respect to (c) satisfies both the principle of reference, since the frame and its informational profile remain intact, and the principle of transparency. Indeed, since English is evolving towards a strict SVO language, it seems more adequate to express the first participant as an S. 4.2. Grammaticalization As for the process of grammaticalization, it is no longer necessary to underline its importance for grammatical change. Quite rightly, there has been insistence upon the di¤erences between the notions of ‘‘reanalysis’’ and ‘‘grammaticalization’’ in recent years. These two notions must not be put on a par with each other (cf. Hopper and Traugott 2003: 58¤.; Haspelmath 1998; Lang and Neumann-Holzschuh 1999a; Detges and Waltereit 2002; Marchello-Nizia 2006: 45¤.), since they do not belong to the same level of abstraction. Grammaticalization is a process with several stages which can involve steps of reanalysis. Grammaticalization is unidirectional, reanalysis as such is not. Reanalysis is triggered by the hearer (cf. 2.5 and 4.1) guided by the discreet character of contiguities, hence of metonymy. By contrast, the starting point of a long grammaticalization process is a choice made by a speaker, who uses an expressive ‘‘rhetorical’’ strategy which allows him to solve di‰cult but frequent communicative situations (cf. especially Detges 1999; 2003). A particularly frequent rhetorical strategy chosen in these cases is, metonymy, because the speaker takes advantage of the concrete and seemingly objective character of contiguities. A classical example is the go-future, which can be observed in English (e.g. I am going to help him) and a great number of other languages (cf. Bybee et al. 1994: 243–280). In this case the speaker who triggered the innovation relied on an action frame, containing among others the elements movement, intention, and imminence. A succession of two 24. Both participants being in the singular, no problem of congruence arises. This is certainly another important precondition for this type of reanalysis.
The pervasiveness of contiguity and metonymy in semantic change
291
figure-ground e¤ects is used to suggest to the hearer that he virtually has proof of the truth of an a‰rmation about a future action. The speaker’s current movement towards a place, where the action in question (help him) will take place, reveals the sincere intention of the speaker to accomplish this same action; furthermore, the intention reveals the imminence of the action (cf. Detges 1999). 5. Word-formation and semantic change After having examined the semantic e¤ects which intervene on the lexical and grammatical levels, we will take a look at an area that belongs at the same time to the lexicon and to morphology, namely word-formation. This area is of course an object of synchronic research, concerning especially internal relations within the lexicon, lexical motivation, morphological and semantic regularities in the lexicon, etc. We will distinguish in this respect the ‘‘base’’ (e.g. English teach-), the ‘‘device’’ (e.g. su‰xation of -er leading from action to agent), and the ‘‘product’’ (e.g. teach-er). Wordformation is also an object of diachronic research in several respects: i. The most obvious application of the diachronic perspective to wordformation concerns the production of new words in time. Any nonce application of a given device of word-formation to a new base can be the starting point for the lexicalization of the resulting wordformation product as a new word in the lexicon of the language. Thus, teacher was a new formation in the 14th century, replacing the former larþeow, lorðe(a)u, la´r(e´)ow, etc. (cf. OED, lorthew, larew; teacher, sense 2.a.). Word-formation products as a diachronic issue will be addressed in section 5.3. ii. From the synchronic perspective a word-formation product is a motivated word of the lexicon. But lexical motivation is subject to language change as well and may, for instance, fade away in diachrony (e.g. ModE. dairy < ME. deie ‘dairy-maid’ þ -erie (place); cf. OED, dairy, n.). Conversely – and this is more important for metonymic innovation – language users may search for motivation in words that have been demotivated or never were motivated at all. Remotivation as a diachronic issue will be addressed in 5.1. iii. It is not only products of word-formation that are subject to language change, but also the devices underlying them. We know for instance that ModE. -hood goes back to a noun OE. ha´d, able to form compounds with other nouns (31) and progressively transformed into a derivational su‰x (cf. e.g. Marchand 1969: 293; Faiß 1978).
292
Peter Koch
(31) OE. sa´cerd ‘priest’ þ ha´d ‘state, condition, . . .’ ! sa´cerdha´d ‘condition of being a priest’ (cf. below (33)); etc. It is mainly for the sake of completeness that we mention this problem of the emergence of new word-formation devices as a diachronic issue.25 iv. Given that word-formation devices are subject to language change, we also have to take into account the semantic change of already existing devices. We will treat this kind of problem under the heading ‘‘semantic change of existing word-formation devices as a diachronic issue’’ in section 5.2. 5.1. Remotivation as a diachronic issue: folk-etymological semantic change and contiguity Part of the synchronic linguistic consciousness of speaker-hearers is the desire to ‘‘motivate’’ lexical signs wherever this seems possible (cf. Radden and Panther 2004). Sometimes language users even try to motivate words that are either opaque (and have always been so etymologically) or seem to be opaque, because their status as products of word-formation has been obscured. This is what is called ‘‘folk-etymology’’ or ‘‘popular etymology’’ (cf. Ullmann 1957: 91f., 1964: 101–105; Hock 1991: 202f.; Campbell 2006: 114–116). Folk-etymology may or may not involve semantic change. We are concerned here only with folk-etymological semantic change. Blank (1997: 303–317; 2000: 70) has shown expertly that this kind of process is not only based on a similarity of the signifiers (cf. Ullmann 1957: 234– 238; 1964: 211), but also on a cognitive relation which corresponds to contiguity, with only a few exceptions. To take a well-known example of folk-etymological semantic change (cf. Ullmann 1964: 221), the English noun boon1 and adjective boon 2 were etymologically unrelated homonyms. The noun ultimately went back to ON. bo´n ‘prayer’ and successively developed readings as ‘matter/ thing prayed for’ and ‘favour’ (32)(a). The adjective came from OF. bon ‘good’ and, among others, developed the reading ‘advantageous’ (32)(b). The noun boon1 (32)(a), which expressed the concept favour, clearly was
25. This has recently been discussed under di¤erent labels, including ‘grammaticalization’ (Chapter 4.2), ‘reanalysis’ (Chapter 2.5, Chapter 4.1) or even ‘lexicalization’: cf. Blank 2001: 1602 and the discussion in Brinton and Traugott 2005: 62–110.
The pervasiveness of contiguity and metonymy in semantic change
293
lacking any lexical-morphological motivation. Nevertheless, language users had two reasons to bring this noun together with the adjective boon 2 (32)(b). Formally, the two words were homonymous, and semantically, the adjective expressed a contiguous concept in the same frame, namely advantageous, since in general a favour or a gift graciously bestowed is advantageous for the receiver. Within the limits of the part of speech ‘‘noun’’ the most contiguous concept would be advantage ¼ advantageous thing. So it was not surprising that for the noun boon1 language-users shifted from the concept favour (¼ E1 in Figure 2a) to the contiguous concept advantage (¼ E2 in Figure 2b): (32)(a ! c). In fact this is nothing but a metonymic figure-ground e¤ect with respect to boon 1, which was facilitated by the semantic contiguity of boon 2 and boon 1 seemingly reflected in their formal identity, a process that has been reinterpretated as a conversion adjective ! noun. (32) (a) boon 1 ‘favour, gift, thing freely or graciously bestowed’; first attestation: Send us, lord, this blissid bone (The towneley mysteries, 282; cit. OED, boon, n.1, sense 4., c1460) [according to OED archaic reading in Modern English]. (b) boon 2 ‘advantageous, fortunate, favourable, prosperous’;26 I may wish boone fortune to thy iourney (Robert Greene, Greenes neuer too late; cit. OED, boon, a., sense 2., 1590) [according to OED, obsolete reading in Modern English]. (c) boon1 ‘blessing, advantage, thing to be thankful for’ [without the notion of asking or giving]; first attestation: The charter of Massachusets was not so great a boon (Thomas Hutchinson, The history of the Province of Massachusetts Bay (1628–1750), i; cit. OED, boon, n.1, sense 5., 1767). Let us call the lexical units exemplied in (32)(a), (b), and (c) ‘‘Source Unit’’, ‘‘Backing Unit’’, and ‘‘Target Unit’’ respectively. This material is well suited to show that we certainly cannot speak of ‘‘folk-etymology’’ without formal similarity of the signifiers of the Source and the Backing Unit, but that the formal aspect is insu‰cient to capture folk-etymology, whose very essence is a semantic improvement of the lexical motivation of the Source Unit. In the case of folk-etymological semantic change this 26. The most frequent collocation is boon voyage (since 1494, cf. OED, boon, a., sense 2.), clearly influenced by Fr. bon voyage.
294
Peter Koch
improvement is, firstly, backed up by accidental formal similarity,27 hand in hand with accidental semantic relatedness, mostly by contiguity, of the Source and the Backing Unit ((32)(a–b)), and it comes about, secondly, by the semantic change from the Source to the Target Unit, which is metonymic in nature in most cases ((32)(a ! c)). Note that contiguity comes into play, in our example and many others, at several points: between the Source and the Backing Unit (‘‘discovering’’ a new motivation), between the Backing and the Target Unit (bridging motivation and semantic change), and finally between the Source and the Target Unit (realizing, in fact, a metonymic change). One may wonder how it comes about that speakers who know their mother tongue well can trigger something as drastic as a folk-etymological semantic change by remotivating the Source Unit and by reinterpreting it into the Target Unit. Once more, it is useful to take the hearer’s perspective to understand the pragmatic rationale of this kind of process. Folk-etymological semantic change is nothing more than a particular type of lexical reanalysis, triggered by the hearer (see 2.4.; cf. Blank 2001: 1600f.; Detges and Waltereit 2002: 160, 163; Ge´vaudan 2007: 158–162). A speaker S1 uses the Source Unit according to traditional rules, without wanting to suggest any innovation. It is a hearer H1 who ‘‘discovers’’ the Backing Unit and innovates into the Target Unit. As a speaker S2 of a subsequent communicative act, s/he then actively transmits her/his innovation to a hearer H2 etc. etc. (cf. Figure 6). As for the principle of reference characterizing reanalysis, the ‘‘personal’’ conceptual reinterpretation of the Source into the Target Unit by hearer H1 is compatible with the reference and the global pragmatic meaning of the utterance at the moment of its being uttered (within a so-called ‘‘bridging context’’). As we already saw in 2.5, this is particularly easy with contiguity-based ‘‘deviations’’, i.e. with metonymies like (32)(a ! c). The principle of transparency characterizing reanalysis assumes particular importance in the case of folk-etymology, because motivation is improved by remotivation of the Source Unit through the Backing Unit and ratified by semantic change from the Source to the Target Unit. After the change, only the motivation of the Source Unit through the Backing Unit as well as the innovative Target Unit, though representing a fallacy
27. In some cases formal similarity is improved by slight phonetic adaptations, as e.g. from OE. wilcuma (containing will ) to ME./ModE. welcome (related to well ); cf. Ullmann 1957: 91; OED, welcome, n.1, etymology.
The pervasiveness of contiguity and metonymy in semantic change
295
from the scientific point of view, are real in the synchronic linguistic consciousness of the speaker-hearers of the community. 5.2. Semantic change of existing word-formation devices as a diachronic issue: metonymy Once a given word-formation device has come into being (section 5, iii), it may be subject to change (5, iv). Formation devices can undergo the main types of semantic change we already know from the diachrony of lexical words (2.2.4): mainly metaphorical, taxonomic and, of course, metonymic changes (cf. Mutz 2000: 243f.; Rainer 2005: 422–428). To go back to example (31), i.e. the transformation of -ha´d into something like a su‰xoid and hence into a su‰x (a process that must have begun rather early: see n. 28), let us consider the further development of this word-formation device (cf. Marchand 1969: 293; Faiß 1992: 60). From the outset the word-formation products containing -ha´d represent nomina qualitatis, which, according to the etymological background of the su‰x(oid), designate the quality of being X (31). Qualities can be either time-stable, like being human, a woman, a brother, an animal, normally also a priest, etc., or more or less transitory, like being young, a child, a widow, a nation, etc. For the latter type, the frame of the constitution of a person or a thing implies a strong contiguity between the quality of being X (¼ E1 in Figure 2a) and the period of being X (¼ E2 in Figure 2b). So language users easily slip through a metonymic figure-ground e¤ect from E1 to E2. Among the earliest attestations of -ha´d in Old English there are examples that clearly designate time-stable qualities, such as sa´cerdha´d ‘priesthood’ (33). In fact, the etymological background (31) tells us that the quality reading of -ha´d/-hood must have been prior in diachrony, surviving nevertheless in Middle English and Modern English for time-stable qualities (cf. e.g. OED, manhood, sense 1.a., a1225; personhood, 1959). The period reading, which is particular natural for transitory qualities, is already present very early as well, e.g. cildhad ¼ childhood (34).28 (33) OE. sa´cerdha´d ‘priesthood’ (from sacerd ‘priest’); first attestation: Ðylæs ænig unclænsod dorste on swa micelne haligdom fon ðære clænan ðegnenga ðæs sacerdhades (K. Ælfred, Gregory’s pastoral care tr., vii. 51; cit. OED, uncleansed, ppl. a., c897). 28. As it were, this early semantic change seems to presuppose that the transformation of the noun -ha´d into a su‰x(oid) was already under way in Old English.
296
Peter Koch
(34) E. childhood ‘time during which one is a child’; first attestation of childhood: Soð he cuoeð from cildhad (Lindisfarne gospels, Mark ix. 21; cit. OED, childhood, sense 1., c950). (35) OE. gioguðhade ‘state of being young/time during which one is young’ (from gioguð ‘youth’); one of the first attestations: Bliðsa, cniht, on ðinum gioguðhade (K. Ælfred, Gregory’s pastoral care tr., xlix. 385; cit. OED, bliss, v., sense 1.a., c897). Example (35), displaying the transitory quality youth, shows a possible bridging context, where the original quality reading as well as the new period reading is possible. This suggests that in the beginning, when only the quality reading of -ha´d existed, a reanalysis must have come about (cf. Figure 6). Without any intention of innovating, a noun, which contained -ha´d and by chance expressed a transitory quality was used by a speaker S1 according to traditional rules, i.e. with the quality reading. A hearer H1 accomplished a figure-ground e¤ect within the relevant frame, shifting from the quality to the period reading, an e¤ect that remained entirely compatible with the global meaning of the utterance, because the transitory quality expressed constituted an ideal bridging context. As a speaker S2 of a subsequent communicative act, the former H1 then actively transmitted her/his innovation to a hearer H2 etc. etc. We therefore have to take into account metonymic reanalysis as a process of semantic change in word-formation devices. But things go much further than that. In a systematic examination of di¤erent types of semantic change in word-formation devices, undertaken by Rainer (2005), metonymy is omnipresent, though not exclusive. As for reanalysis (‘‘reinterpretation’’) that represents one of the central processes of change in this domain, Rainer states: ‘‘The most important type of lexical change giving rise to reinterpretation, according to my sources, is metonymy [. . .]’’ (2005: 423). 5.3. Word-formation products as a diachronic issue . . . and much more: contiguity within a three-dimensional grid of lexical diachrony Now let us come to the most obvious application of the diachronic perspective to word-formation, namely the emergence of new word-formation products as part of lexical diachrony (section 5 (i)). To begin with, we will consider the origin of expressions for the concept tree which produces peaches in three di¤erent languages: English, French, and Russian. English and French display products of word-formation in this domain.
The pervasiveness of contiguity and metonymy in semantic change
297
In Middle English a compound designating the fruit of the peach-tree came into being (36), which combined the already existing nouns peach and tree (according to the OED, first attested in ?a1366 and c825 respectively). In Old French a derivative is attested (37), whose base was pesche (fruit of the peach-tree: cf. DHLF, peˆche [¼ ModFr. form]). (36) E. peach-tree ‘tree which produces peaches’ (first attestation: OED, peach-tree, c1400). (37) OFr. pesch(i)er ‘tree which produces peaches’ (> ModFr. peˆcher) (first attestation in the form peskier: 1150; cf. DHLF, peˆche). At first sight, the situation is quite di¤erent in Russian, where the designation for tree which produces peaches (38)(b) is the product of a metonymic change that the word for fruit of the peach-tree, i.e. pe´rsik ((38)(a)) has undergone.29 (38) (a) Russ. pe´rsik ‘fruit of the peach-tree’. (b) Russ. pe´rsik ‘tree which produces peaches’. Since the concept fruit of the peach-tree is an element of the frame tree which produces peaches, this change is based on a figure-ground e¤ect from the element (¼ E1 in Figure 3b) to the contiguous frame (¼ FRAME in Figure 3a).30 In fact, the formal relationship between (38)(a) and (b), i.e. mere identity, is completely di¤erent from the one visible in (36), i.e. integration into a composition, or the one shown in (37), i.e. su‰xation. Nevertheless, on the semantic level a common denominator is visible for all the three examples: the contiguity relation between the element fruit and the frame tree. We have to conclude that one and the same contiguity relation may not only connect the source and the target concepts of processes of semantic change (38), but also the source and the target concepts of word-formation processes ((36), (37); cf. Koch 1999a: 158f.; 1999b: 340–342; 2001a: 231– 29. The etymological background suggests the posteriority of the tree-reading. Russ. pe´rsik ‘peach’ goes back to either AncGr. (meˆlon) persiko´n or Lat. (ma#lum) Persicum borrowed from Greek, both designating the fruit (cf. REW, ne´pcuk; SEW, pe´rsik). Furthermore, parallel metonymies in the same direction exist for other fruit concepts: gru´sˇa ‘pear’ ! ‘pear-tree’; cˇere´sˇn’a ‘cherry’ ! ‘cherry-tree’, etc. 30. Note that even in English a tree-reading as a result of a metonymic change of pear is available (cf. OED, pear, n., sense 2.). But in English this solution has remained marginal, whereas it has been generalized in Russian.
298
Peter Koch
233). I would not speak of ‘‘metonymy’’ nor of ‘‘figure-ground e¤ect’’ in the case of word-formation, because metonymy as a trope based on a figure-ground e¤ect presupposes a constant on the expression side, i.e. identity of the signifier and its grammatical form. But the principle of contiguity transcends the formal di¤erences between lexical devices, such as semantic change, su‰xation, composition, or others. Conversion, which constitutes a particularly important device in English (cf. Marchand 1969; Faiß 1992), may involve cognitive contiguity relations as well, as shown by example (39), where the source concept is snow (cf. OED, snow, n.1, sense I.1.a.) and the target concept of a conversion (the) snow ! (to) snow corresponds to the whole frame of the situation of snowing: (39) (a) E. snow ‘fall down (snow)’; (first attestation: OED, snow, v., sense 1., a13..). Consequently, contiguity is a cognitive relation underlying a wide range of diachronic lexical devices comprising di¤erent types of word-formation as well as semantic change with respect to the constant signifier of a given word. Yet, the possibility to transcend the boundaries between semantic change and word-formation is by no means a privilege of the relation of contiguity. It also holds for other cognitive relations, as e.g. for taxonomic subordination, as shown by the examples (20) and (40), which is a compound based on English dog (first attestation: OED, dog, n.1, I., sense 1., c1050). (40) hunting(-)dog ‘quadruped of the genus Canis used for hunting game’ (first attestation: (OED, hunting dog, sense 1., 1863). The reading (20)(b) of hound, which is the result of a process of semantic specialization, stands in a relation of taxonomic subordination to the original reading (20)(a). In a similar way, though by completely di¤erent formal means, the compound hunting(-)dog (40) is taxonomically subordinated to its base dog (40). All in all, cognitive relations and lexical devices do not simply constitute two di¤erent paths, but two dimensions of lexical change (cf. Koch 2000: 81–84; Blank 2003; Ge´vaudan 2007: 58–61, 165–177; Ge´vaudan and Koch 2010: 113–117). In principle, as shown in Figure 7, we can ‘‘multiply’’ with each other the di¤erent cognitive-relational categories (¼ dimension 1; cf. 2.2.4) and the devices of lexical change (¼ dimension 2).
The pervasiveness of contiguity and metonymy in semantic change
299
Figure 7. Three-dimensional grid for lexical diachrony
Furthermore, both dimension 1 and 2 may interact with processes of borrowing. Spanish sombrero ‘hat’ specialized its meaning when borrowed into English (41). With respect to English handy (cf. OED, handy, a., sense 2.a.) the borrowing German Handy (42) is the result of a conversion (dimension 2), which expresses a contiguous concept (dimension 1: ready to hand ! mobile phone). (41) English sombrero ‘broad-brimmed hat of a type common in Spain and Spanish America’; first attestation: A brown cap or silk net, with a large flatted hat called a sombrero over it (The Gentleman’s Magazine, XL. 530; cit. OED, sombrero, sense 2., 1770). (42) German Handy ‘mobile (phone)’ (Ge´vaudan 2007: 180). So, then, the aspect of lexical stratification (autochthonous vs. borrowed) constitutes a third dimension of lexical change (¼ 3). As shown in Figure 7 and by the examples (37)–(42), the values of the dimensions 1, 2, and
300
Peter Koch
3 can be combined with each other in a systematic way31 (cf. Koch 2000: 81–89; Blank 2003; Ge´vaudan 2007: 61–63, 177–183; Ge´vaudan and Koch 2010: 113–119). From the quantitative point of view, it would be interesting to know what is the total impact of contiguity on lexical change all over the three dimensions represented in Figure 7. This has been tested on a sample of 179 concepts out of the so-called ‘‘Swadesh list’’ (cf. Swadesh 1955) for five Romance languages, considering, for every concept, the last step of change leading to its modern lexical expression (cf. Koch, in press). By far the most frequent cognitive relation overall is contiguity, whose rate varies around two fifth: French 45%, Spanish 40%, Italian 39%, Romanian 43%, Logudorese Sardinian 40%. From a qualitative point of view, it is important to realize that the three-dimensional grid shown in Figure 7 is, first of all, a heuristic treatment. Every theoretically possible combination of values of dimensions 1, 2, and 3 does not necessarily occur in the reality of lexical change. However, no empirical limitation seems to exist for the cognitive relation of contiguity. Two large-scale investigations in lexical diachrony, one of them systematic (Ge´vaudan 2007) and one of them empirical in nature (Steinberg 2010) indicate that contiguity (1) is combinable with any kind of lexical device (2) and of course with autochthonous change as well as with borrowing (3). This multi-task profile does not apply to other cognitive relations (2.2.4). To sum up, the three-dimensional perspective of the lexicological grid of Figure 7 clearly underpins the omnipresence of contiguity in lexical change.
6. Conclusion and further perspectives We have seen that the relation of contiguity constitutes a fundamental cognitive principle that reappears everywhere in human language, typically but not only in the form of metonymy. The omnipresence of contiguity in language (see also Schifko 1979), has been illustrated here in the area of linguistic units endowed with meaning, and, more specifically, for their diachronic evolution. On the level of lexical-semantic change, we have encountered, besides the standard examples of metonymy (section 31. As far as they are concerned here, the examples (36)–(40) are all ‘autochthonous’ with respect to dimension 3.
The pervasiveness of contiguity and metonymy in semantic change
301
2.1), lexical subjectification (2.3), delocutive change (2.4), and the phenomenon of hearer-metonymy which translates into lexical reanalysis (2.5). Concerning the genesis of discourse markers (3), it seems almost inconceivable to find here anything but metonymies. We could only allude in passing to grammatical-semantic change and to its frequently metonymic bases (cf. section 4). By including word-formation in our considerations, we discovered not only further metonymic e¤ects (5.2), but also a wide range of types of lexical change involving contiguity (5.1 and 5.3). No doubt one can really speak of an omnipresence of metonymy and/or contiguity. Indeed, in comparison to the more complex taxonomic relations (1.2) and to the highly complex trope of metaphor, which brings together distant concepts belonging to di¤erent frames (2.2.4), contiguity and the metonymic figure-ground e¤ect within frames represent particularly simple principles that are extraordinarily flexible from the semantic and the pragmatic point of view. In order to complete this picture, research would have to be continued in two directions that the editorial limitations do not permit us to develop here. On the one hand, it would be useful to compare the semantic and pragmatic productiveness of metonymy with that of other types of semantic change (especially metaphor, generalization and specialization). It would appear that it is only metonymy which comprises speaker and hearer perspective at the same time, alongside e‰ciency and imprecision, expressivity, euphemism and dysphemism, etc. (cf. Koch 2004: 48¤.). On the other hand, the area of diachrony would have to be left in order to measure the impact of the relation of contiguity in the entire field of the lexicon, including synchronic motivation (cf. Koch 2001c: 1156– 1168; Koch and Marzo 2007, esp. 279–281). No doubt: it will become even clearer that the relation of contiguity, in its fundamental nature and in its pragmatic flexibility, is decidedly a ‘‘Jack of all trades. . . , master of some’’. References ALDH ¼ Georges, Karl Ernst and Heinrich Georges 1913 Ausfu¨hrliches lateinisch-deutsches Handwo¨rterbuch. Hannover: Hahnsche Buchhandlung [reprint 1983]. Allen, Cynthia L. 1995 Case Marking and Reanalysis. Oxford: Clarendon. Amin, Ismail 1973 Assoziationspsychologie und Gestaltpsychologie. Eine problemgeschichtliche Studie mit besonderer Beru¨cksichtigung der Berliner
302
Peter Koch
Schule. (Europa¨ische Hochschulschriften 6, 9.) Bern/Frankfurt a. M.: Lang. Anscombre, Jean-Claude, Franc¸oise Le´toublon, and Alain Pierrot 1987 Speech act verbs, linguistic action verbs, and delocutivity. In: Jef Verschueren (ed.), Linguistic Action. Empirical-conceptual Studies, 45–67. (Advances in Discourse Processes 23.) Norwood, N.J.: Ablex. Barcelona, Antonio (ed.) 2000 Metaphor and Metonymy at the Crossroads. A Cognitive Perspective. (Topics in English Linguistics 30.) Berlin/New York: Mouton de Gruyter. Barsalou, Lawrence W. 1992 Frames, concepts, and conceptual fields. In: Adrienne Lehrer and Eva F. Kittay (eds.), Frames, Fields, and Contrasts. New Essays in Semantic and Lexical Organization, 21–74. Hillsdale (N.J.)/London: Lawrence Erlbaum Associates. Benveniste, Emile 1966 Les verbes de´locutifs. In: Emile Benveniste, Proble`mes de linguistique ge´ne´rale, Volume 1, 277–285. Paris: Gallimard. Blank, Andreas 1997 Prinzipien des lexikalischen Bedeutungswandels am Beispiel der romanischen Sprachen. (Beihefte zur Zeitschrift fu¨r romanische Philologie 285.) Tu¨bingen: Niemeyer. Blank, Andreas 1999 Co-presence and succession. A cognitive typology of metonymy. In: Klaus-Uwe Panther and Gu¨nter Radden (eds.), Metonymy in Language and Thought, 169–192. (Human Cognitive Processing 4.) Amsterdam/Philadelphia: Benjamins. Blank, Andreas 2000 Pour une approche cognitive du changement se´mantique lexical: aspect se´masiologique. In: Franc¸ois, Jacques (ed.), Theories contemporaines du changement se´mantique, 59–73. (Me´moires de la Socie´te´ de Linguistique de Paris 9.) Louvain: Peeters. Blank, Andreas 2001 Pathways of lexicalization. In: Martin Haspelmath, Ekkehard Ko¨nig, Wulf Oesterreicher, and Wolfgang Raible (eds.), Language Typology and Language Universals/Sprachtypologie und sprachliche Universalien/La typologie des langues et les universaux linguistiques. An International Handbook/Ein internationales Handbuch/Manuel international, Volume 2, 1596–1608. (Handbu¨cher der Sprach- und Kommunikationswissenschaft 20.) Berlin/New York: de Gruyter). Blank, Andreas 2003 Words and concepts in time: towards diachronic cognitive onomasiology. In: Regine Eckardt, Klaus von Heusinger, and Christoph Schwarze (eds.), Words in Time. Diachronic Semantics
The pervasiveness of contiguity and metonymy in semantic change
303
from Di¤erent Points of View, 37–65. (Trends in Linguistics, Studies and Monographs 143.) Berlin/New York: Mouton de Gruyter. Blank, Andreas and Peter Koch (eds.) 1999 Historical Semantics and Cognition. (Cognitive Linguistics Research 13.) Berlin/New York: Mouton de Gruyter. Brinton, Laurel J. 1996 Pragmatic Markers in English. Grammaticalization and Discourse Function. (Topics in English Linguistics 19.) Berlin/New York: Mouton de Gruyter. Brinton, Laurel J. 2001 From matrix clause to pragmatic markers. The history of lookforms. Journal of Historical Pragmatics 2: 177–199. Brinton, Laurel J. and Elizabeth C. Traugott 2005 Lexicalization and Language Change. Cambridge: Cambridge University Press. Bybee, Joan, Revere Perkins, and William Pagliuca (eds.) 1994 The Evolution of Grammar. Tense, Aspect and Modality in the Languages of the World. Chicago/London: University of Chicago Press. Campbell, Lyle 2006 Historical Linguistics. An Introduction. Cambridge (Mass.): MIT Press. Croft, William 1993 The role of domains in the interpretation of metaphors and metonymies. Cognitive Linguistics 4: 335–370. Croft, William 2006 On explaining metonymy: Comment on Peirsman and Geeraerts, ‘Metonymy as a prototypical category’. Cognitive Linguistics 17: 317–326. Croft, William and Alan D. Cruse 2004 Cognitive Linguistics. Cambridge: Cambridge University Press. Cruse, D. Alan 1986 Lexical Semantics. Cambridge: Cambridge University Press. Cruse, D. Alan 2000 Meaning in Language. An Introduction to Semantics and Pragmatics. Oxford: Oxford University Press. De Smet, Hendrik and Jean-Christophe Verstraete 2006 Coming to terms with subjectivity. Cognitive Linguistics 17: 365– 392. Detges, Ulrich 1999 Wie entsteht Grammatik? Kognitive und pragmatische Grundlagen der Grammatikalisierung von Tempusmorphemen. In: Ju¨rgen Lang and Ingrid Neumann-Holzschuh (eds.), Reanalyse und Grammatikalisierung in den romanischen Sprachen, 31–52. (Linguistische Arbeiten 410.) Tu¨bingen: Narr.
304
Peter Koch
Detges, Ulrich 2003
La grammaticalisation des constructions de ne´gation dans une perspective onomasiologique, ou: la de´construction d’une illusion d’optique. In: Andreas Blank and Peter Koch (eds.), Kognitive romanische Onomasiologie und Semasiologie, 213–233. (Linguistische Arbeiten 467.) Tu¨bingen: Niemeyer. Detges, Ulrich and Waltereit, Richard 2002 Grammaticalization vs. reanalysis: a semantic-pragmatic account of functional change in grammar. Zeitschrift fu¨r Sprachwissenschaft 21: 151–195. DHLF ¼ Alain Rey 1992 Dictionnaire historique de la langue franc¸aise, Volume 2. Paris: Dictionnaires le Robert. Dik, Simon C. 1977 Inductive generalization in semantic change. In: Paul J. Hopper (ed.), Studies in Descriptive and Historical Linguistics. Festschrift for Winfried P. Lehmann, 283–300. (Amsterdam Studies in the Theory and History of Linguistic Science 4.) Amsterdam: Benjamins. Dirven, Rene´ 1993 Metonymy and metaphor. Di¤erent mental strategies of conceptualisation. Leuvense Bijdragen 82: 1–28. Dostie, Gae´tane 2004 Pragmaticalisation et marqueurs discursifs. Analyse se´mantique et traitement lexicographique. Bruxelles: De Boeck/Duculot. Eco, Umberto 1984 Semiotica e filosofia del linguaggio. Turin: Einaudi. Erman, Britt and Ulla-Britt Kotsinas 1993 Pragmaticalization: the case of ba’ and you know. Studier i modern spra˚kvetenskap 10: 76–93. Evans, Vyvyan and Melanie Green 2006 Cognitive Linguistics. An Introduction. Edinburgh: Edinburgh University Press. Faiß, Klaus 1978 Verdunkelte Compounds im Englischen. Ein Beitrag zu Theorie und Praxis der Wortbildung. (Tu¨binger Beitra¨ge zur Linguistik 104.) Tu¨bingen: Narr. Faiß, Klaus 1992 English Historical Morphology and Word-Formation: Loss versus Enrichment. (FOKUS. Linguistisch-Philologische Studien 8.) Trier: Wissenschaftlicher Verlag Trier. Feyaerts, Kurt 2000 Refining the inheritance hypothesis. Interaction between metaphoric and metonymic hierarchies. In: Barcelona, Antonio (ed.), Metaphor and Metonymy at the Crossroads. A Cognitive Perspec-
The pervasiveness of contiguity and metonymy in semantic change
305
tive, 59–78. (Topics in English Linguistics 30.) Berlin/New York: Mouton de Gruyter. Fillmore, Charles J. 1977 Scenes-and-frames-semantics. In: Zampolli, Antonio (ed.), Linguistic Structures Processing, 55–81. (Fundamental Studies in Computer Science 5.) Amsterdam: Benjamins. Fillmore, Charles J. 1985 Frames and the semantics of understanding. Quaderni di Semantica 6: 222–254. Fontanier, Pierre 1977 Les figures du discours. Introduction par Ge´rard Genette. Paris: Flammarion. Franc¸ois, Jacques (ed.) 2000 Theories contemporaines du changement se´mantique. (Me´moires de la Socie´te´ de Linguistique de Paris 9.) Louvain: Peeters. GDLI ¼ Salvatore, Battaglia 1961–2004 Grande dizionario della lingua italiana, 30 volumes. Turin: UTET. Geeraerts, Dirk 1997 Diachronic Prototype Semantics. A Contribution to Historical Lexicology. Oxford: Clarendon. Geeraerts, Dirk 2006 A rough guide to Cognitive Linguistics. In: Geeraerts, Dirk (ed.), Cognitive Linguistics. Basic Readings, 1–28. (Cognitive Linguistics Research 34.) Berlin/New York: Mouton de Gruyter. Geeraerts, Dirk 2010 Theories of Lexical Semantics. Oxford: Oxford University Press. Geeraerts, Dirk and Hubert Cuyckens (eds.) 2007 Cognitive Linguistics. Oxford: Oxford University Press. Ge´vaudan, Paul 2007 Typologie des lexikalischen Wandels. Bedeutungswandel, Wortbildung und Entlehnung am Beispiel der romanischen Sprachen. (Linguistik 45.) Tu¨bingen: Stau¤enburg. Ge´vaudan, Paul and Koch, Peter 2010 Se´mantique cognitive et changement lexical. In: Franc¸ois, Jacques (ed.), Grandes voies et chemins de traverse de la se´mantique cognitive, Leuven: Peeters (Me´moires de la Socie´te´ de Linguistique de Paris, N.S., 18), 103–145. Grady, Joseph E. 2007 Metaphor. In: Dirk Geeraerts and Hubert Cuyckens (eds.), Cognitive Linguistics, 188–213. Oxford: Oxford University Press. Happ, Heinz 1985 ‘Paradigmatisch’ – ‘syntagmatisch’: Zur Bestimmung und Kla¨rung zweier Grundbegri¤e der Sprachwissenschaft. (Reihe Siegen 55.) Heidelberg: Winter.
306
Peter Koch
Haspelmath, Martin 1998 Does grammaticalization need reanalysis? Studies in Language 22: 315–351. Heine, Bernd 2002 On the role of context in grammaticalization. In: Ilse Wischer and Gabriele Diewald (eds.), New Reflections on Grammaticalization, 83–101. (Typological Studies in Language 49.) Amsterdam/Philadelphia: Benjamins. Hock, Hans Heinrich 1991 Principles of Historical Linguistics. Mouton de Gruyter: Berlin/ New York. Holenstein, Elmar 1972 Pha¨nomenologie der Assoziation. Zur Struktur und Funktion eines Grundprinzips der passiven Genesis bei E. Husserl. (Phaenomenologica 44.) Den Haag: Nijho¤. Hopper, Paul J. and Elizabeth C. Traugott 2003 Grammaticalization, 2nd ed. Cambridge: Cambridge University Press. Husserl, Edmund 1950/52 Ideen zu einer reinen Pha¨nomenologie und pha¨nomenologischen Philosophie. Edited by Walter Biemel/Marly Biemel. 3 volumes (Husserliana 3–5.). Den Haag: Nijho¤. Husserl, Edmund 1973 Cartesianische Meditationen und Pariser Vortra¨ge 2nd ed. Edited by Stephan Strasser. (Husserliana 1.) Den Haag: Nijho¤. Inhelder, Ba¨rbel and Jean Piaget 1970 The Early Growth of Logic in the Child. Classification and Seriation. London: Routledge & Kegan Paul. Jakobson, Roman 1971 Two aspects of language and two types of aphasic disturbances. In: Roman Jakobson and Morris Halle (eds.), Fundamentals of Language, 43–67. Den Haag/Paris: Mouton. Jespersen, Otto 1949 A Modern English Grammar on Historical Principles. Copenhague: Munksgaard/London: Allen & Unwin. Kleiber, Georges 1999 Proble`mes de se´mantique. La polyse´mie en questions. Villeneuve d’Ascq: Presses Universitaires du Septentrion. Klix, Friedhart ¨ ber Wissensrepra¨sentation im menschlichen Gehirn. In: Fried1984 U hart Klix (ed.), Geda¨chtnis, Wissen, Wissensnutzung, 9–73. Berlin: Deutscher Verlag der Wissenschaften. Koch, Peter 1993 Kyenbe´ – tyonbo. Wurzeln kreolischer Lexik. Neue Romania 14: 259–287.
The pervasiveness of contiguity and metonymy in semantic change Koch, Peter 1998
Koch, Peter 1999a
Koch, Peter 1999b Koch, Peter 2000
Koch, Peter 2001a Koch, Peter 2001b
Koch, Peter 2001c
Koch, Peter 2004 Koch, Peter 2005
307
Prototypikalita¨t: konzeptuell – grammatisch – linguistisch. In: Udo L. Figge, Franz-Josef Klein, and Annette Martı´nez Moreno (eds.), Grammatische Strukturen und grammatischer Wandel im Franzo¨sischen. Festschrift fu¨r Klaus Hunnius zum 65. Geburtstag, 281–308. (Abhandlungen zur Sprache und Literatur 117.) Bonn: Romanistischer Verlag. Frame and contiguity. On the cognitive bases of metonymy and certain types of word formation. In: Klaus-Uwe Panther and Gu¨nter Radden (eds.), Metonymy in Language and Thought, 139–167. (Human Cognitive Processing 4.) Amsterdam/Philadelphia: Benjamins. tree and fruit: A cognitive-onomasiological approach. Studi Italiani di Linguistica Teorica e Applicata 28/2: 331–347. Pour une approche cognitive du changement se´mantique lexical: aspect onomasiologique. In: Jacques Franc¸ois (ed.) Theories contemporaines du changement se´mantique, 75–95. (Me´moires de la Socie´te´ de Linguistique de Paris 9.) Louvain: Peeters. Metonymy: unity in diversity. Journal of Historical Pragmatics 2, 201–244. As you like it. Les me´tataxes actantielles entre expe´rient et phenomena. In: Lene Schøsler (ed.), La Valence, perspectives romanes et diachroniques, 59–81. (Beihefte zur Zeitschrift fu¨r franzo¨sische Sprache und Literatur, N.F. 30.) Stuttgart: Steiner. Lexical typology from a cognitive and linguistic point of view. In: Martin Haspelmath, Ekkehard Ko¨nig, Wulf Oesterreicher, and Wolfgang Raible (eds.), Language Typology and Language Universals/Sprachtypologie und sprachliche Universalien/La typologie des langues et les universaux linguistiques. An International Handbook/Ein internationales Handbuch/Manuel international, Volume 2, 1142–1178. Berlin/New York: de Gruyter. Metonymy between pragmatics, reference and diachrony. Metaphorik.de 07: 6–54. Available via http://www.metaphorik.de. Taxinomie et relations associatives. In: Adolfo Murguı´a (ed.), Sens et Re´fe´rences. Me´langes Georges Kleiber/Sinn und Referenz. Festschrift fu¨r Georges Kleiber, 159–191. Tu¨bingen: Narr.
308
Peter Koch
Koch, Peter 2007
Koch, Peter 2008
Koch, Peter (in press)
Assoziation – Zeichen – Schrift. In: Daniel Jacob and Thomas Krefeld (eds.), Sprachgeschichte und Geschichte der Sprachwissenschaft, 11–52. Tu¨bingen: Narr. Cognitive onomasiology and lexical change: around the eye. In: Martine Vanhove (ed.), From Polysemy to Semantic Change, 107–137. (Studies in Language Companion Series 106.) Amsterdam/Philadelphia: Benjamins.
Divergencias y semejanzas de designacio´n en el vocabulario central de las lenguas roma´nicas. In: Alicia Puigvert (ed.), Corrientes de Estudio de Sema´ntica y Pragma´tica Histo´rica. Koch, Peter and Daniela Marzo 2007 A two-dimensional approach to the study of motivation in lexical typology and its first application to French high-frequency vocabulary. Studies in Language 31: 259–291. Koch, Peter and Esme Winter-Froemel 2009 Synekdoche. In: Gerd Ueding (ed.), Historisches Wo¨rterbuch der Rhetorik, Volume 9, 356–366. Tu¨bingen: Niemeyer. Lako¤, George 1987 Women, Fire, and Dangerous Things. What Categories Reveal about the Mind. Chicago/London: University of Chicago Press. Lako¤, George and Mark Johnson 1980 Metaphors We Live By. Chicago: University of Chicago Press. Lang, Ju¨rgen and Ingrid Neumann-Holzschuh (eds.) 1999 Reanalyse und Grammatikalisierung in den romanischen Sprachen. (Linguistische Arbeiten 410.) Tu¨bingen: Narr. Langacker, Ronald W. 1977 Syntactic reanalysis. In: Charles N. Li (ed.), Mechanisms of Syntactic Change, 57–139. Austin: University of Texas Press. Langacker, Ronald W. 1999 Losing control: grammaticization, subjectification, and transparency. In: Andreas Blank and Peter Koch (eds.), Historical Semantics and Cognition, 147–175. (Cognitive Linguistics Research 13.) Berlin/New York: Mouton de Gruyter. Lausberg, Heinrich 1973 Handbuch der literarischen Rhetorik. Eine Grundlegung der Literaturwissenschaft, 2nd ed. Mu¨nchen: Hueber. Le Guern, Michel 1973 Se´mantique de la me´taphore et de la me´tonymie. Paris: Larousse. Lyons, John 1977 Semantics, 2 volumes. Cambridge: Cambridge University Press. Marchand, Hans 1969 The Categories and Types of Present-Day English Word-Formation. A Synchronic-Diachronic Approach, 2nd ed. Mu¨nchen: Beck.
The pervasiveness of contiguity and metonymy in semantic change
309
Marchello-Nizia, Christiane 2006 Grammaticalisation et changement linguistique. Bruxelles: De Boeck/Duculot. Mihatsch, Wiltrud 2006 Kognitive Grundlagen lexikalischer Hierarchien, untersucht am Beispiel des Franzo¨sischen und Spanischen. (Linguistische Arbeiten 506.) Tu¨bingen: Niemeyer. Hansen, Maj-Britt Mosegaard 1998 The Function of Discourse Particles. (Pragmatics & Beyond, N.S. 53.) Amsterdam/Philadelphia: Benjamins. Mutz, Katrin 2000 Die italienischen Modifikationssu‰xe. Synchronie und Diachronie. (Europa¨ische Hochschulschriften 9, 33.) Frankfurt a.M.: Lang. Nerlich, Brigitte 1992 Semantic Theories in Europe 1830–1930. From Etymology to Contextuality. (Amsterdam Studies in the Theory and History of Linguistic Science III, 59.) Amsterdam/Philadelphia: Benjamins. Nerlich, Brigitte and, David D. Clarke 1999 Synecdoche as a cognitive and communicative strategy. In: Andreas Blank and Peter Koch (eds.), Historical Semantics and Cognition, 197–213. (Cognitive Linguistics Research 13.) Berlin/ New York: Mouton de Gruyter. Nunberg, Geo¤rey 1995 Transfers of meaning. Journal of Semantics 17: 109–132. OED 2002 Oxford English Dictionary. CD-ROM version 3.00, Oxford: Oxford University Press. Panther, Klaus-Uwe and Gu¨nter Radden (eds.) 1999 Metonymy in Language and Thought. (Human Cognitive Processing, 4.) Amsterdam/Philadelphia: Benjamins. Panther, Klaus-Uwe and Linda L. Thornburg (eds.) 2003 Metonymy and Pragmatic Inferencing. (Pragmatics and Beyond 113.) Amsterdam: Benjamins. Panther, Klaus-Uwe and Linda L. Thornburg 2007 Metonymy. In: Dirk Geeraerts and Hubert Cuyckens (eds.), Cognitive Linguistics, 236–263. Oxford: Oxford University Press. Peirsman, Yves and Dirk Geeraerts 2006a Metonymy as a prototypical category. Cognitive Linguistics 17: 269–316. Peirsman, Yves and Dirk Geeraerts 2006b Don’t let metonymy be misunderstood: An answer to Croft. Cognitive Linguistics 17: 327–335. Radden, Gu¨nter and Zolta´n Ko¨vecses 1999 Towards a theory of metonymy. In: Klaus-Uwe Panther and Gu¨nter Radden (eds.), Metonymy in Language and Thought,
310
Peter Koch
17–59. (Human Cognitive Processing 4.) Amsterdam/Philadelphia: Benjamins. Radden, Gu¨nter and Klaus-Uwe Panther 2004 Introduction: Reflections on motivation. In: Klaus-Uwe Panther and Gu¨nter Radden (eds.), Studies in Linguistic Motivation, 1– 46. (Cognitive Linguistics Research 28.) Berlin/New York: Mouton de Gruyter. Raible, Wolfgang 1981 Von der Allgegenwart des Gegensinnes (und einiger anderer Relationen). Strategien zur Einordnung semantischer Information. Zeitschrift fu¨r romanische Philologie 97: 1–40. Rainer, Franz 2005 Semantic Change in Word Formation. Linguistics 43: 415–441. REW ¼ Max Vasmer 1953–58 Russisches etymologisches Wo¨rterbuch, 3 volumes. Heidelberg: Winter. Rosch, Eleanor 1973 On the internal structure of perceptual and semantic categories. In: Timothy E. Moore (ed.), Cognitive Development and the Acquisition of Language, 111–144. New York: Academic Press. Roudet, Le´once 1921 Sur la classification psychologique des changements se´mantiques. Journal de psychologie 18: 676–692. Ruiz de Mendoza Iba´n˜ez, Francisco Jose´ 2000 The role of mapping and domains in understanding metonymy. In: Barcelona, Antonio (ed.), Metaphor and Metonymy at the Crossroads. A Cognitive Perspective, 109–132. (Topics in English Linguistics 30.) Berlin/New York: Mouton de Gruyter. Schifko, Peter 1979 Die Metonymie als universales sprachliches Strukturprinzip. Grazer Linguistische Studien 10: 240–264. Seefranz-Montag, Ariane von 1983 Syntaktische Funktionen und Wortstellungsvera¨nderung. Die Entwicklung ‘‘subjektloser’’ Konstruktionen in einigen Sprachen. (Studien zur Theoretischen Linguistik 3.) Mu¨nchen: Fink. Seto, Ken-ichi 1999 Distinguishing metonymy from synecdoche. In: Klaus-Uwe Panther and Gu¨nter Radden (eds.), Metonymy in Language and Thought, 91–120. (Human Cognitive Processing 4.) Amsterdam/ Philadelphia: Benjamins. SEW ¼ Berneker, Erich 1913/14 Slavisches etymologisches Wo¨rterbuch, Volume 2. (Indogermanische Bibliothek. Abt. 1, Lehr- und Handbu¨cher. Reihe 2, Wo¨rterbu¨cher, 2 – Sammlung slavischer Lehr- und Handbu¨cher. II, Reihe, Wo¨rterbu¨cher 1.) Heidelberg: Winter.
The pervasiveness of contiguity and metonymy in semantic change
311
Sperber, Dan and Wilson, Deirdre 1995 Relevance. Communication and Cognition, 2nd ed. Oxford/ Cambridge (Mass.): Blackwell. Steinberg, Reinhild 2010 Lexikalische Polygenese im Konzeptbereich KOPF. PhD dissertation. University of Tu¨bingen. Swadesh, Morris 1955 Towards greater accuracy in lexicostatic dating. International Journal of American Linguistics 21: 121–137. Taylor, John R. 1995 Linguistic Categorization. Prototypes in Linguistic Theory, 2nd ed. Oxford: Clarendon. Taylor, John R. 2002 Cognitive Grammar, Oxford/New York: Oxford University Press. Traugott, Elizabeth C. 1999 The rhetoric of counter-expectation in semantic change: a study in subjectification. In: Andreas Blank and Peter Koch (eds.), Historical Semantics and Cognition, 177–196. (Cognitive Linguistics Research 13.) Berlin/New York: Mouton de Gruyter. Traugott, Elizabeth C. and Richard B. Dasher 2002 Regularity in Semantic Change. (Cambridge Studies in Linguistics 97.) Cambridge: Cambridge University Press. Ullmann, Stephen 1957 The Principles of Semantics. A Linguistic Approach to Meaning, 2nd ed. Oxford: Blackwell. Ullmann, Stephen 1964 Semantics. An Introduction to the Science of Meaning, 2nd ed. Oxford: Blackwell. Ungerer, Friedrich and Hans-Jo¨rg Schmid 1996 An Introduction to Cognitive Linguistics. London/New York: Longman. Waltereit, Richard 1998 Metonymie und Grammatik. Kontiguita¨tspha¨nomene in der franzo¨sischen Satzsemantik. (Linguistische Arbeiten 385.) Tu¨bingen: Niemeyer. Waltereit, Richard 1999 Reanalyse als metonymischer Prozeß. In: Ju¨rgen Lang and Ingrid Neumann-Holzschuh (eds.), Reanalyse und Grammatikalisierung in den romanischen Sprachen 19–29. (Linguistische Arbeiten 410.) Tu¨bingen: Narr. Waltereit, Richard 2002 Imperatives, interruption in conversation, and the rise of discourse markers. A study of Italian guarda. Linguistics 40: 987–1010. Wertheimer, Max 1922/23 Untersuchungen zur Lehre von der Gestalt. Psychologische Forschungen 1: 47–58; 4, 301–350.
A cognitive approach to the methodology of semantic reconstruction: The case of Eng. chin and knee Ga´bor Gyori and Ire´n Hegedus Abstract In etymological research the uncertainties of semantic reconstruction derive mainly from the lack of regularity in semantic change. Looking at the cognitive background of changes can reveal general tendencies and sometimes even universal paths of change but for the concrete identification of cognates a more particular approach is needed. In the paper it is suggested that through the application of image schemas developed by cognitive linguistics it seems possible to establish cognates in cases where the formal (phonological) correspondence is visible, but the semantic connection has so far been covert, as in the case of English chin and knee. This methodological approach also has the advantage of eliminating cases of multiple homonymy resulting from reconstructions.
1. Introduction In historical reconstruction the postulation of possible cognate forms has two basic requirements. One is the existence of regular sound correspondences between the forms, while the other is the supposition of some kind of semantic relationship between the putative cognates on the basis of which we have to be able to supply a plausible explanation for the semantic development from some earlier underlying meaning. The first requirement is by far the more compelling evidence for the reconstruction of a common etymon. The theoretical primacy of this requirement derives from our established knowledge of the regularity of sound change as opposed to the lack of regularity in semantic change, on which a postulation of semantic correspondences could be based. This assumed ad hoc character of semantic change has always posed a problem for semantic reconstruction (cf. e.g. Sweetser 1990: 26). For the sake of achieving a rigor similar to phonological reconstruction in the reconstruction of meanings, the best available method appears to be a feature analysis of the putative related meanings, from which some kind
314
Ga´bor Gyori and Ire´n Hegedus
of a lowest common denominator is processed. The bundle of features yielded in this way can then be posited as original meaning. However, as Sweetser (1990: 24) has clearly shown, these meanings do not seem to be realistic at all, since such a procedure yields a proto-vocabulary full of abstract meanings, which contradicts our knowledge of semantic change running from concrete to abstract in the vast majority of the cases. The problem with the above method of semantic reconstruction does not lie by its imprecision but in the view it maintains about the nature of meaning. Langacker (1987: 157) has pointed out that semantic extension – among other everyday semantic phenomena – cannot be handled by an autonomous feature-based approach but only in an encyclopaedic view of meaning. This is because the understanding of any new meaning on the basis of the original meaning involves associative processes rather than simple algorithmic operations. In Sweetser’s (1990: 24) opinion, a cognitive theory of meaning cannot subscribe to the idea that the basic mechanisms of semantic change can be reduced to loss, addition and recombination of semantic features. In any case, no semantic reconstruction can be initiated unless we can find some semantic relationship between our tentatively cognate forms and supply a plausible explanation for the given development. An encyclopaedic approach to meaning incorporating the view of the prototypical nature of conceptual categories is especially suited to explain semantic extension and change because it is based on natural cognitive capacities which underlie meanings. In phonological change, for instance, naturalness of a reconstructed phonemic inventory and naturalness of the processes by which it is possible to derive the attested phonemic inventories from the proto-system is an important governing principle. That is, one of the tests of the validity of particular reconstructions has been mostly the existence of typological parallels across languages (see Fox 1995: 253f. on problems of the applicability of typological considerations in phonological and syntactic reconstruction). Joseph and Karnitis (1999) state that research on change in components of grammar other than semantics has always benefited from work on naturalness constraints, whereas the study of semantic change and the search for cognates have at best ‘‘the traditional methodology of looking for parallels to get a handle on the wide range of semantic extension’’ (Joseph and Karnitis 1999: 152). A more practical problem of etymological research caused by the uncertainties of semantic reconstruction is that there are cases in which the formal (phonological) correspondence is visible, yet a semantic connection seems impossible or rather di‰cult to make. As a result several cases
A cognitive approach to the methodology of semantic reconstruction
315
of reconstructed etyma are considered to be homophonous on the protolinguistic level because their connection is semantically not feasible (cf. Hegedus 2008). As an illustration, consider the following examples in which some kind of semantic connection could intuitively be made: PIE *bhergh-1 ‘to hide, protect’ and PIE *bherg- 2 ‘high (with derivatives referring to hills and hill-forts)’: 1.1. PIE *bhergh-1 ‘to hide, protect’, in zero-grade form *bhr gh- > ˚ PGmc. *burg-jan > OE byrgan > Mod.E. bury, 1.2. PIE *bherg- 2 ‘high (with derivatives referring to hills and hill-forts)’, in zero-grade from *bhr gh- > PGmc. *burgs ‘hill-fort’ > OE burg, ˚ burh, byrig ‘(fortified) town’ > Mod.E. borough, reflected also in a grammaticalized form as -bury (Watkins 2000: 11). 1.
2.
PIE *wer- has seven homonymic entries in Watkins 2000, at least two of them can be semantically connected: 2.1. PIE *wer- 4 ‘to percieve, watch out for’, in o-grade form with a su‰x *wor-o- > PGmc. *wara- > OE wær ‘watchful’ > Mod.E. wary (Watkins 2000: 99–100), 2.2. PIE *wer- 5 ‘to cover’, in o-grade form with a su‰x *wor-no- > PGmc. *war-no#n > Pre-OE *war(e)nian ‘to take heed, warn’ > Mod.E. warn (Watkins 2000: 100).
In this paper we will take a closer look at one such case, which is represented by the English words knee < PIE *gˆo´nu-/*gˆnu- and chin < PIE ˚ ablaut grades of *gˆenu-, where the PIE forms can be considered di¤erent one and the same root. Mallory and Adams (1997: 336) suggest that these two forms may be semantically related on the basis of shape, namely ‘‘both being sharply angled parts of the body.’’ Below we will try to provide evidence for this cognate relationship and explain the semantic connection between these two lexemes on cognitive grounds. As the problem of semantic reconstruction largely depends on finding general and regular mechanisms and processes in semantic change, we will turn to this issue first. 2. Is there regularity in semantic change? Semantic change has been considered to be essentially sporadic, devoid of regularities in the form of systematic changes (cf. e.g. Anttila 1989: 147; Hock and Joseph 1996: 244). Thus, the investigation of generalizabil-
316
Ga´bor Gyori and Ire´n Hegedus
ity in this area of language change has always been a problematic issue (McMahon 1994: 175). Traugott and Dasher (2002: 3–4) remark that lexemes in the nominal domain are especially susceptible to irregular changes due to the strong influence of extralinguistic factors. In spite of this, certain types of regularities have been found also in the case of semantic change, for instance in the unidirectional relationships in certain types of semantic structure (Traugott and Dasher 2002: 26). Most of the generalizing work on semantic change has been concerned with classifications of the changes according to various mechanisms, results, attitudes, causes, etc. (e.g. Algeo 1990; Blank 1997; Campbell 1998: 256–266; Fortson 2003; Hock 1991: 284–305; McMahon 1994: 178–184). It is important to note, however, that it would be wrong to suggest that certain changes can be classified by causes, others by mechanisms and still others by range, results or attitude, etc. These aspects do not exclude each other; rather, a single change can and should be described from several perspectives in order to give a complete and precise characterization of the change in question (cf. Gyori 2004: 29). Despite the progress made in research on general and regular characteristics of semantic change, the question to what extent these are comparable to the regularity and systematicity of the changes found at other levels of linguistic analysis has remained rather controversial. Historical changes at the levels of phonology, morphology and syntax have been found to exhibit regular and systematic e¤ects on which generalizations can be based and laws can be established. In many cases these even constitute major events in the history of a language, a¤ecting the whole language system. Such e¤ects are unthinkable for semantic change. One of the reasons why semantic change has been found to be mostly irregular as compared to changes at other levels of linguistic analysis is that there is an essential structural di¤erence between these levels and semantics. This di¤erence is manifest in the fact that in the case of semantics the change operates on an open-ended set of linguistic elements, namely lexical items, while changes at the levels of phonology, morphology and syntax concern closed system items (i.e., restricted sets of elements) (cf. McMahon 1994: 185). For this reason, we think that it is important to make a terminological di¤erence between generalizability and regularity. Semantic change is generalizable rather than regular because various established general aspects can be applied in the characterization of any single change. However, semantic change is not regular in the sense that the change of the meaning of one lexeme will have specific definable e¤ects on the whole semantic
A cognitive approach to the methodology of semantic reconstruction
317
system of a language. This kind of regularity is characteristic for sound change, where most changes will a¤ect the complete phonological system. Such e¤ects will alter the constellation of the phonological inventory, especially in the case of sound shifts which totally recast the distribution of phonemes in the system. The reason for this is the existence of a phonological space, which physically and physiologically limits the possibilities for the changes. Conversely, the semantic structure of a language does not have the same system characteristics as does the phonological structure. In the case of lexemes and their meanings, the status of individual elements of the system does not formally and systematically determine the status of other elements, i.e., they do not condition each others’ properties and position in the system, as is the case with phonemes. However, where parts of the semantic structure appear to be closed or at least semiclosed, as with homonymy, synonymy and lexical fields, the (relative) system characteristics of a semantic space will condition the changes. These special cases of systematic changes of meaning are the elimination of homonymic clash (Hock 1991: 297–298), the di¤erentiation of synonyms (Berndt 1989: 98–102), and chain shifts within lexical fields (Hock and Joseph 1996: 245¤.; cf. also Anttila 1989: 146–147; McMahon 1994: 186). However, they represent only a relatively small minority of meaning changes in languages. The reason for this is that apart from these cases of systematic changes, semantic change provides solutions for problems of e‰ciency in communication and mental representation (cf. Geeraerts 1997: 102–108), while phonological, morphological and syntactic changes solve structural problems in the system. Another reason for semantic change to be di¤erent from other types of linguistic change pertains exactly to this last point, i.e., the function of semantic change. Since, as we have seen, meaning can be characterized only partly through aspects of linguistic structure, it is even more important to take this reason into consideration in semantic reconstruction. Research in cognitive semantics has shown that semantic knowledge is far from autonomous. Meanings are based on encyclopaedic knowledge with specific constraints (Langacker 1987: 153), and represent socially shared and culturally valid conceptualizations. In other words, semantic structure is conventionalized conceptual structure (Langacker 1987: 99). Armed with this theory, cognitively oriented historical semantics has made considerable progress in the theoretical account of meaning change (e.g. Geeraerts 1997), as opposed to traditional logic-based semantics, which has failed to give any explanation of such change. However, much as a semantic theory founded on the open-ended nature of meaning can give a solid
318
Ga´bor Gyori and Ire´n Hegedus
account of why and how semantic change happens, the fact still remains that most semantic changes require individual explanations based on our knowledge of the socio-cultural history of the speakers of a language (cf. Anttila 1989: 137, and Campbell 1998: 267). And this is no wonder in view of the fact that meanings represent parts of a speech community’s conventionalized mental model of their natural and socio-cultural environment. Despite the di‰culties, semantic change is not completely ungeneralizable. Ko¨nig and Siemund (1999: 237) claim that recent studies have seriously questioned the irregularity of semantic change because ‘‘all semantic changes are instances of a very limited set of possible processes, such as metaphor, metonymy, ellipsis, narrowing, broadening, etc.’’ However, while ellipsis is a linguistic device promoting economy of expression, metaphor, metonymy, meaning restriction and extension (in fact category restriction and extension) are the basic cognitive mechanisms that yield novel conceptualizations of the world, and on which therefore semantic extension is based (Gyori 2002: 134, 143; Koch 2008: 115; see also Koch, this volume). In fact, these are the four well-established mechanisms of semantic change that almost all investigated languages make use of (cf. Traugott 1985; Traugott and Dasher 2002: 56–57). When speakers perform these cognitive operations on entrenched meanings for the sake of enhancing communicative e‰ciency, the linguistic manifestations of these operations may become conventionalized and new concepts of cultural relevance may become established. When this happens, semantic change has taken place. Another area of research into the generalizability of semantic change has been concerned with the direction of the changes. It has been claimed that this direction follows general principles. Thus, Traugott (1990) points out three tendencies of change in which later meanings increasingly reflect the way speakers subjectively view the world (cf. also Traugott and Dasher 2002: 99; see also Hansen, this volume). Wilkins (1996), discussing the semantic domain which he calls ‘‘parts of a person,’’ argues that synecdochic change is unidirectional because normally ‘‘a term referring to a visible part [. . .] [will] come to refer to the visible whole of which it is an intermediate, and a spatially and/or functionally integral part’’ (Wilkins 1996: 275). A more general tendency, which embraces the above two, has also been established: meaning changes usually follow a concrete to abstract development (e.g. Hock 1991: 290, and Sweetser 1990: 18). However, Campbell (1998: 273) draws attention to the fact that the unidirectionality principle does not always hold because semantic restrictions
A cognitive approach to the methodology of semantic reconstruction
319
‘‘often involve change toward more concreteness,’’ as in the case of Eng. fowl ‘domestic bird’ < Old English fugol ‘bird’, or Eng. deer < Old English de#or ‘animal’. When studying the generalizability of semantic change, one of the most controversial questions is whether, in addition to describing general mechanisms and general directions of change, generalizations concerning the content of meaning changes can also be made. It appears to be selfevident that this aspect of semantic change is the most culture-dependent, and the influence of the specific socio-cultural environment of the speech community is an important factor in the change. In spite of the culture-dependent character of the content side of semantic change, certain general tendencies have been found here as well. These general tendencies reveal themselves in similar conceptualizations across languages. Especially onomasiological change may reflect certain universal directions of human thought. When looking at whole lexical fields, we may find that the various mechanisms of change may lead to similar conceptual avenues in referring to particular phenomena (cf. Anttila 1989: 147). Campbell (1998: 270–272) provides a good overview of the general tendencies that have been observed in certain kinds of changes. Haser (2000) has looked at a huge amount of data for semantic change in various lexical fields in a large number of historically unrelated languages and compiled a long list of the similar trends of development between source and target concepts (see also Zalizniak 2008). Such tendencies can also be identified in grammaticalization (Traugott and Dasher 2002: 86). Gyori (1998) examined the lexicalization processes of basic emotion terms in di¤erent languages and found that the historical data show significant similarities of onomasiological change. The most typical changes are metonymic extensions of terms for the physiological symptoms and other e¤ects accompanying the expression of the various emotions. Within this general frame many of the cases reveal similar or even identical conceptualizations, but a considerable number of the conceptualizations are di¤erent. These latter cases do not seem to manifest culture-dependence only, but also the freedom and creativity of human thought within the above general constraints.
3. The cognitive foundations of generalities in semantic change Describing the natural tendencies of semantic change attested in various languages may provide considerable help in looking for semantic connec-
320
Ga´bor Gyori and Ire´n Hegedus
tions in etymological research. If we can even provide an explanation for the occurrence of these tendencies, the accuracy of semantic reconstruction may be further increased. In our view, the causes behind the generalities found in the linguistic manifestations of semantic change are of a cognitive nature. In other words, they do not derive from the structural properties of linguistic systems, but from the way the human mind operates in perceiving and understanding the world. Cross-linguistic universal tendencies in semantic change originate therefore in the universal cognitive processes of the human mind in an e¤ort to solve the problem of e‰cient communication. A better understanding of the cognitive processes involved in semantic extensions may be of great help in finding hidden or obscure semantic connections between putative cognate forms. When trying to get new ideas, views, attitudes, etc. about the world across to the hearer, the speaker faces certain communicative challenges. This raises a cognitive problem, since the speaker’s particular conceptualization of phenomena (s)he wishes to communicate about must be made accessible to the hearer in order to ensure mutual intelligibility. Due to the analogical nature of the human mind relying on the transposition of gestalts (Anttila 2003: 429), this is most obviously done through the exploitation of familiar knowledge (Holyoak and Thagard 1997), and the application of the fundamental cognitive mechanisms serving the exploitation of such knowledge. These mechanisms are category extension and restriction, and – due to our cognitive disposition to perceive similarity and contiguity (Anttila 2003: 431) – especially metaphor and metonymy (Traugott and Dasher 2002: 27), by which the human mind makes sense of the world in general (Dirven 1993; Johnson 1987: xx, 100; Lako¤ and Johnson 1980: 36; Lako¤ 1987: 77). The necessary familiar knowledge shared by both speaker and hearer resides basically in the already-coded category system of the language. Since new knowledge is derived by imposing new structure on familiar knowledge (Langacker 1987: 105), the modification of conventional semantic structures is the primary way for sharing new perspectives and conceptualizations. An e¤ective utilization of familiar knowledge through the imposition of structure can be achieved only if the most suitable expression is selected for semantic modification and if the most appropriate cognitive mechanism is applied. Gyori (2002: 143–147) gives an account of the special cognitive factors governing this selection. These factors constitute a significant influence on universal tendencies in semantic extension (Gyori 2004; cf. also Koch 2008).
A cognitive approach to the methodology of semantic reconstruction
321
As we have seen above, one of the levels at which semantic change can be generalized are the well-established linguistic mechanisms of metaphor, metonymy, and semantic broadening and narrowing. The universality of these mechanisms in the modification of meanings seems to be fairly self-evident considering the fact that they originate in universal human cognitive mechanisms. Gyori (2004: 30) has called these ‘‘universals of form’’ due to the fact that these mechanisms pertain to the mode of conceptualizations inherent in the innovative usage of conventional expressions. In contrast to the general mechanisms of human cognition that shape our conceptualizations and inevitably lead to universal patterns in the modification of meanings, the content of these conceptualizations may be influenced to a considerable degree by the cultural context. Thus, a much more challenging undertaking appears to be the search for universals that relate to the content of the conceptualizations on which the innovative usage of conventional expressions is based. Such universals, called content universals of semantic change in Gyori (2004: 31), are due to cognitive factors which reduce or even cancel the relativistic e¤ects of the cultural context and bring about similar conceptualizations in various languages under di¤erent cultural conditions. One such obvious factor is the mechanisms of change themselves (cf. Anttila 1989: 148). They may lead to universal conceptualizations because familiar knowledge can be utilized only by way of the cognitive processes underlying these mechanisms. Thus, the reason why these mechanisms may universally induce certain specific conceptualizations is the generality of particular types of knowledge, which count as familiar independent of the cultural context. This should be the most obvious in the case of metonymy, which probably requires the least cognitive e¤ort due to the often explicit contiguity of the referents of the source and the target domains of the semantic extension (see Koch, this volume). This may also be the reason for the dominance of metonymic onomasiological changes in certain lexical fields (Gyori 1998). All this appears to be deeply rooted in Rosch’s (1978) two basic principles of categorization: cognitive economy and perceived world structure. The cognitive salience of contiguity derives from the perception of the close correlational structure of the world, which is mostly independent of the cultural context. Conceptualizating certain phenomena in terms of these universally perceived correlations will also satisfy the principle of cognitive economy by providing ‘‘maximum information with the least cognitive e¤ort’’ (Rosch 1978: 28).
322
Ga´bor Gyori and Ire´n Hegedus
Thus, in general it could be said that the more specific similarities in the content of conceptualizations might be accounted for by a causal chain of cognitive operations in which universally perceived world structure is inherited by more abstract levels of conceptualizations. The categorization principle ‘‘perceived world structure’’ influences the topological structure of conceptual domains that may eventually serve as the source domains of metaphorical and metonymical extensions, and this structure is further preserved in mappings onto a target domain (cf. Lako¤ 1990). Since perceived world structure also influences our taxonomical view of the world, category extension and restriction, as manifest in semantic broadening and narrowing, may also yield universal conceptualizations by way of a similar chain. General tendencies appear to be further determined by the fact that the human conceptualizing capacity may function at various levels of specificity (Lako¤ 1987: 281). Thus, universal avenues of change are expected to appear at various degrees of conceptual abstraction. Of very low conceptual specificity is the universal tendency of the unidirectional development from concrete to abstract in semantic extensions. This tendency is not simply a linguistic phenomenon but derives from a fundamental characteristic of human cognition. Harnad (1990) has shown that all symbolic representations must be empirically grounded. Our direct perceptual experience will produce two kinds of mental representations of the concrete world, iconic ones, which are mental analogues of concrete objects and events, and categorical ones generated by innate and learned feature detectors. Symbolic representations acquire their grounding indirectly by being composed of these directly grounded ones. Categorical representations appear to be akin to basic level categories, which are categories of concrete objects at an intermediate level of our conceptual hierarchy in being ‘‘the most abstract categories for which an image could be reasonably representative of the class as a whole’’ (Rosch 1978: 34). These categories have principal psychological salience in human cognition because they function as cognitive reference points: objects are first recognized at this level and they are also the first ones to be learned by children (Rosch 1978: 35). The notion of embodiment in cognitive semantics is also based on the claim that meaning and abstract reason in general originate in concrete and direct perceptual experience and bodily interaction with our environment. Johnson (1987) claims that it is through this behavioral activity that we recognize recurrent patterns which generate pre-conceptual mental structures called image schemata. These are gestalt structures whose
A cognitive approach to the methodology of semantic reconstruction
323
internal organization makes our experience of the world meaningful by lending it ‘‘regularity, coherence, and comprehensibility’’ (Johnson 1987: 62). Furthermore, the abstract domains of our experience are understood via metaphorical projections from these image schematic gestalts. As we have seen, the universal tendency of the concrete to abstract development in semantic change originates in what is probably one the most fundamental principles of human cognition, the general disposition of the mind to proceed from concrete to abstract, i.e., to understand abstract domains in terms of concrete ones. Correspondingly, the universal tendency of semantic change to proceed in this direction is of very low specificity. However, the concrete to abstract trajectory of human thought is manifest in the functioning of image schemata only in a very general way. The various image schemata also function in much more specific ways corresponding to the particular recurrent patterns of bodily experience, which further constrains the generality of the content of conceptualizations in concordance with the gestalt structures of the various schemata (cf. Johnson 1987: 112¤.). Since the various image schemata are expected to arise in all cultures due to the basic ways of human interaction with the environment, the specific conceptualizations emanating from them may still lead to universal semantic extensions. In general, it could be argued that if semantic extension in general is rooted in cognitive processes of the human mind, then general tendencies of change should derive from what is universal about human conceptualization, and further that the levels of specificity of these universal semantic extensions will be determined by what level of specificity universal conceptualizations can reach (cf. Geeraerts 1994). In other words, the more specific the content of universal human cognitive processes, for instance conceptualizations on the basis of image schematic projections, the more specific are the semantic changes found cross-linguistically. Image schematic projections seem to account for a relatively high level of specificity in universal avenues of semantic change, though Haser (2000: 187) remarks that ‘‘[a] more elaborate description of the underlying pattern [. . .] [than o¤ered by image schemata] seems necessary’’ for a more accurate explication of cross-linguistically similar extensions.
4. The applicability of image schemata for semantic reconstruction In an earlier study (Gyori and Hegedus 1999) we looked at a special type of semantic change where parallel extensions from the same original form
324
Ga´bor Gyori and Ire´n Hegedus
resulted in conceptually opposed lexemes. There we claimed that this development into semantically antonymous cognates could be found cross-linguistically with some generality. We explained this phenomenon as the result of a bipolar lexicalization of an underlying image schema on the grounds that pairs of basic oppositions form a gestalt and thus cannot by the nature of the matter be conceptualized individually but only in correlation. Haser (2000: 187) has also called attention to the image schematic character of semantic extensions and implied that the observed universal tendencies in lexical developments might derive from the exploitation of similar source domains for similar target domains. Given the fact of the image schematic structuring of human experience in general, this appears to be a very likely explanation. Semantic extensions may develop in a parallel fashion because their inherent conceptualizations are most probably triggered by underlying image schemata. We also completely agree with Haser’s claim that not all semantic extensions originate in ‘‘a more or less fixed inventory of image schemata’’ but rather ‘‘[a] comparative analysis of similar metaphorization processes may help discover relevant ‘structures’ triggering the extensions in the first place’’ (Haser 2000: 186). Among others, this is one of the insights which emerges from the etymological analysis below. Johnson himself remarks that ‘‘[i]f one understands ‘schema’ more loosely than [he does] [. . .], it might be possible to extend [. . .] [the] list [of schemata] [. . .] at length’’ (Johnson 1987: 126). Wilkins (1996) also points out that the uncertainties and controversies in semantic reconstruction derive mainly from our lack of knowledge of natural semantic shifts, and quite correctly claims that by identifying natural tendencies in semantic change, reconstruction could be made more precise. He distinguishes five main types of natural tendencies in the naming of body parts, among which the third tendency is formulated as follows: ‘‘[w]here the waist provides a midline, it is a natural tendency for terms referring to parts of the upper body to shift to refer to parts of the lower body and vice versa (e.g. ‘elbow’ Q ‘knee’; ‘uvula’ ! ‘clitoris’; ‘anus’ ! ‘mouth’)’’ (Wilkins 1996: 273–274). An inherent problem of the formulated tendency appears to be that the direction of the exemplified developments, as well as their unidirectionality, is seemingly arbitrary and inexplicable, since none of the senses could be considered the more basic one. In our view, the tendency illustrated by the examples can be given a more economical explanation based on universal cognitive operations of the human mind, which also solves the above problem by demonstrating that in such cases both (all) senses are extensions from a more basic original sense.
A cognitive approach to the methodology of semantic reconstruction
325
Table 1. The diachronic interconnection of ‘knee’ and ‘chin/jaw’ ‘knee’
‘chin/jaw’
Old English
cne#o(w)
cin(n) ‘chin’, cinba#n ‘jawbone’
Old Norse
kne# ‘knee, limb’
kinn ‘cheek’
Gothic
kniu
kinnus ‘cheek’
O.H.G.
kniu, kneo
kinni ‘chin’, chinne ‘jaws’
Greek
go´nu (cf. gonı´a ‘corner, angle’ < pre-Gk. *go#nwia)
gEnuv ‘jaw, cheek’
Latin
genu#
gena ‘cheek’
Welsh
glin (< *glu#-nes < *gnu#-nes)
gen ‘jaw, chin’ (< ? Latin)
Hittite
ge#nu-
–
Tokharian
A kanwem, B kenıne #
A c¸anwem ‘jaws’ (dual.)
Armenian
cunr (dial. cnui)
cnawt ‘jaw’
Sanskrit
ja#nu-
hanu- ‘jaw’ (h < *j )
Avestan
zˇnu-
zanauua ‘both jaws’
Lithuanian
–
zˇa´ndas ‘jaw, cheek’
Latvian
–
zuoˆds ‘chin’
Proto-IE
*gˆo´nu-/*gˆnu˚
*gˆenu-
< PIE *gˆo´n-ha dh-o-s
In the following we would like to demonstrate that the explication of similar semantic developments across di¤erent languages in terms of the same underlying image schema may help verify particular semantic reconstructions and yield new insights in the search for possible cognates (cf. Joseph and Karnitis 1999: 154). Table 1 shows the distribution of two groups of words with the senses ‘knee’ and ‘chin/jaw’ in Indo-European languages. (The quoted forms represent 14 languages from 9 branches of the Indo-European language family.) The two sets of forms can thus be referred to an identical root form. We maintain that the possibility of separate homonymic roots can be rejected here, and we maintain that it is precisely our semantic arguments grounded in cognitive linguistic approaches that enable us to do this. Thus, we argue that the two groups of words are ablauting forms of one
326
Ga´bor Gyori and Ire´n Hegedus
and the same ancient Indo-European root with di¤erent root extensions. So the word meaning ‘knee’ is a u-stem noun with o-grade or zero-grade of the same root, while the one meaning ‘chin/jaw’ is an e-grade form of the same root. There does not seem to be a primacy of any of the senses from a cognitive point of view, and the semantic extensions are probably based on the same underlying perceptual pattern. In this cognate group the resemblance in shape, presumably the notion of ‘angle’, is the prominent characteristic feature associated with both of the senses ‘knee’ and ‘chin/jaw’, which provides the semantic connection between the names of these body parts (cf. Buck 1949: 221). This supposition is reinforced by the fact that in some languages we can find cognate words that still have the meaning ‘angle’, e.g. Greek go´nia ‘angle’. Pokorny gives the reconstructed meaning for PIE *gˆenu-/*gˆonu-/*gˆnu- as ‘Knie, Ecke, Winkel’ [‘knee, corner, angle’] and ‘Kinnbacke’ [‘jaw(bone)’] (Pokorny 1959: 380). Furthermore, the meanings ‘jaw’, ‘chin’ and ‘cheek’ can interchange. Under these circumstances it is possible to find the common concept of similarity in shape, which was formulated by Buck as ‘something projecting’ or a ‘hook’ (Buck 1949: 224). Parallel semantic developments are observable in the data of several other language families. In Uralic languages there are two etyma reconstructed for the notion of ‘knee’: Proto-Uralic *polwe (UEW: 393) and Proto-Uralic (or Proto-Finno-Ugric) *sa¨ncfi (the reconstruction of *sa¨ncfi is perhaps not really as deep as the PU stage, since the inclusion of the Samoyedic data is questionable) (UEW: 471). In the case of PU *polwe ‘knee’ the individual languages have not only the semantic component of the body part but also the meaning ‘curve, bend’, e.g. Finnish polvi ‘knee; extremity; curve, bend’. The question is of course which meaning component is primary and diachronically earlier? If we suppose that the notion of ‘bend’ is a frequently occurring underlying concept in the names of the body parts – as in the case of Indo-European ‘knee’ – then the polysemy found in Finnish seems to maintain an archaic state. The other etymon for ‘knee’, *sa¨ncfi could be derivationally connected to the etymon *siÐe ‘bend, curve’ (UEW: 480) if they can be proven to be morphologically complex, but the UEW fails – or does not dare – to make the connection. Besides these two etyma, the UEW also lists PFU *picfi (pu¨cfi) ‘bend(ing) of a body part (e.g. of the knee, elbow)’ from which we have reflexes both for ‘elbow’ and for ‘knee’: Vogul pisi/pa¨s ‘elbow’, and Votyak pıˆd’es ‘knee’ (UEW: 376). This is a most illustrative example of the linguistic representation of the underlying conceptual relationship.
A cognitive approach to the methodology of semantic reconstruction
327
A further semantic parallel is provided by the data from South Caucasian (Kartvelian) languages: Georgian muxl- ‘knee’ and Megrel muxur- ‘corner, angle, edge’ are derived from the Proto-Kartvelian stem *muql- ‘knee, corner’ (cf. Klimov 1964: 138). The perceptual pattern of ‘curve/bend’ has served as a cognitive basis for the semantic extension also in the case of terms denoting other body parts with this shape. Thus, we can adduce evidence from several language groups all over the world for the meaning ‘elbow’ deriving from the notion of ‘bend,’ e.g.: 1. English elbow < Old English elnboga < Proto-Germanic *alino-bugo#n ‘bend of the forearm’, 2. Greenlandic Inuit piRniq ‘joint, elbow’ < Proto-Eskimo *pRnR ‘joint or bend’ analyzable as the PEsk. base *pR- ‘bend’ þ the nominalizer postbase *-nR (Fortescue et al. 1994: 256, 414), 3. Nanaj maja ‘elbow, left hand’ < Proto-Tungusic *majıˆ-/ma¨ji- ‘bend’ (Dybo 1988: 119), 4. The Uralic Etymological Dictionary also suggests the comparison of Proto-Finno-Ugric *kina¨ (ku¨na¨ ) ‘elbow’ with Proto-Indo-European *gˆenu- ‘knee’ and several Altaic languages, such as Tungusic *k cu¨nce(n) ‘elbow’ (cf. UEW: 158), or perhaps more exactly Proto-Tungusic *xu¨ncen (Dybo 1988: 118, 128). An interesting example is the above mentioned PFU *picfi (pu¨cfi) ‘bend(ing) of a body part (e.g. that of the knee or the elbow)’, from which both Vogul pisi/pa¨s ‘elbow’ and Votyak pıˆd’es ‘knee’ derive (UEW: 376). Yet another body part term where the curved shape served as the basis for the semantic extension is ‘hump,’ which can be found in various members of the Dravidian language family. The following examples have been taken from Burrow and Emeneau (1961: 131): Tamil ku#n ‘curve, hump’, ku#nu ‘to bend’; Malayalam ku#n ‘hump’; Kota ku#n- ‘to bend (intransitive)’; Toda ku#n ‘hunchback’; Telugu gu#nu ‘hump’, etc. These historical semantic data prompt us to postulate a BEND/CURVE image schema, from which metaphorical and metonymical projections give rise to various senses. We are of course aware that the proper verification of the existence of a BEND/CURVE schema needs a more detailed analysis and the collection of a large amount of evidence also from synchronic linguistic data in a similarly precise fashion as, for instance, Cienki (1998) has done for the STRAIGHT schema. However, the semantic extensions in the historical developments documented in the above paragraphs appear to have their cognitive foundations in an experiential
328
Ga´bor Gyori and Ire´n Hegedus
schema emerging from a relatively well delineated, recurring perceptual and kinesthetic pattern, which figures in many types of bodily interaction with our environment. First this pre-conceptual BEND/CURVE schema leads to a more elaborate structuring of conceptual space, from which the lexicalized concepts then derive at the linguistic level. We have tried to illustrate this process in a schematic diagram (see Figure 1).
Figure 1. The linguistic manifestation of the postulated BEND/CURVE schema
The above considerations also encourage us to hypothesize about a possible cognate relationship of PIE *gˆenu-/*gˆonu-/*gˆnu- ‘knee, corner, angle’ and ‘chin, jaw(bone)’ (see above), which may ˚otherwise remain concealed due, among others, to the traditionally established divergent semantic reconstructions of the forms in question. We propose a further connection with the root *gen-, the semantics of which was reconstructed by Pokorny (1959: 370–373) as ‘zusammenknicken; Zusammengedru¨cktes, Geballtes’ [‘to bend or fold; something pressed together, something formed into a ball’]. The proto-form *gen- has numerous reflexes in languages of the western area, such as Greek, Celtic and especially Germanic, but its occurrence is very rare in the eastern groups, in fact there are reflexes only in the Balto-Slavic branch (where they are likely to be borrowings from Germanic). In a traditional approach this uneven distribution does not allow for interpreting the form *gen- as a PIE reconstructed root, but
A cognitive approach to the methodology of semantic reconstruction
329
a proto-form reconstructable only for the western dialects. However, an underlying BEND/CURVE schema may rather plausibly connect the semantics of this root with PIE *gˆenu-/*gˆonu-/*gˆnu- via metaphorical projections. As the western languages depalatalized˚ gˆ and merged it with g, the velar stop of the western languages may derive from an earlier, PIE palatal stop (although for a strong case of reconstruction we would need evidence from the eastern branches to demonstrate that we are indeed dealing with a PIE palatal stop gˆ ).
5. Conclusion In this paper we tried to show that the similar conceptualizations manifest in semantic changes found commonly across various languages derive from the specific universal cognitive mechanisms the mind employs to comprehend the world by making use of already familiar knowledge. We propose that many resemblances in the directions of semantic change arise through metaphorical and metonymical projections from image schemata. An account in terms of image schemata may have practical applications in the methodology of historical-comparative linguistics and can be used to facilitate semantic reconstruction and the identification of cognates.
References Algeo, John 1990
Semantic change. In: Edgar C. Polome´ (ed.), Research Guide on Language Change, 399–408. Berlin and New York: Mouton de Gruyter.
Anttila, Raimo 1989 Historical and Comparative Linguistics. Amsterdam: Benjamins. Anttila, Raimo 2003 Analogy: The warp and woof of cognition. In: Brian D. Joseph and Richard D. Janda (eds.), The Handbook of Historical Linguistics, 425–440. Oxford: Blackwell. Berndt, Rolf 1989 A History of the English Language. Leipzig: Verlag Enzyklopa¨die. Blank, Andreas 1997 Prinzipien des lexikalischen Bedeutungswandels am Beispiel der romanischen Sprachen. Tu¨bingen: Max Niemeyer. Buck, Carl Darling 1949 A Dictionary of Selected Synonyms in the Principal Indo-European Languages. Chicago/London: University of Chicago Press.
330
Ga´bor Gyori and Ire´n Hegedus
Burrow, Thomas and Murray B. Emeneau 1961 A Dravidian Etymological Dictionary. Oxford: Oxford University Press. Campbell, Lyle 1998 Historical Linguistics: An Introduction. Edinburgh: Edinburgh University Press. Cienki, Alan 1998 STRAIGHT: An image schema and its metaphorical extension. Cognitive Linguistics 9: 107–149. Dirven, Rene´ 1993 Metonymy and metaphor: Di¤erent mental strategies of conceptualisation. Leuvense Bijdragen 82: 1–25. Dybo, Anna V. 1988 Etimologicheskij material k rekonstrukciji pratunguso-man’chzhurskih nazvanij chastej tela [Etymological material for the reconstruction of Proto-Tungusic names of body parts]. In: Sinhronija i diahronija v lingvisticheskih issledovanijah. Part 1: 108–127. Moscow: Nauka. Fortescue, Michael, Steven Jacobson and Lawrence Kaplan 1994 Comparative Eskimo Dictionary, with Aleut Cognates. Fairbanks: University of Alaska, Alaska Native Language Center. Fortson, Benjamin W. IV 2003 An approach to semantic change. In: Brian D. Joseph and Richard D. Janda (eds.), The Handbook of Historical Linguistics, 648–666. Oxford: Blackwell. Fox, Anthony 1995 Linguistic Reconstruction: An Introduction to Theory and Method. Oxford: Oxford University Press. Geeraerts, Dirk 1994 Semantic change (Laws of ). In: Robert E. Asher (ed.), The Encyclopedia of Language and Linguistics. Vol. 7: 3800. Oxford/ New York: Pergamon Press. Geeraerts, Dirk 1997 Diachronic Prototype Semantics: A Contribution to Historical Lexicology. Oxford: Clarendon Press. Gyori, Ga´bor 1998 Cultural variation in the conceptualisation of emotions: A historical study. In: Angeliki Athanasiadou and Elz˙bieta Tabakowska (eds.), Speaking of Emotions. Conceptualisation and Expression (Cognitive Linguistics Research 10.), 99–124. Berlin/New York: Mouton de Gruyter. Gyori, Ga´bor 2002 Semantic change and cognition. Cognitive Linguistics 13: 123– 166.
A cognitive approach to the methodology of semantic reconstruction Gyori, Ga´bor 2004
331
Semantic-lexical change at the crossroads between universals and linguistic relativity: A perspective from cognition and evolution. In: Wiltrud Mihatsch and Reinhild Steinberg (eds.), Lexikalische Daten und Universalien des semantischen Wandels / Lexical Data and Universals of Semantic Change, 19–37. Tu¨bingen: Stau¤enburg. Gyori, Ga´bor and Ire´n Hegedus 1999 Is everything black and white in conceptual oppositions? In: Leon de Stadler and Christoph Eyrich (eds.), Issues in Cognitive Linguistics (Cognitive Linguistics Research 12.), 57–74. Berlin/ New York: Mouton de Gruyter. Harnad, Stevan 1990 The symbol grounding problem. Physica D 42: 335–346. Haser, Verena 2000 Metaphor in semantic change. In: Antonio Barcelona (ed.), Metaphor and Metonymy at the Crossroads. A Cognitive Perspective (Topics in English Linguistics 30.), 171–194. Berlin/New York: Mouton de Gruyter. Hegedus, Ire´n 2008 A note on the pre-protolinguistic background of Proto-Uralic homonyms. Mother Tongue. Journal of the Association for the Study of Language in Prehistory (Boston) XIII: 191–195. Hock, Hans Heinrich 1991 Principles of Historical Linguistics. Berlin/New York: Mouton de Gruyter. Hock, Hans H. and Brian Joseph 1996 Language History, Language Change, and Language Relationship. Berlin/New York: Mouton de Gruyter. Holyoak, Keith J. and Paul Thagard 1997 The analogical mind. American Psychologist 52: 35–44. Johnson, Mark 1987 The Body in the Mind. The Bodily Basis of Meaning, Reason and Imagination. Chicago: University of Chicago Press. Joseph, Brian D. and Cathrine S. Karnitis 1999 Evaluating semantic shifts: The case of Indo-European *(s)meuk- and Indo-Iranian *muc-. OSU Working Papers in Linguistics 52: 151–158. Klimov, Georgij A. 1964 Etimologicheskij slovar’ kartvel’skih jazykov [An Etymological Dictionary of Kartvelian Languages]. Moscow: Nauka. Koch, Peter 2008 Cognitive onomasiolgy and lexical change: Around the eye. In: Martine Vanhove (ed.), From Polysemy to Semantic Change: Towards a Typology of Lexical Semantic Associations, 107–137. Amsterdam: Benjamins.
332
Ga´bor Gyori and Ire´n Hegedus
Ko¨nig, Ekkehard and Peter Siemund 1999 Intensifiers as targets and sources of semantic change. In: Andreas Blank and Peter Koch (eds.), Historical Semantics and Cognition, 237–257. Berlin and New York: Mouton de Gruyter. Lako¤, George 1987 Women, Fire and Dangerous Things. What Categories Reveal about the Mind. Chicago: The University of Chicago Press. Lako¤, George 1990 The Invariance Hypothesis: Is abstract reason based on imageschemas? Cognitive Linguistics 1: 39–74. Lako¤, George and Mark Johnson 1980 Metaphors We Live by. Chicago: The University of Chicago Press. Langacker, Ronald W. 1987 Foundations of Cognitive Grammar. Volume 1: Theoretical Prerequisites. Stanford, CA: Stanford University Press. Mallory, James P. and Douglas Q. Adams (eds.) 1997 The Encyclopedia of Indo-European Culture. London: Fitzroy Dearborn. McMahon, April M.S. 1994 Understanding Language Change. Cambridge: Cambridge University Press. Pokorny, Julius 1959–1969 Indogermanisches etymologisches Wo¨rterbuch. Bern: Francke. Rosch, Eleanor 1978 Principles of categorization. In: Eleanor Rosch and Barbara B. Lloyd (eds.), Cognition and Categorization, 27–48. Hillsdale, NJ: Lawrence Erlbaum Associates. Sweetser, Eve 1990 From Etymology to Pragmatics: Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press. Traugott, Elizabeth C. 1985 On regularity in semantic change. Journal of Literary Semantics 14: 155–173. Traugott, Elizabeth C. 1990 From less to more situated in language: The unidirectionality of semantic change. In: Sylvia Adamson, Vivien Law, Nigel Vincent and Susan Wright (eds.), Papers from the 5th International Conference on English Historical Linguistics, Cambridge, 6–9 April, 1987, 497–517. Amsterdam and Philadelphia: John Benjamins. Traugott, Elizabeth C. and Richard B. Dasher 2002 Regularity in Semantic Change. Cambridge: Cambridge University Press.
A cognitive approach to the methodology of semantic reconstruction
333
UEW: Uralisches Etymologisches Wo¨rterbuch. Band I–III. Edited by Ka´roly Re´dei. 1986–1991. Budapest: Akade´miai Kiado´. Watkins, Calvert 2000 The American Heritage Dictionary of Indo-European Roots, 2nd ed. Boston/NewYork: Houghton Mi¿in. Wilkins, David P. 1996 Natural tendencies of semantic change and the search for cognates. In: Mark Durie and Malcolm Ross (eds.), The Comparative Method Reviewed. Regularity and Irregularity in Language Change, 264–304. New York/Oxford: Oxford University Press. Zalizniak, Anna A. 2008 A catalogue of semantic shifts: Towards a typology of semantic derivation. In: Martine Vanhove (ed.), From Polysemy to Semantic Change: Towards a Typology of Lexical Semantic Associations, 217–232. Amsterdam: Benjamins.
Commentary: Theoretical Approaches Terttu Nevalainen 1. Approaching semantic change Semantics is no di¤erent from other fields of science and scholarship in that it diversifies and undergoes paradigm shifts over time. For example, a remarkable boost to the study of semantics was given by Lako¤ and Johnson’s 1980 bestseller on conceptual metaphors. Cognitive approaches have since gained a central position in semantics research and been extended to the study of semantic change. At the same time, diversification of the field has taken place with pragmatic approaches, in particular, gaining momentum over the last couple of decades. A degree of rapprochement is also in evidence across paradigms, especially as far as the role of contextualization in semantic change is concerned. The four contributions to this section nicely illustrate the current versatility of approaches and concern for uncovering the range of social, functional and cognitive factors that influence the progress and outcome of semantic change. Robinson’s study is informed by empirical sociolinguistics, and Hansen’s by pragmatics, while cognitive linguistics provides the frameworks employed both by Koch and by Gyori and Hegedus. Apart from their concern for contextualizing semantic change, these various approaches share a number of basic semantic concepts. From their di¤erent vantage points, the authors refer to metaphor and metonymy, generalization/broadening and specialization/narrowing of meaning. Borrowing is similarly considered by most of them. The more recent concept of (inter)subjectification, a tendency towards speaker-oriented meaning, has also come to be shared across frameworks, and is considered in the contributions by Hansen and Koch. It is a historical commonplace that the older a word is, the larger the number of semantic extensions it has. The chapters in this section illustrate that, while metaphors are indeed pervasive in human communication, they are not the sole or even the major principle governing lexical meaning making. Koch observes that, in lexical change, metaphors are in fact less common than metonymic meaning transfers. Both of them form
Commentary: Theoretical Approaches
335
part of the cognitive dimension of his three-dimensional model of lexical change, on a par with the lexical devices dimension (e.g. a‰xation, compounding) and the dimension of stratification (borrowing). The social embedding of ongoing semantic change emerges from Robinson’s case study of the incoming and outgoing senses of the adjective skinny, based on sociolinguistic interviews. Although this level of social detail cannot be recovered for the more distant past, the documentation of earlier times proves no less intricate. Hansen discusses the subtle discourse e¤ects that a¤ected the choice between Medieval French negative constructions and notes that the pragmatically more constrained expressions were superseded in the course of time. Finally, moving on to proto-languages, Gyori and Hegedus explore the limits of semantic reconstruction and ask whether it is possible to reconstruct, at a general level, conventionalized cognitive connections between concepts that seem unrelated to modern speakers. The approaches adopted in the four chapters all shed light on fundamental issues in lexical semantics. Robinson adopts a semasiological perspective and analyses the ebb and flow of polysemy in the speech community. As she notes, her work could be extended by taking account of the onomasiological dimension of synonymy in the lexical field. Semantic sameness is indeed the key issue informing the variationist sociolinguistic paradigm, focused as it is on ‘‘alternative ways of saying the same thing’’. As a result of lexicalization and grammaticalization, the layering and persistence of older meanings typically produces polysemous homonymy. Hansen approaches these processes in pragmatic terms and considers semantic change at the level of pragmatic inferences. Her chapter relates to the variationist concern of synonymy in that it details the pragmatic contexts of the use of alternative negative expressions. Koch models the rise of new meanings and their semantic relations from the cognitive perspective, showing the ease with which metonymic meaning extensions, based on figure–ground e¤ects, can be made not only in the lexicon but also in grammar. In their e¤ort to reduce homonymy, Gyori and Hegedus apply a cognitive framework to explore the potential meaning connections of cognate forms such as knee and chin. They find that these items may be linked through shared familiar knowledge, an image-schema, which can give rise to various metaphoric and metonymic projections. They agree with Koch that, due to the often explicit contiguity of the referents of the source and the target domains, metonymy requires less cognitive e¤ort than metaphor.
336
Terttu Nevalainen
2. Models and hypotheses The theoretical approaches of the four contributions di¤er but they all agree that semantic change cannot be predicted in the same way as, for example, phonological change. Justyna Robinson’s innovative study explicitly extends the variationist sociolinguistic paradigm to recent semantic change. Her aim is to explore whether the apparent-time model that has proved its value in the study of sound change in progress can be applied to ongoing semantic change. This model assumes that language change can be traced by analysing the linguistic behaviour of successive generations and that generational di¤erences are maintained throughout their lifetimes. This generational pattern can be contrasted with the hypothesis of communal change, according to which all members of the community acquire new forms simultaneously. Maj-Britt Mosegaard Hansen contrasts the traditional truth-conditional view of semantic change with the broader pragmatic one based on conversational implicatures. She is interested not only in the development of conventional non-truth-conditional expressions from truth-conditional meanings but also in the rise of new non-truth-conditional meanings associated with these items. Following the semantic/pragmatic tendencies proposed by Elizabeth Traugott and Richard Dasher, Hansen shifts the focus of research from static to procedural meanings which can (but need not) become fully semanticized and hence part of the coded meaning of the items they are associated with. These tendencies include increased subjectification of meanings, broadening scope, and progressively more extended reference. As is the case in semantics more generally, these tendencies do not represent exceptionless or unidirectional laws but the older meanings of the source constructions typically persist and give rise to semantic layering in the lexicon. These source meanings commonly constrain the direction and extent of subsequent changes. However, when it happens, the conventionalization of contextually invited inferences draws on the mechanism of metonymy. In his contribution, Peter Koch develops a comprehensive model for what he observes is the pervasiveness of contiguity and metonymy in semantic change. He outlines the semantic relations that are needed to di¤erentiate and define metonymy, and contrasts two principles of conceptual organization: taxonomies consist of sub- and superordinate relations such as hyponyms, which are linked through cognitively relevant similarities and relate to hierarchies of categorization, whereas what he calls engynomy refers to concepts that are simultaneously present conceptually
Commentary: Theoretical Approaches
337
or perceptually. It is modelled in terms of the contiguity of concepts within semantic frames and typified by metonymy. Taxonomy and engynomy are mutually incompatible, but both kinds of relation are based on relevance. Contiguities derive from various cultural, social and anthropological parameters of relevance but appear to be relatively simple in cognitive terms. Ga´bor Gyori and Ire´n Hegedus extend the cognitive approach to semantic reconstruction. They show how this time-honoured field of study can benefit from advances in cognitive semantics. Language historians are used to identifying cognate words on the basis of sound correspondences and relying on the regularity of sound changes. However, unlike sound changes, semantic changes cannot be expected to be regular, and when overt semantic connections are lacking, phonological correspondences are not su‰cient to suggest a shared earlier meaning. The result is the multiplication of homonyms. Where a shared meaning can be suggested, the outcome of traditional semantic reconstruction can be a bundle of shared abstract features. Gyori and Hegedu s point out that a proto-vocabulary full of abstract meanings is not realistic because ‘‘the understanding of any new meaning on the basis of the original meaning involves associative processes rather than simple algorithmic operations’’. The cognitive approach they adopt to overcome these obstacles relies on human cognitive capacities that underlie meanings.
3. Methods and case studies Interested in finding the social loci of both incoming and recessive changes, Robinson tests the apparent-time model by charting the range of meanings and meaning changes of a polysemous lexical item in a socially stratified speech community. Her sociolinguistic survey provides a welcome addition to the information given by dictionaries and standard text corpora on recent change in general and regional usage in particular. Her data came from 72 South Yorkshire speakers, aged between 11 and 94, and she employed advanced statistical methods to assess the significance of her quantitative findings. The quantitative information was supplemented with the speakers’ metalinguistic comments, which proved helpful in interpreting the observed processes and their social evaluation. Robinson illustrates her approach by focusing on the polysemy of the adjective skinny. Hansen notes that Jespersen’s traditional explanation for the introduction of a reinforcing negative, the phonological weakening of ne, does not
338
Terttu Nevalainen
account well for the long period of variation between the simple and the bipartite structures. Instead, she focuses on communicative interaction, which gives rise to pragmatic inferences and creates potential contexts for semantic change. For example, discourse markers can be the result of negotiations about the next move in a dialogic exchange while modal particles arise from those about common ground. Hansen discusses the subtle roles that di¤erent types of inferences can assume in semantic change and foregrounds the relevance of ‘‘bridging contexts’’ that can be interpreted in two di¤erent ways by the addressee. She applies these pragmatic principles to account for variation and change in the Medieval French negative particles, ne followed by pas, mie and point in a corpus of 13th-century prose texts. Modelling contiguity and other cognitive relations, Koch concentrates on determining the subtypes of metonymy, its limits, and its internal unity vs. variability. The cognitive notion of ‘‘domain highlighting’’ proves relevant to the unity of metonymy, although a metonymic lexical change can be identified only after it has occurred. In delimiting the concept of metonymy, he considers its relation to metaphor. While metonymy is based on contiguity within a single frame, metaphoric relations are similaritybased and belong to two di¤erent frames. Metonymy and metaphor are hence conceptually incompatible and also distinct from taxonomies. Part–whole relations present a potential problem in delimiting the notion of metonymy. Cases such as bar (‘counter in a public house’), which has metonymically come to stand for the public house itself, can be interpreted both in terms of a part–whole relationship and in terms of location. However, the extension of the traditional notion of synecdoche to a member– category or species–genus relation can blur the distinction between engynomy and taxonomy. For example, the narrowing of the meaning of hound, ‘dog’ in Old English, to ‘a dog used for hunting’ can be understood as a relation of taxonomic subordination. For this reason Koch excludes the species–genus relation from his discussion of metonymic change but retains the part–whole relation as one of its subtypes. To compare and contrast the open-ended nature of the lexicon with the closed system of phonology, Gyori and Hegedus distinguish between generalizability and regularity. Whereas phonological reconstruction is premised on the regularity of sound change and its influence on the phonological system as a whole, individual meaning changes do not as a rule have a similar impact on other lexical elements. However, semantic change is generalizable because various general aspects that represent socially shared, culturally valid conceptualizations can characterize any
Commentary: Theoretical Approaches
339
single change. The basic mechanisms of semantic change (metaphor, metonymy, meaning restriction and extension) correspond to the ‘‘basic cognitive mechanisms that yield novel conceptualizations of the world’’. Gyori and Hegedus argue that generalizations can be made with respect to mechanisms and directions of change, and even regarding the content of changes. The latter can be achieved by comparing conceptualizations across languages. Gyori and Hegedus subscribe to the idea that human behavioural activity results in recurrent patterns that give rise to pre-conceptual mental structures or image schemas. If it is generally the case that semantic extension is rooted in human cognitive processes, the argument goes, the general tendencies of change should derive from what is universal about human conceptualization. Cross-linguistic tendencies may hence inform the quest for semantic connections between putative cognates. The case study of Gyori and Hegedus concerns the relation between the English words knee and chin, and related forms in other Indo-European languages, which can be traced back to di¤erent ablaut grades of the same PIE root.
4. Results and conclusions Robinson demonstrates how the traditional regional sense of skinny ‘mean’, ‘tight-fisted’, is losing ground; how the new senses, ‘showing skin’ (in skinny dipping) and ‘low fat’, borrowed from American English, first made their way into the speech community; and how an ameliorative change is under way in the cultural evaluation of the primary sense of skinny, ‘thin’. Her starting hypothesis of the relevance of the apparent-time construct based on generational di¤erences is borne out. The age of the speaker is shown to account for most of the observed variation; for example, the recessive regional sense ‘mean’ is largely restricted to the oldest speakers. It is, however, noteworthy that the youngest members are not necessarily the ones who introduce new senses into the community, but innovations can be introduced by other age groups as well. Moreover, important though the age factor is, it is rarely the sole speaker variable accounting for the observed variation as combinations of age and other factors such as gender, education and profession also need to be considered. This proves to be the case with the ‘showing skin’ and ‘low fat’ senses of skinny. Connected with knowledge of the speakers’ social practices, this combined information can give even more accurate indications of ongoing change.
340
Terttu Nevalainen
Hansen reports that the preverbal negator ne was originally neutral between discourse-new and discourse-old propositions but came to be associated with discourse-new contexts in the medieval period. Bipartite negation was introduced into discourse-old contexts, and Janus-faced bridging contexts allowed their reinterpretation as neutral. Hansen suggests that the introduction of bipartite negation may hence have been triggered by negotiations of common ground. The three negators pas, mie and point are not functionally interchangeable either, but can be di¤erentiated in terms of both their structural and mitigating properties. Unlike the other two, point typically occurs with partitive constructions. Reflecting its quantitative source meaning, point thus prefers predicates that can be graded. It can also be characterized as more emphatic pragmatically than the other two. Ne . . . mie is found to have mitigating properties in that it is used when addressing a social superior, while ne . . . pas is favoured when the addressee is a social inferior. In contrast to mie, pas is used to describe circumstances over which the subject can exert direct control. Hansen accounts for these aspects of selection of the three postverbal negators in terms of speech-act modification and (inter)subjectification. The fact that pas has survived in Modern French is attributable to persistence, its source noun (< L passum ‘step’) being more abstract in meaning than those of the other two (mie < L mica(m) ‘crumb’, point < L punctum ‘point’, ‘speck’). Koch shows that metonymy is a highly productive principle governing semantic change. The subtypes he discusses include processes of subjectification, delocutive change, and speaker- as opposed to hearer-induced change. As also shown by Hansen, metonymy goes beyond the lexicon giving rise to processes of pragmaticalization and grammatical change. Koch argues that it is in fact di‰cult to find discourse markers that are not the result of a metonymic process. The ubiquity of contiguity relations in the lexicon is further illustrated by word-formation processes, such as the rise of the su‰x -hood (OE had ‘condition’, ‘state’) through compounding (e.g. priesthood ). With transitory qualities, the frame implies contiguity between the quality of being x and the period of being x (e.g. childhood ). Koch is also interested in assessing the role of contiguity relations in lexical change compared to other kinds of lexical process, e.g. metaphorical similarity and taxonomic subordination. His study of 179 concepts in the Swadesh list (vocabulary with basic meanings) in five Romance languages suggests that contiguity is the single most frequent cognitive relation leading to their modern lexical expressions. Overall, Koch con-
Commentary: Theoretical Approaches
341
cludes that contiguity and the metonymic figure–ground e¤ect within frames emerge as both simple and flexible principles in semantic change. Taxonomic relations are more complex, and metaphors, involving distant concepts which belong to di¤erent frames, are considerably more so. Gyori and Hegedu s relate the ‘knee’, ‘corner’, ‘angle’ and ‘chin’, ‘jaw’ meanings of the cognates they study to a pre-conceptual BEND/CURVE schema, which is recurrent in many types of human bodily interactions with their environment. Importantly, parallel semantic correspondences are found in several other language families, including Proto-Uralic. The fact that the modern Finnish word polvi ‘knee’ (< PU *polwe) also has the meaning of ‘bend’ e.g. in joen polvi ‘river bend’ suggests the archaicness of the language – or raises the question which meaning component was diachronically earlier. The authors conclude by suggesting a more general etymological connection of the BEND/CURVE schema with the IE root *gen-, which has meanings related to bending and folding. In this context, a modern Finnish speaker finds it interesting that Pokorny also lists an IE *gˆen-root, which relates to bearing and generating. Its reflexes include generation ‘line of descent’, ‘people born at about the same time period’. The Finnish for ‘generation’ is sukupolvi or simply polvi, and the two polvi words, ‘knee’ and ‘generation’, are listed under the same head word in dictionaries. The extent to which speakers conceive them as homonyms is another matter. One cannot help wondering whether this is a culture-specific case of metonymic productivity or whether the comparative method could bring to light other cross-linguistic connections, say, through calquing. The semantician’s work continues.
5. Envoy It is not every day that substantial funding is o¤ered for basic research on semantics. The Metaphor Program announced by the US Intelligence Advanced Research Projects Activity (IARPA) is a multimillion-dollar venture which aims to ‘‘exploit the fact that metaphors are pervasive in everyday talk and reveal the underlying beliefs and worldviews of members of a culture’’. Automated methods are first to be developed ‘‘for recognizing, defining and categorizing linguistic metaphors associated with target concepts and found in large amounts of native-language text’’. In the second phase, these techniques will be used to ‘‘identify the
342
Terttu Nevalainen
conceptual metaphors used by the various protagonists, organizing and structuring them to reveal the contrastive stances’’.1 One thing that is not made clear in the call is the extent to which the methods to be developed for the recognition of conceptual metaphors are to be based on words. As Lako¤ and Johnson (2003: 244) emphasize in the new afterward to their 1980 book, conceptual metaphor is about concepts, not about words, and hence not a simple lexical matter. Equating the two is one of the major historical barriers to understanding the nature of conceptual metaphors. As far as words are concerned, the four chapters in this section discuss the much broader range of principles and mechanisms that speakers can exploit for lexical creativity.
References Lako¤, George and Mark Johnson 2003 [1980] Metaphors We Live By. Chicago and London: University of Chicago Press. Pokorny, Julius 1989 [1959] Indogermanisches etymologisches Wo¨rterbuch. Bern: Francke. Traugott, Elizabeth Closs and Richard B. Dasher 2002 Regularity in Semantic Change. Cambridge: Cambridge University Press.
1. The IARPA Metaphor Program Broad Agency Announcement (May 20, 2011) is available online at: https://www.fbo.gov/index?s=opportunity&mode= form&tab=core&id=20¤241cdc2146dc147b4014730fc807. (Accessed 27 June, 2011.)
Subject index age-related variation see variation amelioration 178, 225 analogy 35, 142, 322 antonymy 50, 178, 324 apparent-time construct 200–201, 209–210 back-formation 29 basic-level categories 51, 322 bivariate analysis 118–123 see also multivariate analysis bleaching 50, 162 body part names 324–325 borrowing/borrowed words 29–32, 36, 221 bottom-up (vs. top-down) approaches to data 51, 150, 185–189, 193 bridging contexts 239, 285 British National Corpus (BNC) 140– 142, 205–207 broadening 171–172, 178, 276, 318 category/categorization 46–50, 322 central senses see prototypes change in progress 60, 206–207, 224 cluster analysis 150–152, 169, 185 see also Variability-based Neighbor Clustering cognates, cognate forms 313 cognitive semantics 2, 112, 266, 317, 334 collocations/collocates 133–134 collostructional analysis 139–141, 156–157 see also distinctive collexeme analysis concept/conceptual field 46 conceptualization 279–283, 317–324 context 60, 114, 139, 171–172, 236– 241, 334 co-text 60, 92 context vectors see vectors contiguity 260–265, 320–321
conventionalization 81, 236 conversion 298–299 corpus data 4, 61–62, 111–112, 135– 136, 162, 184, 205–207 see also Corpus of Contemporary American English, diachronic corpus analysis, Helsinki Corpus, N-gram Corpus, Oxford English Corpus Corpus of Contemporary American English (COCA) 135–136, 148–150 see also corpus data cross-linguistic universal tendencies see universal tendencies in semantic change cultural change see extra-linguistic/ cultural change decision tree analysis 211–212 see also multivariate analysis delocutive change 281–283 density see semantic density analysis diachronic corpus analysis 149–150 see also corpus data di¤usion 81–85, 60, 223 dimensions of lexical change 298–300 directionality (in semantic change) 23, 241, 281, 318–319, 322–323 discourse markers 239–240, 287–288 distinctive collexeme analysis 156–157 see also collostructional analysis domain 22–23, 237, 260–261, 321–322 embodiment 322 encyclopaedic view of meaning 314, 317 see also extra-linguistic/cultural change encyclopaedic and linguistic knowledge 52 engynomy 262–265 Enroller project 55–56 etymology/etymons 98–99, 314–315 see also folk etymology
344
Subject index
extra-linguistic/cultural change 3, 48, 84–85, 100, 223–224, 318–321 see also encyclopaedic view of meaning extension see semantic extension facets (of meaning) 273–274 familiar/unfamiliar lexis 28–29, 178– 179 feature analysis 46, 51 figurative language/senses 28–29, 35–36 figure/ground shift 238–239, 267– 268 folk etymology 218, 292–295 folk taxonomy 51 formality/informality 89–90, 217–218 frame 261–262 gender-related variation see variation generalization see broadening generalities (in semantic change), generalizability 315–323 genre 154–156 grammatical change see metonymic grammatical change grammatical reanalysis see reanalysis grammaticalization 133, 158, 242– 243, 283–285, 290–291, 319 granularity 188–189 Helsinki Corpus 173 see also corpus data Historical Thesaurus of the OED (HTOED) 41, 97–98, 112 homonymy 171, 292, 317, 335 hyponymy 43, 277 image schemas 322–329 implicature 235–238 inference, invited inference see pragmatic inferencing innovation (lexical/semantic) 60, 209– 210, 224, 283 institutionalization 81, 89–91 Iterative Sequential Interval Estimation 186 internet see world wide web
Latent Semantic Analysis (LSA) 164– 170 lectal variation see variation lexical/semantic fields 31, 99, 110, 317 lexicalisation 50, 60, 81, 85–89, 272– 273, 287 lexical networks 91–92 lexicography 97, 102–103 logistic regression 123, 192 see also multifactorial analysis metalinguistic comments 76–78, 90– 91, 203 metaphor 23, 48, 277–278, 318, 320– 322, 341–342 – concrete > abstract mapping 23, 35–36 metonymy 238–239, 265–279, 318– 322 – metonymic grammatical change 288–291 – speaker-induced vs. hearerinduced 283–288 – types of metonymy 268–270 mitigation 250–252 motivation (of semantic change) 21–22, 224–225, 291–292 see also remotivation multi-dimensional scaling (MDS) 175–177 multifactorial analysis see multivariate analysis multivariate analysis 123–127, 189– 192, 211–212 see also logistic regression, regression with breakpoints, decision tree analysis, Poisson regression N-gram Corpus 80 see also corpus data narrowing 175–177, 275–276, 318 negation 243–247 neologisms 60 onomasiology, onomasiological change 111, 319 Oxford English Corpus (OEC) 61, 205 see also corpus data
Subject index Oxford English Dictionary 18–21, 24– 25, 44–45, 51, 103, 150, 204 partitioning of diachronic data 149– 153, 185–188 pejoration 178, 204, 225 persistence 241 Poisson regression 190–192 see also multifactorial analysis polysemy 44, 85–87, 171, 225, 241 pragmaticalization 242–243, 287 pragmatic inferencing 235–242 predictability (in semantic change) see regularity/predictability (in semantic change) procedurality, procedural meaning 242 prototypicality, central/prototypical senses 44, 50–51, 110, 208–209, 262–263 reanalysis 285–290 reconstruction see semantic reconstruction regional variation see variation register 111, 189 see also text types regression with breakpoints 187–188 see also multivariate analysis regularity/predictability (in semantic change) 2–3, 22, 315–319, 336–337 remotivation 292–295 resolution see granularity Roget’s Thesaurus 44–45, 47 salience 50–51, 202–203, 210, 321– 322 see also prototypicality – discourse-salience 245 – onomasiological 120 – salient word senses 202 search engines see web crawlers semantic density analysis 169–170 semantic extension 23, 171, 213–214, 314, 318–324, 336 semantic fields see semantic/lexical fields
345
semantic reconstruction 313–315 semasiology 111, 118 similarity 263–265, 277–278 sociolinguistic variation see variation social variation see variation specialization see narrowing stative vs. non-stative verbs 146–147 subjectification 243, 252, 279–287 synecdoche 274 synonyms/synonymy 43–44, 115–117, 317, 335 taxonomy 51, 134, 262–265, 322 tendencies in semantic change 235, 318–323 text type 2, 75–77, 117, 153–156, 166, 252 thesauri 42, 103 see also Historical Thesaurus of the Oxford English Dictionary Thesaurus of Old English 46, 49 transparency 294 truth-conditional vs. non-truthconditional meaning 233 Variability-based Neighbor Clustering (VNC) 127–129 see also cluster analyses variation 98–99, 200–202, 246–247 – age-related 207–210, 220 – gender-related 210–211 – lectal 128, 193 – regional 77, 98–99, 190, 216–218, 246–247 – social 76, 127, 207–212, 222–223 vectors 169, 189 – context vectors 168–172 – word vectors 165–168 von Wartburg, Walther 98–101 web see world wide web web crawlers 61–65 widening see broadening word-formation 291–300 world wide web 61–62, 75–77
Index of word forms and concepts ancora 282 anger 109–111, 127–128 bar 274 belly < bælg 278 *bhergh-, *bhrgh- > *burg˚ > bury 315 jan > byrgan *bhergh-, *bhrgh- > *burgs > burg, burh, byrig˚ > borough, -bury 315 bien 240 book 273 boon, bon, bo´n 292–293 boor 280, 286 but 234 child 272 chin 315 cour, court 268 dairy 291 deer < de#or 171–179, 319 detweet 81–92, 103 do 174–175 dog 171–177, 298 dull 32–35, 102 elbow < elnboga < *alino-bugo#n 327 elpendban 43 encore 241, 281–282, 286 error 52–55 faithful 55 fowl < fugol 319 *gen- 328 *gˆenu-, *gˆonu-, *gˆnu- 326, 328 go-future (be going˚ to) 134–135, 290– 291 go´nia 326 gu#nu 327 handy 299
have 48 -hood < ha´d 291–292, 295–296 hound 171–177, 275–276, 298 hunting-dog 298 ire 110 keep V-ing 135 *kina (ku¨na¨) 327 knee 315 ku#n, ku#nu, ku#n- 327 *k c u¨nce(n) 327 lician > like 289–290 lie 53 look, look you 287–288 loyal 55 maja < *majıˇ-/ma¨ji- 327 marriage 20 mie 244–253 milksop 26–29, 101 mone#ta 271 mouse 237 *muql- > muxl-, muxur- 327 ne 243–253 observe 279 pas 243–253 peach, peach-tree 297 pe´rsik 297 pesche, peˆche 297 *picfi *pu¨cfi 326–327 pıˆd’es 326–327 piger, pigrum 267–269, 286 piRniq < *pRnR < pR- þ *-nR 327 pisi/pa¨s 326–327 point 244–253 polvi 326, 341 *polwe 326
Index of word forms and concepts pregnant 29–32, 102 puis que 239–240, 249–250 retweet 86, 91 *sa¨ncfi 326 say 140–141 science 174, 179 selbst 237 sharp 35, 102 *siÐe 326 skinny 204–207 sooth, soothsayer < soþ 54–55 snow 298 sombrero 299 teacher 291 te´moin ( *war-no#n > *war(e)nian > warn 315 *wer, *wor-o- > *wara- > wær > wary 315 while 240 white 43, 49 wilcuma > welcome 294 witness 283–284 will 140 wrath 110 *xu¨ncen 327
347