123 25 17MB
English Pages [270] Year 2024
Marion Neubauer English Nouns since 1150
Topics in English Linguistics
Editors Susan M. Fitzmaurice Bernd Kortmann
Volume 115
Marion Neubauer
English Nouns since 1150 A Typological Study
ISBN 978-3-11-131747-2 e-ISBN (PDF) 978-3-11-131771-7 e-ISBN (EPUB) 978-3-11-131790-8 ISSN 1434-3452 Library of Congress Control Number: 2023951320 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2024 Walter de Gruyter GmbH, Berlin/Boston Cover image: Brian Stablyk/Photographer’s Choice RF/Getty Images Typesetting: Integra Software Services Pvt. Ltd. Printing and binding: CPI books GmbH, Leck www.degruyter.com
Acknowledgments This book goes beyond the scope of studies of selected processes in diachronic word formation, putting into perspective all means used to enlarge the lexicon since 1150 and, ultimately, providing a basis for comparison of lexical and grammatical morphology. As such, it will be of interest to a range of linguists involved in historical word formation, borrowing, word structure, diachronic morphology, language change or language use and cognition, hopefully stimulating new research ideas in these areas. The present work is a thoroughly revised version of my doctoral dissertation, which would not have been possible without the contribution of many people over the years. The colleagues, students and friends I owe thanks to are simply too many to name here; so, for reasons of space, I focus on those who have had the greatest impact on this book. First and foremost, my heartfelt thanks go to my supervisor Thomas Berg, who inspired my fascination with language and whose knowledge and advice were instrumental in shaping my research. He always found time to discuss my ideas, offering valuable suggestions, and words of encouragement when needed, for which I am immensely grateful. Deepest thanks are also due to Stefan Hartmann, my second thesis advisor, for his thoughtful comments that prompted me to reconsider the effects of usage frequencies and helped to improve the book in more ways than one. Likewise, I am greatly indebted to Günter Radden, not only for his enduring confidence in me but also for his close reading of entire chapters of the book; his much-appreciated remarks entailed substantial revisions to enhance coherence and readability. Many thanks go to Eva Berlage for intriguing discussions and for sharing with me her recent research, some of which has found its way into this book. Of my fellow doctoral colleagues, Daniela Schröder needs to be singled out for always being willing to discuss new ideas, theoretical approaches, methodological issues, mundane concerns and personal problems, just as a good friend would. Moreover, I am extremely grateful to Benedikt Szmrecsanyi and Bernd Kortmann, who suggested and encouraged the publication of this work in the TiEL series, which seems to me the perfect home for the book. A word of thanks also goes to Natalie Fecher and Barbara Karlson at De Gruyter Mouton for their wonderful support in the publishing process. Further, Daniel Ross deserves credit for his excellent work as proofreader; all remaining errors are, of course, entirely my own responsibility.
https://doi.org/10.1515/9783111317717-202
VI
Acknowledgments
Finally, I would like to thank all my friends, especially Hardy, whose love and support have maintained, and at times restored, my sanity – without their supply of comfort and joy this book would have been unthinkable. Marion Neubauer Hamburg, November 2023
Contents Acknowledgments List of figures
XI
List of tables
XIII
V
Abbreviations and symbols 1 1.1 1.2 1.3 1.4
XV
Introduction 1 The typological profile of English: Limited to grammar? 1 Previous research on typological shifts in the lexicon 2 A twofold approach to the typological development of the nominal lexicon 3 Structure of the book 6
Part 1: Morphological typology and the English lexicon 2 2.1 2.1.1 2.1.2 2.1.3 2.1.4 2.2 2.2.1 2.2.2
Morphological typology 13 Basic concepts 13 Analyticity – syntheticity 13 Isolating – agglutinating – fusional 14 Correlation of the typological concepts 16 Gradience in morphological typology 17 Diachronic development 18 Drift in typology 19 Cyclic movement 20
3 3.1 3.2 3.3 3.3.1 3.3.2 3.4
Typological shifts in the English lexicon 24 Typological parameters in the lexical domain 24 Analytic tendencies in Old English 25 Resurgence of syntheticity in Middle English 26 Contact situation between English and French 27 Influence of French borrowings 34 Typological classification of the means of lexicon extension
37
VIII
Contents
Part 2: Means to extend the nominal lexicon since 1150 4 4.1 4.2
The database 45 Setup of corpora Data collection
5 5.1 5.2 5.2.1 5.2.2 5.2.3 5.2.4 5.2.5 5.3 5.4 5.4.1 5.4.2 5.4.3 5.4.4 5.4.5 5.4.6 5.4.7 5.5
New additions to the lexicon 51 Determining date and process of origin 51 Classifying the new nouns 55 Borrowing 55 Conversion 57 Compounding 59 Germanic affixation 62 Romance affixation 64 Distribution of new and old lexemes 66 Processes used to add new nouns 70 Overview of the relative distributions 70 Borrowing 73 Conversion 75 Compounding 77 Germanic affixation 79 Romance affixation 81 Excursus: Approaches to the integration of Romance affixes Means employed to extend the lexicon from a typological perspective 97
6 6.1 6.2 6.2.1 6.2.2 6.2.3 6.2.4 6.2.5 6.2.6 6.3 6.4 6.4.1 6.4.2 6.4.3
Word formation patterns 102 Analogy in word formation 102 Classifying the model nouns 103 Transparency as a prerequisite 104 Conversion 107 Compounding 109 Germanic affixation 113 Romance affixation 115 Multi-model nouns 119 Distribution of model and non-model nouns 120 Patterns for noun formation in terms of model lexemes Overview of the relative distributions 121 Conversion 126 Compounding 129
45 47
84
121
IX
Contents
6.4.4 6.4.5 6.5 6.6
Germanic affixation 132 Romance affixation 135 Word formation patterns from a typological perspective Typological techniques in lexicon extension 144
140
Part 3: Typological profile of the nominal data since 1150 7 7.1 7.2
Overall development of syntheticity 155 Preliminary remarks on classification aspects 155 The synthetic index of the nominal data 158
8 8.1 8.1.1 8.1.2 8.1.3 8.2
Typological subtypes: Between isolation and fusion 163 Assessment of fusion 163 Previously suggested parameters of fusion 163 Fusion quantified in terms of four parameters 167 Development of fusion since 1150 172 The slow move toward fusion: Four minutes in 850 years
9
Changes in syntheticity and analyticity
10
Typological shifts in lexical structure and word formation
175
179 186
Part 4: Discussion and conclusion 11 11.1 11.2 11.3 11.4 11.5
Typological trends in English morphology and beyond 193 Corresponding developments in the lexicon? 193 The verbal domain 194 Parallel shifts in derivational and inflectional morphology 197 Beyond morphology: Tendencies in syntax and semantics 199 The global typological profile of the English language 203
12 12.1 12.1.1 12.1.2 12.2 12.3
Typology and change: Cognitive and sociocultural roots 206 Key factors in the use and extension of the nominal lexicon 207 Physiological and cognitive mechanisms 207 Sociocultural forces 209 Broadening the view 210 A cognitive account of the typological development since 1150 212
X
13
Contents
Conclusion
References
225
Appendices
235
Index
251
219
List of figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 Figure 13 Figure 14 Figure 15 Figure 16 Figure 17 Figure 18 Figure 19 Figure 20 Figure 21 Figure 22 Figure 23 Figure 24 Figure 25 Figure 26 Figure 27 Figure 28 Figure 29 Figure 30 Figure 31 Figure 32 Figure 33 Figure 34 Figure 35
Typological cycle 20 Dixon’s clock 21 Analyticity and syntheticity in English grammar since 1150 22 Means of lexicon extension in a typological perspective 39 Distribution of old and new noun tokens across centuries 67 Distribution of old and new noun types across centuries 68 Proportional distribution of new types by process of origin 71 Proportional distribution of new tokens by process of origin 72 First use of the four most frequent Romance affixes in the data 87 Type and token distributions of Romance nouns affixed with -ance and –ation 93 Type and token distributions of Romance nouns affixed with -ment and –ity 94 Proportional distribution of new types by typological technique 98 Proportional distribution of new tokens by typological technique 99 Relative distributions of model types by potential pattern 122 Distribution of model words for conversion (relative and raw frequencies) 126 Model nouns as potential and actual instances of conversion 128 Distribution of model words for compounding (relative and raw frequencies) 129 Model nouns as potential and actual instances of compounding 131 Distribution of model words for Germanic affixation (relative and raw frequencies) 132 Model nouns as potential and actual instances of Germanic affixation 134 Distribution of model words for Romance affixation (relative and raw frequencies) 135 Model nouns as potential and actual instances of Romance affixation 138 Relative type distribution of model lexemes by typological technique 140 Relative distribution of new nouns and model nouns by typological technique 144 Proportion of fusional techniques in new additions and pattern inventory 145 Proportion of conversion in new additions and pattern inventory 146 Proportion of the different agglutinating techniques in new additions and pattern inventory 147 Development of the synthetic index in the nominal data 160 The four fusional parameters in language use since 1150 172 Fusion indices 0 to 5 arranged on Dixon’s clock 176 Typological stages of the nominal usage data since 1150 177 Development of the grammatical and lexical syntheticity indices 180 Development of the grammatical and lexical analyticity indices 182 Development of the coordinated mean lexical and grammatical indices 184 Relative type distribution by typological technique/subtype in new additions and the nominal lexicon 188
https://doi.org/10.1515/9783111317717-204
List of tables Table 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7 Table 8 Table 9 Table 10 Table 11 Table 12 Table 13 Table 14 Table 15 Table 16 Table 17 Table 18 Table 19 Table 20 Table 21 Table 22 Table 23 Table 24 Table 25 Table 26 Table 27 Table 28 Table 29 Table 30 Table 31 Table 32 Table 33 Table 34 Table 35
Morphological typology: Descriptive levels 16 Distribution of nouns from 1150 to 2000 50 Distribution of old and new lexemes across centuries 67 New types and tokens as a result of borrowing 74 New types adopted by borrowing according to source language group New types and tokens derived by conversion 76 New types derived by conversion according to original word class 76 New types and tokens formed by compounding 77 Distribution of new compound types by word class of constituents 78 New types and tokens derived by Germanic affixation 79 Germanic prefixation and suffixation manifest in new types 80 New types and tokens derived by Romance affixation 82 Romance prefixation and suffixation manifest in new types 83 Transparent and intransparent nouns ending in -ation, -ity, -ance and –ment 86 Distribution of nouns suffixed with -ity and -ance by status of their base Distribution of nouns exhibiting allomorphic variants of -ity and –ation Distribution of medians, maximum values and outliers per period 91 Transparent and intransparent types ending in -ation, -ity, -ance and -ment (adjusted for extremely high-frequency types) 91 Distribution of new nominal types and tokens by typological technique Germanic nominal affixes documented by the data in this study 113 Romance nominal affixes documented by the data in this study 117 Model and non-model nouns (absolute and relative frequencies) 121 Type and token distributions of model nouns by potential pattern 123 Model nouns for conversion by word class 127 Model nouns for compounding by word class of constituents 130 Germanic prefixation and suffixation manifest in model nouns 133 Romance prefixation and suffixation manifest in model nouns 136 Type and token distributions of model nouns by typological technique Synthetic index: Measures of central tendency and dispersion per period Fusion index calculated for sample nouns 168 Development of fusion based on the computed fusion index per century Mean fusion index of the nominal usage data per century 175 Syntheticity index: Mean and dispersion across texts per century 181 Analyticity index: Mean and dispersion across texts per century 183 Relative distribution of new verbs according to origin based on the OED
https://doi.org/10.1515/9783111317717-205
74
88 89
100
141 161 174
196
Abbreviations and symbols < > ADJ ADV AND BT c. CF eModE Gmc IE INTERJ lModE ME MED N O OE OED ON PDE PN PP PREP Rom TTR V
derived from developed into adjective adverb Anglo-Norman Dictionary Bosworth-Toller Anglo-Saxon Dictionary circa combining form early Modern English Germanic Indo-European interjection late Modern English Middle English Middle English Dictionary noun object Old English Oxford English Dictionary Old Norse Present-Day English proper noun prepositional phrase preposition Romance type–token ratio verb
https://doi.org/10.1515/9783111317717-206
1 Introduction 1.1 The typological profile of English: Limited to grammar? This book describes the developments of the English nominal lexicon in use since 1150 from a typological perspective. Its point of departure is the linguistic commonplace that English has developed from a synthetic into an analytic language: Old English (OE), so the story goes, was a synthetic language but lost most of its inflections during the Middle English (ME) period, thereby gradually evolving into the analytic type evident in Present-Day English (PDE). This narrative, perpetuated by handbook articles (e.g., Lass 1992; Dietz 2015b), apparently implies that the language’s movement toward analyticity has continued unabated, though a recent survey by Szmrecsanyi (2012), based on empirical data from early ME to the 20th century, suggests that the historical trajectory has been less straightforward. The sweeping statement about the analyticity of the English language harks back to typological classifications established in the 19th century when languages were classified holistically along a single parameter, namely their morphological structure (e.g., Croft 2003: 46). While today’s typologists acknowledge additional parameters, such as word order, the traditional basis for classification is still deemed justified since morphology is “of fundamental importance to the over-all characterization of a language” (Greenberg 1960: 180; see also Comrie 1989: 42). Crucially, morphology in typological research is usually limited to inflectional morphology, raising the question whether analytic trends in English are, in principle, restricted to the grammatical domain or whether they may be expected to surface in lexical morphology as well. Depending on how we conceptualize the relationship between inflection and derivation – or between grammar and lexicon, for that matter – the answers to this question will differ greatly. In the structuralist tradition, grammar and lexicon are construed as discrete categories; correspondingly, inflection, considered part of grammar, and derivation, located in the lexicon, are disparate domains (e.g., Quirk et al. 1985: 13; Huddleston & Pullum 2002: 28; see also Anderson’s (1982) account from a generativist perspective). Since such approaches take a modular view of language, drawing a clear distinction between grammatical and lexical morphology with no room for overlap, scholars working within these frameworks would not anticipate analytic trends in English grammar/inflection to surface in the lexicon/derivation – any observation contrary to this expectation would be due to chance. By contrast, the cognitive approach adopted in this book would regard corresponding typological developments in the lexical and grammatical domains as neihttps://doi.org/10.1515/9783111317717-001
2
1 Introduction
ther accidental nor unexpected. Rejecting a categorical distinction between inflection and derivation, cognitive linguists consider differences between inflectional and derivational morphology to be gradual as “all morphological categories belong on a continuum that ranges from lexical to inflectional” (Bybee 1985: 85). Since they allow for overlap between grammatical and lexical morphology, analytic trends observed in English inflection may be expected to materialize in derivation as well. But what could be the unifying factors? Broadening the view, cognitively motivated approaches “start from the assumption that grammar and lexicon are part of the same continuum” (Divjak 2019: 118), namely language, which emerges and changes through usage. Crucially, language use itself is governed by experience, physiological and cognitive mechanisms as well as sociocultural conditions (e.g., Beckner et al. 2009). Since these factors are domain-general, affecting basically every aspect of human behavior, it stands to reason that they underlie the use of all linguistic units, whether grammatical or lexical. In short, I would anticipate similar typological developments in grammar and lexicon due to the unifying factors anchored in the human agent.
1.2 Previous research on typological shifts in the lexicon So far, typological developments in the lexical domain have been largely neglected (Haselow 2012a), and the scattered papers in this respect (e.g., Kastovsky 2006a, 2006b) seem rather impressionistic. The first (and to my knowledge only) to break ground is Haselow’s (2011) investigation of nominal derivation in OE and early ME based on language use. He observes a general reduction of suffixed noun types and tokens, interpreted as “a decline in the frequency of use of bound morphemes” (Haselow 2011: 188), and a change from stem-based to word-based affixation, attended by a shift from variant to invariant bases, which he considers “an indicator for the shift towards a higher degree of analyticity” (Haselow 2011: 233). These systematic changes in derivation accompanied those in inflection and, consequently, “contributed to the shift of English from a predominantly synthetic towards a predominantly analytic language” (Haselow 2011: 240). Still, the shift toward analyticity in the lexical domain is supposed to have been reversed since ME due to the huge influx of Romance loanwords (e.g., Haselow 2012a; Kastovsky 2006b).1 Importantly, claims about a reversed trend are not well-grounded empirically as they rely entirely on observations about structural
In this book, the term ‘Romance’ chiefly refers to French and Latin; other Romance languages, such as Spanish and Italian, feature far less prominently in my data.
1.3 A twofold approach to the typological development of the nominal lexicon
3
properties of affixed words and lack any kind of quantification. The object of the present study, then, is to remedy these deficits, providing quantified data for the time from 1150 to 2000 in order to trace the typological development of the English nominal lexicon, understood as the vocabulary used by the language community. Designed as a sequel to Haselow’s (2011) work, the investigation focuses on nouns at the expense of other lexical word classes, which may be perceived as a weakness of the study. However, the choice of nouns as research object is justified insofar as this category constitutes the largest word class in terms of types and tokens (e.g., Berg 2014); thus, the ultimate rationale for focusing on nouns is their quantitative representativeness of the lexicon.
1.3 A twofold approach to the typological development of the nominal lexicon Grounded in a usage-based framework, cognitive linguistics presumes that language – and its structure – emerges and changes through usage (e.g., Bybee 2015: 10; Divjak 2019: 5), vividly described in metaphorical terms by Ellis (2017: 91): “Language and usage are like the shoreline and the sea.” In order to trace typological changes in nominal language use, the empirical investigation addresses two issues: (i) the processes employed by speakers to extend their nominal word-stock (ii) the structure of the entire nominal lexicon in usage This approach differs notably from previous research by explicitly distinguishing between the processes used to extend the lexicon and the structure of the words included in the lexicon. Unfortunately, this distinction is usually neglected in (diachronic) studies, word structure being more or less tacitly equated with word formation. It stands to reason that a lexeme’s internal makeup is a direct result of the process used to derive it; still, not infrequently, the structure of a word is not a reliable indicator of its origin, which is particularly true for complex loanwords and nouns derived by conversion. Consequently, claims raised in the literature about the use (and productivity) of Romance affixes in English, based on the structure of borrowed words, have antedated the respective affixation processes (see also Dalton-Puffer 1996: 220).
4
1 Introduction
Processes to extend the lexicon Since changes in language use are most directly represented by linguistic innovations, my primary focus is on the means employed by speakers to enlarge their nominal word-stock since 1150, asking if language users chose more analytic or more synthetic means to extend the English lexicon at different points in time. While Haselow’s (2011) investigation serves as a good starting point for this endeavor, its exclusive focus on (nominal) suffixation is too limited to substantiate claims about typological trends in derivation. Therefore, the present study casts the net wider to include affixation, compounding and conversion, i.e., word formation processes that constitute the “central morphology” (Bauer 2019: 84), as well as borrowing. Only if all processes that have been used to extend the nominal lexicon are correlated will it be possible to evaluate typological developments in this domain. To the best of my knowledge, no correlations of this kind have so far been established either in synchronic or in diachronic usage-based studies, which have usually focused on individual processes. Accordingly, diachronic research in the nominal field consists of isolated investigations, such as the valuable long-term study of conversion by Biese (1941), the in-depth analysis of compounding in (early) ME by Sauer (1992) or the richly detailed work on borrowing since early OE by Durkin (2014). Additionally, noun suffixation has been investigated in multiple studies, especially since the advent of modern corpus linguistics. Hence, the classical paper on the long-term development of Romance suffixes by Gadde (1910) has been supplemented by corpus data for ME (Lloyd 2011) and from the 14th to the 17th centuries (Palmer 2009), and several studies have compared the usage of Romance and Germanic suffixes in ME (Dalton-Puffer 1996;2 Gardner 2014) and since the end of ME (Cowie 1999), to name but a few; for a comprehensive overview see Dietz (2015b). All these works with their focus on specific processes – or particular suffixes, for that matter – provide valuable details on certain aspects, e.g., semantic development, productivity or register distribution. By contrast, the present study advocates a global approach, taking all means of lexicon extension into account, which, by necessity, comes at the expense of such detailed information. Still, the rigorous quantification and comparison of empirical data will definitely offer more details than can be obtained from listings in handbook articles (e.g., Burnley 1992; Nevalainen 1999) or treatises on word formation (e.g., Marchand 1969; Bauer 1983; Dixon 2014). Crucially, the cor-
It should be noted that Dalton-Puffer’s (1996) investigation is not restricted to nominal suffixes but includes derivational verbal and adjectival suffixes as well.
1.3 A twofold approach to the typological development of the nominal lexicon
5
relation of the various means used to extend the nominal lexicon enables us for the first time to evaluate the quantitative importance of the respective methods, thereby providing grounds to critically reflect on previous claims. As the focus is on all word formation processes as well as borrowing, the notion ‘morphological typology’ is understood in its broadest sense, encompassing not only derivation by bound morphemes, but also compounding, conversion and borrowing. Structure of the lexicon While we would expect that human agency and the structure of the lexicon are somehow interrelated, the impact of speakers’ linguistic behavior, as reflected by new lexical additions, turns out to have been fairly modest: Correlating the numbers of new and old nouns in usage for each century reveals that the proportion of tokens instantiating new lexemes has not even reached 5% on average per period. Thus, the typological techniques used to extend the nominal lexicon have barely affected the typological profile of the contemporary word-stock. On the other hand, language users are thought to have been influenced in their choice of means for extending their vocabulary by the typological profile of the lexicon (e.g., Kastovsky 1992b). Obviously, this view cannot be taken literally within a cognitive framework, which would regard a language’s typological profile as an epiphenomenon, well suited to describe the language’s structure, but not as a force impacting on speech behavior. Still, we could provisionally interpret the term ‘typological profile’ in this context as a shortcut to denote the linguistic behavior of previous generations that generated the language’s profile. Viewed this way, we may assume that speakers have chosen techniques to enlarge the lexicon in accordance with the typological profile of their word-stock, but this connection can only be validated if the morphological type of their vocabulary is first determined. In order to establish the structure of the word-stock in each century and the attendant typological shifts since 1150, all nouns are analyzed in detail for their morphological makeup, adopting token-based approaches that highlight different aspects. Besides providing insights as to whether the lexicon in use has become more analytic or synthetic, this kind of data is best suited for comparison with grammatical data to evaluate if the typological developments in grammar and lexicon have paralleled each other, thereby reflecting a global tendency in English morphology.
6
1 Introduction
1.4 Structure of the book Part 1 sets the stage by reviewing previous research relevant to the subsequent investigations. Chapter 2 summarizes the basic typological concepts together with the parameters suggested to differentiate between language types, rounded off by a short presentation of how typological developments have been modeled. In Chapter 3, the focus is narrowed to derivation, starting with proposals concerning the transfer of typological parameters from the inflectional to the derivational domain. A short summary of analytic tendencies in OE is followed by an overview on how typological shifts in the lexical area since that time have been presented in the literature. In this context, special attention is given to the contact situation in the ME period because of the strong claims about the influence of Romance borrowings on the English language type. The chapter concludes with suggestions on how the means available for extending the lexicon are best classified typologically. Part 2 addresses the first key issue raised in the previous section, i.e., the processes by which language users have, or could have, enlarged their nominal word-stock. Chapter 4 details the database for the investigation, specifies the extraction of the data and gives a first impression of the distribution of nominal types and tokens in the corpora. Subsequently, Chapter 5 focuses on new additions to the lexicon. After dealing with methodological issues, the distribution of new and old lexemes is briefly outlined, before I turn to the processes actually used to add new nouns to the lexicon, correlating the different means in terms of types and tokens. Due to the persistent difficulties in assessing the status of Romance affixation in historical English, the chapter includes a rather lengthy excursus that probes into the proposals concerning the emergence of Romance bound morphemes offered so far. In conclusion, the actual processes evidenced by the new lexemes are presented from a typological perspective. While the origin of a word is a static historical fact, more or less reliably professed by dictionaries, it does not preclude the possibility that the lexeme may be derived by other means in (later) language use. Therefore, Chapter 6 investigates the distribution of word formation patterns that have been available for language users to coin new nouns, starting from the premise that analogy is a powerful cognitive mechanism operative in word formation. In order to assess the possibilities for analogical derivations, all new and old nouns were classified as to their potential for promoting models for derivation by conversion, compounding and Germanic and Romance affixation in each century. After establishing the specific classification criteria, the chapter correlates all noun formation patterns that speakers had at their disposal during the respective centuries, followed by a depiction in typological terms. Finally, the typological techniques employed in actual language use and those exhibited by theoretically available patterns are
1.4 Structure of the book
7
compared to determine the typological development of the means to extend the nominal lexicon. Part 3 tackles the second key issue introduced in the preceding section, investigating the structure of the entire nominal word-stock in usage to establish the typological profile of the data at different points in time. To this end, the nominal data are subjected to three separate quantitative analyses. Chapter 7 draws on Greenberg’s (1960) index to measure the degree of syntheticity, based on the number of morphemes per word. Subsequently, Chapter 8 scrutinizes the extent of word-internal fusion, advancing a relatively simple, novel approach to its calculation, particularly suited for historical data. The degree of fusion evident in the nominal vocabulary in each century, then, allows for the historical stages of nominal English to be arranged around a clock face, as proposed by Dixon (1997), to illustrate the typological development in this respect. Next, Chapter 9 traces the development of analyticity and syntheticity in the nominal language, following the procedure adopted by Szmrecsanyi (2012), which provides a perfectly sound basis for comparing the findings to the typological trends recently observed in grammar. Finally, Chapter 10 discusses the typological shifts in the structure of the nominal usage data in relation to developments apparent in the use of typological techniques to extend the nominal lexicon. Part 4 places the findings of the empirical parts in a broader context, considering them from two distinct angles. Chapter 11 takes up the results from Part 3 and tentatively relates them to developments in other linguistic areas. Besides shifts in inflectional morphology, we observe an expanded use of periphrasis, the replacement of prefixed verbs by phrasal verbs, word groups attaining word status and an increased semantic underspecification of English words, which raises the question whether these changes have converged so that English can be said to have developed harmonically with regard to its typological profile. A different perspective is adopted in Chapter 12, embedding the findings of Part 2 in a cognitive framework with language users as the central focus. After specifying the physiological and cognitive mechanisms as well as the sociocultural factors that are presumed to govern linguistic behavior, I propose a cognitive account of the typological trends in the nominal lexicon since OE, specifically focusing on why speakers have preferred, or dispreferred, certain processes for enlarging their nominal word-stock at different stages. In conclusion, Chapter 13 reviews the main findings of the book with regard to their theoretical impact and offers suggestions for future work, some deriving from the global design of the study, which is necessarily limited in many aspects.
Part 1: Morphological typology and the English lexicon
Chapter 2 summarizes the basic concepts of morphological typology, delineating the terms ‘analytic – synthetic’ and ‘isolating – agglutinating – fusional’ as well as the parameters introduced to define a certain language type. This general overview is rounded off with a description of how typologists have modeled diachronic developments so far. Subsequently, the scope is limited in two respects: The focus is on the lexical domain exclusively, and the perspective shifts from crosslinguistic typology to the typological profile of an individual language. More precisely, Chapter 3 reviews how the English lexical morphology is supposed to have changed typologically since early OE, giving special emphasis to the contact situation with French during the ME period, which is assumed to have reversed the language’s analytic development. Against this background, I finally devise a scheme to typologically classify the processes of lexicon extension; this classification system is essential for the data analyses presented in Part 2.
https://doi.org/10.1515/9783111317717-002
2 Morphological typology 2.1 Basic concepts Language types may be classified as analytic or synthetic, or they may be categorized as isolating, agglutinating or fusional. Given the terminological mix-up noticeable even in typological literature (e.g., Comrie 1989: 46; Tauli 1945/49: 84–85), it is worth stressing that both sets represent distinct concepts illuminating different aspects.
2.1.1 Analyticity – syntheticity The concept denoted by ‘analyticity’ and ‘syntheticity’ refers to the internal complexity of a word measured in the number of its constituent morphemes. It has been introduced by Sapir (1921) to differentiate between analytic, synthetic and polysynthetic languages based on the number of concepts encoded in one word. In an analytic language one concept is expressed by a single word so that “the sentence is always of prime importance, the word is of minor interest” (Sapir 1921: 135). In a synthetic language, by contrast, concepts are combined into one word; however, the number of concepts per lexeme is smaller than in polysynthetic languages, which are vaguely defined as “more than ordinarily synthetic” (Sapir 1921: 135). To avoid possible confusion, it should be noted that, in this context, Sapir uses the term ‘concept’ in the sense of ‘morpheme’ (Sapir 1921: 24–42), which was not firmly established as a linguistic notion in his day. Accordingly, the parameter to assess the degree of internal complexity, “the index of synthesis” (Comrie 1989: 46), is now defined as the number of morphemes per word (Aikhenvald 2007b; Comrie 1989: 46; Croft 2003: 46). Hence, analyticity is characterized by a one-toone correspondence between words and morphemes, whereas the hallmark of syntheticity are words that consist of more than one morpheme. Sapir’s third category, polysyntheticity, may be considered a special case of syntheticity since the parameter is quantifiable, although no threshold values have been proposed to demarcate polysynthetic from synthetic languages. Obviously, the category is dispensable and has been largely abandoned in current research, mentioned only in passing, if at all.
https://doi.org/10.1515/9783111317717-003
14
2 Morphological typology
2.1.2 Isolating – agglutinating – fusional The triplet isolating, agglutinating and fusional denotes the “three canonical types of language” (Comrie 1989: 43), introduced in the 19th century (Croft 2003: 45–46).3 The concept loosely referred to the technique of joining morphemes and has been tightened by Sapir because “it [had] been generally obscured by a number of irrelevancies” (Sapir 1921: 136). He proposes to distinguish between isolating, agglutinative, fusional and symbolic language types depending on how strongly fused the constituent morphemes are. At one extreme, we find isolating languages that consist of simplexes; in the absence of morphemes to combine, the degree of fusion is zero.4 Situated at the other extreme are symbolic languages with words that resemble monomorphemes but denote an additional (grammatical) category by internal change, such as the umlaut plural (goose – geese) or the ablaut past tense (sing – sang). In these cases, then, the two ‘morphemes’, the lexical base and the inflection, can be considered maximally fused, but symbolism as a fusional language type has been criticized (Greenberg 1960) and is largely ignored in today’s typological literature. Between these extremes, the agglutinating type is differentiated from the fusional one, based on the kind of boundary between the morphemes in the word: In agglutinating languages, complex words can be easily segmented, whereas the word-internal boundary in fusional languages is blurred, so that the morphemes are more fused. This distinction seems straightforward, but the devil is in the detail or – more precisely – in the delimitation of fusion. The parameter ‘fusion’ is ambiguous with respect to the fields covered: In its broadest definition, fusion refers to cumulative exponence and formal alteration; more narrowly defined, the parameter comprises formal aspects only. Cumulative exponence, or simply cumulation (e.g., Haspelmath 2009a), occurs when more than one category is expressed by one form. In fusional languages, affixes may “fuse together several grammatical categories (such as number, gender and case) into a single morpheme” (Croft 2003: 46; see also Comrie 1989: 44; Aikhenvald 2007b), whereas in agglutinating languages, each category is realized by separate exponents.
Instead of ‘fusional’, some authors use the term ‘flectional’ or ‘inflectional’ in line with 19th century practice. Since inflection is expressed in both agglutinating and fusional languages, the use of the term ‘flectional’ or ‘inflectional’ to denote fusional properties is misleading and eschewed in this book (see also Comrie 1989: 45; Aikhenvald 2007b). It has been argued that the parameter is, by definition, irrelevant for the isolating type (Comrie 1989: 46; see also Greenberg 1960), which would eliminate isolating languages from the triplet presented here. I, nevertheless, retain this language type, as is usually done in current research.
2.1 Basic concepts
15
Sapir (1921) does not discuss cumulation but focuses on formal alteration exclusively, equating fusion with the extent of phonological alterations induced by affixation (see also Greenberg 1960). Accordingly, he postulates: “A word like goodness illustrates ‘agglutination’, books ‘regular fusion’, depth ‘irregular fusion’, geese ‘symbolic fusion’ or ‘symbolism’” (Sapir 1921: 140). Complex words in agglutinating languages generally exhibit transparent word-internal boundaries and the shape of their morphemes is invariant, whereas in fusional languages morpheme boundaries are obscured, usually indicated by “allomorphic changes conditioned in the stem by the affix, or the reverse, allomorphic changes conditioned in the affix by the stem, as well as actual phonological fusion at the boundary between the two” (Bybee 1996: 252). Allomorphy, classified as regular fusion by Sapir (see above), is, however, not universally accepted as an indicator of fusion. It is argued that, if formal variation is predictable from general phonological processes affecting the language, the variant form is the result of “automatic alternation” (Greenberg 1960: 185). Such complex words are thus considered to instantiate agglutination, not fusion, which is reflected by relaxed criteria for the agglutinating type: The constituent morphemes need to have “a reasonably invariant shape” (Comrie 1989: 43, emphasis added), and the affixation process may induce some, though “relatively little phonological alteration” (Croft 2003: 46, emphasis added). The vagueness of the wording is due to the circumstance that processes prompting automatic alteration have not been demarcated unanimously. Doubtlessly, syllabification is a general phonological process in English, which “can work across the morpheme boundary, especially if the suffix is vowel-initial” (Bauer, Lieber & Plag 2013: 168); thus, resyllabification caused by affixation is a case in point. Yet this is only one of the many processes that may influence the morphemes’ form and, concomitantly, blur the word-internal boundary; other phenomena are stress shifts (e.g., ex’pect > expec’tation), alternations of the base vowel (e.g., ser[i:]ne > sere[ɛ]nity),5 changes in the base-final segment, such as velar softening (e.g., electri[k] > electri[s]ity) or palatalization (e.g., eras[s|z]e > era[ʒ]ure), to name but a few. Which of these alterations might have been caused automatically by general phonological processes pertaining to a specific language? Greenberg (1960: 185), for one, illustrates automatic alternation with an instance of base allomorphy, namely the voicing of the base-final consonant leaf > leaves although general automaticity might be doubted in view of plural forms that
Phonological transcription is based on British English, as indicated by the Oxford English Dictionary (OED).
16
2 Morphological typology
preserve the base’s shape, as exemplified by roof > roofs.6 Going still further, Comrie (1989: 50) claims that some cases of base vowel alternation, such as div[ʌɪ]ne > div[ɪ]nity, are “essentially predictable”, thereby qualifying as automatic alternation. In short, we can quite easily differentiate agglutinating and fusional language types based on phonological alteration as long as the parameter also includes those segmental changes that are supposed to arise automatically. Phenomena adduced to illustrate automatic alternation, and thus agglutination, such as base allomorphy or base vowel change, are better considered instances of fusion, albeit to different degrees.
2.1.3 Correlation of the typological concepts It should have become clear at this point that the typological concepts analyticity and syntheticity, on the one hand, and isolating, agglutinating and fusional, on the other, are based on independently defined parameters, namely the number of morphemes and the degree of phonological alteration. Nevertheless, both parameters overlap to a certain extent, and the respective classifications may properly be regarded as “intercrossing schemes” (Sapir 1921: 145; see also Haselow 2011: 18), displayed in summary form in Table 1.7 Table 1: Morphological typology: Descriptive levels. Language type
No of morphemes
analytic
synthetic
>
Subtype (technique)
Fusion
isolating
Ø
agglutinating
clear-cut boundary, invariant morpheme forms
fusional
obscure boundary, variant morpheme forms
As mentioned above, analyticity and the isolating type show a one-to-one correspondence since monomorphemic words are inherently isolated morphemes, i.e., combinations at the word level are logically excluded. Against this background, the terms ‘analytic’ and ‘isolating’ can be used interchangeably. For the sake of readability, I refrain from giving phonetic transcriptions if the phonological change is mirrored by graphic alteration. The parameter ‘fusion’ in Table 1 is used in its narrow definition (see above) because cumulation is irrelevant in the context of this book.
2.1 Basic concepts
17
By contrast, syntheticity cannot be equated with either agglutination or fusion as the morphemes of a complex word can, in principle, be linked by both techniques. Depending on the juncture and shape of the morphemes, agglutinating and fusional languages need to be distinguished as subtypes of the synthetic language type.
2.1.4 Gradience in morphological typology The language types introduced in this chapter are idealizations insofar as no language manifests a certain type “with absolute purity and consistency” (Jespersen [1922] 1969: 421). Even analytic languages, such as Vietnamese and Chinese, use agglutinating techniques, namely compounding (Aikhenvald 2007b; Tauli 1958: 83), and Turkish, the textbook example of an agglutinating language, displays instances of fricativization or devoicing of the base finals, thus fusion at the morpheme boundary (Hagège 1990). Accordingly, morphological typology must be understood as “a continuous typology, i.e., for a given language we can assign that language a place along the continua defined by the index of synthesis and the index of fusion” (Comrie 1989: 47). Due to this gradualness, statements about a language’s type are couched in cautious terms, such as when English is described as a predominantly analytic language. The crucial methodological question surrounding gradualness, then, is how to capture gradience and how to determine the place of a language along the cline. In his seminal paper, Greenberg (1960) proposes a quantificational approach to morphological typology, thereby moving beyond earlier impressionistic claims about language types. Proceeding from Sapir’s typological concepts, he introduces ten classification features in total, each of which is strictly quantifiable in terms of their relative text frequencies. The principle is as ingenious as simple: Each index is the ratio of two units calculated by adding their occurrences in a given text. Along these lines, syntheticity is determined by the synthetic index, defined as the result of dividing the total number of morphemes in a text by the number of its word tokens (Greenberg 1960: 185). By necessity, the synthetic index of a language cannot be lower than 1.00 as each word consists of at least one morpheme; at the upper end, there is theoretically no limit, although, according to Greenberg, an index value greater than 3.00 appears to be rare. By way of illustration, he calculates indices for several languages, finding that the synthetic index for English decreased over the last millennium (see Chapter 7). Similarly, agglutinating and fusional languages are differentiated on the basis of the index of agglutination (Greenberg 1960: 185). Here, the number of agglutinating constructions, i.e., complex words without phonological alteration un-
18
2 Morphological typology
less automatically induced, is divided by the number of morpheme junctures occurring in a given text. High index values are characteristic of agglutinating languages, whereas low values indicate fusional languages; the threshold seems to be arbitrarily set at 0.50 (Greenberg 1960: 194). Based on the respective calculations for OE (0.11) and PDE (0.30), the agglutination index for English has increased, or, conversely, the language has become less fusional. In this approach, fusion is assessed indirectly by the absence of agglutination at the morpheme boundary, and the binary variable, agglutinating vs. non-agglutinating, does not incorporate different degrees of fusion.8 Greenberg’s methodological approach has enormously contributed to usagebased accounts of typology because it enables us to capture the degree to which languages manifest a certain language type. Instead of intuitively determining a language’s typological profile, classifications to this effect are based on token frequencies in strictly quantifiable terms. Gauging the extent to which specific features are distributed across a given language accommodates a gradient view of typological traits, in line with linguistic reality.
2.2 Diachronic development So far, I have summarized synchronic concepts of morphological typology without exploring the diachronic dimension, namely the “commonplace fact [. . .] that languages can change type” (Croft 2003: 233). The supposed truism of this statement hinges on its loose wording: Besides the ambiguity of the term ‘languages’ in this context, the phrase ‘change type’ seems to remain deliberately vague as to whether the typological change is nondirectional or whether it takes a specific direction. Specified in these respects, the above statement would be far from trivial in view of dissenting opinions that are briefly outlined in the following paragraphs.
The remaining eight indices can be summarized as follows: Besides introducing prefixing and suffixing indices, Greenberg (1960) tries to capture Sapir’s (1921: 86–126) conceptual typology, proposing six additional indices. The indices of compounding, derivation and inflection indicate the distribution of root, derivational and inflectional morphemes, i.e., the substance of Sapir’s concepts; the indices of word order, concord and non-concord inflection specify how the means of relating words are distributed in the language, thus elaborating on Sapir’s relational concepts.
2.2 Diachronic development
19
2.2.1 Drift in typology Based on the distinction between arbitrary and (relatively) motivated signs, Saussure ([1916] 1959: 133–134) differentiates languages according to the degree of their motivation: At one extreme, he locates analytic languages like Chinese; at the other extreme, he situates highly synthetic languages like Sanskrit.9 With respect to the development of a language, he contends that the entire system continuously passes “from motivation to arbitrariness and from arbitrariness to motivation [in a] seesaw motion” (Saussure 1959: 134). In a similar but more explicit manner, Sapir (1921: 183) maintains that language change is comprehensive, involving “[e]very word, every grammatical element, every locution, every sound and accent”. This change, labeled ‘linguistic drift’, is supposed to proceed in “a certain consistent direction” (Sapir 1921: 183), even if individual domains develop at different speeds. Proposals to conceptualize typological change as a drift have been variously criticized. As an example, Haspelmath (2009a) may be cited: Having investigated typological trends in noun declension and verb conjugation across 30 languages, he finds not enough correlation between the two inflectional domains to convince him that languages can be uniformly classified. However, research seems to disagree on the question whether typological tendencies in different linguistic areas converge, and this issue is discussed more thoroughly in Chapter 11. While Saussure (1959) and Sapir (1921) recognize that language change is directional, no crosslinguistically uniform direction of the development is assumed. By contrast, Jespersen (1969: 425) claims that typological shifts are unidirectional, namely toward increasing morphological simplification and “freely combinable elements”.10 He even contends that the tendency toward more analyticity “is a universal fact of linguistic history” (Jespersen 1969: 366), a claim which is untenable in view of languages, such as Finnish, that manifest an increase in morphological complexity (Gelderen 2016).
Although the terms ‘analytic’ and ‘synthetic’ are not used by Saussure, these correspond closely to what is termed “ultra-lexicological” and “ultra-grammatical” (Saussure 1959: 134) and are thus employed here. In fact, Jespersen’s (1969: 366) label is ‘grammatical simplification’, but since his concept of ‘grammatical’ includes derivational morphemes as well (Jespersen 1969: 214), I use the term ‘morphological’ instead.
20
2 Morphological typology
2.2.2 Cyclic movement Conceptualizing typological change as movement along a cline between two poles, be it bidirectional (Saussure) or unidirectional (Jespersen), is seriously deficient because the model does not provide any juncture where languages may pass directly from most synthetic (fusional) to extremely analytic (isolating) types (and vice versa, at least in theory). Yet such transitions are observable in the history of individual words: Two separate words that frequently occur in a fixed sequence, say cup and board, are likely to lose their phonetic and semantic autonomy, resulting in a single-stressed compound, cupboard, semantically distinct from the combined meaning of its parts. Having passed from isolation to agglutination, the unit may undergo further phonological fusion, as evident in PDE /kʌbəd/, and continue to diverge semantically. Arguably, cupboard is still lingering at the fusional stage, moderately motivated by its transparent spelling, as opposed to PDE lord, which similarly originates in a compound, namely OE hlāfweard ‘loaf warden’, but has lost any motivation and, consequently, passed onto the isolating stage (see also Tauli 1958: 82–83; Gelderen 2016). Diachronically then, the three subtypes isolating, agglutinating and fusional stand in “a cyclical relationship” (Gelderen 2016: 6; see also Gabelentz 1891: 251; Hodge 1970); therefore, typological change is better conceptualized as movement along a cycle, illustrated by Figure 1.
Figure 1: Typological cycle.
The cyclic concept has been criticized insofar as complex words need not pass through the fusional stage but may simply be replaced by monomorphemes (Haspelmath 2018), which is perfectly true at the level of words.11 As such, the above examples only serve to illuminate the rationale which the typological cycle is based on and should be taken with a grain of salt. At the level of languages, however, “[c]ycles involve the disappearance of a particular word and its renewal by another” (Gelderen 2016: 3). Thus, typological stages of a language have to be de-
In fact, I might add that, instead of unidirectionally moving along the cycle, individual words can even take the opposite direction; a case in point are loanwords, which are first adopted as unmotivated simplex words and may later be reanalyzed as complex, often fused units.
2.2 Diachronic development
21
termined by taking into account the internal structure of all word tokens at different points in time; comparison of the results, then, enables us to track the language’s typological development, cyclic or otherwise. The image of a typological cycle is slightly inaccurate considering that languages never return to exactly the same position they occupied previously; their development is not reversed but runs parallel to earlier ones. Thus, typological movements are more appropriately accounted for by the notion of a spiral, “Spirale” (Gabelentz 1891: 251; see also Gelderen 2016; Haspelmath 2018). While the point is well taken, I retain the concept of the cycle for expository convenience. The cycle theory is nicely elucidated by Dixon’s (1997) proposal to envision the cycle as a clock for descriptive purposes: “If we place the isolating type at the four o’clock position, agglutinative at eight o’clock and fusional at twelve o’clock, around a clock-face, it is possible to describe recent movements in various language families” (Dixon 1997: 42). Accordingly, he makes some suggestions on where to locate Chinese and Dravidian languages, among others, which I incorporated into the depiction in Figure 2.
Figure 2: Dixon’s clock.
Figure 2 illustrates the movement of two language families around what is termed Dixon’s clock in this book: Early Chinese, at three o’clock, still displayed some fusion, developing into Classical Chinese, at four o’clock, which is thought to have been “a fairly pure isolating type” (Dixon 1997: 42), whereas the modern Chinese languages have moved toward five o’clock, “acquiring a mildly agglutinative structure” (Dixon 1997: 42). Dravidian, on the other hand, is located near the purely agglutinating type with Proto-Dravidian, at seven o’clock, still somewhat isolating, while modern Dravidian languages, at nine o’clock, have become more fusional. Proto-Indo-European (Proto-IE) is placed at twelve o’clock, the quintessential fusional stage, and its descendants are assumed to “have moved, at different rates, towards a more isolating position (some to one or two o’clock, others
22
2 Morphological typology
towards three o’clock)” (Dixon 1997: 42). Doubtlessly, Dixon’s clock is a nice descriptive tool and lends itself well to tracking the development of the English language (see Chapter 8). While none of the language families outlined in Figure 2 has, so far, rounded the cycle fully, Egyptian is supposed to have completed the cycle in the course of 3,000 years. Based on his observations about verbal inflection for three periods, Hodge (1970) argues that Old Egyptian predominantly used morphological means, whereas Late Egyptian chiefly relied on syntactic constructions, and Coptic, finally, primarily employed morphology again. With respect to the development of Egyptian, Hodge (1970: 5) thus concludes: “Our cycle is complete”. Intriguingly, a quite similar result has been obtained for the grammatical domain of the English language. Adopting Greenberg’s (1960) approach in a slightly modified fashion, Szmrecsanyi (2012, 2016) calculates an index of grammatical analyticity and grammatical syntheticity based on the token frequencies of free grammatical words, on the one hand, and inflected words, on the other, from the 12th to the 20th centuries. Combining both indices into a two-dimensional array (see Figure 3 below), Szmrecsanyi (2016: 93) observes “a cyclical merry-go-round”: While syntheticity
Figure 3: Analyticity and syntheticity in English grammar since 1150. (Source: Szmrecsanyi (2016: 102), reproduced with kind permission of John Benjamins Publishing Company.)
2.2 Diachronic development
23
initially declined, reaching its nadir in the 15th century, it has subsequently increased slowly but steadily, most markedly since the 17th century. By contrast, analyticity rose until the 14th century, remaining at high levels until the end of early Modern English (eModE), and has continuously decreased since. As might have been expected, analytic and synthetic tendencies have complemented each other, but the true surprise is “that in terms of analyticity-syntheticity coordinates, the 20th century has come almost full circle back to where we started in the 12th century” (Szmrecsanyi 2016: 103). The interesting question to be addressed is, of course, whether typological developments in the lexical domain follow a comparable cyclic trajectory (see Chapter 9). To sum up, language types can be described either in terms of internal complexity or with respect to the technique of joining morphemes. Regardless of the descriptive level, diachronic changes in morphological typology are better conceptualized as cyclic movements than as developments along a cline limited by two poles. Some of the methodological ideas presented in this chapter (Greenberg’s synthetic index, Dixon’s clock, Szmrecsanyi’s analyticity and syntheticity indices) can, in principle, be transferred to the lexical domain and are adopted in Part 3 of the book to track typological changes in the lexicon.
3 Typological shifts in the English lexicon 3.1 Typological parameters in the lexical domain While the basic distinctions between analyticity and syntheticity, on the one hand, and isolating, agglutinating and fusional language types, on the other, have been applied to the lexical domain, the parameters, introduced above, have not been systematically transferred. Instead of assessing the degree of syntheticity in terms of the number of morphemes, researchers seem to have regressed to pre-Sapirian times. Rooted in concepts from the 19th century, analytic languages are differentiated from synthetic ones based on the place of encoding (Haselow 2011: 31), namely whether information is expressed “word-externally by particles or word-internally by morphological means” (Kastovsky 2006b: 155). Thus, the quantifiable parameter to calculate degrees of syntheticity is replaced by a binary one, externally vs. internally, entailing the loss of detail for no good reason. By contrast, the parameter ‘fusion’ to distinguish between isolating, agglutinating and fusional types is retained in the manner previously described, i.e., it focuses on formal alteration caused by derivation and does not include cumulation, which Haselow (2011: 283) considers “probably irrelevant for derivation”. Accordingly, in agglutinating languages morphemes combine without phonological changes, whereas complex words in fusional languages defy a clear-cut segmentation into their constituent morphemes due to the “presence of base or affix alternations” (Haselow 2011: 31). Kastovsky (2006b: 156) contends that the traditional parameters, though useful for the grammatical domain, are insufficient to capture “the typological properties of derivational morphology and its interdependence with inflectional morphology”; thus, he introduces four additional parameters “which apply to both domains, which also play an important role in the historical development” (Kastovsky 2006b: 156). The morphological status of the base denotes whether the input is a free lexeme or a bound stem or root. In PDE, regular inflection and Germanic word formation are word-based, as opposed to non-native derivation, which is described as “partly word-based, partly stem-based” (Kastovsky 2006b: 158–159). The second parameter refers to the existence of lexical strata that may have emerged within a language. In that event, the means of word formation and inflection are systematically stratified across different levels with particular phonological and morphological properties. If the etymological origin is to be stressed, the layers are distinguished into native and non-native ones; if emphasis is laid https://doi.org/10.1515/9783111317717-004
3.2 Analytic tendencies in Old English
25
on structural features, the distinction is drawn between Level 1 and Level 2, or Stratum 1 and Stratum 2 (Kastovsky 2006b: 158). The next two parameters concern morphophonemic alternations generating allomorphy and affix position. While the first of these seems a bit redundant in view of the parameter ‘fusion’, affix position, certainly an interesting issue in its own right, is only mentioned here to complete the picture, as this aspect is of minor importance in the present study.
3.2 Analytic tendencies in Old English OE witnessed a shift from stem-based to word-based inflection to such an extent that, in noun morphology, “the majority pattern was word-based, the minority patterns were still stem-based” (Kastovsky 2006a: 67; see also Kastovsky 2006b). Besides derivational morphology, the change affected “different domains of the same level, e.g. nominal inflection influencing verbal inflection and vice versa” (Kastovsky 2006a: 71). Obviously, word-based morphology is the precondition for a language to develop into an isolating type since stems cannot be used as independent units without morphological completion. The changed status of the base, then, “can be taken as a first indicator for a typological change into the direction of isolating encoding techniques in both grammar and the lexicon” (Haselow 2011: 36). Further indications in this respect are reported by Haselow (2011), who studied the development of noun derivation in five conceptual categories, as reflected in texts ranging from 700 to 1250.12 Quantitatively, Haselow (2011: 187) observes “a clear reduction of the inventory of suffixes” in four of the five categories. These findings are mirrored in the figures for suffixed nouns showing an overall decline of the type and token frequencies, hence a decreasing use of bound morphemes in nominalization. These results, then, testify to “a weakening of the OE suffixation system, which surfaced in full extent in early ME” (Haselow 2011: 188). Moreover, Haselow discusses changes toward word-based morphology as well as phonological alternation caused by suffixation, which may affect the base and/or the affix. Stem and suffix variations were both amply evident in OE, but these are considered “remnants of a fusional language type” (Haselow 2011: 45) because the alternations had been caused by processes no longer operative. In fact, Haselow’s (2011) investigation is semantically driven, analyzing the means of expressing the concepts ‘Person’, ‘Object’, ‘Location’, ‘Action’ and ‘Abstract’ and the respective developments. As my own focus is on formal aspects, his findings as to semantics are ignored the following remarks.
26
3 Typological shifts in the English lexicon
Hence, OE is supposed to have undergone “a transition from a base-variant to a base-invariant language in its earliest stage” (Haselow 2011: 45), followed by the shift toward “word-based morphology, by which all kinds of formal alternation of bases and affixes were lost” (Haselow 2011: 45). Accordingly, all suffixes that persisted into ME times were invariant, forming complex words of the agglutinating type. The lack of fusion caused by invariant affixes, together with the general loss of suffixes, is regarded as “a clear trend towards isolating encoding strategies of lexical information” (Haselow 2011: 226; see also Haselow 2012b). Likewise, the change toward word-based morphology is considered “an indicator for the shift towards a higher degree of analyticity in English derivation since it represents the progressive isolation of morpheme boundaries and thus the loss of fusional structures” (Haselow 2011: 233–234). As a result, the development of nominal morphology in OE “contributed to the shift of English from a predominantly synthetic towards a predominantly analytic language” (Haselow 2011: 240), suggesting that changes in inflection and derivation “drifted into the same direction” (Haselow 2011: 240). This is reminiscent of Sapir, who briefly remarks in a footnote that “English was moving fast toward a more analytic structure long before the French influence set in” (Sapir 1921: 206). Yet while Sapir (1921: 216–217) argues that contact with French strengthened analytic tendencies in English, the common narrative maintains that the influx of French loanwords in ME partly reversed the trend toward analyticity. Along these lines, Haselow states that the suffix inventory, depleted at the beginning of ME, was refilled by new derivation patterns, and, concomitantly, “stem-based morphology and stem-variance were (re-)introduced” (Haselow 2012a: 215).
3.3 Resurgence of syntheticity in Middle English While the developments in OE, caused in particular by phonological erosion, uniformly affected the language, the changing conditions during the ME period drastically altered the “very homogeneous morphophonemic system” (Kastovsky 2006a: 77). The enormous impact on the language “again can be regarded as a typological change, this time however triggered by extra-linguistic rather than intra-linguistic factors” (Kastovsky 2006a: 77). It is well known that language contact with French during the High and Late Middle Ages acted as the extralinguistic trigger for the supposed typological shift. What seems to be less known, however, is that traditional accounts of the contact situation, perpetuated in linguistic circles, need to be seriously revised in view of textual evidence that has been accumulated over the past decades. Therefore, I
3.3 Resurgence of syntheticity in Middle English
27
first want to set the record straight before turning to the impact of French on English morphology, which cannot “be dealt with satisfactorily without a proper understanding of the nature of contact” (Timofeeva & Ingham 2018: 199).13
3.3.1 Contact situation between English and French In order to abandon long-cherished beliefs about the status of French dialects in England, their life span and diffusion through the English medieval society, we first have to acknowledge that the traditional accounts are deeply rooted in the nationalist historiography prevalent in 19th century Europe (Watson 2013). Just like other sciences, linguistics became an instrument of English nationalism, construing “the eventual triumph” (Burnley 1992: 428) of the English language over Anglo-Norman “as a matter of naturalized and inevitable ‘English’ emancipation and self-definition against the French” (Wogan-Browne 2013: 2; see also Watson 2013). Hence, French was interpreted “as an oppositional partner in the development of English traditions and institutions” (Wogan-Browne 2013: 2), its early and steady decline in England a foregone conclusion. During the past two decades, however, scholars have begun to reinterpret the textual evidence, which has continuously increased and become more accessible. In view of these findings, then, several myths need to be deconstructed. Myth 1: Decline of Anglo-Norman following the loss of Normandy The label ‘Anglo-Norman’ is commonly applied to the French variety spoken by the Normans who settled in England in the second half of the 11th century and, as such, designates French texts produced in the British Isles from 1066 onwards. Traditionally, Anglo-Norman is assumed to have died out two centuries later; Thomason & Kaufman (1988: 126) speculate that “French speakers had all shifted to English by about 1265” and subsequently acquired the Central French variety as a second language (see also Dietz 2015a).14 The demise of Anglo-Norman, so the story goes, started with the loss of Normandy by the English crown in 1204, prompting king and nobles “to look upon England as their first concern” (Baugh & Cable 2002: 128). Against this backdrop, Baugh & Cable (2002: 130) claim “that
The urgency to address this issue became manifest in 2018 when the prestigious journal English Language and Linguistics published a Special issue on mechanisms of French contact influence in Middle English, edited by Olga Timofeeva and Richard Ingham. Instead of the label ‘Central French’, the designations ‘Parisian French’ or ‘Continental French’ are also found in the literature.
28
3 Typological shifts in the English lexicon
after 1250 there was no reason for the nobility of England to consider itself anything but English. The most valid reason for its use of French was gone.” This simplistic view grossly distorts historical facts in that it restricts the links between England and France to Normandy exclusively, ignoring the vast territorial possessions of the English crown in France since the 12th century: As of 1154, England’s rule extended over virtually the whole west of France, and, while Anjou, besides Normandy, was lost in 1204/05, England retained the large southwestern duchies of Gascony and Aquitaine for another three centuries (Sarnowsky 2002: 110). Thus, England maintained various relations with French territories after 1204, whether in trade or in dealings with its French-speaking subjects, who frequently addressed petitions to the king (Rothwell 2006; Ormrod 2013). Unsurprisingly then, French did not fall into disuse in England but rather expanded “its range of written-text functions after 1250” (Timofeeva & Ingham 2018: 199; see also Ingham 2013). The narrative about the extinction of Anglo-Norman and its eventual substitution by Central French in England suggests a discontinuity of the use of French, which fosters “a model in which English lies dormant after the Conquest, having been overwhelmed and replaced by Anglo-Norman, but rises again in a late fourteenth century efflorescence” (Wogan-Browne 2013: 1). Missing from this narrative, however, is the written evidence testifying that Anglo-Norman texts were still composed in the Late Middle Ages, on the one hand, and that Central French texts were distributed as early as the 12th century, on the other (Wogan-Browne 2013). Instead of being used in neat succession, the two varieties were “evidently everyday realities in England” (Rothwell 2001: 550), at least in Chaucer’s time, employed in different contexts, i.e., domestic dealings or correspondence and business with France. Despite this difference, however, the varieties did “not [display] a stark, universal black–and–white distinction, but a grading from one to the other” (Rothwell 2001: 550). Consequently, researchers have increasingly replaced the terms ‘Anglo-Norman’ and ‘Central French’ by the more generic names ‘AngloFrench’ (e.g., Durkin 2014: 230) or ‘French of England’ (e.g., Rothwell 2006) to capture the linguistic diversity in medieval England and the continuous contact situation. The fact that Anglo-French kept close contact with continental French is also evident in grammatical developments that first occurred in France and were subsequently adopted in England. Ingham (2013) examines the movement of atonic object pronouns from postverbal to preverbal position (e.g., Modern French pour le voir), replacement of nul by aucun in negative sentences and cases of subject–verb inversion between 1250 and 1350, finding that Anglo-French “participated in the
3.3 Resurgence of syntheticity in Middle English
29
mainstream of language evolution in French” (Ingham 2013: 51).15 Thus, the French in England is best considered “part of a dialectal continuum in which innovations spread outward from an innovative central zone” (Ingham 2013: 53), presupposing sustained contact between native Anglo-French speakers and/or writers and continental French natives. Myth 2: French as the language of the enemy and ‘the triumph of English’ The common narrative construes French as “the language of an enemy country” (Baugh & Cable 2002: 142) during the Hundred Years’ War (1337–1453), which is supposed to have contributed to the abandonment of French in England. This neatly coincides with the tale about the triumphal march of the English language in the 14th century: After the centuries-long banishment, English reemerged in schools after 1349, and Parliament was opened in English for the first time in 1362. The same year saw the passage of the Statute of Pleading, stipulating that legal proceedings should be held in English because the French language had become “very much unknown in the said realm [i.e., England]” (translated by Ormrod 2003: 755). Interpreted as “a symbolic assertion of Englishness at a specific point in the history of the Hundred Years War” (Ormrod 2003: 781), the Statute has been taken as a milestone in the resurgence of the English language. However, the impact of the Statute of Pleading and the realities of the Hundred Years’ War are unduly simplified. First of all, the developments during the Hundred Years’ War in France cast doubt on the hypothesis that French was ‘the language of an enemy country’ since much of this ‘enemy country’ was, in fact, ruled by the English crown for long periods. Admittedly, after territorial gains at the beginning of the war, England incurred important losses in the 14th century. But with the advent of the 15th century, the British continuously enlarged their continental possessions; the magnitude of their influence manifested itself in the Treaty of Troyes (1420), which determined that England and France were to be ruled in personal union by the English king. The tide ultimately turned in 1450, and, within a few years, England lost all its continental holdings with the exception of Calais (Sarnowsky 2002: 177–184). Not only were boundaries constantly changing, but “[w]arefare, moreover, was accompanied by cross-cultural exchange that transcended geographic or na-
These developments were not influenced by tendencies in (Middle) English: The shift of atonic object pronouns to preverbal position was exactly the opposite of the English trend, the introduction of any in negative clauses occurred markedly later in English, and the requirements for subject–verb inversion differed substantially between the two languages (Ingham 2013).
30
3 Typological shifts in the English lexicon
tional identity” (Driver 2013: 420). This exchange, conducted in French, was not limited to administration and business dealings but included private transactions by wealthier English natives who settled in large cities, such as Paris and Rouen, that were under English rule for long times. A case in point is the commissioning of “books from local scribes and artists, some of whom later travelled to England to continue their careers” (Driver 2013: 420), thereby fostering the French of England. Hence, instead of strengthening national demarcations, the Hundred Years’ War increased contact between English and French natives on the Continent, reinforcing French language use by English speakers. In England, on the other hand, the number of speakers proficient in French declined. In this context, the Statute of Pleading (1362) is considered to have encouraged the use of the English language in political and legal affairs at the expense of French, which has, however, turned out to be an undue overgeneralization (Rothwell 2001; Ormrod 2003). For a start, the Statute was expressly limited in its application “to the courts of law, not to ‘government’ or ‘the legal system’ in general” (Rothwell 2001: 541); but even more importantly, the proscribed use of English was restricted to oral exchanges, the pleading in court. In this respect, a plausible scenario of spoken courtroom interactions has been proposed by Rothwell (2001): Embedded in English syntax, English function words and common verbs would accompany French legal terms, which were retained due to their established use to denote specific meanings (see also Ormrod 2003). While pleadings were conducted in English, recordings of these pleadings were written in Latin or French. Latin had been the primary language of record in legislation, jurisdiction and administration but was replaced by French during the 14th century: Statutes in French outnumbered those in Latin as early as the beginning of the century (Lusignan 2013), and many collections of state and legal papers from c.1350 to c.1450 likewise display a large proportion of French words (Rothwell 2001). The replacement of Latin by French marks the transition from a learned language with a large inflectional system to a vernacular used without, or with incongruous, flectional endings.16 French inflection became increasingly redundant, its function largely performed by the syntax in the later Middle Ages (Rothwell 2001). Thus, the frequently lamented errors during the later 14th and 15th centuries did not impair comprehensibility, but they seem to indicate that “French was progressively treated more and more like the native English of the scribes” (Rothwell 2001: 555).
The term ‘vernacular’ describes the position of Anglo-French with respect to Latin and ME dialects; it certainly does not imply that Anglo-French was the language of the majority of the English people (Rothwell 2001).
3.3 Resurgence of syntheticity in Middle English
31
Unlike oral proceedings with monolingual English participants, written records did not address the general public but were designed for professional elites who “had no reason to abandon a long-established practice of recording” (Rothwell 2001: 542). Administrators could conveniently draw on an established repertoire of French terminology and abbreviations, particularly useful for taking notes and dictations. Unsurprisingly then, the French language continued to be used well into the 15th century before being replaced by English. Depending on the target audience, English sooner or later superseded French in official documents: Government proclamations in French, read out in medieval marketplaces, peaked between 1350 and 1415 but were soon replaced by announcements in English as the number of regulations intended for the general public multiplied and “the use of French became increasingly onerous” (Britnell 2013: 89). By contrast, French statutes, directed at professionals, flourished throughout the 15th century, “and it was not until 1488 that [. . .] [English] replaced French entirely in its function as a legislative language” (Lusignan 2013: 21).17 The distinction between public and professional audiences to account for the language(s) used in state and legal documents, however, captures only part of the picture. In general, readership seems to have been linguistically diverse even toward the end of ME times, as reflected by the history of manuscript transmission of the chronicle Brut, which relates the history of Britain since the fall of Troy. The oldest version, ending with the death of Henry III in 1272, is assumed to have been composed at the beginning of the 14th century (Marvin 2013). Even though the narrative, originally written in Anglo-French, had been translated into English (and Latin) and nearly all continuations after 1330 were written in English, the “Anglo-Norman Brut manuscripts continued to be read (and annotated) and written through the fifteenth century” (Marvin 2013: 314) – in terms of numbers, approximately a quarter of the surviving French texts were copied during the 15th century. Hence, the chronicle circulated “simultaneously [. . .] in the three main literary languages of England, serving an array of audiences” (Marvin 2013: 304). In short, claims about the ‘triumph of English’ in the 14th century, or “the death of Anglo-French c.1400” (Miller 2012: 148), are unreasonably strong, in wording and in substance. Rather, the shift from French to English in language use should be considered a gradual process that was not completed before the end of the ME period; again, “[l]inguistic divides were not, in practice, as sharp as current disciplinary divides might make them seem” (Marvin 2013: 313).
It is worth noting that documents issued by the royal chancery were drafted in Latin and/or French during the 15th century; thus, “the term ‘Chancery English’ can be rather misleading” (Ormrod 2003: 785).
32
3 Typological shifts in the English lexicon
Myth 3: The eliteness of French The French of England is commonly assumed to have been the prestige language of the upper classes (e.g., Baugh & Cable 2002: 114). Thomason & Kaufman maintain that the ruling classes of Norman descent “began giving up French by 1235 at the latest” (Thomason & Kaufman 1988: 308) and had adopted English as their first language by 1265. Consequently, the upper stratum of society was no longer competent in the French variety brought in by the Normans and “shifted to the by then more prestigious Central dialect” (Thomason & Kaufman 1988: 309). Therefore, the use of French after 1265 is claimed to have been “artificial for all concerned, high and low alike” (Thomason & Kaufman 1988: 275), serving “primarily as class marker” (Thomason & Kaufman 1988: 269). This narrative is based partly on the ostensibly low number of native French speakers in England and partly on the semantic domains of French loanwords, deemed characteristic of upper-class language, but it ignores textual evidence that “attest[s] to the use of French in a wider range of social contexts” (Timofeeva & Ingham 2018: 199). The expansion of institutions in medieval England entailed the proliferation of professional administrative classes in all domains, requiring an ever-increasing number of legal, political and administrative professionals to have a certain command of French for their daily interactions. Against this background, the diffusion of French loanwords may well have followed the scenario proposed by Lusignan (2013: 25): “The Anglo-French words that the above individuals found particularly efficient or pleasing were doubtless used by them in their conversations with their English peers, thereby beginning to percolate into their spoken language.” Accordingly, the spread of loanwords is primarily ascribed to “the urban bourgeoisie and the gentry” (Lusignan 2013: 26) rather than to the upper classes. Moreover, perfunctory acquaintance with spoken French was obviously widespread throughout society due to “strong cultural habits and associations” (Britnell 2013: 88). Although most townspeople, for instance, had no active knowledge of the language, public announcements were read out in French; some, such as urban ordinances, were only framed by French phrases, others, such as oaths of allegiance, were entirely drafted in French. In these contexts, French was favored “as a spoken language even though it was not the mother tongue of its writers or hearers” (Britnell 2013: 87). At the other end of the spectrum, a significant number of English sailors had enough knowledge of the language to communicate their basic concerns in foreign ports; thus, “the French of England was for them a practical, living (oral) vernacular” (Kowaleski 2013: 117). In ecclesiastical contexts, French was not only the language of speech but also increased in writing, especially following the reforms initiated by the Lateran Council in 1215, which opened new avenues of religious teachings (Deeming 2013). In this spirit, English religious texts were preferentially based on French
3.3 Resurgence of syntheticity in Middle English
33
rather than Latin sources “either because translation from French was easier or, perhaps, because a work’s existence in the French vernacular legitimated its rendering into English” (Watson 2013: 336). French was even preferred over English in the countrywide dissemination of ecclesiastical writings, as observed by Watson (2013: 336): “French, not English, [was] the vernacular medium in which religious works travel[ed] from one part of England to another.” Finally, the status of French as a vernacular is evident in the domain of medical writings. While Latin continued to be “the lingua franca of medical theory” (Green 2013: 230), French was the medium for “empirical remedies [. . .] that could be used for self-help by laypersons, and works of praxis that could aid the general practitioner or surgeon” (Green 2013: 229). Medical treatises written in English, on the other hand, did apparently not materialize before the Late Middle Ages, as may be inferred from the corpus of Middle English Medical Texts, which begins with samples from 1375 (Taavitsainen & Pahta 1997). These necessarily selective observations should suffice to demonstrate that “French competence was not limited to the highest social classes but spread some way down the social scale” (Timofeeva & Ingham 2018: 199). Hence, medieval agents of borrowing from French were not only members of the upper classes but could be found in societal segments that may be compared to today’s (upper-)middle classes. This conception, then, would be fairly consistent with the “curvilinear hypothesis” (Labov 2001: 32) that ascribes agentivity neither to the highest nor to the lowest stratum but identifies the central classes as the leaders of language change. To conclude this longish excursion into some historical background, I would like to return to the introductory remarks to this section. As Watson (2013: 334–335) so aptly notes, “our deepest scholarly narratives – and ‘the triumph of English’ is a deep narrative indeed – do not live by evidence alone. Core national, religious and literary historical beliefs are bound up in their survival.” While I cannot possibly dismantle extralinguistic beliefs, I hope to have provided enough evidence to dispel some linguistic myths surrounding the contact situation between English and French. The emergent, more realistic picture recognizes the uninterrupted continuity of Anglo-French, albeit dialectally varied, well into the 15th century and thus a longer coexistence of French and English in medieval England than traditionally assumed. Although the overwhelming majority of the population were undeniably monolingual English speakers, French was more widely diffused throughout society than previously claimed; in view of its pervasiveness in the important administrative classes, especially, Anglo-French needs to be conceived as a vernacular rather than an artificial language in medieval England. In this light, the suggestions by Rothwell (2001) and Lusignan (2013) about the mixed use of French
34
3 Typological shifts in the English lexicon
and English in various contexts describe perfectly plausible scenarios that set the stage for the process of borrowing, a key factor in this study.
3.3.2 Influence of French borrowings Extensive borrowing from French during the Middle Ages undoubtedly had a huge impact on the English lexicon, both in terms of quantity and quality (e.g., Durkin 2014: 254–264). While the majority of French loanwords are found among nouns (Dekeyser 1986), members of other word classes were borrowed as well, resulting in a dissociation of the lexicon, so that semantically associated words, such as the noun hand and the corresponding adjective manual, are not transparently morphologically related (Durkin 2014: 224; Haselow 2012b). Apart from significant additions to the word-stock, French borrowings affected other linguistic domains to varying degrees. In segmental phonology, the introduction of words with initial voiced fricatives, e.g., very or zeal, advanced the phonemicization of former allophones, giving rise to PDE minimal pairs, such as ferry – very or seal – zeal (e.g., Lass 1992; Aikhenvald 2007a). On the suprasegmental level, the absorption of French words with their particular prosodic structure established a new stress pattern, the Romance Stress Rule (e.g., Lass 1992). Unlike the Germanic pattern, which invariably fixes the stress on the initial syllable of the root, the Romance pattern allows for moveable stress from the wordfinal to the antepenultimate syllable depending on syllable weight (e.g., Kastovsky 2006a). In the morphological domain, borrowing from French had repercussions for derivational morphology, though only indirectly. As in most contact situations, whole words were adopted into the English lexicon; still, all loanwords are, in principle, susceptible to later structural analyses by speakers of the receiving language. Hence, “the affixes figuring in complex words may be recognized and subsequently ‘isolated’ from the borrowed word in question” (Koefoed & Marle 2004: 1586). In the present case, such reanalyses resulted in the well-known enlargement of the English affix inventory. Thus, neither French affixes nor “foreign rules of word-formation” (Dietz 2015b: 1921) were borrowed, yet the impact of French borrowings in this domain is considered dramatic. According to Durkin (2014: 224) “[t]he derivational morphology of English was [. . .] completely transformed”; similarly, Miller (2012: 14) notes “a reshuffling of the English morphological system”. These widely held perceptions are grounded in two important characteristics that distinguish Romance from Germanic affixation: the morphological status of the base and phonological variation. As mentioned before, Germanic derivation
3.3 Resurgence of syntheticity in Middle English
35
had become increasingly word-based since OE, and, although some dissent exists as to whether this development was completed in ME or continued into eModE (Dalton-Puffer 1996: 30–31), Germanic affixation is unquestionably word-based in PDE. In French, however, morphology operates on bound bases, with the result that “stem-based morphology [. . .] [was] reintroduced with the adoption of Romance-based morphological material” (Haselow 2012b: 649; see also Kastovsky 2006b; Dalton-Puffer 1996: 30–31). Moreover, French loanwords, and their subsequent reanalysis, reintroduced phonological variation in derivationally related words (e.g., Haselow 2012b). While Germanic derivation in OE had progressively shed affix allomorphs and left bases unimpaired by affixation, Romance derivation reestablished affix allomorphy and stem variance, not least due to the aforementioned movable stress, “especially since suffixes could bear stress themselves or determined the position of stress” (Kastovsky 2006a: 76). Accordingly, PDE displays stress-induced variation in word pairs, such as the base vowel change between adjectival ‘form[ə]l and the derived noun for’m[a]lity. In view of the particular characteristics of Romance derivation, borrowing from French is generally considered to have added “a second morphological type [. . .] to the native one” (Haselow 2011: 266), resulting in today’s dual-stratum derivational system. Thus, PDE “distinguishes word formation on a non-native basis from word-formation on a native basis” (Kastovsky 2006b: 159), usually based on the etymological origin of the affix (e.g., Bauer, Lieber & Plag 2013: 583; Nevalainen 1999; for an alternative approach see Marchand 1969: 215–216). Given their different properties, derivational affixes are thought to be systematically stratified across two levels, the native and non-native ones: Romance affixes, having originally been separated from foreign bases in loanwords, combine with nonnative bases (Nevalainen 1999) and attach to bound bases, prompting base variance, whereas Germanic affixes are indifferent to the etymological origin of their base and combine with free morphemes without creating phonological variation. This division into two systems, however, seems overly rigid since none of the three features unambiguously defines Romance affixation against Germanic derivation. For a start, neither prefixes nor suffixes of Romance origin exclusively attach to foreign bases, which would have created “a loan morphology” (Aikhenvald 2007a: 21), but are affixed to Germanic bases as well. So, “while there is a tendency for non-native suffixes to favour non-native bases, there is far more mixing of the two systems of derivation than such theoretical claims might have led us to expect” (Bauer, Lieber & Plag 2013: 614). Moreover, Romance affixation does not necessarily operate on bound bases but is partly stem-based, partly word-based, and, similarly, not all Romance affixes produce formal variation, such as base variability or stress shift.
36
3 Typological shifts in the English lexicon
In short, “[t]he boundary between the two systems is not clear-cut” (Haselow 2012b: 649). In view of the lack of systematicity, Bauer, Lieber & Plag (2013: 612) even assert “the myth of stratification”: Though acknowledging two structural types in PDE morphology, they argue that for language users, commonly unaware of the morphemes’ etymology, the difference between Romance and Germanic affixation may well boil down to “phonological characteristics of native versus nonnative derived words [. . .] to the extent that both prefixes and native suffixes are more likely to be less phonologically fused to their bases than non-native suffixes” (Bauer, Lieber & Plag 2013: 615). Regardless of the researcher’s stance on lexical strata, then, borrowing from French undeniably reintroduced phonological variation in derivationally related patterns, and possibly bound bases, once the complex loanwords had been analyzed by English language users. Since base variance and/or affix allomorphy create fusion at word-internal boundaries, which had been eliminated in Germanic derived words since OE times, the adoption of Romance loanwords partly reversed the typological development toward isolation in the English lexicon (Haselow 2012a). Against this backdrop, Sapir’s assessment of the impact of Romance loanwords is quite astonishing as he claims the exact opposite, namely that the contact with French “stimulated [English] in its general analytic drift” (Sapir 1921: 206). He argues that English was exceptionally receptive to French loanwords as a consequence of its own trend toward “the completely unified, unanalyzed word” (Sapir 1921: 208). Therefore, French complex words were eagerly adopted “because each represent[ed] a unitary, well-nuanced idea and because their formal analysis [. . .] [was] not a necessary act of the unconscious mind” (Sapir 1921: 208), which is certainly true for the initial process of borrowing. Ensuing analyses of borrowed words merely increased the affix inventory but did not influence the English morphology, so that the language “remained [. . .] true to its own type and historic drift” (Sapir 1921: 217). Unfortunately, Sapir neglects to reconcile his claim about English’s general drift toward analyticity with his concept of fusion, outlined in the preceding chapter, and much of the respective discussion (Sapir 1921: 215–217) seems forced and fails to persuade. In summary, English lexical morphology had become less fusional at the beginning of the period under investigation: On the one hand, the loss of affixes and the decline of complex nouns in late OE testify to isolating tendencies; on the other, increasingly transparent morpheme boundaries reflect the growth of agglutination at the expense of fusion in word formation. The trend toward isolation in the lexicon was reinforced by large-scale borrowing from French in ME,
3.4 Typological classification of the means of lexicon extension
37
but subsequent reanalyses of complex Romance loanwords reintroduced fusional tendencies in lexical word formation. While the analytic trends in late OE suggested by Haselow (2011, 2012a, 2012b) are inferred from quantitative data, assertions about the typological development since ME derive from qualitative observations exclusively. Such observations may serve as a starting point but are insufficient to support claims about typological tendencies in the lexical domain, which need to be based on usage frequencies as typological morphology is inherently gradient. However, even frequency-based analyses of the morphological makeup of words used since 1150 are, to some extent, misleading if we want to assess the degree of analyticity and/or syntheticity in language. Unlike grammar with inflected word forms unambiguously derived by Germanic affixation, the lexicon is much more versatile in its derivational possibilities, so that a complex Romance word, for instance, may be analyzed as a case of borrowing, strengthening isolating trends, or as manifesting Romance affixation, thus reinforcing fusional tendencies. While decisions in this regard ultimately depend on the researcher’s approach, the ambiguity of words as to their typological impact can be reduced by focusing on the processes that introduced them into the English lexicon in the respective centuries.
3.4 Typological classification of the means of lexicon extension In order to evaluate the typological influence of the processes chosen by language users to extend their lexicon, these means need to be classified typologically. As noted in Chapter 1, the processes under investigation comprise the central morphology, i.e., word formation by affixation, compounding and conversion, as well as borrowing. The previous discussion of contact scenarios as well as Sapir’s (1921: 206–208) remarks about the adoption of French words should leave no doubt that borrowing is an analytic means to enlarge the word-stock. Similarly, the review of Germanic and Romance affixation should suffice for their typological classification: Both processes pertain to the synthetic language type, but, while Germanic affixation is agglutinating (transparent word-internal boundary due to free bases and phonological invariance), Romance affixation, in general, is fusional (obscured word-internal boundary due to phonological variation and, possibly, bound bases). The differences within Romance affixation may be captured by stipulating that “Romance derivation ranges from being weakly agglutinative to being entirely fusional” (Dalton-Puffer 1996: 62; emphasis added), but as long as research has not agreed on a unified ap-
38
3 Typological shifts in the English lexicon
proach to and delimitation of fusion (see Chapter 2), it seems most unfortunate to introduce further demarcation problems. Therefore, Romance affixation processes are uniformly classified as fusional techniques, and it remains to determine the status of conversion and compounding with respect to their typological impact. Conversion broadly denotes the transposition of a word from one syntactic category to another one without formal modification, regardless of the item’s morphological complexity (e.g., Huddleston & Pullum 2002: 1640).18 Such category changes mostly concern the three largest word classes; thus, shifts occur between nouns and verbs, e.g., statement (N) > statement (V) or run (V) > run (N), between nouns and adjectives, e.g., fun (N) > fun (ADJ) or poor (ADJ) > poor (N), and between adjectives and verbs. Frequently, elements undergoing conversion are phonologically unmarked for their word class (run, fun, poor), and their function has to be determined by their syntactic position (e.g., Biese 1941: 391). Synsemantic means, such as word order or sentence stress, may even outweigh autosemantics as exemplified by statement, which is morphologically marked as a noun by the suffix -ment but may be used as a verb. Hence, conversion is an isolating process since no material is added to indicate the transposed element’s function, which is derived from the sentence context exclusively. Analytic languages with their strong reliance on synsemantic means are best suited for this word formation process (e.g., Biese 1941: 391; Vachek 1961). Accordingly, conversion in fusional languages, such as the Slavonic ones, “is all but negligible” (Vachek 1961: 21) while figuring prominently in analytic languages, such as Chinese (e.g., Gabelentz 1891: 419; Tauli 1958: 178). Compounding, on the other hand, is the process by which free morphemes are concatenated to form complex words with highly transparent internal boundaries. While this technique is often accompanied by prosodic shifts in that one primary stress is imposed on the first element, thereby unifying the compound constituents phonetically (e.g., Berg 2012), these suprasegmental changes do not blur the morpheme boundary.19 Boundary transparency in English is also not affected by processes such as segment elision operating at the morpheme juncture. Consequently, compounding is an agglutinating word formation process, but its products are less cohesive than words derived by Germanic affixation since the constituent morphemes largely retain their autonomy (Berg 2009: 9; Berg 2012).
I do not subscribe to the theoretical construct of zero-derivation that claims the existence of an unrealized morpheme (zero) in parallel with overtly realized derivational affixes (e.g., Marchand 1969: 359–361; for criticism see also Huddleston & Pullum 2002: 1641). This does, of course, not preclude that compound-internal boundaries may become obscure if the constituents further coalesce in later usage of the complex word; compare the earlier remarks on OE hlāfweard and PDE cupboard.
3.4 Typological classification of the means of lexicon extension
39
Due to the constituents’ relative independence, the demarcation between English compounds and phrases is often fuzzy (e.g., Berg 2012); therefore, compounding is characterized as “the most syntax-like of all word formation” (Giegerich 2009: 199). In terms of typology, compounding is regarded as “Zwitterding zwischen Wortbildung und Syntax, zwischen isolirendem und agglutinativem Verfahren” [hybrid between word formation and syntax, between isolating and agglutinating processes] (Gabelentz 1891: 341). Unsurprisingly then, compounding, while available in all language types, prevails in isolating languages (Aikhenvald 2007b; Sapir 1921: 67; Tauli 1958: 83), which is only to be expected in view of the limited possibilities to form words in analytic languages.
Figure 4: Means of lexicon extension in a typological perspective.
In the light of the above considerations, the means of lexicon extension can now be ordered typologically along a continuum that builds on the descriptive levels displayed in Table 1. Based on the degree of fusion, Figure 4 depicts a cline that extends from borrowing and conversion, i.e., non-fusional processes operating on words perceived as monomorphemic, to Romance affixation, the most fusional of the word formation processes under investigation. Located between these poles are the agglutinating techniques with Germanic affixation placed toward Romance affixation and compounding situated toward isolating means of lexicon extension. This classification, then, provides the basis for the analyses presented in Part 2.
Part 2: Means to extend the nominal lexicon since 1150
This part focuses on the processes by which language users have or could have incorporated new nouns into their vocabulary. Before addressing these issues, however, it is necessary to expand on the data that provide the basis for the empirical parts of the book. To this end, Chapter 4 describes the preparation of the corpora used in this study and the collection of the data, providing a first overview of the distribution of the noun types and tokens across the centuries. The following chapters concentrate on methods for lexicon extension: Chapter 5 presents the means actually used in the subsequent centuries since 1150 by concentrating exclusively on words newly adopted in the respective epochs, while Chapter 6 examines the patterns that, at least in theory, have been available for language users to derive nouns since 1150. Starting from the premise that analogy is operative in word formation, all types, old and new, are assigned to the word formation patterns they may have strengthened in each century. Based on the typological techniques used in new additions to the lexicon and those reflected by potential patterns for noun formation, I finally propose some preliminary conclusions on the typological development of the means to extend the nominal lexicon since early ME.
https://doi.org/10.1515/9783111317717-005
4 The database 4.1 Setup of corpora The principal source of data is the part-of-speech annotated Penn Parsed Corpora of Historical English series consisting of the Penn-Helsinki Parsed Corpus of Middle English (PPCME2), the Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME) and the Penn Parsed Corpus of Modern British English (PPCMBE). The corpora comprise 605 British English texts, totaling roughly 4 million tokens, which cover the time from 1150 to 1913. Each corpus is subdivided into shorter periods ranging from 70 to 100 years, yielding a total of ten subperiods, which is unnecessarily fine-grained for the present purpose. In order to improve comparability, the texts were reallocated to 100-year slices based on the manuscript date, which, unlike the composition date, is suitable to capture contemporary trends since scribes often changed the text considerably in copying from the exemplar (e.g., Stenroos 2017; Durkin 2014: 260). Due to the skewed distribution of early texts, however, the first period has to encompass 150 years, hence from 1150 to 1300. The corpus texts represent text types as diverse as diaries and letters, religious writings, drama, scientific articles and historical narratives, amounting to a total of 26 registers manifest in one century or the other.20 However, none of the text types is attested in all periods: Some registers were completely lost (e.g., Biography: life of Saints), others only emerged over time (e.g., Fiction), and still others changed so dramatically as to constitute a new text type (e.g., Handbook: medicine > Science: medicine) (see also Biber & Conrad 2009: 157). While I did not perform a register analysis of noun derivation and usage, I wanted to base the study on a corpus balanced for text type to ensure maximum representativeness; thus, the database had to be abridged in such a way that all registers of the respective century exhibit equal numbers of tokens if possible. (In case of insufficient amount of data in one or more registers, true text-type balance within the respective century was, of course, impossible to achieve.) As the focus of the investigation is on nouns, composing the largest word class in terms of types and tokens (e.g., Berg 2014), I decided that a corpus of 100,000 tokens per century would provide a sufficiently large basis – for the period from 1150 to 1300, the corpus size was proportionally increased to 150,000 tokens.
The term ‘text-type’ refers to linguistic features that define one type of text against another one, whereas ‘register’ is the superordinate term denoting such differences on a more abstract level (Taavitsainen 2016); in this book, the two designations are used interchangeably. https://doi.org/10.1515/9783111317717-006
46
4 The database
To maintain the textual variety of the Penn collection, all texts were included in the adapted corpora, except for the supplements to the original Helsinki files in eModE21 and the few corpus texts (mis)representing the 20th century (see below). In sum, 299 files were retained, amounting to roughly two and a half million tokens. Since the number of words exceeded the target number of 750,000 tokens for the time from 1150 to 1900, the texts were truncated, if necessary, to create files of equal size for each register in the particular centuries. To this effect, word counts were calculated for all 299 texts using Microsoft Word, the average target length per text extract was determined, and the files were shortened accordingly; for a detailed listing of registers, number of texts and original and reduced numbers of tokens across all periods see electronic Appendix E1. As noted above, the PPCMBE files for the 20th century were excluded since a mere five texts ranging from 1901 to 1913 cannot be considered representative for the century. The data for this period, then, come from the British part of the ARCHER, A Representative Corpus of Historical English Registers (version 3.1), which compares best with the Penn series in terms of register distribution (for corpus description see Biber, Finegan & Atkinson 1994). For the time from 1901 to 2000, ARCHER contains 191 British texts, corresponding to roughly 350,000 words, but only 90 files were included in the present study; otherwise, the number of texts would have deviated too much from those of the previous centuries, resulting in unduly short extracts for the 20th century. Again, I tried to maintain the representativeness, selecting files more or less equally distributed across the time period; the rather convoluted procedure is detailed in electronic Appendix E2. As before, the texts, totaling nearly 150,000 tokens, were shortened to obtain a corpus balanced for register, with each text type being ideally represented by text extracts of the same length. Unlike the Penn series, the ARCHER data are not part-of-speech annotated; therefore, all files were prepared for automatic text processing using the interface WebLicht, which provides the necessary linguistic tools.22 Prior to the actual tagging process, each text had to be converted and tokenized, i.e., individual words and punctuation marks needed to be delimited. WebLicht provides various instruments for tokenizing and tagging to experiment with; after several trials, I
The supplements consist of 301 texts with roughly one million tokens, assembled in the Penn1 and Penn2 directories. The additions mostly comprise texts by the same authors and from the same editions included in the Helsinki Corpus so that the textual diversity is sufficiently illustrated by the Helsinki directory of the PPCEME, which contains 147 texts with approximately 500,000 tokens. The site has been developed and maintained by the research infrastructure CLARIN-D, located at the University of Tübingen, Germany (weblicht.sfs.uni-tuebingen.de).
4.2 Data collection
47
decided on the SfS:Stanford Tokenizer and the IMS Tree Tagger since these produced the smallest error rate in processing the corpus texts.23
4.2 Data collection All nominal tokens were automatically retrieved using the concordance program MonoConc Pro 2.2 (Barlow 2000), performing two batch searches, the results of which were manually inspected. The first search process collected all elements tagged as common noun, whether in base form or inflected for number and/or (possessive) case, as well as all common nouns preceded by any kind of modifier but spelled as one word or hyphenated. Besides noun–noun (N + N) compounds, e.g., sealmboc ‘psalm book’ (cmtrinit.mx1), this search routine turned up various sequences requiring close inspection. To begin with, I excluded adjective–noun (Adj + N) strings that, despite their nominal heads, instantiate categories other than nouns, such as adverbial dru-fot ‘dryfoot’ (cmjulia m1). By contrast, Adj + N compounds, such as aȝen-will ‘ownwill, self-will’ (cmvices1.m1) were retained in the data, unless written in two words. The workload would have increased drastically if all separately spelled and tagged Adj + N sequences had been collected and sifted through for pertinent compounds. It follows that figures for Adj + N compounds are unreliable: A cursory check of cmvices1.m1, for example, reveals that this text displays three hyphenated tokens of aȝen-will and six instances spelled in two words. Sequences consisting of article and noun, e.g., þarrke ‘the ark’ (cmorm.m1) or afolc ‘a folk’ (cmancriw-2.m1) were simply stripped of the determiner, whereas cases of nouns preceded by other determiners and/or prepositions had to be decided individually, depending on the word class realized by the composite: Adverbs, e.g., oðerhwile ‘otherwhile’ (cmancriw-2.m1), adjectives, e.g., oliue ‘alive’ (cmjulia.m1), and compound pronouns, such as him-seolf ‘himself’ (cmkathe.m1) or nanðing ‘nothing’ (cmvices1.m1), were deleted, while strings like oðermonne ‘other man’ (cmancriw-2.m1), iplace ‘in place’ (cmkathe.m1), þispeche ‘your speech’ (cmancriw-1.m1) or namon ‘no man’ (cmlambx1.mx1) were reduced to the respective nouns to be included in the data. Removing all non-nominal items as well as incor-
SfS is the abbreviation for ‘Seminar für Sprachwissenschaft’ [department of linguistics] at the University of Tübingen, Germany, which hosts this application; IMS stands for ‘Institut für Maschinelle Sprachverarbeitung’ [Institute for Natural Language Processing] at the University of Stuttgart, Germany.
48
4 The database
rectly tagged elements led to the exclusion of 5,170 tokens, amounting to 0.6% of the data.24 A further search was performed for compounds with nominal constituents written in two (or more) words, the prevalent spelling of N + N compounds (Bauer, Lieber & Plag 2013: 450). As before, the rightmost position was set to be filled by a common noun, base or inflected form, but this time it needed to be preceded by one or more nouns. The search query allowed for common and proper names, possibly inflected, to occur in the modifier slot, thereby also retrieving genitive compounds, such as brudlakes dei ‘bridelock’s day’ (cmhali.m1), and compounds with proper noun modifiers, such as lady sawter ‘Lady psalter, rosary’ (mowntayne-e1-h). Only sequences with classifying modifiers, exemplified by brudlakes dei and lady sawter, were retained, while strings with specifying modifiers, e.g., Goddes sone ‘God’s son’ (cmvices.m34) were ignored.25 Moreover, the search routine unearthed phrasal compounds, such as man-of-the-world touch (1920clarj7b) or floor-to-ceiling panelling (1961evanj8b), where the head noun is modified by a phrase that may or may not be lexicalized. In line with the treatment of (lexical) phrases as syntactic units, premodifying phrases were dismantled, so that only the nominal constituents (here: man, world, floor and ceiling) were included for further investigation. Before lemmatizing the collected tokens, abbreviations, curtailments and contractions were expanded into full forms. Such spelling reductions were especially fashionable in eModE, where the omission of graphemes was usually indicated by tildes and/or superscript letters, e.g., Maie ‘majesty’ (gawdy-e2-h), s~vante ‘servant’ (stat-1500-e1-h) or argumt ‘argument’ (commiss-e3-h) (Petti 1977: 22–25). Subsequently, all spelling variants and grammatical word forms were assigned to the respective lexemes, following the OED lemmas and subentries as far as possible.26 In a few instances only, I decided to deviate from the dictionary’s approach in order not to artificially inflate the number of types. With respect to the agent morpheme -er, the OED lists several (graphic) variants, such as advisor – adviser or commissionar – commissioner, as separate entries without differences in meaning; these were subsumed under one type ending in -er (see also Huddleston & Pullum 2002: 1698; Durkin 2014: 324). Similarly, complex words suffixed by -head and -hood
It is worth noting that the Penn corpora tags are much more reliable than the annotations allocated by the IMS Tree Tagger – the error rate for the 20th century data is twice as large as that for the preceding centuries. Obviously, nominal tokens that instantiated compound constituents were removed from the files of the previously collected nominal tokens to avoid double counts. Less than 6% of all types are not listed in the OED; as may be expected, most of these, namely 80%, have been derived by compounding.
4.2 Data collection
49
were pooled if semantically identical, e.g., maidenhead – maidenhood, despite having different entries in the OED. Also, I did not follow the OED’s practice of redundantly including lexemes prefixed by OE ʒe-, which developed into y- or i- in ME before being lost completely. Thus, while the dictionary provides distinct headwords, such as i-bedde and bedde ‘bedfellow’ or i-witness and witness, with no semantic differences, I regarded both forms as variant attestations of the nonprefixed type. As mentioned above, word forms were lemmatized since inflectional endings are immaterial to word formation (Marchand 1969: 4); however, inflected words may acquire particular semantics so that related word forms become dissociated (Bybee 1985: 88–95). Such inflectional splits may occur between nouns inflected for number, e.g., cloth (singular) and clothes (plural), or between lexemes marked for different cases, e.g., OE sceadu ‘shade’ (nominative) and OE sceaduwe, sceadwe ‘shadow’ (oblique). Moreover, graphic splits may arise (Durkin 2009: 83–84), indicating that previously polysemous words, such as flour or business, ‘outsourced’ some of their meaning components to new forms, such as flower and busyness. Although such splits do not belong to word formation proper, they establish new lexemes and cannot be treated as spelling variants and/or grammatical word forms; instead, they were included as fully fledged types, following the OED, which grants them the status of headwords. Table 2 provides an overview of the distribution of nominal types and tokens throughout the centuries. From 1150 to 2000 the total number of noun tokens amounts to 145,222, equaling 17.1% of all corpus tokens, distributed across a total of 12,083 types. On the whole, the percentage of nominal tokens fluctuates moderately between 15.9 and 18.2; so, the use of nouns seems to have been relatively stable, and language users have apparently not resorted to more verbal constructions. A global chi-square test shows significant differences across the centuries (χ2 (7) = 296.44, p = .000), which is only to be expected in view of the large samples.27 The effect size, however, shows no association between token numbers and time periods (Cramer’s V = .02, thus less than small according to Cohen’s convention; see Döring & Bortz 2016: 821), so that we may safely assume that the usage of nouns has been fairly constant since 1150. Concerning the type distribution, a continuous rise in numbers can be observed. While the type total for 1150 to 1300 is higher than that of the subsequent periods due to the larger corpus base, the steady growth of types is evident in the increasing type–token ratios (TTR) since early ME. The TTR has more than dou-
All chi-square statistics were carried out with Yates’ correction to decrease the risk of Type I errors.
50
4 The database
Table 2: Distribution of nouns from 1150 to 2000. Period
Corpus tokens
Noun tokens
Noun types
Type–token ratio
– – – – – – – –
, , , , , , , ,
, (.%) , (.%) , (.%) , (.%) , (.%) , (.%) , (.%) , (.%)
, , , , , , , ,
. . . . . . . .
Total:
,
, (.%)
bled during the time under investigation, signaling that language use has become more varied. Although the differences in terms of types and tokens between the 20th and previous centuries may, to some extent, be artifacts of the corpus and/or the tagging software, on the whole, the dataset looks encouraging with respect to size and distribution.
5 New additions to the lexicon The focus of this chapter is on the means actually employed to extend the lexicon in subsequent time periods since 1150; consequently, my primary interest is in words newly added to the word-stock in the respective centuries. To this end, determining the date and the etymological origin of nominal types on reasonable grounds is of utmost importance. Hence, Section 5.1 first details how the date and process of origin of the nouns were established, followed by remarks concerning the classification of the data (Section 5.2). Subsequently, the distribution of old and new lexemes is depicted (Section 5.3) before turning to the central issue of the chapter – the description of the processes used to add new nouns (Section 5.4). In conclusion, Section 5.5 presents the results from a typological angle.
5.1 Determining date and process of origin As stated above, the study of additions to the nominal lexicon crucially depends on carefully determining when and how the types entered the English lexicon. Until the advent of large searchable databases such decisions were based on printed dictionaries with small, often incomplete samples of attestations; as a result, lexemes were often postdated (see also Cowie & Dalton-Puffer 2002). To avoid such problems, historical morphologists (e.g., Cowie 1999; Haselow 2011) have begun to work with a starting lexicon, following the same logic underlying hapax-based measures of productivity (Baayen 2009): Corpus tokens that newly surface in a certain subperiod, i.e., those not attested in previous subcorpora, are considered to stand in for neologisms because “words which are ‘new’ within the universe of the corpus have a certain probability of encompassing words which are also new in the language” (Cowie & Dalton-Puffer 2002: 431). Obviously, for this estimation to work out, the corpus base needs to be sufficiently large, which presents serious challenges for historical corpus linguistics: Haselow’s (2011) complete corpus consists of about 200,000 tokens, Cowie’s (1999) starting lexicon comprises corpora totaling roughly 550,000 tokens – the likelihood that elements appearing for the first time in these small corpora are, in fact, newly introduced lexemes is fairly slim (see also Cowie & Dalton-Puffer 2002). In view of today’s large electronic dictionaries, such as the OED comprising more than three million attestations, the determination of neologisms in historical data based on a starting lexicon is apparently not a methodological improvement but rather exacerbates the dating problem.
https://doi.org/10.1515/9783111317717-007
52
5 New additions to the lexicon
Consequently, I decided to determine the date and source of origin of the approximately 12,000 nominal types on the basis of lexicographical information, mindful of the fact that no dictionary can provide absolute coverage but that drawing on several reference works would complete the picture (for a similar approach see Kempf & Hartmann 2018). Of primary importance for the entire time span was the OED, which is under constant revision with updates published every three months (Durkin 2016); hence, some of the roughly 700 lexemes not listed in the dictionary during the time of data collection may meanwhile have been added.28 Still, most of these types have probably not been included given that the dictionary displays systematic gaps in documentation: For a start, nonce words, i.e., “ad hoc coinages by individuals in particular circumstances” (Durkin 2009: 49), also termed “occasionalisms” (Berg 2009: 85), are excluded (see also Nevalainen 1999).29 Moreover, transparent complex lexemes are omitted from the OED because, as explained by its first primary editor, “[t]he number of these combinations is practically unlimited, since they can be formed at will” (Murray 1888: vii). In other words, extremely prolific processes, such as compounding, are underrepresented in the OED, or any dictionary for that matter (see also Cowie & Dalton-Puffer 2002; Berg 2012). While my primary concern is new additions to the lexicon in any given period, I did not want to artificially inflate the number of neologisms, attributing more creativity to the respective language community than warranted; therefore, I took a conservative approach to dating the nouns by assigning the earliest date permissible. The point of departure for all dating was the OED; more specifically, I recorded the first attested use of the specific noun unless the lexeme appeared in foreign language contexts or the quotation was an isolated, metalinguistic comment on the word. If the OED provided both the manuscript date and the composition date, my reference point was the date of the manuscript, in line with the dating of the corpus texts. The OED’s documentation of OE and ME usage, in particular, is fragmentary, and OE words that did not continue to be used later than early ME are systematically excluded from the dictionary (Durkin 2014: 23). Consequently, it needs to be supplemented by information gleaned from further dictionaries. In this respect,
I accessed the OED between November 2014 and March 2019 in order to identify and classify the collected nouns. Due to the large number of types, I did not record the precise access dates for each lexeme, especially since many types were repeatedly checked against the dictionary during these years. In this study, nonce words that do not gain currency and neologisms that become established in wider use are not differentiated; I treat both phenomena as neologisms (see also Bauer, Lieber & Plag 2013: 30).
5.1 Determining date and process of origin
53
the Middle English Dictionary (MED) proves to be indispensable.30 Due to the wealth of material, the MED often allows antedating types that had been preliminarily dated on the basis of the OED. Frequently, the earliest MED dates attest to a lexeme being used ‘as a name’, which I did not universally adopt: Proper nouns only counted as first quotations if the name had been derived in English, such as gabber ‘liar, mocker’, recorded in 1230 as “Willelmus le Gabber”. By contrast, first attestations of foreign names, such as prudence, noted in 1203 as “Adam Prudence”, were ignored because these proper nouns may well have had personal reference solely, without any semantic relation to the word borrowed into English a century later (see also Durkin 2016). Dating was further improved by consulting additional dictionaries. For OE nouns that are not attested before ME in the OED or the MED, I relied on the online edition of Bosworth-Toller Anglo-Saxon Dictionary (BT), which occasionally needed to be supplemented by Sweet’s (1897) Student’s Dictionary of Anglo-Saxon.31 In general, both OE dictionaries only sparsely document loanwords from Old Norse (ON), partly due to gaps in textual evidence (Durkin 2014: 187), partly because the type of words borrowed from ON were used in spoken language rather than written registers (Serjeantson 1936: 64). Nevertheless, I assigned ON loanwords to the OE period on extralinguistic grounds: The contact between English speakers and Scandinavian settlers in England peaked in the 10th century, followed by the decline of the ON variety spoken in the British Isles; therefore, we can safely assume that borrowing from ON ended before 1150 (Durkin 2014: 187; see also Burnley 1992; Dietz 2015a; Kastovsky 1992a). Not only the lexemes’ date of origin but also the process from which they originated was decided on the basis of etymological information provided by the dictionaries. Since dictionary etymologies are often tentative proposals (Durkin 2009: ix), I determined the process of origin in accordance with concurring accounts by at least two sources whenever possible. In a few instances, conflicting accounts of a word’s etymology could not be resolved. A case in point is ME misdoer: The OED stipulates suffixation of verbal misdo, while the MED suggests prefixation of nominal doer. According to the OE dictionaries, the verb misdo as well as the noun doer existed in OE, allowing for both affixation processes; so, lacking fur-
The online searchable MED was regularly accessed between November 2014 and March 2019. I had no access to the Dictionary of Old English (DOE), published by the University of Toronto, but in recent years, the OED and the MED have started to establish links between their headwords and the respective entry in the DOE, which I took as additional evidence for the type’s attestation in OE.
54
5 New additions to the lexicon
ther evidence, the type was classified as ‘unclear’ in this regard. In total, only 337 nouns, i.e., less than 2.8% of all nominal types, had to be annotated as ‘unclear’. Most importantly, however, recourse to various dictionaries is necessary in order to ascertain whether a Romance-flavored noun is the result of borrowing or word formation in English. In this context, the OED often prematurely assumed derivation in English, which has already led to some types being reclassified as loanwords in the course of the revision process (Cowie 2012). Besides consulting the aforementioned dictionaries as well as Johnson’s ([1755] 1792) Dictionary of the English Language and Bailey’s ([1721] 1763) Universal Etymological English Dictionary, etymologies were substantiated using French reference works. For the ME period, valuable information is supplied by the Anglo-Norman Dictionary (AND), a glossary supplemented by quotations from Anglo-French literary texts and nonliterary documents produced in medieval England.32 For the eModE period, I relied on Hollyband’s (1593) A Dictionary French and English, Cotgrave’s (1611) Dictionarie of the French and English Tongues and Miège’s (1677) New Dictionary French and English, with another English and French.33 Finally, lexemes originating in the 19th and 20th centuries required recourse to still further dictionaries, as large parts of the OED date from its first edition (1884–1928) and have not yet been revised (Durkin 2016). To this effect, I additionally consulted the Chambers Dictionary of Etymology (1999) and the MerriamWebster Online Dictionary.34 To sum up, a total of twelve dictionaries, in addition to the occasional access to the LEME, were used to obtain and verify information as to when and how the nominal types under investigation originated. Drawing on the wealth of knowledge accumulated in these reference works improves the accuracy in determining the etymology of the lexemes and, ultimately, allows for a more realistic approximation of the means used to extend the nominal lexicon.
Despite its critical stance toward the concept ‘Anglo-Norman’, the dictionary has retained the designation in continuity with its first edition (Rothwell 2006). I accessed the revised edition (2001–), electronically searchable on the Internet, between November 2014 and May 2019. Some instances, analyzed as borrowings from French or Latin by Johnson (1792) and/or Bailey (1763), needed to be checked against further dictionaries. Here, I resorted to the online platform Lexicons of Early Modern English (LEME), trying to ascertain the word’s existence in French and/ or Latin. The online edition of Merriam-Webster (1996–) was accessed between January and February 2019.
5.2 Classifying the new nouns
55
5.2 Classifying the new nouns 5.2.1 Borrowing Determining whether a particular lexeme was borrowed from Romance or formed in English is inherently difficult because French and Latin loanwords gave rise to new affixes to be used in English (see also Cowie 2012). In an attempt to gauge the productivity of Romance affixes in English, researchers often interpret hybrids, “words which mix elements from the native Germanic part of the vocabulary with elements from the recently borrowed Romance part of the vocabulary” (Cowie & Dalton-Puffer 2002: 420), as evidence of word formation in English (e.g., DaltonPuffer 1996: 221–222). In principle, hybrids denote Romance bases combined with Germanic affixes, on the one hand, and Germanic bases affixed with Romance morphemes, on the other. Obviously, the first alternative is the result of word formation in English, not requiring further discussion here; the second option, however, needs closer attention. Not infrequently a complex word made up of a Germanic base and a Romance affix, such as husbandry (1) or bondage (2), is claimed to have been coined in English “simply because it has an English word as its base” (Durkin 2014: 329; see also Palmer 2009: 9). This simplistic approach has been refuted on a word-byword basis, demonstrating that husbandry, bondage and many others most probably originated in the donor language (Gadde 1910: 22, 51; see also Dietz 2002). (1) Marie was not distracte aboute husbondrye (cmaelr3.m23)35 (2) men and women & childryn, þe wheche weren all holden in thraldom and bondage (cmbrut3.m3) The claims about the English origin on account of Germanic bases completely ignore the fact that English words had been borrowed into Anglo-French: The contact situation, outlined in Chapter 3, paved the way for mutual borrowings, and the AND clearly testifies to words borrowed from ME (see also Gadde 1910: 26–27; Dietz 2002; Wogan-Browne 2013). To stipulate that Romance suffixes, such as -ry or -age, were deliberately affixed to English nouns as early as 1300 (husbandry) or 1330 (bondage) would undoubtedly overstate the case. In short, hybrid formations cannot be adduced as conclusive evidence of English word formation, although
For the sake of readability, the numbered examples provide the immediate context only, the original sentences being far too long and convoluted.
56
5 New additions to the lexicon
they may well have contributed to the identification and subsequent use of Romance affixes in English. Similarly, caution is advisable when approaching complex words made up of Romance affixes attached to Romance bases that had already been admitted to the English lexicon: chastisement (cmayenbi.m2), for instance, is described as an English deverbal derivation by the OED and the MED, apparently because the verbal base chastise is attested around the same time as the complex noun in English. But, as Dalton-Puffer (1996: 210) observes, “in many cases, if not in most, it is most likely that the derived noun was borrowed first”, which is certainly true for chastisement, first attested in English in 1303. Again, it would be misleading to contend that early ME speakers formed the lexeme by appending a string of phonemes such as -ment, probably still meaningless to them in the early 1300s, to a recently borrowed verb. (The emergence of Romance affixes in English is discussed in more detail in an excursus; see Section 5.4.7.) Since I did not want to risk antedating word formation by Romance affixation in (Middle) English, I classified lexemes as loanwords if listed as Anglo-French words in the aforementioned AND. For the eModE period, I resorted to the bilingual dictionaries named in Section 5.1: If a given type was listed as a French headword, annotated for grammatical gender, and not used in the English rendition, the lexeme was classified as a loanword. All loanwords were categorized in terms of their origin, but only the final step in a possible series of borrowings was considered; thus, a Greek lexeme, for example, entering English via Latin/French was classified as a borrowing from Romance, regardless of the loanword’s ultimate origin. As a rule, I did not classify loanwords according to the specific donor language but allocated the types to the respective language groups, namely Romance, West Germanic, Celtic and Greek. While the latter three groups are negligible in terms of type and token frequencies, the high numbers of Romance borrowings during the Middle Ages would certainly justify further subdivisions, specifically into French and Latin. Still, it has been repeatedly stated that it is difficult or even impossible to decide whether a word borrowed into ME, in particular, originated in French or Latin (e.g., Dietz 2015a; Durkin 2014: 236–251; Grant 2009); therefore, these loanwords are best treated indiscriminately (Dalton-Puffer 1996: 11). Against this backdrop, I use ‘Romance’ as an umbrella term to comprise both donor languages for the ME period and beyond.36
It would have impaired comparability of the data to first merge French and Latin and later differentiate between the two donor languages.
5.2 Classifying the new nouns
57
An equally pragmatic approach was adopted to deal with instances of reborrowing, such as nominal absolution, first recorded in OE as a loanword from Latin and subsequently attested in ME as a borrowing from French in the OED’s entry for this noun. Cases of reborrowing usually served to reinforce the older type; if anything, they involved modifications of the word’s semantics, such as the various meanings added to OE cāsus ‘case’, which had been restricted to solely denote grammatical case until the lexeme was newly borrowed into ME. Since semantic developments are not immediately relevant to my investigation, I decided to simply ignore processes of reborrowing. Also, it would have been difficult to distinguish reborrowed nouns from types undergoing respelling or remodeling (Durkin 2014: 325–327). In eModE, French loanwords were often formally modified in accordance with Latin patterns, possibly due to “the predominating fashion for Latinate word forms” (Durkin 2014: 326). Latinized reshapings systematically affected French affixes (Dietz 2015b); thus, the ME noun avaunce, listed as a French loanword in the MED, became advance in eModE, the French prefix a- being replaced by the Latin morpheme ad-. Since neither French a- nor Latin ad- enjoy affix status in English noun formation, I did not classify such replacements as morphological operations but rather as formal alterations. As such, they have no bearing on the present study and were disregarded. Similarly, phonological developments resulting in the loss of one or more phonemes are not considered to constitute word formation (Durkin 2009: 117–118). More specifically, processes such as aphesis, e.g., eclipse > clips(e), aphaeresis, e.g., ambushment > bushment, or syncope, e.g., fantasy > fancy, all documented in the data, are purely phonological operations, which I subsumed under the category ‘alterations’, mostly following the OED’s etymology. If, however, the reduced and the full form are both attested in the donor language, I assumed two borrowing processes; hence, I occasionally deviated from the OED unless its analysis is corroborated by the MED.
5.2.2 Conversion As noted toward the end of Chapter 3, conversion is understood as the process whereby a word is transposed from one primary syntactic category to another one without affecting its formal shape. In principle, all word classes may be involved, but, evidently, the possibilities for closed classes are fairly restricted (see also Dixon 2014: 36). Accordingly, the focus is on open classes, i.e., conversions between verbs and nouns as well as category shifts between adjectives and nouns, besides the occasional conversion from a closed-class member.
58
5 New additions to the lexicon
Although conversion operates without formal modification, resulting in two words identical in shape, I classified instances such as rebuke (cmmirk.m34), derived from the ME verb rebuken, as V > N conversion. While formal identity may be considered impaired by inflectional endings, such as the early ME infinitive suffix -en, the system had already been substantially reduced by 1150. The loss of infinitive final -n started in early ME, followed by the drop of inflectional -e, with the result that infinitive inflections were completely lost by c.1400 (Dietz 2015b). Against this background, Marchand (1969: 363) concludes that the derivational relations between weak verbs and nouns “were fully established around 1200”. Considering that the inflectional markers were lost progressively and that this loss affected individual words at different rates, I decided to treat all nouns and their verbal and/or adjectival counterparts as bare forms, regardless of any morphological residue. Along these lines, nominal hindene ‘posteriors’ (3), the genitive plural of the unattested ME noun ✶hinde, was classified as derived from the ME adverb hinde ‘hind, behind’ by ADV > N conversion. (3) þet is þes deofles hindene (cmlambx1.mx1) (4) And that Electricks display their virtue more faintly by night than by day (boyle-e3-h) While transformations from verb to noun and vice versa are unanimously acknowledged to constitute word formation processes, shifts between adjective and noun, illustrated in (4), elicit conflicting views. Frequently, adjectives are elliptically used as nouns, i.e., the head noun of a nominal phrase is suppressed and the premodifying adjective acquires head status, e.g., the poor people > the poor (Biese 1941: 335; Dixon 2014: 40). Marchand (1969: 361) treats these elliptic phenomena as belonging “to speech (parole), but not to language (langue)”; as such, they do not qualify as word formation by zero-derivation because the omitted noun can always be provided. If elliptic expressions, such as musical < musical comedy, achieve independence from their phrasal source, the new nouns are regarded as unmotivated signs, which “do not belong in word-formation” (Marchand 1969: 361). In a similar vein, Quirk et al. exclude such instances from word formation, especially since adjectives elliptically used as nouns do not display “inflectional evidence of the word’s status as a noun” (Quirk et al. 1985: 1559). However, nouns derived from adjectives often denote collectives or abstractions (Nielsen 2005: 78), therefore disfavoring morphological pluralization on semantic grounds like other collective or uncountable nouns. Moreover, the pluralization of adjectives converted into nouns, while highly restricted in OE and ME, has steadily increased
5.2 Classifying the new nouns
59
since; today, about 80% of the nouns derived from adjectives take the plural -s (Biese 1941: 336); see also (4) above. In short, I do not regard nominal inflection as a conclusive criterion but rely on the syntactic function instead: If adjectives occur as heads of nominal phrases, they instantiate ADJ > N conversion (see also Dixon 2014: 40).
5.2.3 Compounding Determining the origin of compounds on the basis of lexicographical information is often difficult. Resulting from prolific processes, many compounds are deliberately omitted from dictionaries, although they may be found in the OED’s extensive attestations, which, for want of better sources, were used as reference point for the date of origin of types not otherwise specified. With respect to the process of origin, the dictionaries are even less reliable: Frequently, unambiguous classification is avoided by listing the respective type in unspecific categories such as ‘compounds and (miscellaneous) combinations’ (MED) or by simply cross-referring the reader to the entry of one of the constituent words (OED). Given the syntactic nature of compounding, it is inherently difficult to determine whether an N + N sequence instantiates a compound or a phrase. Theoretically, compounds, as opposed to phrases, are listed in dictionaries and have developed idiomatic meaning; they are solidly spelled, carry primary stress on the first constituent and defy syntactic operations since the constituents, unlike phrasal elements, are not available to the syntax (Bauer 1998). In practice, however, either the criteria are not met by a large number of compounds, e.g., listedness or solid spelling, or they may lead to inconsistent results as illustrated by mankind, which is spelled as one word but stressed on the second constituent (Bauer 1998). At most, we may conclude that criteria like phrasal stress or semantic compositionality unambiguously apply to phrases, whereas compound words display characteristics of both compounds and phrases (Giegerich 2009). From a diachronic perspective, this ambiguity is hardly surprising considering that compound stress patterns or semantic idiosyncrasies gradually emerge after two free lexemes have been juxtaposed or undergone univerbation (Sauer 1992: 116–123).37 Synchronically then, a certain overlap between compounds and phrases is only to be expected and, in fact, attested by compounds with phrase-like properties. Against this backdrop, I followed Bauer, Lieber & Plag (2013: 434), who “consider NN constructs
Univerbation is defined as “the merging of two (or more) words due to their frequent adjacent co-occurrence” (Bauer, Lieber & Plag 2013: 442).
60
5 New additions to the lexicon
as compounds, unless there is clear evidence to the contrary” (see also Bauer 1998). Accordingly, complex titles, such as lord cheyffe justes ‘Lord Chief Justice’ (machyn-e1 -h), are included as instances of word formation by compounding, while terms of address, like Mr. Attorney (raleigh-e2-h), are obviously phrases and thus ignored. Similarly, we observe an overlap between N + N sequences and the formally distinct classifying genitive constructions N’s + N that do not specify the referent but restrict the class denoted by the head noun. Serving the same function as N + N compounds, classifying genitives can be used interchangeably with N + N formations. Variation between N + N and N’s + N sequences is not only noticeable in PDE (Rosenbach 2006) but also existed in early ME (Sauer 1992: 156–158). Consequently, N’s + N sequences as exemplified by tanner’s hair (5), denoting material used to tan skins and hides, are classified as instances of N + N compounding. (5) mix thoroughly together a quantity of strong, fat loam, [. . .] a little tanner’s hair or straw, cut very small, with a little salt (grafting-1780) The vast majority of English compounds are N + N structures, but head nouns may also be modified by members of other word classes, notably adjectives, e.g., riht half ‘right-hand side’ (cmtrinit.mx1), and verbs, e.g., wagtale ‘wagtail’ (merrytal-e1 -h). These compounds were not systematically collected because the search routine only retrieved tokens written as one word or hyphenated and no additional search was carried out to collect nouns preceded by non-nouns spelled separately (see Chapter 4). Still, considering that ADJ + N compounds and phrases, for instance, existed side by side, at least in early ME (Sauer 1992: 66), I included these comparatively rare occurrences as instances of ADJ + N and/or V + N compounding. The loss of inflection in ME not only blurred the distinction between compounds and phrases but also obliterated formal differences between nouns, adjectives and verbs in general, resulting in today’s category-ambiguous premodifiers (see also Berg et al. 2012; Sauer 1992: 75–76). Since I focused on nouns in premodifying position, ambiguous premodifiers were classified as nouns if semantically appropriate. Accordingly, rebel soldier (towley-1746) was categorized as an example of N + N compounding, even though the OED lists this instance as an attestation of the adjectival use of the premodifier. Since compounding is a process that can be applied recursively, English compounds, especially, may well consist of more than two words. Nevertheless, the structure of all compounds is fundamentally a binary one, and, in case of more than two constituents, “one can find binary structures embedded in binary structures” (Bauer, Lieber & Plag 2013: 443). Accordingly, I determined the immediate
5.2 Classifying the new nouns
61
constituents of the types on the basis of lexicographical information.38 As an example, consider glass fibre filter paper (1975gibb): While the compound itself is not listed in the OED, the units glass fibre and filter paper are both attested as nouns, in contrast to ✶fibre filter; consequently, the type was classified as an instance of N (glass fibre) + N (filter paper) compounding. Finally, a fairly small class, totaling less than 0.7% of all types, comprises the socalled neoclassical compounds. Composed of bound morphemes, these complex words run counter to the essence of compounding, namely the concatenation of free morphemes, which is rather embarrassing since compounding has been argued to represent an agglutinating word formation process (see Chapter 3). Worse still, neoclassical compounds are “not a particularly stable or well-defined class” (Bauer 2017: 150), probably due to the dubious status of the constituent morphemes (e.g., Aikhenvald 2007b). Defined negatively, the bound morphemes cannot be affixes because affixes do not combine solely with each other; therefore, the constituents must be analyzed as obligatorily bound bases (Bauer 2017: 150). More positively, Bauer, Lieber & Plag (2013: 486) suggest a cline from bound bases to affixes on semantic grounds: “At the more contentful end we have what we would call neoclassical combining elements [. . .] At the less contentful end we have what we would tend to call affixes.” This proposal seems reasonable but only captures instances situated toward the extremes such as tracheostomy (1985smit), indisputably composed of two combining forms, tracheo- and -stomy, which approach free lexemes in terms of meaning (compare trachea and stoma). With regard to the numerous forms intermediate between the two end points, however, we soon encounter difficulties in determining the degree of semantic content; decisions to this effect are ultimately subjective, so that researchers frequently disagree on whether a morpheme like hyper-, e.g., hyperatrophy (6), is best considered a combining form (Bauer 2017: 155) or a prefix (Marchand 1969: 167–168; Dixon 2014: 126–127). (6) hyperatrophy [. . .], an atrophy which transcends the normal amount (1905oliv) From the users’ perspective, it is immaterial whether we classify a bound morpheme as a combining form or affix; however, their familiarity with affixation is certainly greater than their experience with neoclassical compounding because affixes are used more widely due to their semantic versatility, or vagueness for that matter. Moreover, affixes attach to free lexemes, which are reputedly more
For the transfer of the immediate constituent analysis from syntax to word formation see Huddleston & Pullum (2002: 1625–1626).
62
5 New additions to the lexicon
transparent than bound morphemes, so that I analyzed nominal types, such as hyperatrophy (6), as instances of affixation if the bound morpheme qualified as affix. As a result, I occasionally deviated from the OED’s analysis – and the dictionary’s inflationary use of the term ‘combining form’.39 Bound morphemes that do not realize affixes were interpreted as combining forms. Like their affixal counterparts, combining forms attach to free morphemes either in prenominal or in postnominal position (e.g., Bauer 1983: 214): If prefixed to the noun, e.g., biodegradation (1975gibb), the type was classified as an instance of CF + N compounding; if appended to the free base, e.g., genome (1975hoge), the complex word was included as the result of N + CF compounding.40 Very few new types instantiate quintessentially neoclassical compounds, such as the aforementioned tracheostomy composed of two bound morphemes; accordingly, these instances were classified as originating from CF + CF compounding.
5.2.4 Germanic affixation While the word formation processes conversion and compounding are categorized in terms of word classes, e.g., V > N, N + N, CF + N, the classification of the affixation processes is based, more specifically, on the individual affix. Depending on their etymological origin, the bound morphemes are considered to instantiate either Germanic or Romance affixation. This said, three affixes, namely -er, misand possibly -ard, originated in both Germanic and Romance. Since nouns with the given affixes of Romance origin were incorporated into the English lexicon as a result of later borrowings, I regarded them as reinforcing the older Germanic affix types under which they were subsumed. The point of departure for allocating the new nouns to the specific derivational process was the OED. I did, however, not rely solely on this source but crosschecked the information obtained from the OED against that provided by the MED. Comparison of the respective entries in the different lexicographical works, then, caused me to deviate from the OED’s classification in some instances.
As early as 1884 the precursor of the OED introduced the term ‘combining form’ to deal with such elements, but the dictionary’s generous application to recurring sequences produces some surprising results: The ADJ + N compound rear admiral and the V + N compound pickpocket, for instance, are interpreted as consisting of a combining form (rear, pick) attached to the respective head noun; for criticism see Marchand (1969: 132). According to the OED, -ome is used, inter alia, to form nouns meaning “all of the specified constituents of a cell, considered collectively or in total”; hence, genome refers to the set of genes.
5.2 Classifying the new nouns
63
To begin with, I followed the MED in subsuming the form -laik under the suffix type -lock, whereas the OED posits two distinct affixes, obviously due to the category of the base: Denominal fearlac ‘fear, terror’ (cmkathe.m1) is supposed to manifest the suffix -lock, while deverbal schendlac ‘disgrace’ (cmkathe.m1) is thought to exhibit the affix -laik, although both endings are spelled identically. More importantly, however, we may doubt that the word class of the base mattered to ME speakers because fear and shend could be used as both nouns and verbs in ME, allowing for fearlac and schendlac to be interpreted as deverbal or denominal derivations. Hence, instead of advancing speculations in this respect, -lock and -laik were conflated (see also Dalton-Puffer 1996: 81). Similarly, I did not adopt the OED’s categorization of -hood and -head as separate suffixes but followed the MED, which lists both forms as variants of a single affix. In ME, especially, spelling seems to have been erratic; thus, the data include forms like childhade (cmvices1.m1), childhede (cmkentse.m2), childhode (cmpolych. m3) ‘childhood’ and evince variation between manhode and manhede ‘manhood’ even within the same manuscript (cmwyser.m3). Despite the formal differences, the complex words convey the same meaning; by implication, then, the forms -hood and -head do not differ semantically and were thus subsumed under the suffix type -hood (see also Marchand 1969: 293; Dalton-Puffer 1996: 77–78). Further deviations from the OED’s classification relate to the distinction between compounding and affixal derivation, which is no trivial matter, especially in a longitudinal study beginning in early ME. The boundary between these word formation processes is blurred because Germanic affixes regularly developed from compound constituents; more specifically, compound heads gradually evolved into suffixes, whereas compound modifiers slowly grew into prefixes (e.g., Berg et al. 2012). The gradualness of this change makes it inherently difficult to determine for a given morpheme when the transition from compound constituent to affix can be considered complete. Differentiating between compounding and prefixation is particularly problematic because many Germanic prefixes emerged in ME, replacing the OE obligatory bound morphemes, which had been lost as a result of phonological erosion (Marchand 1969: 130; Bauer 2003). Due to their comparatively young age, these prefixes have preserved not only much of their semantic content but also the formal shape of their freely occurring counterparts. Along these lines, the OED treats complex nouns such as bystighe ‘byway, by-path’ (7) and inmate ‘inhabitant, tenant’ (8) as compounds premodified by a preposition and an adverb, respectively. This view seems to be shared by Marchand, who classifies such complex lexemes as ‘preparticle compounds’, considering by-, in-, etc. locative particles (Marchand 1969: 113–121). By contrast, the MED posits prefixation in these cases. In line with the MED, I decided to categorize potential compound modifiers as prefixes, fol-
64
5 New additions to the lexicon
lowing Dixon (2014: 164–165, 146), Bauer, Lieber & Plag (2013: 340) and Dietz (2015a) among others. (7) He lad me vp þe bistiʒes of riʒtfulnes (cmearlps.m2) (8) the pestering of Houses with diṽse [diverse] Famylies, Harboringe of Inmates (stat-1590-e2-h) As mentioned previously, the OED is overly generous in assigning bound morphemes to the fuzzy category ‘combining form’. Thus, the dictionary classifies kine- ‘kingly, royal’, e.g., kine-ring (cmkathe.m1), as a combining form, whereas the MED and BT ascribe prefix status to this form. While the OED’s attribution may have been motivated by the morpheme’s high semantic content, this rationale cannot account for the dictionary’s differentiation between again- (combining form) and gain- (prefix) since both forms denote ‘against, in opposition’. In consonance with the MED, I regarded both forms as prefixes instead of following the OED’s seemingly haphazard approach. Overall, I classified bound morphemes as affixes whenever possible, relying on the lexicographical sources as well as the most recent comprehensive work by Dixon (2014) and Marchand’s (1969) authoritative reference book. (For further discussion and a comprehensive list of all Germanic nominal affixes attested in the data see Chapter 6.)
5.2.5 Romance affixation As before, the starting point for establishing whether a new noun originated from Romance affixation was the OED. Comparison of the dictionary’s etymological information with that supplied by the MED and the AND, in particular, led to the exclusion of several types, such as chastisement, mentioned before. Vice versa, the nouns semicircle (record-e1-h) and confinement (stat-1690-e3-h), classified as loanwords by the OED, appear to have been derived in English (see also Johnson 1792); accordingly, these types were included as instances of Romance affixation. All Romance affixes were cross-checked against the reference works by Dixon (2014) and Marchand (1969), which are, however, less comprehensive in their coverage than the OED. In particular, affixes manifest in scientific words, such as -osis, realized in myxomatosis (1954sackx8b), are solely listed in the dictionary, so that I had to rely on the OED exclusively. Moreover, I decided to follow the dictionary’s treatment of the morpheme -ier, ascribing suffix status to what is regularly interpreted as a mere by-form of the agent morpheme -er (e.g., Dixon 2014: 307; Marchand 1969: 278–279). The decision to classify the form as an affix in its own right rests on both semantic and formal
5.2 Classifying the new nouns
65
grounds. On the one hand, the meaning of the suffix is more restricted to ‘profession’ than that of -er; thus, a lawyer (cmwycser.3) is professionally connected to the denotatum of the base. On the other hand, the two suffixes differ markedly in terms of their phonological shape, with -ier having a distinct Romance flavor. As noted in Chapter 3, complex Romance lexemes reestablished affix allomorphy in derivation, most impressively displayed by the suffix type -ation with no less than seven alternative manifestations. Labeled as “multi-form suffix” (Dixon 2014: 337), the type is realized as -ation, e.g., causation (1905croom7b), -ion, e.g., confusion (doddridge-1747) or permission (fayrer-1900), -ication, e.g., application (burnetcha-e3 -h), -tion, e.g., convention (1981longj8b), and -ition, e.g., addition (statutes-1805). The affixation process triggers variation of the base to different degrees: Besides stress shift, illustrated by causation, and base vowel change, demonstrated by convention, the base-final consonant may be palatalized, as in confusion, or spirantized, as in permission (see also Bauer, Lieber & Plag 2013: 161–177). In addition to these five affix forms, -ption and -ution are included as variants of the suffix -ation. Even though complex words such as conception (authold-e2-h) or evolution (1975hoges8b) are thought to exhibit “[c]lassically motivated allomorphy” (Bauer, Lieber & Plag 2013: 176), i.e., variation not motivated in English but in the donor language, the respective alternations are fairly systematic: Derivationally related pairs like describe – description (fielding-1749) and perceive – perception (thring-187x) show regular replacement of verb-final and by
when affixed with -ation, reflecting the phonological changes from /b/ and /v/ to /p/. Equally systematic are alternations between verbs and derived nouns such as solve – solution (george-1763): If the base-final is preceded by a lateral, the form of choice is -ution. Unlike the five allomorphs mentioned above, however, -ption and -ution must be considered to attach to bound bases exclusively because the discrepancy between the verb in independent use and its occurrence in complex words cannot be accounted for by (supra-)segmental changes induced by affixation. Although all variants are attested by the data, the few new nouns derived by -ation and its allomorphs comprise free bases only, indicating that language users have been reluctant to employ bound bases. Should we then assume that speakers have, in general, avoided bound bases in coining new words by Romance affixation? This appears to be a reasonable conclusion, but how would we account for new nouns, such as resiancy ‘residence’, which obviously include a bound base? What is at issue here are the three suffixes -ance, -ancy and -acy, evident in pursuance (9), resiancy (10) and supremacy (11), respectively.41 While the majority
For ease of exposition, homophonous -ance and -ence as well as -ancy and -ency are denoted by -ance and -ancy in this book.
66
5 New additions to the lexicon
of complex words ending in -ance contain free bases, -ancy and -acy seem to prefer bound bases – a likely obstacle for speakers to create new lexemes. (9) Wee have thought fit in pursuance thereof to signify to you Our Pleasure (charles-1670-e3-h) en [Christian] Name Surname Mysterie and Place of Dwellinge and Re(10) Xp siancy (stat-1580-e2-h) (11) that was the deniall of the kings supremacye (roper-e1-h) However, virtually all complex types composed of a bound base and -ance, -ancy or -acy are accompanied by a parallel adjective ending in -ant/-ent or -ate, establishing pairs such as eloquence – eloquent, brilliancy – brilliant, accuracy – accurate. Although many of these nouns and adjectives were borrowed separately, in a synchronic view, they exhibit highly systematic alternations, suitable to derive new nouns by what is termed “correlative derivation” (Marchand 1969: 216). Due to the regular changes affecting only the (pre-)final segment of the derivationally related forms, it stands to reason that English speakers would derive nouns from the corresponding adjective by spirantization of the adjective-final /t/, adding the suffix -y3 if appropriate.42 While such cases may therefore be considered “morphologically conditioned morphophonemic alternation[s]” (Kastovsky 2006b: 166), I nevertheless followed Marchand’s proposal, classifying these words as instances of correlative derivation. (For further discussion and a comprehensive list of all Romance nominal affixes attested in the data see Chapter 6.)
5.3 Distribution of new and old lexemes Before focusing on the nouns newly added to the English lexicon in each century, it seems advisable to consider the distribution of old and new words to gain a better idea of the quantitative importance of the new additions in general. Figure 5 presents the proportions of old and new nominal tokens; for illustrative purposes, the numbers for the period from 1150 to 1300 (12th/13th century), based on a corpus of 150,000 words, are normalized to a corpus size of 100,000 tokens in conformity with the subsequent periods. As noted in Chapter 4, the overall
Index numbers following the affix are based on the OED’s classification scheme, used to distinguish homonymous affixes.
5.3 Distribution of new and old lexemes
67
Figure 5: Distribution of old and new noun tokens across centuries.
numbers of tokens across the centuries differ only moderately. While the proportion of new to old nouns fluctuates, language users throughout the investigation period have chiefly drawn on well-known vocabulary, as evidenced by the huge proportions of old tokens, amounting to around 96% on average in each century. Table 3 displays the raw numbers of old and new tokens, non-normalized for the first period, with percentages in parentheses. The beginning of the study period shows a relatively high proportion of new to old tokens, gathering momentum in the subsequent century but followed by a sharp drop in the use of new tokens toward the end of ME. From the 16th to the 19th centuries, the number of new tokens oscillated between increases and decreases without reflecting a clear trend toward either direction; by contrast, the last period exhibits a strong rise in token numbers, approaching the proportion of new tokens manifest in the earliest stage. Table 3: Distribution of old and new lexemes across centuries. th/th Tokens
Old New
Types
Old New
th
th
th
th
th
th
th
, (.) , (.)
, (.) , (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
68
5 New additions to the lexicon
Due to the large numbers of tokens, all chi-square tests for differences between adjacent centuries prove to be significant (p = .000); thus, in order to avoid the pvalue fallacy (Wasserstein & Lazar 2016), claims are based on the effect size indicated by the phi coefficient.43 In this respect, then, merely two changes are remarkable: The decline of new tokens from the 14th to the 15th century instantiates a small to medium effect (phi = .22), and the increase from the 19th to the 20th century produces at least a small effect (phi = .10). Otherwise, the differences between old and new tokens tested for two adjacent periods show effect sizes well below small. Unlike the overall numbers of tokens, which are spread fairly evenly across the centuries, the global distribution of types shows a steady increase throughout the study period, as illustrated in Figure 6; for visual effect, the data for the 12th/13th century are again normalized to a corpus size of 100,000 tokens. As already suggested in Chapter 4, the rise in the overall numbers of types coupled with the more or less constant numbers of tokens indicates that English speakers have become more versatile in their usage of nouns.
Figure 6: Distribution of old and new noun types across centuries.
As recently discussed by Winter & Grice (2021), chi-square tests performed on corpus data are flawed because the statistics assumes that all data points are independent samples, which is rarely true as observations are usually clustered in texts, thus interrelated by author and, on a global level, by register. On that account, the tests presented in this part of the book violate the test’s independence assumption as they are based on the sum totals of all occurrences, neglecting potential sources of non-independence such as author and/or register. For present purposes, however, this neglect is acceptable since the primary goal is to uncover diachronic tendencies for the first time, justifying the exclusive focus on the variable ‘century’.
5.3 Distribution of new and old lexemes
69
The proportion of new to old types, however, reflects the same development displayed by the ratio between new and old tokens, albeit in a quantitatively more pronounced manner; for the non-normalized, raw frequencies of old and new types as well as the respective percentages in each period see Table 3. The low numbers of new types in the 15th to 19th centuries raise the question whether these might be an artifact of the periodization underlying this study. As a matter of course, any temporal subdivision and allocation of data to delimited time slices is inevitably artificial insofar as the emerging picture does not reflect the continuous development evident in language (Cowie & Dalton-Puffer 2002). But at issue here is whether delimiting, say, the 15th century by the years 1400 and 1499 instead of the selected demarcation points 1401 and 1500 would substantially change the proportion of new types. Tentatively realigning the data accordingly revealed that the number of new nouns in the 15th century would indeed rise from 169 to 222; however, in the subsequent periods, the differences, ranging between one and four types, were negligible. Consequently, the low numbers reported for new types can be considered reliable. Similar to the distribution of new tokens, the proportion of new types was remarkably high in the first two periods under investigation, followed by a significant decrease in the 15th century with a medium effect (phi = .32). Between the 15th and 19th centuries, their proportion first expanded, but subsequently shrank until the 19th century; as such, the new types in language use do not corroborate dictionary-based claims, namely that “the Early Modern English period is marked by an unprecedented lexical growth” (Nevalainen 1999: 332). After the 18th century, which itself seems to have been the least innovative period in terms of new nouns (see also Durkin 2014: 308), type numbers started to rise again; the difference between the 19th and 20th centuries proves statistically significant, albeit with a small effect size only (phi = .12). On closer inspection, the significant difference between the 14th and 15th centuries in the distribution of old and new types materializes in all registers, as demonstrated by the comparison of the text types instantiated in the 14th century and their equivalents in the subsequent period (see electronic Appendix E3): In all seven registers, the differences are statistically significant at the .000 level, with effect sizes ranging from small (Bible, History, Sermon, Rule) to moderate (Fiction, Religious treatise) to large (Handbook: astronomy). By contrast, the significant deviation in the ratio of new and old types between the 19th and 20th centuries is, to some extent, a corpus artifact due to the novel register News with a large proportion of new nominal types. More importantly, however, the large difference between the two centuries with respect to new and old types can be traced to scientific writings: Comparing the seven remaining text types of the 20th century to their analogs of the preceding period reveals no statisti-
70
5 New additions to the lexicon
cally significant differences except for Science: medicine (χ2 (1) = 48.1, p = .000, phi = .19) and Science: other (χ2 (1) = 75.89, p = .000, phi = .25). The sharp rise of new nominal types in scientific writings since 1901 may seem less peculiar if we consider the extensive linguistic changes in this register reported by Biber & Conrad (2009: 157–166). With regard to the 20th century, the authors observe important shifts in textual conventions which may correlate with the introduction of a large number of new nouns, but this suggestion must remain speculative at this point.
5.4 Processes used to add new nouns We now turn to the means employed by language users to extend their nominal lexicon as evidenced by the new lexemes in each century. The section first provides the overall correlation of the diverse processes, followed by more detailed presentations of the individual means. Particular attention is devoted to Romance affixation due to its presumed impact on the typological development of the English lexicon, but the discouragingly low numbers of new nouns derived by this word formation process cast doubt on whether language users have indeed adopted Romance affixation as a profitable means to extend their nominal wordstock. This raises the question how English speakers might have integrated the Romance bound morphemes into their lexicon, which is examined more closely in the excursus at the end of the section.
5.4.1 Overview of the relative distributions A general impression of the means used to enlarge the lexicon since 1150 is given by Figure 7, which represents the relative distribution of new types according to their origin. For completeness only, Figure 7 displays a category labeled ‘Misc.’, comprising unclear cases, alterations and minor processes; the distribution of these miscellaneous types together with their token frequencies across the centuries is detailed in Appendix A1. Correlating the different processes instantiated by new types within each century reveals that borrowing was the most frequently employed means in ME before compounding took over as the quantitatively most important process in eModE, continuously expanding its position ever since. Germanic affixation, while the second most used process in the first two periods, has steadily lost ground, which has not been compensated by the surprisingly small gains in the domain of Romance affixation.
5.4 Processes used to add new nouns
71
Figure 7: Proportional distribution of new types by process of origin.
Again, the usage data do not testify to trends established on the basis of dictionaries, according to which, from 1460 to 1774, borrowing ranked first, followed by affixation in second and compounding in third place (Nevalainen 1999). Only with respect to conversion do the dictionary- and usage-based data agree, namely that conversion has been the least frequently employed method to derive new nouns. In the first period new lexemes were still predominantly derived by native means, namely compounding and Germanic affixation, but borrowing already accounted for nearly 40% of all new types. In consonance with the contact situation between English and French described in Chapter 3, borrowing increased after 1300 and still figured prominently in the 15th century. While compounding, after a brief decline in the second period, has progressively gained strength, Germanic affixation has constantly decreased, most strongly rivaled by compounding and, to a far lesser extent, by borrowing. Romance affixation has not become a serious competitor to Germanic affixation, although dictionary-based findings suggest that the proportion of Germanic to Romance affixes in newly derived words dropped from around 80% in the mid-15th century to 30% over the next 300 years (Nevalainen 1999). This development is not reflected by the new nominal types in language use: The proportion of Germanic affixation still surpassed that of Romance affixation in the 18th century. As regards the new tokens, Figure 8 depicts their relative distribution according to their origin. Overall, the new tokens are distributed similarly to the new
72
5 New additions to the lexicon
types with some noticeable deviations. As before, borrowing is the most frequent option in ME but the proportion manifest in new tokens is consistently higher across the centuries than that observed for new types. By contrast, the proportion of new tokens that represent compounding is consistently lower in all periods than that displayed by new types. While the numbers of new tokens indicate how frequently the new types surfaced in language use, they do not reveal if the new types, once adopted, spread as ready-made nouns or if the process from which they originated was applied repeatedly. Thus, the findings allow for alternative interpretations: Either we assume that a noun was borrowed repeatedly, which, at least in ME times, would not be implausible given the diffusion of Anglo-French throughout society, or we surmise that a newly borrowed word spread rapidly in language use. In any event, compounding generated more lexical variety in language use than borrowing.
Figure 8: Proportional distribution of new tokens by process of origin.
In general, the rankings established on the basis of the relative distribution of new tokens mirror those based on new types for each century. In the first two periods, borrowing was the most frequent strategy, followed by Germanic affixation, with compounding and conversion trailing in third and fourth place, respectively. In the 15th century, compounding started to increase substantially, relegating Germanic affixation to third place, a position it has maintained more or less ever since.
5.4 Processes used to add new nouns
73
Conversion has never been a frequently used option to derive nouns, occupying the penultimate rank until the end of eModE, before being surpassed by Romance affixation.44 Conversely, Romance affixation ranked last until the end of the 17th century, and, while exceeding conversion since the 18th century, the process has created far fewer new types, and tokens, than I would have anticipated based on the literature (e.g., Nevalainen 1999). Only in the 20th century did Romance affixation apparently become the second most frequently used choice to form new nouns, transcending borrowing and Germanic affixation. Still, I need to add that this ranking of the processes is a purely academic exercise: As described in the next sections, type and token numbers of new lexemes derived by Romance affixation in the last period do not diverge considerably from those introduced by borrowing and Germanic affixation because the overwhelming majority of new nouns have been formed by compounding since the 18th century, leaving an ever-smaller number of new types and tokens to be distributed across the remaining processes. As a result, this distribution is fairly even when compared to the totals resulting from compounding. While compounding can be clearly identified as the preferred method for nominal derivation, the other means employed to this end cannot be ranked from two to five with the same confidence for the past three centuries. On the whole, the development of the distribution of the processes used to extend the nominal lexicon since 1150, as outlined in Figures 7 and 8, suggests that language users have become less diverse in adding new nouns to their wordstock: Except for the 14th century, the different means by which new words entered the English language were more balanced until the end of eModE than during the last three centuries, when speakers increasingly resorted to compounding at the expense of other processes.
5.4.2 Borrowing The development of borrowing as a means of lexicon extension is presented in Table 4, displaying the absolute numbers of new types and tokens as well as their percentages of the total new nouns in each century. After a significant increase in both types and tokens in the 14th century, with a smaller effect size on the type level (phi = .23) than on the token plane (phi = .28),
The relatively high proportion of new tokens in the 17th century (12.6%), promoting conversion to the third rank, is grounded in a single scientific text (boyle-e3-h) that elaborates on electrics by using the newly derived noun electric 16 times throughout the (shortened) extract.
74
5 New additions to the lexicon
Table 4: New types and tokens as a result of borrowing.
No of types % of all new types No of tokens % of all new tokens TTR
th/th
th
th
th
th
th
th
th
. . .
. , . .
. . .
. . .
. . .
. . .
. . .
. . .
the numbers of new nouns added by borrowing have continuously declined. The decreases in new types and tokens in the 15th, 16th, 18th and 20th centuries are significant at least at the .001 level; beyond that, adjacent periods do not differ significantly. Even though the number of new tokens in the 19th century seems to indicate a rise compared to the preceding period, the difference does not prove significant. While the data corroborate claims about the strong increase in borrowing between 1251 and 1400 (Nielsen 2005: 9–10), they do not support observations about huge gains by borrowing in eModE, according to which “during the 16th century 15,000 and in the course of the 17th century 16,000 new loanwords were introduced” (Dietz 2015a: 1644). Although direct comparisons in terms of absolute numbers are precluded due to the comparatively small size of my database, the findings of the present study allow for comparison with trends stated in the literature. Dictionary-based investigations note the profound impact of borrowing from Romance, in general, and Latin, in particular, in eModE times. More specifically, the proportion of loanwords to all new words in eModE is assumed to have varied between 40 and 50% (Nevalainen 1999). This estimate far exceeds the percentages of borrowing manifest in new nouns: Table 4 specifies that, in the 16th to 18th centuries, their proportion amounted to 17.7%, 23.6% and 8.7%, respectively. Taking into account that nouns are borrowed more frequently than members of other word classes (e.g., Bybee 2015: 192; Dixon 1997: 20), the figures reported in the literature seem to be an artifact of the collection process. Table 5: New types adopted by borrowing according to source language group. Donor Romance WestG Celtic Greek Other
th/th
th
th
th
th
th
th
th
() () ()
(,)
() ()
() ()
() ()
()
() ()
() ()
()
() ()
()
5.4 Processes used to add new nouns
75
With regard to the donor languages, Table 5 lists the language families that served as sources of borrowing; the number of new tokens is given in parentheses. Doubtlessly, Romance provided the most important input throughout the investigation period, distantly followed by West Germanic languages. Not surprisingly, borrowing from Celtic languages is attested only in the earliest documents dating from c.1225, whereas loanwords from Greek are not documented before the 19th century when modern scientific writings intensified. For reasons explained above, I did not distinguish between borrowings from French and from Latin but subsumed them under the umbrella term ‘Romance’; consequently, the data cannot be directly related to recurrent claims about the strong influence of borrowing from Latin on the eModE lexicon (e.g., Dietz 2015a; Kastovsky 2006b; Nevalainen 1999). Still, the supposed influence should be visible in the overall figures for Romance borrowings from the 16th to the 18th centuries, but instead of a resurgence of borrowing in eModE, the data consistently show decreases in new types and tokens. Since I excluded reborrowing, thereby dismissing previously borrowed nouns as new types borrowed again at a later date, the numbers of new lexemes are probably lower than if reborrowed nouns were included as new types at the time of reborrowing. However, it seems doubtful that the figures adjusted for reborrowing would rise to such an extent as to support previous observations about the massive gains by borrowing from Latin. Rather, it stands to reason that the discrepancy results from the kind of data used in lexicon- and usage-based approaches. As Cowie (2012: 610) aptly notes, “the apparently dramatic peak of Latinate vocabulary” is due partly to the OED’s disproportionate sampling of eModE records and partly to the inclusion of hard-word dictionaries, i.e., works translating Latin, especially, into English. In short, it does not reflect common language usage, which appears to have been far less affected by borrowing from Latin than implied in the literature.
5.4.3 Conversion As illustrated in Figure 7 above, language users have rarely resorted to conversion to derive new nouns since 1150. After a decrease in the 14th century, the two subsequent periods witnessed an increase, followed by a continuous decline since 1601. Table 6 details the development of conversion in terms of new types and tokens as well as their proportions of the total new nouns per period. On the level of types, changes do not prove to be statistically significant due to the small numbers of new nouns derived by conversion compared to those obtained by other means. By contrast, the numbers of new tokens declined signifi-
76
5 New additions to the lexicon
cantly in the 14th, 18th and 20th centuries, albeit with a small effect size.45 Since the numbers of new tokens have dropped more sharply than those of new types, the TTR has increased since late ME (except for the 17th century), which should, however, not be interpreted as a sign of greater lexical richness given the low totals of new nouns resulting from conversion. Table 6: New types and tokens derived by conversion. th/th
th
th
th
th
th
th
th
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
No of types % of all new types No of tokens % of all new tokens TTR
In general, the loss of OE derivational affixes, on the one hand, and the introduction of Romance loanwords in ME, on the other, enlarged the stock of monomorphemic lexemes in English. This development, together with the adoption of formally identical nouns, verbs and/or adjectives from Romance, is considered to have advanced word formation by conversion (e.g., Dietz 2015b). Correlating the timelines of borrowing and conversion, we may find some support for the argument: The decrease in borrowing after the 14th century is mirrored, to some extent, by the increase in conversion in the 15th and 16th centuries. Still, we need to take into account that the rise in new nouns derived by conversion was not substantial enough to produce statistically significant effects and that this development was soon reversed. In short, the connection between borrowing and conversion seems less strong than implied in the literature, at least with respect to the derivation of nouns. Table 7: New types derived by conversion according to original word class. Word class < ADJ N conversion during the lifetime of verbal zeal, which became obsolete after 1642. While phonological overlap between a complex lexeme and its possible constituents is not decisive for its analyzability, as illustrated by vanity, phonological developments may impair the morphological transparency of complex words, ultimately obscuring their compositional structure as earlier exemplified by lord and cupboard (see also Bybee 2015: 206). Phonological transparency is “a gradable concept” (Cutler 1981: 75), obfuscated to different degrees by processes ranging from stress shift to base vowel changes and from resyllabification to phonemic changes at the word-internal boundary. Determining phonological transparency is further complicated by the fact that, diachronically, words are affected by phonological operations at different rates, depending inter alia on their frequency of use (Bybee 2015: 39–41). Instead of embarking on the hopeless task of ascertaining the pronunciation of some 12,000 types for each century since 1150, I relied on spelling to keep the work manageable. Frequently, the written language does not reflect changes in the pronunciation of complex words, so compounds, such as cupboard, may present themselves as transparently compositional in writing but not in the spoken language. How-
Lexemes were analyzed as having fallen into disuse if the lemma is explicitly marked by a dagger (†) or labeled ‘obs.’ in the OED. As regards dating, the type was assumed to have become obsolete after the last attestation date given by the OED unless the MED provides a later record.
106
6 Word formation patterns
ever, we should not prematurely preclude that oral and written modes do not influence each other: On the one hand, phonological developments may translate into graphic changes, as evidenced by darling; on the other, written forms may have repercussions for the spoken language as testified by the spelling pronunciation of rarely used words such as forehead, which is now more commonly pronounced /fɔ:hɛd/ than /fɔrɪd/ in British English (Durkin 2009: 56). If a morpheme used in complex words has graphically diverged from its use in free occurrence, the complex constituent routinely loses its morphemic status, thereby obscuring the compositional structure of the lexeme. A case in point is darling, manifested in the first two study periods as deorling (cmancriw-1.m1) and derlyngges (cmaelr3.m23) and thus transparently related to adjectival deore, dẹ̄re ‘dear’ in ME; in these instances, then, the complex noun can be regarded as a model word for suffixation by -ling. When the type later resurfaced in the form darling (victoria-186x), however, it could only be considered non-compositional by language users since the formal sequence dar was not part of the English lexicon (see also Bauer, Lieber & Plag 2013: 387). In this vein, most, if not all, data can be analyzed as structurally simple or complex depending on the actual existence of their constituent morphemes during the centuries since 1150. Importantly, the relation between morphemes constituting complex words needs to be meaningful. Unlike Kastovsky (2006b), I would argue that the compositional meaning of cupboard is still recoverable in the sense of ‘a closet with shelves for keeping cups, dishes, etc.’ even if its ordinary meaning has undoubtedly been generalized. Similarly, the meaning of presupposition (1933hodgh7b) or reaction (1925angus7b) is retrievable by combining the semantics of prefix and base, whereas neither prediction (tillots-a-e3-h) nor revenue (1904bensj7b) instantiate meaningful compositions of pre- ‘before’ or re- ‘again’ prefixed to diction and venue, respectively. Accordingly, only the former types were classified as model words for the particular prefixation process. Given the strong impact of semantics on the analyzability of complex nouns, minor graphic deviations, notably those occurring regularly in derivational processes, are likely to have been tolerated by language users. Hence, despite changes such as loss of medial schwa, e.g., remembrance (cmctpars.m3), or replication of base finals, e.g., potter (cmvices1.m1), these types qualify as morphologically transparent complex nouns to be classified as model lexemes. Beyond these general remarks, decisions on negligible alterations depend on the specific word formation process and are reported in the following discussions, which focus on the classification of nominal types as model words strengthening patterns for derivation by conversion, compounding and Germanic and Romance affixation.
6.2 Classifying the model nouns
107
6.2.2 Conversion As noted in Chapter 5, conversion predominantly involves members of open classes. In general, shifts between nouns and verbs are assumed to happen far more frequently than category changes between nouns and adjectives. As to the derivational direction, conversions from nouns into verbs seem to occur more often than transpositions of verbs into nouns, supposedly because the possibilities to derive verbs by affixation are less numerous than affixal options for nominal derivation (Marchand 1969: 364; Biese 1941: 406). By contrast, conversions from nouns into adjectives are considered to be realized less frequently than vice versa (Dixon 2014: 39–40). At this point, we may legitimately ask whether the derivational direction is relevant to language users and, if so, how it can be determined. As language users are commonly unaware of a word’s etymology, it hardly matters whether the lexeme was originally used as noun, verb or adjective before undergoing conversion. Synchronically, base and target word classes may be defined on semantic grounds, as proposed by Marchand (1964: 12): “The word that for its analysis is dependent on the content of the other pair member is necessarily the derivative.” While plausible at first glance, this suggestion has been rejected on the basis of experimental data demonstrating that native speakers’ intuitions about base and derivative differ extremely; obviously, conversion works in opposite directions simultaneously (Becker 1990: 49–50). Teddiman (2012), more precisely, establishes a correlation between speaker intuition and usage frequency. Reporting the results of a category decision task, she concludes that “75% of the time, participants categorized ambiguous target items in accordance with the more frequently occurring lexical category for those items” (Teddiman 2012: 241). Hence, the formation of mental patterns for conversion, in general, and the derivational direction, in particular, crucially depends on the token frequencies of the parallel forms, which, needless to say, change over time. Accordingly, the relative frequencies of all nouns and their corresponding verbs/adjectives would have needed to be determined synchronically, i.e., for every period since 1150, to better approximate the number of model lexemes in each century. Still, relative categorical token frequency, while reminiscent of the relative base-derivative frequency introduced by Hay (2002), is fraught with the same problems noted in the excursus in Chapter 5 – the absence of discrete breakpoints. Given the lack of threshold values that are essential to the assessment of relative frequency, I eschewed the time- and labor-consuming task of collecting all pertinent tokens. Instead, I decided to ignore possible derivational directions and classified all types as model words if accompanied by a corresponding verb
108
6 Word formation patterns
and/or adjective, mindful of the possibility that this classification may well overemphasize the relevance of this process for analogical noun formation. While the respective verbs are easy to identify, the delimitation of adjectives is more challenging, especially in the case of adjectives derived by conversion. At issue are word class changes from nouns to adjectives that frequently arise from the use of a noun in premodifying position. Marchand considers these uses as syntactic transpositions that instantiate “a regular syntactic pattern which has nothing to do with word-formation and derivation” (Marchand 1969: 360). Somewhat less rigidly, Quirk et al. (1985: 1562) accept denominal conversion into adjectives but “only when the noun form occurs in predicative as well as in attributive position”. Still, this cannot be considered a decisive factor since a large adjectival subgroup, the so-called associative adjectives, such as dental, solar or medical, likewise defy predicative use (Giegerich 2009). In fact, adjectives derived by conversion from nouns may be used predicatively, dependent on the individual language user’s experience and tolerance. Discussing the apparent N + N sequence steel bridge, Giegerich (2009: 184) surmises that “steel is arguably a noun-to-adjective conversion for those speakers who also accept it in the predicate position (this bridge is steel) and simply a noun (like London in London college) for those speakers for whom the predicate is ungrammatical”. This inter-speaker variation is easily accommodated by conceptualizing adjectives converted from nouns as gradient in terms of word class membership. In the absence of clearly defined boundaries between nouns and adjectives originating from nouns, I ascertained whether the nominal types were accompanied by parallel adjectives based on the information supplied by the OED. Although the dictionary, at times, too generously assigns adjectival status to nouns used attributively, I decided to strictly follow the OED’s category ascription to ensure replicability of the study. The nominal types were classified as model nouns fostering the specific conversion pattern during the time that the verbal and/or adjectival form existed; accordingly, date of origin and, in case of obsolescence, the last attestation of the verb and/or adjective were ascertained on the basis of the dictionaries. Thus, nominal wyrgyn ‘virgin’ (cmsiege.m4), first attested as an adjective in 1400, provides a model word for ADJ > N conversion since the second period under investigation; likewise, the borrowed noun visage (cmctmeli.m3), paralleled by verbal visage from 1386 to 1531 according to the OED, was included as a model lexeme for V > N conversion from the 14th to the 16th centuries. As noted previously, conversion operates without formal modification, resulting in two words identical in shape, but this does, of course, not suggest that all outwardly identical pairs are related by conversion. The primary criterion here is
6.2 Classifying the model nouns
109
morphological identity; consequently, the large group of nouns and adjectives ending in -ing, such as driving (N) and driving (ADJ), do not qualify as model words for ADJ > N conversion since noun and adjective are derived by distinct homonymous suffixes, also distinguished by subscript numbers in the OED. Formal identity between corresponding nouns and verbs may be considered impaired by inflectional residues, which I decided to disregard, as argued in the preceding chapter. Additionally, conformity may be weakened by differences in stress assignment or contrasts in voicing of the word-final segment, possibly accompanied by changes of the base vowel, such as life (N) and live (V). These phonological changes may or may not be reflected in spelling, e.g., descent (N) and descend (V) but house (N/V). As stated earlier, I did not monitor the types’ phonological developments since 1150 but focused on their graphic representations. In this regard, the formal distinctions between nouns and verbs are minor ones that are likely to have been tolerated by language users; consequently, the respective types were classified as models for V > N conversion (see also Bauer 1983: 228–229). Nominal types manifested in both categories, i.e., by a parallel verb and adjective, represent models for language users to derive nouns either from verbs or from adjectives (see also Dixon 2014: 37). Therefore, these lexemes were included as model nouns for both V > N conversion and ADJ > N conversion as exemplified by advance (brightland-1711): The noun, borrowed from Romance in 1400, has been paralleled by the earlier loanverb since its inception in English and developed into an adjective in premodifying use in the 19th century, promoting one pattern for V > N conversion since the 15th century and another one for ADJ > N conversion from 1801 to 2000. Besides being used as verb and/or adjective, the types under investigation occasionally instantiate other word classes, such as east (cmpeterb.m1), which additionally functions as an adverb. Due to their relative insignificance, shifts between nouns and these categories were disregarded as model words, i.e., I did not record if and when a type occurred as adverb and/or function word. The few exceptions to this rule are those nouns that were, in fact, converted from a member of the minor word classes, such as nominal seolf ‘self’ (cmancriw-1.m1) or aside (yonge-1865) derived from the earlier pronoun and adverb, respectively.
6.2.3 Compounding Due to its syntactic nature, compounding is an extremely prolific process, so that compounds, in particular, are often omitted from lexicographical works – in the present study, 20.5% of approximately 2,700 compounds could not be found in either the headword entries or the dictionary attestations.
110
6 Word formation patterns
The percentage is even higher among genitive compounds that may or may not have arisen by univerbation of the respective syntactic groups. While Sauer (1992: 152) traces ME cinnesmen ‘kinsmen’ to wið heora agenes cynnes mannum ‘against their own kinsmen’, he observes, at the same time, that preexisting syntactic groups were not prerequisite for the emergence of genitive compounds, as exemplified by the OE compound deofol-cræft ‘devil craft, sorcery’, which was resolved into the genitive construction deueles craftes ‘devil’s crafts’ during ME (Sauer 1992: 152). In this light, the following data from the early 13th century seem to testify to variation between these structures: woreld wele (16) and worldes wele (17) both realize ‘worldweal’, denoting earthly goods as opposed to heavenly wealth. (16) gef þu hauest woreld wele, þu miht þarof wurðliche fare (cmtrinit.mx1) (17)
hus and ham, wif and child, and gold and seluer, and alle worldes wele (cmvices1.m1)
The sequence in (17) appears to instantiate a genitive construction, signaled by the formal marker -s; however, the string worldes wele does not reflect the OE genitive case marker since nominal world did not inflect in accordance with the a-stem declension (masculine, neuter) where the genitive -s originated. While all noun classes ultimately adopted the a-stem genitive -s in ME, the marker may equally well be interpreted as a sign that simply links two words, thus indicating N + N compounding (see also Bauer 2019: 111). In sum, I included all classifying N’s + N sequences as model words for compounding patterns – irrespective of the linking -s, which may have acquired lexical traits – since they are functionally equivalent to N + N compounds and both constructions have been used interchangeably. As noted in the previous chapter, the vast majority of English compounds display nominal modifiers, but head nouns may also be modified by members of other word classes. With respect to their origin, compounds with non-nominal modifiers developed from syntactic groups that underwent univerbation, but again not necessarily so: ADJ + N compounds and phrases, for instance, existed side by side, at least in early ME (e.g., Sauer 1992: 66, 164). Regardless of whether they arose from univerbation or compounding, all respective nouns were classified as model lexemes for compounding, further subcategorized for the word class of their constituents, e.g., ADJ + N, V + N. In line with the procedure detailed in Chapter 5, I classified all compounds as to their immediate constituents, assuming a fundamentally binary structure. If the compound consisted of more than two members, the immediate constituents were established on the basis of the lexicographical works. Since the OED does not include OE and early ME words that fell out of use, the immediate constituents of older compounds, such as acer sæd hwæte ‘acre seed wheat’ (cmpeterb.
6.2 Classifying the model nouns
111
m1), were ascertained by recourse to the MED and the OE dictionaries. Besides verifying the existence of potential constituents, I noted their date of origin and, in case of obsolescence, their last attested use. To give just one example, houedaunce ‘hove-dance, court dance’ (cmreynar.m4) was included as a model word for N + N compounding only until 1700 because its premodifier hove became obsolete after 1639, so that the compound’s internal makeup could no longer be properly recognized by language users. Although English compounds predominantly consist of a modifier–head sequence, we observe occasional instances of left-headed structures. A case in point are N + ADJ compounds, e.g., president-general (1959gua1n8b). Originally borrowed from French, they have preserved the Romance head–modifier order but their impact on English word formation is supposed to have been negligible (Kastovsky 2006b); hence, Bauer, Lieber & Plag (2013: 439) consider these cases “pieces of lexicalized syntax”. To determine whether such impressionistic claims bear empirical scrutiny is beyond the scope of this book. Still, I did not want to prematurely exclude the possibility that N + ADJ constructions might have advanced a pattern for coining N + ADJ compounds in English; therefore, the respective data, albeit accidental findings due to the data collection procedure, were classified as model nouns for N + ADJ derivations. Likewise, constructions such as sister-in-law (boswell-1776) are undoubtedly left-headed, but their status is unclear. Bauer, Lieber & Plag (2013: 438) regard such types as lexicalized phrases, and thus not part of word formation, while the OED assumes that they were derived by compounding a noun and a combining form. Neither analysis, however, is satisfactory: On the one hand, the sequence in-law is regularly attached to any semantically appropriate head noun, also evidenced by four different types in the data, thereby calling into question that the entire construction is lexicalized and non-compositional. On the other hand, the OED, possibly in an effort to capture the observed regularity, unfortunately allocates the postnominal sequence to the category ‘combining form’, which only exacerbates the problems surrounding this class of elements. Against this background, I decided to include the respective lexemes as model words for compounding patterns, namely a noun followed by a prepositional phrase (N + PP). As before, the existence and lifetime of the constituents of left-headed compounds were ascertained on the basis of the pertinent dictionaries in order to determine if and when the specific complex lexeme provided a model noun for left-headed compounding. Another relatively minor group comprises the so-called neoclassical compounds. While their process of origin was fairly easy to determine on the basis of dictionaries, establishing the kind of pattern illustrated by the respective types proved far more problematic, especially given the inflationary use of the term ‘combining form’ by the OED. In line with the general approach of the study, the
112
6 Word formation patterns
primary criterion for the ensuing classification was transparency for the language users. For a start, the lexemes should allow effortless morphological segmentation which presupposes transparent word-internal boundaries; but again, the respective decisions are at the researcher’s discretion. Rather frequently, neoclassical formations display vocalic -o- or -i- at the morpheme juncture, e.g., discography, insecticide, which may be analyzed either as the base-initial segment of the combining form (-ography, -icide) or “as an independent element intervening between two bases” (Bauer, Lieber & Plag 2013: 456). The former analysis would involve variant base forms, e.g., -graphy, -ography, and thus an increased degree of fusion, whereas the latter assumes independent linking elements that overtly signal the word-internal boundary, which would enhance the transparency of complex words (see also Section 6.2.5). Subsuming neoclassical complex forms under compounding implies that I opted for the second alternative, i.e., I supposed the formal structure to be transparent to the language user. As in the previous chapter, I classified combining forms as affixes, whenever possible on the basis of the pertinent reference works, assuming that language users have been more familiar with affixation than with neoclassical compounding. Consequently, many types that are often described as neoclassical compounds, e.g., monotheism (1951brodh8b) or photography (1925angus7b), were included as model words for affixation patterns, e.g., mono- + N or V + -y3, which presumably better accommodates the users’ perspective. This rather unconventional approach not only reduced the size of this fuzzy class but also factors in that neoclassical lexemes composed of bound and free morphemes compare more closely to affixation than compounding in English. Bound morphemes that do not realize affixes were considered combining forms if their status could be ascertained on the basis of lexicographical information; accordingly, the loanword agriculture (turner1-1799) was included as a model noun for CF + N compounding since the 19th century because agri- was established as a combining form in 1839. Most dates were determined by recourse to the OED as the dictionary frequently records the first use of the morpheme as a combining form in the respective entry. Otherwise, I searched the dictionary for complex lexemes that had, in fact, been derived by employing the combining form in English. A case in point is -graphy undated in its own entry but first attested as a combining form in 1573 in the derived complex pseudography. Moreover, if the combining form was attached to a free base, the date of origin and, in case of obsolescence, the last attested use of the free morpheme was recorded as well since only transparent, semantically motivated complex words qualify as model words for compounding patterns. That said, several types exhibited less transparency than desired, namely those composed of two bound morphemes,
6.2 Classifying the model nouns
113
such as autograph (1920firbd7b); these quintessentially neoclassical compounds, then, were included as model nouns for CF + CF compounding.
6.2.4 Germanic affixation As in the preceding chapter, model nouns that may have strengthened specific affixation processes were classified in terms of the individual affix. To qualify as a model word, the complex noun needed to include a free base morpheme, since Germanic affixation became word-based during ME times. The existence of the free lexeme was determined and recorded for each possible base in the by now familiar fashion. The nominal affix inventory was established on the basis of the reference works by Dixon (2014) and Marchand (1969), both of which, however, focus on derivational morphology in PDE. Therefore, I resorted to the OED to ascertain the status of affixes that are thought to have fallen into disuse. Moreover, recourse to the OED was necessary in the case of affixes used for specialized, often scientific, purposes, such as -al2, evident in choral (poore-1876), or -ase, manifest in transaminase (1985lowem8b). A complete list of the Germanic nominal affixes discernible in the complex word types under investigation is given in Table 20. As mentioned earlier, homonymous affixes are followed by an index number based on the OED’s classification scheme; accordingly, the diminutive suffix -en1, e.g., stuchen ‘stitchen’ (cmancriw1.m1), is differentiated from -en2, used to indicate female sex, e.g., wulfene ‘wolfen, she-wolf’ (cmancriw-1.m1). Likewise, the action nominalizer -ing1, e.g., fedying ‘feeding’ (cmwycser.m3), is distinguished from the patronymic and/or diminutive affix -ing3, e.g., lordinges ‘lording’ (cmkentse.m2). Table 20: Germanic nominal affixes documented by the data in this study. Prefix types afterbackbyedforegainhalfin- kine-
midmisoffonoutoverselfstepthrough-
Suffix types to- twiununderupwanwith-
-al -ard -ase -cade -dom -en -en -end -er, -ar, -or
-ful -hood -ild -ing -ing -le, -el -ling -lock -ness
-ock -red -ship -ster -th -y, -ie
114
6 Word formation patterns
An expanded list of the 25 prefixes and 24 suffixes presented in Table 20 is provided in Appendix A4, which not only displays the affix types and their various spelling forms but also offers an example from the data, followed by variant forms which are either phonologically or lexically motivated. Additionally, I registered whether the respective morpheme was, in fact, classified as a nominal affix by the three pertinent sources and whether Dixon (2014), Marchand (1969) and the OED determined variant forms and affix origin similar to the approach taken here. This information is intended to elucidate and justify classification decisions which are inevitably subjective since the authorities not infrequently disagree in this regard; prime examples are -lock or -hood and their variants, discussed in Chapter 5. The miscellany of affixes displayed in Table 20 is, of course, due to the long period under investigation. Many of the OE derivational morphemes, still evident and possibly used in ME, have meanwhile vanished without leaving a trace. Hence, complex words exhibiting the prefix ed-, e.g., edlen ‘reward, retribution’ (cmlambx1.mx1), or the agent suffix -end, e.g., helpend ‘helper’ (cmvices1.m1), are not attested in the data since the end of the ME period. However, some OE affixes are still discernible in complex words, although they seem to have lost their capability to derive new lexemes in eModE and afterwards. Cases in point are the suffixes -red, e.g., hatred (1949tim1n7b), and -th, e.g., warmth (1904bensj7b), which have become semantically obscured but may nevertheless be analyzable for today’s language users provided that the affixed base is not obsolete and that the suffix occurs in more than isolated instances (Kastovsky 1985). If complex words are morphologically transparent in this sense, we cannot exclude the possibility that word formation by suffixing -red or -th might be resurrected (see also Bauer 2006; Haselow 2011: 257–259). Consequently, all affixed nouns were included as model lexemes for Germanic affixation as long as the complex word was analyzable due to its transparent base.53 This approach is also preferable because it parallels the treatment of Romance affixation: While we cannot say with absolute certainty if and when a particular Germanic affix disappeared from language use, it is equally challenging to determine the point in time when a given Romance sequence acquired affix status in English, as discussed in the excursus in Chapter 5. Analyzability in terms of base transparency hence seems to be the best option.
The exception to this rule is -t, the by-form of suffix type -th, which appears to have become negligible as early as late OE (Kastovsky 1992a). Therefore, complex words displaying the variant -t were classified as model nouns promoting Germanic affixation only until the end of the ME period.
6.2 Classifying the model nouns
115
As noted in the previous chapter, any researcher investigating Germanic affixation needs to decide where to draw the line between compounding and affixal derivation, which is no trivial matter due to the blurred boundary between these word formation processes. Since I wanted to avoid skewing the model nouns toward compounding, and thus biasing the means of lexicon extension toward the analytic pole of the typological cline, I assumed that all erstwhile compound constituents had acquired affix status by 1150 unless there were good reasons to the contrary. Such good reasons are lexicographical entries in the MED and OE dictionaries that provide more comprehensive, and more reliable, information than the OED concerning (early) ME data. Accordingly, I followed the MED and BT in classifying complex words ending in -ware, e.g., heavenware ‘inhabitants of heaven’ (18), or -ric(he), e.g., abbotric ‘benefice or jurisdiction of an abbot’ (19), as compounds, in contrast to the OED, which supposes suffixation in these cases. (18)
Sunnedei wile ure drihten cumen [. . .] mid alle heouenware (cmlamb1.m1)
(19)
Te king iaf ðat abbotrice an prior of Sanct Neod (cmpeterb.m1)
Besides these few instances, however, I categorized potential compound constituents as affixes, well aware that this blanket approach is not uncontroversial. Prefixes, especially, may be considered compound modifiers since they have preserved much of the semantic content and phonological structure of their freely occurring counterparts. Accordingly, Marchand (1969: 113–121) considers nearly half of the prefixes listed in Table 20 locative particles, regarding the respective complex nouns as preparticle compounds, whereas Bauer, Lieber & Plag (2013: 340–353) treat these elements as prefixes on formal and semantic grounds. In view of the discordant opinions about the morphemes’ status even in PDE, it seemed particularly futile to ascertain when a given morpheme might have completed the shift from compound constituent to affix; instead, I classified all affixed lexemes as model words for Germanic affixation since 1150. When appraising the empirical findings, however, we should be mindful of the fact that the boundary between compounding and Germanic prefixation is fuzzy and that the approach taken here slightly skews the results toward more fusional word formation patterns.
6.2.5 Romance affixation Although we cannot determine with certainty when a specific Romance sequence achieved affix status in English, I do not want to dismiss the possibility that complex Romance nouns constituted model lexemes for language users to analogously
116
6 Word formation patterns
derive new words by Romance affixation since Middle English. Consequently, I classified Romance lexemes as model words provided that they were transparent in terms of their bases. If the complex noun incorporated a free base, the status of the latter, its date of origin and, in case of obsolescence, its last attested use was recorded in the customary way. With respect to bound bases, which may equally well serve as input for Romance affixation, the procedure is less straightforward; in particular, the determination of a bound base is subject to some disagreement in the literature. Greenberg (1960), for instance, treats recurring, formally identical sequences, such as -ceive in receive, deceive, etc. and -tain in retain, detain, etc., as bound morphemes, whereas Marchand (1969: 6) denies them morphemic status “as they are not united by a common significate”. Constant meaning displayed by formally identical parts in various contexts is, of course, the very essence of a morpheme, or a bound base for that matter. Observing the semantic criterion, scholars postulate a bound base on the grounds of two parallel derivatives, e.g., “consecr-acion vs. consecr-ate” (Dalton-Puffer 1996: 95), or on the basis of three corresponding derivatives, two of which may, however, be considered derivationally dependent on each other, e.g., stimul-ate vs. stimul-ant vs. stimul-ation (Dixon 2014: 24). From the language user’s perspective, it seems doubtful that two parallel formations suffice to elucidate the meaning of the bound base, especially if one derivative, such as consecrate or stimulate, may be perceived as the holistic base for further derivation, namely consecration or stimulation, by the speaker. A more probable scenario is advanced by Bauer (2017: 151), who establishes the bound base bapt- on the evidence of baptize, baptism and baptist, i.e., three complex words derived independently of each other. To avoid an artificial inflation of bound bases, I followed Bauer’s more cautious approach. The noun mechanism (lind-1753) may serve as an illustration: A search of the OED revealed nominal mechanist, attested since 1606, and the verb mechanize, documented since 1678; as none of the three derivatives became obsolete, I assumed the bound base mechan- and included the complex noun mechanism as a model lexeme promoting Romance affixation by -ism since the 17th century. This said, nouns composed of a bound base and -ance, -ancy or -acy but accompanied by an adjective ending in -ant or -ate were treated as instances of correlative derivation (see Chapter 5). As such, they did not require three independently derived lexemes testifying to the existence of the bound base; instead, the respective complex types were classified as model nouns for Romance affixation as long as the related adjective existed.
6.2 Classifying the model nouns
117
Table 21 provides a list of the Romance nominal affixes displayed by the complex types encountered in the data.54 Like the Germanic affix-stock, the inventory was established on the basis of the works by Dixon (2014) and Marchand (1969), occasionally supplemented by the OED. Again, an expanded list of the 38 prefixes and 36 suffixes is presented in Appendix A5, which illustrates the affix type by an example from the data and specifies the forms regarded as variants in this study, followed by a brief indication of how the respective morpheme is treated by the three pertinent sources. Table 21: Romance nominal affixes documented by the data in this study. Prefix types anteantiarchautobicircumcocounterdemidisdysepiex-
hemihyperhypoin- intermalmetamicromonomultinonpara- para-
Suffix types periprepseudoresemisubsupersurtranstriultravice-
-acy -age -al -an -ance, -ence -ancy, -ency -ant, -ent -arian -ary -ate -ate -ation -cy
-ee -eer -ese -ess -et -ette -ice -ide -ier -ine -ism -ist -itude
-ity -ium -let -ment -ory -osis -rel -ry -ure -y
The list comprises various bound morphemes not discussed by Dixon (2014) and/or Marchand (1969) but documented in the OED; these are primarily affixes manifest in scientific words, such as -ide, realized in chloride (1905olivm7b), or -osis, evident in cyanosis (1985smitm8b). Beyond these register-specific affixes, a few more elements that are not regularly acknowledged in the literature require further comment. Besides classifying -ier as an affix in its own right (see Chapter 5), I followed the OED in ascribing suffix status to the ostensibly versatile -ice, apparent in nouns denoting ‘act’, ‘quality’ or ‘condition’, e.g., seruise ‘service’ (cmjulia.m1), iustice ‘justice’ (cmpeterb.m1), cowardice (holmes-letters-1749). The morpheme -ice is neither mentioned by Dixon (2014) nor listed by Marchand (1969), who treats the sequence as a mere adaptational ending without derivative potential in English, presumably exhibited in Romance loanwords only. While its status may be doubt-
Several affixes, such as anti-, epi-, -ess or -ist, ultimately originated in Greek but were introduced into English via Romance loanwords, which justifies their inclusion in the list of Romance affixes.
118
6 Word formation patterns
ful, it was nevertheless included in the affix inventory since I did not want to prematurely preclude the possibility that nouns ending in -ice provide model words for language users to analogously derive new lexemes. For similar reasons, I included the formal sequence -itude, e.g., longitude (cmastro.m3), as an affix, although in this instance the reference works do not even agree on the main suffix type. The element is completely absent in Dixon’s (2014) treatment, but Marchand (1969: 211) hesitantly allows for suffix status of -itude, possibly accompanied by the variant -ude. While he explicitly rejects a by-form -tude, the OED records this sequence as the suffix type and lists -itude as a variant of -tude, which, strangely enough, is classified as a combining form. Ignoring the confused, and confusing, approach by the dictionary, I proceeded from Marchand’s (1969) suggestions and accepted -itude as an affix, together with -ude as its variant form. While Table 21 includes affixes not listed in the relevant literature, it also lacks three bound morphemes generally classified as affixes, despite their manifestation in the data. More precisely, the prefixes poly-, e.g., polygamy (20), and tele-, e.g., telegram (21), as well as the suffix -ology, e.g., ethnology (22), listed by Dixon (2014: 441, 444) could not be regarded as affixes.55 This is due to the fact that affixes attach to free forms and, occasionally, to bound bases as defined above, which is not the case for the complex nouns under consideration. Here, poly-, tele- and -ology combine with combining forms exclusively, instantiating neoclassical compounds, so that these elements had to be categorized as combining forms. (20) propagation is a perfect struggle; polygamy becomes a law of nature (reade-1863) (21)
I returned yesterday summoned by a telegram from Bisham (thring-187x)
(22) The Hebrew cosmogony hurries rapidly down into ethnology (1951brodh8b) As noted previously, complex Romance lexemes not only reintroduced bound bases but also reestablished affix allomorphy in English derivation, evidenced by the variant forms of derivational morphemes listed in Appendix A5. The suffix type -ure, e.g., closure (1985smitm8b), is accompanied by the variant forms -ature, e.g., signature (1919dai1n7b), and -iture, e.g., expenditures (1949tim1n7b), regarded by Dixon (2014: 344) as the “longer form of the suffix”. By contrast, Bauer, Lieber & Plag (2013: 181) propose segmenting such complex lexemes into base and suffix with an intervening “extender”, citing as examples sign-at-ure and compet-it-ion, among others. For the sake of simplicity, I classified sequences such as -ature and -ition as variants of -ure and -ation, respectively (see also Chapter 5).
I might emphasize that this conclusion is grounded in the morphological structure of the data under investigation and solely applies to this study; no general claim is intended.
6.2 Classifying the model nouns
119
In principle, the term ‘extender’ may also denote epenthetic vowels occurring at the morpheme boundary, such vocalic -a- and -i- manifest in ornament (cmaelr3. m23) or impediments (bacon-e2-h). Etymologically, these loanwords were derived from the Romance verbal bases orna-, impedi-, and the base-final vowel, absent from the English verbs orn, impede, has prompted claims that the complex lexemes are opaque to English speakers (e.g., Lloyd 2011: 31). Yet the opposite position is equally tenable: We assume that the morpheme per se is a firmly integrated unit, usually disallowing intrusive material, whereas the polymorphemic word is less solidly unified, permitting intercalation at the morpheme juncture (Berg 2012). Along these lines, the epenthetic vowel would demarcate the constituent morphemes by signaling the word-internal boundary, enhancing the transparency of the complex noun. Hence, I classified the respective types as model words for Romance affixation provided that the free base existed in English; accordingly, ornament, for instance, was included as a pattern until 1600, by which time verbal orn had become obsolete.
6.2.6 Multi-model nouns Finally, I need to briefly remark on complex types that provide model words for more than one possible word formation pattern. To begin with, Germanic and Romance affixed types like unlicnesse ‘unlikeness’ (cmhali.m1) and disintegration (boethja-1897) simultaneously constitute model lexemes for both prefixation and suffixation. Regardless of their actual process of origin, these nouns would enable speakers to analogically coin new words by prefixation with un-, dis- and by suffixation with -ness, -ation provided that they are familiar with the bases likeness, integration, unlike and disintegrate (see also Bauer 2019: 110). Consequently, such three-morphemic lexemes were included as model nouns for both affixation processes during the time that the respective base word was attested. Broadening the scope to all word formation processes, we observe many types that allow for more than one possibility. Thus, tenys pleyeris ‘tennis player’ (cmreynes. m4) represents a model word for N + N compounding but also serves as a model lexeme for Germanic suffixation with -er as “it is not at all unusual for affixes to attach to compounds” (Bauer, Lieber & Plag 2013: 514). Similarly, the loanword dyocesan ‘diocesan’ (cminnoce.m4) constitutes a multi-model noun supporting two patterns, namely ADJ > N conversion and Romance affixation with -an, given that both bases, adjectival diocesan as well as nominal diocese, have been documented since the complex noun emerged in the 15th century. All data instantiating two or more patterns were classified accordingly and included as model nouns as long as the respective bases persisted in the English language.
120
6 Word formation patterns
In closing this section, I would like to emphasize that, ultimately, data classification is inevitably subjective. When specifying the classificatory criteria, I have repeatedly pointed out that neither the determination of the morphemes’ status nor the delimitation of the word formation processes is a foregone conclusion. On the contrary, many aspects are subject to contentious debate, such as conversion of nouns from adjectives, the determination of combining forms as opposed to affixes and decisions on the status of Romance affixed bases, to name just a few. In the light of the frequently dissenting opinions advanced in the literature, I based my own decisions on various lexicographical and scholarly works, always contemplating which analysis would have been most plausible from the language user’s perspective. As a result, my conclusions diverge from established views on the assessment of bound bases, the classification of neoclassical compounds and the status of Romance -ice and -ier, to give just a few examples. I hope to have sufficiently elucidated the corresponding decision processes, but, of course, any decision is susceptible to criticism.
6.3 Distribution of model and non-model nouns Before evaluating the noun formation patterns in terms of the respective model lexemes, it seems expedient to gauge the quantitative relevance of the model nouns. To this end, Table 22 details the absolute and relative frequencies of all nominal types based on their potential, or lack thereof, to act as a model lexeme in each century. Remarkably, the vast majority of types can be considered to represent patterns for word formation processes, while the proportion of nominal types not reflecting any model has been surprisingly low throughout the study period. As early as the 12th/13th century, the proportion of types evincing possible patterns amounted to 62.3% of all nouns and has continuously increased, reaching 87.6% in the 20th century. On the token plane, the distributions presented in Table 22 have developed similarly, except for the 20th century when the number of tokens displaying a potential pattern decreased significantly, though the effect size is less than small.56 The growth of types that may have served as models for noun formation has been gradual throughout the period of investigation; the differences between adjacent centuries, though at times statistically significant, are trivial since all effect sizes are well below small. That said, the increase in the numbers of types in the As noted in Chapter 5, statistical tests based on large numbers are prone to the p-value fallacy; hence, in addition to reporting the effect size, all chi-square statistics testing differences in token distributions in this chapter were performed on data normalized to a corpus of 10,000 words to improve the explanatory power of the statistics.
121
6.4 Patterns for noun formation in terms of model lexemes
Table 22: Model and non-model nouns (absolute and relative frequencies).
Types
Model Non-model
Tokens
Model Non-model
th/th
th
th
th
th
th
th
th
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) (.)
, (.) , (.)
, (.) , (.)
, (.) , (.)
, (.) , (.)
, (.) , (.)
, (.) , (.)
, (.) , (.)
, (.) , (.)
16th century is exceptional insofar as the phi coefficient exceeds the threshold for a small effect size (phi = .15). The more pronounced rise of model lexemes in this period is largely due to the progressive analyzability of Romance loanwords in English: While in the 15th century the proportion of model nouns originally borrowed from Romance amounted to 28% of all nominal types, its proportion increased by a further 10 percentage points in the subsequent period, a significant rise in terms of statistics, albeit with no effect (phi = .06).
6.4 Patterns for noun formation in terms of model lexemes 6.4.1 Overview of the relative distributions In order to assess the patterns potentially available for language users to extend their nominal lexicon, all model lexemes were allocated to the processes they represent, as detailed in Section 6.2. Figure 14 provides an overview of the proportional distribution of the four most important means based on the respective type totals in each century; for absolute numbers see Table 23 below. Perhaps most striking is the strong preponderance of model nouns for conversion, especially when compared to the modest numbers of new words derived by this process (see Chapter 5). Since the 14th century, conversion patterns appear to have constituted the largest share of all models for analogical formations, the numbers of nominal types accompanied by corresponding verbs and/or adjectives amounting to roughly half the number of all model lexemes in the 15th to 19th centuries. Equally surprising is the comparatively low proportion of patterns for compounding in the light of the effective utilization of this process to coin new nouns especially since eModE. After the stark decline in the 14th century, the numbers of model lexemes for compounding recovered to a certain extent. Yet the process
122
6 Word formation patterns
Figure 14: Relative distributions of model types by potential pattern.
ranked last until the 19th century, and it was only in the 20th century that the large number of model words promoted compounding to second place, which it shared with Romance affixation. Unlike conversion and compounding patterns, the development of models for Germanic affixation seems less unexpected against the background provided in the previous chapter: After figuring prominently in ME, the numbers of model nouns for Germanic affixation have continuously decreased, relegating this process to the third place by the 17th century. At this point, model lexemes for Romance affixation, which had already outnumbered model words for compounding back in the 14th century, finally surpassed those reflecting Germanic affixation, making Romance affixation the second most important process for analogical noun formation in terms of available model lexemes. On the token level, the developments have been very similar though more pronounced, as indicated by the distribution of tokens that instantiate the respective types, detailed in Table 23.57 Tokens representing models for conversion materialize in exceptionally large numbers, their proportions amounting to around three quarters of all tokens displaying potential patterns in most periods. The diachronic development broadly mirrors that of model types for conversion: After
On account of multi-model nouns, the numbers of types and tokens for each century presented in Table 23 do not add up to the totals given in Table 22.
6.4 Patterns for noun formation in terms of model lexemes
123
an increase in the 15th century, the number of tokens started to decline in relative terms, but none of the differences between adjacent centuries reveals any palpable effect; with phi values not exceeding .07, not even a small effect is discernible. Table 23: Type and token distributions of model nouns by potential pattern. th/th
th
th
th
th
th
th
th
Types
Conversion Compounding Gmc affixation Rom affixation
,
,
,
,
,
Tokens
Conversion Compounding Gmc affixation Rom affixation
, ,
, ,
, ,
, , ,
, , ,
, , ,
, , ,
, , , ,
TTR
Conversion Compounding Gmc affixation Rom affixation
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
In contrast to tokens displaying patterns for conversion, the overall proportion of tokens manifesting models for compounding has been extremely low, fluctuating well below 10% between 1150 and 1900. The development is undramatic as their distribution did not change significantly, except for the decrease in the 14th century and the increase in the 20th century, which are significant at the .000 level but exhibit only small effect sizes (phi = .10). As regards tokens reflecting Germanic affixation, their proportion was the second highest in ME but has since constantly declined, albeit with virtually no statistical effect, whereas the proportion of tokens representing patterns for Romance affixation has steadily risen. The number of nominal tokens exhibiting Romance affixation increased significantly with a small to medium effect in the 14th century (phi = .17), followed by a small but persistent growth throughout the study period, demoting Germanic affixation to third place in the 17th century, at the latest. Table 23 relates the type and token frequencies of model nouns for the different word formation processes yielding results only to be expected for conversion and compounding. While TTRs for compounding have been comparatively high since 1150, averaging 0.525 per century, TTRs for conversion have been consistently low with an average of 0.112 per century during the study period. More specifically, nominal types representing a potential pattern for conversion from verbs have been used more often than those exhibiting a model for conversion
124
6 Word formation patterns
from adjectives, their TTRs amounting to 0.106 and 0.156, respectively, on average per period. As to the affixation processes, model nouns for Romance affixation have exhibited lower ratios than model words for Germanic affixation since the 14th century, the average TTR per period equaling 0.272 and 0.321 for Romance and Germanic affixation, respectively. The development of the TTRs suggests an everexpanding usage of nouns that evince Romance affixes, whereas model lexemes for Germanic affixation seem to have been increasingly less used. Regardless of such differences, however, Germanic and Romance affixations both attest to the suffixing preference: The average TTR for Germanic suffixed and prefixed model nouns amounts to 0.311 and 0.467, respectively, while the Romance data manifest an average TTR per period of 0.275 and 0.302 for suffixed and prefixed nouns, respectively. Extremely high usage frequency has been argued to impact on the mental storage and cognitive access to linguistic units, or exemplars, as described in the excursus in Chapter 5. In the present context, the lexical strength and growing autonomy of exemplars are crucial insofar as complex nouns, having acquired a holistic representation, lose their potential to serve as model lexemes for patterns of noun formation. In this respect, the aforementioned TTRs are, at best, crude approximations since the ratios are calculated across all types within a certain period, neglecting the individual type’s usage frequency. To give just one example: The relatively low TTR for Romance prefixation stated above is largely due to the data distribution in the 12th/13th century, when 82.4% of all tokens instantiated one single model lexeme for prefixation, namely ærcebiscop ‘archbishop’ (cmpeterb.m1), thereby reducing the ratio to 0.088 in that period. Thus, the most preferable approach would have been to exclude such extremely frequently used types from the subsequent analyses, but the methodological problems surrounding the determination of frequency thresholds and our rather limited knowledge of the actual impact of extremely high frequency on the mental lexicon, discussed in the excursus in Chapter 5, prevent me from doing so.58 To sum up at this point, the development of types that may be considered model words for analogical noun formations, on the one hand, and that of their corresponding tokens, on the other, reveal basically the same trends. In early ME,
In earlier work, I excluded extremely high-frequency nouns, defined as outliers on the basis of the overall token distributions (Table 17), which led to the realignment of the relative distributions of processes displayed in Figure 14. Yet the distributions did not change substantially: Global chi-square statistics computed for each period showed the differences, while at times statistically significant, to be too small to produce any effect, Cramer’s V reaching .05 at most.
6.4 Patterns for noun formation in terms of model lexemes
125
model lexemes strongly promoted Germanic affixation, followed by conversion and compounding, whereas model words for Romance affixation were virtually non-existent. The situation changed in the 14th century when model nouns for conversion prevailed at the expense of Germanic affixation, and, importantly, model lexemes for Romance affixation occurred in numbers sufficient to advance the process to third place. Boosted by further increases in model nouns, Romance affixation patterns eventually started to outrank their Germanic counterparts in the 16th century and have since ranked second behind conversion. Ranking the different processes in this way should enable us to assess the productivity of the various means in each century provided we accept “that ‘productivity’, defined as the likelihood that a pattern will apply to a new form, is a direct reflection of the type frequency of that pattern” (Bybee 1996: 248). This line of reasoning, however, is flawed insofar as the count of types instantiating a certain model does not allow predictions about future applications of that pattern; rather, the respective type counts indicate the model’s realized productivity, understood as its “past achievement” (Baayen 2009: 902).59 Though certainly an improvement on previous concepts, the notion of realized productivity may still be misleading as not all types that display a given pattern were, in fact, derived by the respective process. A case in point are Romance affixations: As noted in Chapter 5, Romance words such as punisshement ‘punishment’ (elyot-e1-h) or treatment (george-1763) are often interpreted as products of Romance affixation because they have, in principle, been analyzable since their adoption into English, yet neither punishment nor treatment realized deverbal derivations with -ment as both nouns were borrowed holistically from Romance. While I do not utilize the concept of (realized) productivity as such, the basic idea underlying the present chapter is, obviously, very similar, namely to assess the importance of the various patterns of noun formation primarily as a function of type frequency (see also Divjak 2019: 45–46). In this respect, then, the same word of caution applies, i.e., types apparently attesting to a certain model may well have been derived by other means of lexicon extension. Hence, the following sections present the model words not only according to the processes they display but also in terms of their etymological origin. Although I do not expect this kind of etymological information to have been significant for language users, influencing their choice of the particular process to form new nouns, the comparison of
The concept of productivity has been further refined to allow predictions about a pattern’s future employment, namely its expanding productivity and/or potential productivity (see Baayen 2009). Both calculations are hapax-based and, thus, inappropriate for studies based on small corpora such as the present one.
126
6 Word formation patterns
potential and actually realized patterns may provide a more realistic picture of the models available for speakers.
6.4.2 Conversion The development of nominal types that may be considered to represent instances of conversion is illustrated in Figure 15, reproducing the proportion of model nouns for conversion to all model lexemes in each century. For the sake of completeness, the illustration is supplemented by the absolute numbers of the model types for conversion.
Figure 15: Distribution of model words for conversion (relative and raw frequencies).
The number of model nouns steadily rose until 1700, marked by increases in the 14th and 15th centuries that prove statistically significant at the .000 level but show virtually no effect (phi = .07). The continuous rise was halted in the 18th century when the number of nominal types decreased, albeit only just significantly (p = .044), and, again, in the 20th century with a reduction significant at the .000 level, yet without any noticeable effect. With respect to the word classes that comprise elements eligible to be converted into nouns, Table 24 provides an overview of the model nouns and their formal equivalents in other categories. In addition to adjectives and verbs, the category ‘Minor’ consists of four adverbial types and one closed-class member, namely the pronoun self. The word class that seems best suited to derive nouns by conversion is the verbal category. As early as the 12th/13th century, 73% of all model noun types
6.4 Patterns for noun formation in terms of model lexemes
127
Table 24: Model nouns for conversion by word class. Word class < ADJ N conversion amounted to more than a quarter in first period, its percentage dropped to less than 20% before the end of ME. Despite the category’s relatively small share of conversion patterns, however, adjectives have been converted into nouns more frequently than verbs, as demonstrated in Chapter 5. This discrepancy might be explained by recourse to usage frequencies. In Section 6.4.1 I noted that the TTR of nominal types representing a pattern for conversion from verbs has been lower than that of nouns exhibiting a model for conversion from adjectives. Provided that the lower TTR of nouns paralleled by verbs is due to specific types with extremely high usage frequency, promoting their autonomy, these types would not lend themselves as model lexemes for conversion, thereby reducing the quantitative strength of the V > N conversion pattern. Still, this reasoning would illuminate only one, possibly minor, aspect concerning the speaker’s choice of a model for analogical noun formation, which is discussed in more detail in Chapter 12. The final question to be addressed here is whether the large numbers of model words for conversion, not nearly reflected by the numbers of new nouns derived by this process, may have been artificially boosted by the classification of the data. Assuming that conversion, in principle, works in both directions, I classified all nouns paralleled by a corresponding verb and/or adjective as potential instances of deverbal and/or deadjectival noun derivation. The plausibility of this assumption may be assessed by inquiring into the historical origin of all putative model lexemes in order to ascertain how many nouns were, in fact, converted from members of other categories.
128
6 Word formation patterns
To this end, Figure 16 contrasts all types classified as model nouns for conversion patterns with those that originated from this process according to the pertinent dictionaries. Unmistakably, the proportion of nouns originally derived by conversion has been very small since 1150, comprising less than 10% of all potential model words in most periods, whereas the majority of nominal types matched by identical verbs and/or adjectives emerged by other means.
Figure 16: Model nouns as potential and actual instances of conversion.
First of all, many nouns, such as scelde ‘shield’ (cmlamb1.m1), were inherited from (West) Germanic, formed by processes no longer operative in OE; in the 12th/13th century these types constituted the largest proportion, amounting to nearly 40% of all model nouns. Moreover, numerous nouns borrowed from Romance, such as counfort ‘comfort’ (cmwycser.m3), have prevailed among the model lexemes since the second period, representing more than half of all model words since the 15th century. Finally, the percentage of nouns coined by other processes such as suffixation, e.g., wourshypp ‘worship’ (cmedmund.m4), has regularly been higher than that of nominal types derived by conversion since 1301 (for a detailed listing see Appendix A6). In sum, conversion has only been used infrequently to derive nouns; formally identical noun–verb and/or noun–adjective pairs have either arisen randomly or by converting nouns into verbs and/or adjectives, which has been repeatedly noted (e.g., Biese 1941: 30; Dietz 2015b). Verbs, so the argument goes, would need to be converted from nouns due to the limited possibilities to derive members of this category by other means such as affixation (Marchand 1969: 364). Additionally, the difference in the size of the categories has been invoked: Since the class
6.4 Patterns for noun formation in terms of model lexemes
129
of verbs is smaller than that of nouns, fewer options are available to convert verbs into nouns than vice versa (see Biese 1941: 406). At this stage I will not remark on the validity of these arguments, but see Chapter 12. In any event, the outcome of the survey on the historical origin of the model words seems to point to classification problems, as suspected in Section 6.2.2, resulting in artificially inflated numbers of model lexemes for conversion patterns and, ultimately, overstating the importance of the process.
6.4.3 Compounding As regards model lexemes for compounding, their development relative to all model words is depicted in Figure 17, accompanied by their absolute frequencies in each period. The statistically significant decrease in the number of model nouns in the 14th century was followed by an equally significant increase in the subsequent period. However, judging by the effect sizes, the loss of model words in the 14th century, reaching a near medium effect (phi = .25), had a greater impact than the gains in the 15th century, which show only a small effect (phi = .11). After non-significant changes over the next three periods, the number of model lexemes started to rise again significantly in the 19th century, albeit with no effect. The last period, finally, witnessed an increase in model nouns for compounding patterns significant at the .000 level with a small effect size (phi = .12).
Figure 17: Distribution of model words for compounding (relative and raw frequencies).
130
6 Word formation patterns
The trend manifest in the development of model lexemes for compounding since 1150 is highly similar to that noted for new nouns derived by compounding (see Chapter 5), which is not least due to the fact that the latter constitute a large proportion of the former. More precisely, the proportion of types newly derived by compounding amounts to an average of 35.5% of the model noun types for compounding; by contrast, the corresponding percentage for conversion does not exceed 1.7% on average per period. Therefore, it is hardly surprising that the development of model nouns corresponds more closely to that of new nouns in the domain of compounding than in the area of conversion. Turning to the internal structure of the model lexemes for compounding, Table 25 provides an overview of the distribution of the types according to the word classes of their constituents. As might have been expected, the vast majority of the model words display nouns in modifier position, their percentage oscillating between 81.5% and 89.7%. But as noted earlier, these figures should be considered with some caution due to the data collection procedure and the classification of category-ambiguous modifiers as nouns. Also, the low proportion of ADJ + N types, though definitely larger than that observed for new nouns, is misleading for the same methodological reasons. With respect to neoclassical elements, Table 25 indicates that the number of analyzable complex nouns including such constituents has continuously increased since 1601, their percentage rising from 0.7% in the 17th century to 6.2% in the 20th century. Within this cluster, neoclassical compounds have composed the largest proportion, followed by nouns premodified by a combining form, the number of which surged in the last period; by contrast, the number of N + CF types has been negligible throughout. Despite the rise of model lexemes reflecting CF + CF compounding, language users seem to have been reluctant to employ neoclassical elements as compound heads (see Chapter 5), which suggests the limited relevance of this pattern type for analogical noun formation. Table 25: Model nouns for compounding by word class of constituents. Constituents N+N ADJ + N V+N CF + CF CF + N N + CF Minor
th/th
th
th
th
th
th
th
th
6.4 Patterns for noun formation in terms of model lexemes
131
In general, model words for compounding have diversified as regards the constituents’ word class and the compound’s headedness. In addition to the structural types listed in Table 25, the category ‘Minor’ comprises ten different patterns; for details see Appendix A7. Since I raised the issue of left-headed compounds in Section 6.2.3, it may be worth mentioning that only very few model nouns for this kind of compounding are found in the data, such as isolated instances of N + ADJ compounds, e.g., lettres patents ‘letters patent’ (cromwell-e1-h), or N + PN sequences, e.g., herb Robert, clustering with herb Ive, herb John, herb Walter in the same text extract (cmthorn.mx4). Although I did not systematically collect nouns exhibiting N + ADJ compounding, the analysis of complex Romance loanwords suggests that left-headed compounds were rarely borrowed, providing not enough model words to establish left-headed compounding as a viable option for noun formation in English. To complete the overview of compounding patterns, the model nouns are contrasted with the nominal types actually derived by compounding; as usual, their origin was determined on the basis of the pertinent etymologies given in the dictionaries. Figure 18 reveals that the overwhelming majority of the model lexemes were, in fact, obtained by this word formation process: In all periods, more than 90% of the model words originated from compounding; for details see Appendix A6. The remaining nominal types, not formed by compounding, were in part borrowed from Romance. Some of these complex loanwords, such as cotte armur ‘coat-armor, blazonry’ (machyn-e1-h), became analyzable in English as patterns for N + N compounding, but most of them provided model lexemes for neoclassical
Figure 18: Model nouns as potential and actual instances of compounding.
132
6 Word formation patterns
compounding: Out of 32 nouns classified as CF + CF compounds in the 20th century, 14 types resulted from borrowing, e.g., symmetry (strutt-1890), as opposed to 15 types generated by neoclassical compounding in English, e.g., photophobia (poore1876). Aside from borrowing, model words for compounding originated from other processes such as Germanic affixation, e.g., wyn dronkenes ‘winedrunkenness’ (cmpolych.m3), or, in the case of ADJ + N types, univerbation of the respective phrase, e.g., sweetheart (davys-1716), mentioned only for the sake of completeness given their limited numbers.
6.4.4 Germanic affixation Unlike model lexemes for conversion and compounding with shifting trends in their development, model words for Germanic affixation have developed in one direction only, undergoing a steady decline since 1150. For illustrative purposes, Figure 19 depicts the relative distribution of types manifesting Germanic affixation across the centuries; again, absolute numbers are added for completeness. Relative to the combined type totals of model lexemes for the other processes, the numbers of types displaying Germanic affixation have continuously decreased throughout the study period. Except for the 14th and 19th centuries, differences between adjacent centuries prove to be significant but without any noticeable effect; only the reduction of model nouns in the 15th century evinces a small effect size (phi = .15).
Figure 19: Distribution of model words for Germanic affixation (relative and raw frequencies).
133
6.4 Patterns for noun formation in terms of model lexemes
As regards the different affixation processes, Table 26 summarizes the numbers of prefixed and suffixed model noun types in each century; detailed lists of the prefixes and suffixes evident in the model words are given in Appendix A8. Similar to the distribution noted for new lexemes in Chapter 5, the overwhelming majority of the model nouns have provided patterns for suffixation throughout the investigation period. While in early ME models for suffixation still faced some competition from patterns for prefixation, the percentages of model lexemes equaling 82.2% and 17.8%, respectively, in the 12th/13th century, the proportion of types displaying suffixation subsequently amounted to more than 90%, dropping only slightly in the last two periods. Table 26: Germanic prefixation and suffixation manifest in model nouns.
No of prefixed nouns No of suffixed nouns No of prefix types No of suffix types Ratio prefix: nouns Ratio suffix: nouns
th/th
th
th
th
th
th
th
th
. .
. .
. .
. .
. .
. .
. .
. .
The addition of the different prefix and suffix types, recorded in Appendix A8, reveals that the numbers of prefix and suffix types manifest in model lexemes are fairly similar. Even in the centuries following ME times, the respective figures indicate no substantial difference, which is remarkable since the loss of OE prefixes is thought to have been more dramatic than that of OE suffixes (e.g., Burnley 1992). Still, the usage data on which this study is based do not corroborate this assumption: Although six prefixes (ed-, gain-, kine-, to-, wan-, with-) fell out of use during ME, this fate was shared by no less than five suffixes (-en2, -end, -ild, -lock and -ing3). Correlating the numbers of model lexeme types and affix types for each century in Table 26 highlights that prefix types have been manifested by far fewer nominal types than their suffix counterparts. On average per period, a single prefix is displayed by three model noun types, the range extending from two to six nominal types; by contrast, one suffix type is presented by 29 model word types on average, the range stretching from 23 to 35 types. The low ratio of suffix types to suffixed noun types can be interpreted as motivated by the suffixing preference (e.g., Greenberg 1963), and, at the same time, strengthening this propensity by providing the respective patterns.
134
6 Word formation patterns
With respect to the specific affixes evident in model words for prefixation, the most frequently occurring prefixes, found in all periods, are un-, e.g., untruth (1949tim1n7b), attested by a total of 76 different nominal types from 1150 to 2000, and mis-, e.g., misbileaue ‘misbelief’ (cmvices1.m1), apparent in 32 model nouns. As to suffixation, the most prevalent affixes throughout the study period are deverbal -ing1, e.g., poetizing (okeeffe-1826), evident in 842 different model lexeme types, followed by agentive -er, e.g., cwellere ‘queller’ (cmkathe.m1), exhibited by 471 nominal types, and -ness, e.g., cursydnesse ‘cursedness’ (cmboeth.m3), instantiated by 380 model nouns. This ranking corresponds exactly to the order established for new derivations by Germanic affixation in Chapter 5, suggesting that language users have analogically formed new nouns on the basis of the most frequently observed model lexemes. Contrasting the potential model words for patterns for Germanic affixation with those types effectively derived by this process exhibits that most model nouns, in fact, originated from Germanic affixation, as illustrated in Figure 20. In the first period, the proportion of these nouns to all model lexemes for Germanic affixation amounted to more than 90%, and, although their percentage decreased in the subsequent periods, it did not drop below 75.3%, marking its lowest level in the 18th century. Model words not derived by Germanic affixation chiefly originated from borrowing (see Appendix A6); among these, Romance nouns ending in -er constituted 92.5% of the respective types on average per century. Given that loanwords like affirmer (edward-e1-h) and English derivations like borrower (smith-e2-h) have been “distinguishable only by their etymology” (Durkin 2009: 99), borrowed
Figure 20: Model nouns as potential and actual instances of Germanic affixation.
6.4 Patterns for noun formation in terms of model lexemes
135
nouns that transparently display the suffix are best described as strengthening the formally and semantically identical Germanic agent morpheme (see also Burnley 1992). Besides isolated instances of nouns inherited from Germanic, e.g., minetere ‘minter’ (cmpeterb.m1), the remaining model words not formed by Germanic affixation arose by other word formation processes. Compounding, for instance, generated complex nouns like partaking (montefiore-1836) and home-sickness (1915bentx7b), which simultaneously provide patterns for derivations with -ing1 and -ness, respectively. Also, various nouns serving as model words for prefixation were coined by suffixation, e.g., unlicnesse ‘unlikeness’ (cmhali.m1); conversely, some model lexemes for suffixation originated from prefixation, e.g., ontreuþe ‘untruth’ (cmayenbi.m2). Finally, very few nominal types exposing Germanic affixes emerged by conversion and processes not treated as word formation, namely alteration and univerbation.
6.4.5 Romance affixation As noted in Section 6.4.1, the number of model words for Romance affixation has grown considerably since 1150; Figure 21 illustrates the diachronic development of the respective types in terms of their relative proportion to all model lexemes. The number of nouns displaying Romance affixation rose significantly in the wake of the massive influx of loanwords in the 14th century, manifesting a small to medium effect (phi = .23). The 16th and 18th centuries saw further increases significant at the .000 and .001 level, respectively, but without any noticeable effects. The
Figure 21: Distribution of model words for Romance affixation (relative and raw frequencies).
136
6 Word formation patterns
steady rise of model lexemes for Romance affixation came to an end in the 19th century when the number of model noun types relative to all model lexemes declined. Concerning the different affixation strategies, Table 27 sums up the nominal types considered to serve as model lexemes for prefixation and suffixation; for detailed lists of both affix types evident in the model nouns see Appendix A9. Similar to the distribution of prefixes and suffixes in Germanic affixation described in Section 6.4.4, the vast majority of Romance model nouns reflected patterns for suffixation, their proportion amounting to 93% on average across all periods. On the more specific level, the prefixes most frequently displayed in model nouns for Romance affixation are negative in-4, e.g., indifferency (throckm-e1-h), attested by a total of 57 different nominal types throughout the study period, and dis-, e.g., displesauns ‘displeasance’ (cmcapchr.m4), apparent in 42 model lexemes, followed by re-, e.g., rebaptization (evelyn-e3-h), exhibited by 21 model words. As to suffixation, the most prevalent types from 1150 to 2000 are -ation, e.g., accusacioun ‘accusation’ (cmaelr3.m23), visible in 651 different nominal types, -ity, e.g., extremytee ‘extremity’ (cmfitzja.m4), figuring in 256 model nouns, -ance, e.g., dellyverance ‘deliverance’ (mowntayne-e1-h), displayed by 180 model words, and -ment, e.g., advancement (hayward-e2-h), evident in 138 model lexemes. Table 27: Romance prefixation and suffixation manifest in model nouns.
No of prefixed nouns No of suffixed nouns No of prefix types No of suffix types Ratio prefix: nouns Ratio suffix: nouns
th/th
th
th
th
th
th
th
th
. .
. .
. .
. .
. .
. .
. .
. .
Counting the affix types listed in Appendix A9 reveals that the number of prefix types discernible in model words for Romance affixation has constantly been lower than that of suffix types. After small gains in ME, the number of prefix types stabilized at 13 in eModE and began to grow again in the 19th century, eventually equaling the number of suffix types in the last period. The sharp rise of prefix types in the 20th century appears to be grounded in the developments of scientific writings: 15 out of 33 different prefixes appear to have been used exclusively in medical and other scientific writings, displayed in nouns such as hypertension (1985lowem8b) or metamorphosis (1925gords7b). Suffix types, by contrast, seem to have been more evenly distributed across registers in the 20th century,
6.4 Patterns for noun formation in terms of model lexemes
137
with only three different suffixes observed exclusively in scientific writings, e.g., -ium evident in potassium (1905olivm7b). Correlating the numbers of model lexeme types and affix types in each period shows that suffix types have materialized in considerably more nominal types than prefix types, as exposed by the ratios in Table 27. On average, each suffix is displayed by 18 model nouns in each century with a range between 11 and 26, except for the first period when the ratio of suffix to nominal type equaled 1:2.5. By contrast, each prefix is represented by three model words on average per period, the range extending from two to five. The low ratio of suffix type to suffixed noun types as well as the larger stock of Romance suffixes relative to prefixes indicates that the previously noted suffixing preference has been operative not only in English but also in Romance as the majority of the respective model nouns were originally borrowed into English (see below). Hence, the patterns for Romance affixation have reinforced the suffixing preference in concert with the models for Germanic affixation. Comparing the inventories of Germanic and Romance affixes (Tables 26 and 27) reveals that the number of Romance affix types has exceeded that of Germanic affixes since eModE due to developments in the suffixation domain. While the numbers of Germanic and Romance prefix types did not diverge considerably prior to the 20th century, Romance suffix types started to outperform their Germanic counterparts in the 16th century, reaching double the number of Germanic suffixes in the subsequent periods. Yet even in the 20th century, with its surge of Romance prefixes, the total number of Romance affixes amounted to no more than two thirds of all affix types, which is still below the 80% ratio projected by Haselow (2011: 5). Against this backdrop, I would conclude that the enrichment of the affix inventory with Romance elements has been overstated in previous research. The quantitative importance of theoretically available patterns for Romance affixation is not reflected in the usage of Romance affixes to derive new nouns. As reported in Chapter 5, the numbers of new nouns formed by Romance affixation have been extremely low throughout the investigation period, totaling fewer than 70 types, and it may be argued that an enlarged corpus would have provided more new nouns derived by Romance affixation. Alternatively, however, it may be worth speculating that speakers have, in fact, refrained from commonly resorting to Romance affixation because their actual stock of models for analogical formations has not been greatly expanded by Romance words that are considered complex by language experts but not perceived as such by language users. Some support for this suggestion may be found by inquiring into the historical origin of the nouns serving as model lexemes for Romance affixation.
138
6 Word formation patterns
To this end, all nominal types classified as model words for Romance affixation are contrasted with those etymologically derived by this process. The depiction in Figure 22 impressively demonstrates that the proportion of model lexemes formed by Romance affixation in English has been remarkably low at all times: Disregarding the ME period with hardly perceptible numbers, the percentage of nouns coined by Romance affixation only gradually increased from 6.5% in the 16th century to less than 18% in the last period; for details see Appendix A6. Model nouns that did not originate from the respective Romance affixation were, to some extent, generated by other processes, as exemplified by impoliteness (1904joycx7b) and dislike (1973trevf8b), which were included as model words for negative prefixation but originally derived by Germanic suffixation and conversion, respectively. The vast majority of types exhibiting Romance affixation, however, were holistically incorporated into the English lexicon via borrowing; though slowly decreasing since the 18th century, the proportion of complex loanwords assumed to have evolved into model nouns for Romance affixation still amounted to 72.1% in the 20th century.
Figure 22: Model nouns as potential and actual instances of Romance affixation.
The numerous loanwords, transparent with respect to their base and the putative affix, contributed to the variety of potential patterns established in this study. More precisely, 20 of the 74 affix types listed in Appendix A9, i.e., more than a quarter of the Romance affix inventory, were determined solely on the grounds of their occurrence in loanwords. These purported affix types are not limited to doubtful cases such as -ice or -itude, mentioned previously, but also include se-
6.4 Patterns for noun formation in terms of model lexemes
139
quences usually attributed affix status. Cases in point are prefixal epi-, e.g., epicicle ‘epicycle’ (cmequato.m3), and suffixal -ant, e.g., assistant (montefiore-1836), which both appear only in nouns not formed by Romance affixation in English. Obviously, an enlarged corpus would have supplied more complex types, some of them probably coined by Romance affixation. But even so, we need to allow for the very possibility that language users may not have recognized affixes encountered in borrowed nouns only, thereby acquiring fewer patterns for analogical formations than those suggested by language experts. Although the language users’ faculty to discern patterns is not affected by a given word’s etymology, which is commonly unknown to the ordinary speaker, the potential of complex nouns to serve as model lexemes may be better approximated by factoring in the words’ etymological origins. The fact that model nouns for Romance affixation were chiefly borrowed from Romance, in addition to the delayed emergence of Romance affixes in English discussed in the excursus in Chapter 5, might suggest that many, if not most, complex loanwords have been used holistically, without being related to their possible analogs, since their inception in English. In summary, it must be stressed that the stock of model words, i.e., the basis for the analyses presented in the previous sections, is less reliable than desired. Besides methodological problems concerning the model lexemes for conversion patterns, morphologically complex nouns have not necessarily provided suitable model words for language users – some may have never been considered in terms of their internal structure and potential analogs, while others may have become autonomous and dissociated from earlier analogs due to extremely high usage frequency. That said, I would still argue that language users have basically been able to perceive the internal structure of any given transparent complex noun when consciously contemplating the word and, ideally, its analogs. This way, speakers may well have expanded their inventory of model nouns, not only for Romance affixation but for any kind of analogical noun formation. To accommodate the individual’s potential to realize word-internal complexity and form new patterns, I allowed for all theoretically analyzable nouns to reflect patterns for conversion, compounding, Germanic or Romanic affixation, assessing the strength of each pattern in terms of its type frequency.
140
6 Word formation patterns
6.5 Word formation patterns from a typological perspective To round off the preceding sections, I review the model words that may have established and/or strengthened patterns for language users to analogically form new nouns from a typological perspective. Recall from Chapter 3 that conversion is classified as an isolating technique, whereas compounding and Germanic affixation are agglutinating processes and Romance affixation is the only fusional method under consideration. Summing up the figures of the respective model noun types listed in Table 23 gives the totals for the three typological techniques in each century. The resulting proportional distributions are illustrated in Figure 23; for absolute frequencies see Table 28 below. Overall, model lexemes for isolating and fusional word formation processes have increased since 1150 at the expense of model words for agglutinating methods. As regards isolating techniques, the number of types possibly serving as model nouns rose significantly in the 14th and 15th centuries, but not even a small effect manifests itself (phi = .07). In the subsequent centuries, the numbers of these types remained fairly stable, before decreasing significantly in the 20th century, but again with no effect (phi = .07).
Figure 23: Relative type distribution of model lexemes by typological technique.
This development is somewhat surprising: Assuming that English has become increasingly analytic and that conversion has achieved greater system adequacy due to the loss of inflection (e.g., Dalton-Puffer 1996: 56), we would have expected
6.5 Word formation patterns from a typological perspective
141
the growth of the respective model lexemes to continue well beyond ME. Yet even so, nominal types reflecting patterns for isolating techniques have constituted the majority of all model nouns since late ME with an average percentage of 51% per period between the 15th and 19th centuries. Conversely, the proportion of model words for agglutinating processes steadily decreased from 62.4% in the first period to 24.4% in the 18th century, hence by nearly 40 percentage points. Until the 17th century differences in type numbers between adjacent centuries prove statistically significant, but the effect sizes are below small, except for the decline in the 14th century displaying a small to medium effect (phi = .19). In the 19th century model lexemes for agglutinating techniques finally started to increase, followed by a statistically significant rise in the number of model noun types in the last period, although the effect size remained well below small. In the 20th century, then, the proportions of nominal types promoting patterns for isolating and agglutinating methods approximated each other, amounting to 44.7% and 32.9% each, which roughly corresponds to the relative distribution of model words for these techniques in the 15th century, equaling 49.2% and 35.9%, respectively. Model words displaying patterns for fusional processes have not prevailed over those evincing other typological techniques despite increases in type numbers. While the initial rise in the 14th century exhibits a small to moderate effect size (phi = .23), subsequent growth rates, though statistically significant in the 16th century as well as in the 18th century, show virtually no effect. As a consequence of the increases between 1150 and 1800, the number of model nouns for fusional methods reached that of model lexemes for agglutinating processes in the 18th century but dropped in the next period. On the whole, the relative proportion of model noun types for fusional techniques has remained remarkably constant since the beginning of eModE, fluctuating between 19.6% in the 16th century and 22.4% in the last period. Table 28: Type and token distributions of model nouns by typological technique. th/th
th
th
th
th
th
th
th
Types
Isolating Agglutinating Fusional
,
,
,
,
,
, ,
Tokens
Isolating Agglutinating Fusional
, ,
, ,
, ,
, , ,
, , ,
, , ,
, , ,
, , ,
TTR
Isolating Agglutinating Fusional
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
142
6 Word formation patterns
Turning briefly to the distribution of tokens instantiating model nouns for the three typological methods, the numbers summarized in Table 28 again follow from the figures detailed in Table 23. In Section 6.4.1, I noted the strong preponderance of tokens instantiating model nouns for conversion, i.e., isolating techniques, which is largely due to the fact that extremely high-frequency types have not been excluded. Lacking any threshold values in this respect, let us suppose that the calculation of outliers, based on the token distribution in each century as recorded in Table 17 (Chapter 5), serves as a rough approximation of extremely high usage frequency. Along these lines, model lexemes for isolating methods constitute the majority of all extremely high-frequency types, their proportion amounting to nearly 60% on average per period. To give just one example of their impact: In the 12th/13th century, the 20 most frequently used model nouns for isolating techniques account for 46% of all tokens reflecting isolating methods. The heavy exploitation of these types is only to be expected considering the lexemes’ old age and monomorphemic structure (compare Zipf’s law of abbreviation); however, their usage frequency has decreased to some extent during the study period, as attested by the continuously rising TTRs for isolating techniques, presented in Table 28. Despite this development, the TTRs of model words for isolating processes have remained the lowest throughout all centuries. By contrast, the TTRs of model lexemes for agglutinating techniques have been the highest since 1150, i.e., these types have generally been instantiated by fewer tokens than those reflecting the other typological methods. The TTRs of model lexemes for fusional processes, finally, have always been intermediate between those of model types for isolating and agglutinating methods, possibly with a slight bias toward the latter until the 18th century. Baayen (2009) proposes assessing the potential productivity of a certain affix by dividing the number of hapaxes that display the given morpheme by the token total affixed with the element: The higher the calculated ratio, the more productive is the affix. Applying the logic underlying this proposal to the present context, we may assume TTRs to be indicative of the model words’ aptness: The higher the TTR, the more suited are the types to supply patterns for analogical noun formation. Accordingly, model lexemes for agglutinating processes should provide better patterns than model words for fusional techniques, whereas model nouns for isolating methods seem least suitable to advance models for noun formation. Against this backdrop, the relative distribution of typological techniques in terms of model lexeme types, depicted in Figure 23, needs some qualification. Although model words for isolating methods have outnumbered those displaying other typological techniques since the 15th century, they may well have created a
6.5 Word formation patterns from a typological perspective
143
less viable option for new noun formations, lacking the propensity to form patterns that is exhibited by model nouns for agglutinating processes.60 In other words, model lexemes for agglutinating techniques, though ranked second on the basis of type frequencies since late ME, are likely to have afforded better patterns for extending the nominal lexicon than model words for isolating methods. While the general importance of patterns for isolating and agglutinating techniques needs to be reconsidered along these lines, models for fusional processes have remained least important: On the one hand, their proportion in terms of type numbers has been the lowest in all periods, except for the 18th century; on the other, the model lexemes have been instantiated in language use more often than model nouns for agglutinating processes, as indicated by the respective TTRs. Consequently, we may assume that they have provided less suitable patterns for nominal derivations. This section offered a survey of possible models that language users have had at their disposal to analogically form new nouns, focusing on the type frequency of the various patterns. From a typological perspective, the distribution of model lexemes suggests that agglutinating techniques predominated until 1400, yielding to isolating methods in the subsequent periods but regaining strength in the 20th century. Vice versa, isolating strategies appear to have been promoted in the 15th to 19th centuries especially, whereas fusional methods, despite gains in the 14th to 18th centuries, have always been the alternative least encouraged by the pertinent model words. The generous classification of nouns with respect to the word formation process they presumably exemplify enabled me to establish an inventory of diverse patterns that has been potentially available for language users to extend their nominal lexicon. The operative word here is ‘potentially’ because, as argued above, model lexemes differ in their propensity to promote patterns due to their usage frequencies. Not only for this reason may speakers have deemed the models under consideration to provide more or less suitable grounds for analogical noun formation. By extension, language users may or may not have followed the typological trends evident in the stock of patterns when choosing among the different techniques to coin new words. This issue is addressed in the next section, which compares the typological developments apparent in noun formation patterns and new lexical additions.
Also, recall that the quantitative predominance of model lexemes for isolating techniques, i.e., conversion, may well be an artifact of the data classification.
144
6 Word formation patterns
6.6 Typological techniques in lexicon extension For recapitulation purposes, the details provided in Chapters 5 and 6 are summarized by combining the discrete periods from 1150 to 2000 into three historical epochs. In line with common practice, ME comprises the 12th/13th to 15th centuries; for reasons of convenience, the 16th to 18th centuries are subsumed under eModE, and the final two periods are classified as lModE. For each epoch, the numbers of new types derived by isolating, agglutinating and/or fusional processes, on the one hand, and the numbers of model lexemes for every typological method, on the other, were correlated, yielding the relative distributions depicted in Figure 24.
Figure 24: Relative distribution of new nouns and model nouns by typological technique.
In order to assess the importance of each typological method in extending the nominal word-stock, the primary focus is on new additions to the lexicon since these directly reflect actual language behavior. The corresponding left-hand diagram in Figure 24 indicates that fusional processes have been used infrequently, as opposed to alternative techniques. While isolating methods prevailed in ME, they have since greatly declined but still exceeded fusional techniques even at their all-time low in lModE. By contrast, agglutinating processes, ranked second in ME, have become the preferred option; given their considerably large proportion even in the first epoch, I would conclude that English speakers have firmly relied on agglutinating techniques at all times.
6.6 Typological techniques in lexicon extension
145
With respect to the distribution of typological techniques based on model noun types, the right-hand diagram in Figure 24 reveals that patterns for all three methods have been available, though possibly varying in strength due to differences in type frequencies. Model lexemes for fusional processes, constituting the smallest proportion, are likely to have strengthened fusional techniques to a lesser degree, whereas the large proportion of model words for isolating methods suggests that these means have been boosted especially since eModE. Model nouns for agglutinating processes have occupied the middle ground: After the strong promotion of patterns for these techniques in ME, the proportion of model words for agglutinating methods decreased but still equaled nearly one third in lModE. In short, language users have had at their disposal sufficiently robust patterns for each typological technique to analogically form new nouns, but they have not followed the trends established by model lexemes for the different methods. Regardless of the (quantitative) importance of isolating techniques in eModE and lModE, English speakers have favored agglutinating methods. Thus, their choice seems not to have been dictated by the relative frequencies of the typological techniques abstracted from the inventory of patterns. For the purpose of direct comparison, Figure 25 reproduces the proportion of new nouns derived by fusional processes to all new types from Chapter 5 (Figure 12) as well as the proportion of model noun types for fusional methods relative to all model words from the preceding section (Figure 23). The added trend lines demonstrate that fusional techniques have constantly increased in both
Figure 25: Proportion of fusional techniques in new additions and pattern inventory.
146
6 Word formation patterns
areas, though the slope of the trend line indicating the rise of fusional processes in the pattern inventory is steeper than the gradual incline evident in new nominal additions. Unlike fusional methods, isolating techniques seem to have developed very differently in the two fields studied, which may or may not be related to the fact that isolating methods used to incorporate new types into the nominal lexicon include not only intralinguistically motivated processes (conversion) but also extralinguistically driven means (borrowing), especially prominent in ME. To elucidate this issue, we need to restrict the comparison to intralinguistically motivated isolating techniques. Accordingly, Figure 26 contrasts the proportion of new types derived by conversion relative to all new nouns from Chapter 5 (Figure 7) and the percentage of model lexemes for isolating methods, i.e., conversion, to all model words from the previous section (Figure 23). The resultant picture, supplemented by trend lines, exhibits an upward sloping trend of isolating techniques among the potential patterns, which should persist even if the percentages, oscillating around 47%, had to be revised down due to classification problems. This development differs fundamentally from the trend manifest in new nominal additions: While the proportion of newly converted types to all new nouns has never been particularly notable, the negative sloping line signals the ever-decreasing tendency for English speakers to derive nouns by this process.
Figure 26: Proportion of conversion in new additions and pattern inventory.
6.6 Typological techniques in lexicon extension
147
From a typological point of view, the avoidance of intralinguistically driven isolating strategies seems paradoxical because, if we assume that English has drifted toward more analyticity, fostered by borrowing (Sapir 1921: 206), isolating methods in new additions to the lexicon should have been advanced. As a matter of fact, extralinguistically motivated isolating techniques have not inspired the use of other isolating processes for noun formation. In brief, the rare use of conversion as a means to enlarge the lexicon seems to suggest that English has become less analytic than previously assumed. This suspicion appears to be supported by the proportion of agglutinating processes, which has been disproportionately high in new nominal additions and has remained substantial in the pattern inventory despite losses in eModE. Before drawing hasty conclusions in this respect, however, we need to recall that agglutinating techniques comprise Germanic affixation and compounding. Since the two processes differ in their degree of fusion, they are best considered separately; to this end, Figure 27 compares the relative proportions of each process in new nominal derivations, described in Chapter 5 (Figure 7), to those of the respective method among potential patterns, presented in Section 6.4 (Figure 14).
Figure 27: Proportion of the different agglutinating techniques in new additions and pattern inventory.
With regard to Germanic affixation, then, the left-hand diagram in Figure 27 illustrates that in both areas the rather fusional agglutinating techniques have constantly declined during the study period; the downward sloping trend lines are fairly parallel, suggesting that language users may have been reluctant to form
148
6 Word formation patterns
new nouns by Germanic affixation because patterns for these methods have come to play a minor role compared to models for other noun formation strategies. In any event, the downtrend in Germanic affixation has continued the trend observed for OE, namely the reduced use of affixes, which, together with the change from stem-based to word-based derivation, has been taken to indicate the rise of analyticity in the English lexical domain (Haselow 2011: 239). The supposed shift toward increasing analyticity seems to be corroborated by the development of the more isolating agglutinating techniques, namely compounding. As evident in the right-hand diagram in Figure 27, the proportion of new words derived by this process has continuously risen, marked by a steep incline of the trend line. In the pattern inventory, however, the trend line reveals that the proportion of model lexemes for this method has not noticeably developed in either direction, lingering well below 20%. In contrast to other processes, which display higher proportions among model words than among new nouns, compounding has been extensively employed, although the proportion of respective model nouns has not encouraged its utilization for analogical noun formation. This asymmetry underlines the syntactic nature of the process, allowing for spontaneous innovative word formation. In summary, the typological techniques used to enlarge the nominal lexicon have not changed fundamentally considering that, in general, language users have vastly preferred agglutinating methods. Within this spectrum, however, we observe a shift from more fusional to more isolating processes, which may be interpreted as a trend toward increased analyticity. This development has not been substantially reversed by the reintroduction of fusional techniques in the wake of Romance borrowings, but neither has it reached completion, which would have entailed an increase in the use of distinctly isolating methods. Although ME, especially, witnessed an extensive usage of isolating techniques, these were chiefly motivated by extralinguistic factors, whereas intralinguistically driven processes, which would have accommodated the shift toward analyticity, have been rarely used for noun formation. Besides extralinguistic determinants, the choice of typological techniques has been claimed to be influenced by the language’s typological profile; in this vein, Kastovsky (1992b: 427) contends that “the typological status of a language, or of a subsystem of a language, is not just a convenient construct of the linguist, but apparently a real linguistic factor, which subconsciously affects linguistic behaviour” (see also Haselow 2011: 27). But even if we were willing to accept this hypothesis, it would be difficult to reconcile with the results of the study so far: If we assume in line with Sapir (1921: 206) that English has become increasingly analytic, we would expect speakers to have employed more intralinguistically moti-
6.6 Typological techniques in lexicon extension
149
vated isolating techniques to derive new nouns. Alternatively, if we accept that fusion was restored in English, as suggested by Haselow (2012b) and others, language users should have resorted to fusional techniques more frequently than observed in the data. The larger point at issue here is that the above claims about the typological profile of English, or of the English lexicon for that matter, are rather impressionistic. As far as I know, no study has yet attempted to determine the degree of analyticity and/or syntheticity in the nominal lexicon on a quantitative basis. To fill this void, the following chapters seek to establish the typological status of the English nominal word-stock since 1150.
Part 3: Typological profile of the nominal data since 1150
In usage-based approaches, the typological profile of a language is not only determined by the morphological makeup of its units (types) but crucially depends on the usage frequency of these units; consequently, this part of the study exclusively employs token-based methods. After introducing additional classification criteria, the synthetic index, developed by Greenberg (1960), is adapted to the nominal data, addressing the question whether nouns in use have become more or less synthetic during the past 850 years (Chapter 7). This measure, grounded in the number of morphemes per noun, provides a first impression of the degree of syntheticity evident in the nominal part of the language without specifying the type of syntheticity and possible changes therein, which requires an alternative approach. Therefore, the focus is subsequently shifted to the language subtypes, namely isolating, agglutinating and fusional, determined by the technique of combining morphemes. While the isolating subtype can be equated with analytic languages, the synthetic subtypes, i.e., agglutinating and fusional, are distinguished by the degree of fusion between two morphemes, as detailed in Chapter 2. In order to allocate the nouns used in each period since 1150 to the typological subtypes, Chapter 8 first elaborates on the parameter ‘fusion’, devising a rather unconventional method for its calculation, before arranging the historical stages of nominal English on Dixon’s clock to illustrate the development in terms of language subtypes. The Greenbergian synthetic index also does not capture the level of analyticity manifest in a language, as observed by Szmrecsanyi (2012), who transfers the unidimensional concept to a two-dimensional array by introducing two indices: the analyticity index and the syntheticity index. Chapter 9 applies these measures to the present data to track the developments of analyticity and syntheticity in the nominal language, which are then compared with the results in the domain of grammar presented by Szmrecsanyi (2012). Although the procedures performed in Chapters 7 to 9 highlight different aspects, they collectively reveal consistent typological trends in the structure of the nominal usage data, which are finally discussed in relation to developments evident in the use of typological techniques for noun formation (Chapter 10).
https://doi.org/10.1515/9783111317717-009
7 Overall development of syntheticity 7.1 Preliminary remarks on classification aspects While Part 2 of the book focuses on noun formation, attending specifically to the process of origin and, in case of morpheme concatenation, considering the immediate constituents exclusively, the present chapter concentrates on the complexity of all nouns in terms of their constituent morphemes, regardless of their origin. For this reason, all morphemes of complex words, such as compound pollen tube growth inhibition (1975hoges8b) or converted executive (1989tim1n8b), had to be identified and counted. Thus, in addition to the nominal affixes previously introduced, morphemes to derive verbs, adjectives and, occasionally, adverbs as well as some further nominal affixes had to be taken into account. Appendices A10 and A11 list the additional Germanic and Romance affixes evident in the nouns under investigation, illustrated by an example from the data; moreover, variant forms are specified where applicable, followed by a short indication how the pertinent sources classify the given sequence. With regard to Germanic affixation, the reference works by Dixon (2014) and Marchand (1969) only record those morphemes deemed relevant to PDE word formation, e.g., verbal be- manifest in bedevilment (1930tomlf7b). Therefore, Germanic affixes that became obsolete were determined by recourse to the OED, supplemented by information gleaned from the MED if necessary. Although the literature and the dictionaries sometimes disagree on the status of these affixes in ME, I included elements like for-, e.g., forgyuenes ‘forgiveness’ (cmreynar.m4), or umbe-, e.g., vmby-thynkynge ‘umbethinking’ (cmedthor.m34), as morphemes until 1500 if the base existed in ME, attested in free usage by the dictionaries. As before, the decisive criterion for determining the morphological structure of complex (base) words, such as bedevil, forgive or umbethink, was transparency predicated on the existence of the respective bases and the meaningful relation between the morphemes. Consequently, I classified an instance such as nominal behauiour ‘behavior’ (deloney-e2-h) as three-morphemic as long as verbal behave included the base have used reflexively in the sense ‘conduct oneself, behave’, last attested in 1586 according to the OED. In the same vein, transparency for language users guided the classification of complex lexemes affixed with Romance elements. To begin with, I considered only sequences that seem to have achieved affix status in English, excluding etymological items, such as ad-, displayed in admixture (oman-1895), or sym-, apparent in simpathie ‘sympathy’ (shakesp-e2-h). Although these sequences are listed as prefixes in the OED, they are not included in Dixon’s (2014) and Marchand’s https://doi.org/10.1515/9783111317717-010
156
7 Overall development of syntheticity
(1969) treatments, which served as the primary source for the present classification, occasionally supplemented by register-specific affixes. Besides having acquired morphemic status in English, expressing fairly constant meaning in various contexts, the affix needs to stand in a semantically plausible relation with a discernible base. Hence, the verbal base in degradation (1960boltd8b) consists of two morphemes, i.e., de- and grade, whereas I regarded the base verb in decoction (clowes-e2-h) as monomorphemic. With the primary focus on word structure, I concentrated exclusively on lexical material, i.e., derivational affixes and bases, ignoring grammatical morphemes; thus, nominal suffixes such as plural markers, whether contextually motivated, e.g., affections (bardsley-1807), or lexicalized, e.g., almes ‘alms’ (cmaelr4.m4), were ignored. Likewise, adjectival grammatical affixes, i.e., comparative and superlative morphemes observable in nominalized the latter (1905lewim7b) and the middest (vicary-e1-h), respectively, were dismissed. Also, verbal inflections like the past participle suffixes -ed1, e.g., blessedness (cmmandev.m3), and -en6, e.g., drunnkennesse ‘drunkenness’ (cmorm.m1) were disregarded, whereas the homonymous adjectivederiving suffixes -ed2, e.g., clearheadedness (whewell-1837), and -en4, e.g., Woollenyarn (langf-e3-h), were included. Proceeding from instances such as blessed or drunken, the distinction between inflectional and derivational -ed and -en is often difficult, if not impossible, to draw; therefore, I strictly followed the OED’s etymological suggestions, which cross-refer the reader to the respectively indexed affix.61 As mentioned before, the morphological analysis of the complex nouns rests on the lexemes’ transparency for language users, which essentially depends on whether the base morphemes are stored in their lexicon. Accordingly, I determined for all types the ultimate base, drawing on the pertinent dictionaries described in Chapter 5, in order to calculate the number of morphemes of each noun; thus, the above-cited nominal decoction consists of two morphemes since the base decoct precludes further segmentation due to the lack of ✶coct in English. Importantly, the complex lexeme needed to be transparent to the speakers of the respective century, so I recorded the date of the base’s origin and, in case of obsolescence, its last attestation, following the procedure outlined in Chapter 6. Evidently, the life span of the ultimate base influences the morpheme count as exemplified by the aforementioned degradation: When making its first appearance in the data (burton-1762), the noun included two morphemes, namely degrade and -ation; in that period, degrade could not be analyzed as a complex verb As a rule, the OED uses different index numbers to differentiate homonymous affixes according to their function, but there are some exceptions to this principle: -ing2, for instance, is described somewhat ambiguously as “[s]uffix of the present participle, and of adjectives thence derived, or so formed” (s.v. -ing2, emphasis added).
7.1 Preliminary remarks on classification aspects
157
because the verbal base grade2 was not attested in the sense ‘pass from one grade into another’ before 1892. In its later instantiation (1960boltd8b), however, the noun was considered to comprise three morphemes, namely de-, grade and -ation. The example just adduced points to a further criterion: The factual existence of a base is a necessary but not sufficient condition; to qualify as suitable constituents of a complex (base) word, the morphemes have to be related in a meaningful way. Hence, it would have been improper to attribute three morphemes to Informer (fox-e3-h) and information (1985pullm8b), for instance, since the meaning of the verbal base cannot be derived from the semantics of prefixal in-3 and nominal form, as suggested by the OED. Not infrequently, bare and affixed forms with the same meaning exist, or existed, side by side: The verbal base of impairment (1985lowem8b) is paralleled by the aphetic verb form pair1; similarly, Evaungelie ‘evangely’ (cmctmeli.m3) and evangel1 (19xxcadmh7b), both adopted in the 14th century, denote ‘good news, gospel’. In such cases, language users have likely been able to formally recognize the affixes (here: prefixal im- and suffixal -y3), so I stipulated three and two morphemes for impairment and evangely, respectively. Even more conspicuous are derivatives that show double marking, such as nominal fruiterer (jetaylor-e3-h) or poulterer (walpole-174x), both extended forms of nouns already marked for agenthood, i.e., fruiter and poulter. As long as the simply and doubly affixed forms were concurrently used in the sense of ‘dealer’, I interpreted the rightmost suffix as reinforcing the agentive character of the noun and classified the complex types as three-morphemic lexemes. Similar ‘tautological’ markings are attested in the verbal domain because etymologically related verbs were borrowed from French and Latin, furnishing English with verb pairs such as abroge and abrogate or determine and determinate, at least for certain periods. If the members of such pairs expressed the same semantics, I considered the suffix -ate to strengthen the verbal function of its base, comparable to the Germanic verbal suffix -en5 in enlighten, which itself can be traced to the verb enlight, conveying the same meaning.62 Consequently, I assumed abrogation (conway-e2-h) to consist of three morphemes when the noun appeared in the 17th century data since the verbal base could be segmented into abroge and -ate3 until the bare verb form became obsolete in the 18th century. In the same vein, nominal determination was classified as three-morphemic on its first occurrence in the 16th century (roper-e1-h) because the noun may have been derived from the verb determinate, which, in turn, comprised the earlier verbal
Marchand (1969: 146) mentions similar instances in his discussion of verbal be-.
158
7 Overall development of syntheticity
determine; after the loss of the complex verb in the 18th century, however, determination (wellesley-1815) had to be regarded as bimorphemic. Besides bases attested in free use by the dictionaries, I included bound bases, determined as identical sequences materializing with constant meaning in three, or more, independently derived words (see also Chapter 6). This approach was also adopted for cases treated as correlative derivation before, where the existence of a single related adjective such as eloquent provided sufficient grounds for classifying the noun eloquence in terms of bound base and affix. Like free base words, the bound base must have been discernible by speakers of the respective century; therefore, I established the existence of the bound base of a complex noun by noting the date of origin and, in case of obsolescence, the last attestation recorded by the OED for each of the parallel complex lexemes. Resuming the above example, the noun eloquence was accompanied by adjectival eloquent since its inception in English; in the 16th century an additional adjective, eloquious, emerged but failed to assert itself; the OED documents merely two usage instances within a very short time span, namely for the years 1599 and 1607. To postulate a bound base ✶eloqu- on this basis would hardly be justified; hence, I classified nominal eloquence as monomorphemic in all periods. While, in principle, potential bound bases needed to be manifest in at least three complex words to ascertain their status, combining forms are the exception to the rule because they are bound bases by definition (e.g., Bauer 2017: 148–157). Thus, I accepted combining forms as bound bases, following the OED’s categorization unless the morpheme qualified as affix; a list of these elements subdivided into initial and final combining forms, each illustrated by an example from the data, is provided in Appendix A12. As before, I noted not only whether the potential combining form/bound base existed but also its lifetime, based on the dating given by the dictionary and/or additional searches as described in Chapter 6. In accordance with the criteria specified above, the constituent morphemes of all nouns then were determined in order to assess the degree of syntheticity along the lines proposed by Greenberg (1960).
7.2 The synthetic index of the nominal data As mentioned in Chapter 2, Greenberg (1960) lays the foundation for the assessment of the typological profile of a language in quantifiable terms based on actual language use. To this end, he suggests counting all morphemes and all word tokens within a text and dividing the sum of the first by the total of the latter; the ratio, labeled ‘synthetic index’, indicates the level of syntheticity evident in the studied text. As an illustration, he calculates the synthetic index for OE and PDE
7.2 The synthetic index of the nominal data
159
based on 100-word extracts from both periods, concluding that syntheticity in English has decreased from 2.12 index points in OE to 1.68 index points in PDE. Synthetic index calculations for PDE were repeated for texts from different registers yielding similar results, the index value oscillating between 1.60 and 1.65 index points (Greenberg 1960: 194). In an apparently intuitive manner, Greenberg (1960) stipulates index ranges for analytic (1.00 to 1.99), synthetic (2.00 to 2.99) and polysynthetic (3.00 and greater) language types. Along these lines, English would have developed from a slightly synthetic to a fairly analytic language, but see Payne (2017: 86), who loosely interprets a synthetic index of 1.7 in PDE as “somewhat isolating or moderately synthetic”. The formula proposed by Greenberg can easily be applied to the present data to evaluate the degree of syntheticity in the nominal part of the language for each century. Since the focus is on changes of the morphological makeup of nouns, I divided the total number of morphemes observed in the nominal data by the noun totals per period; the development of the resulting synthetic index is displayed in Figure 28; token and morpheme totals, their ratio (SI) as well as dispersion measures are summarized in Table 29.63 As illustrated in Figure 28, the nouns have not developed consistently, although the added trend line indicates an increase in syntheticity during the study period. Until the end of ME the data attest to a decline in nominal syntheticity, but this trend has been reversed since eModE, the mean number of morphemes per word rising to 1.43 in the 20th century. Variation in the synthetic index over the whole study period proves significant according to a Welch ANOVA performed on normalized data (Fw (7, 5860) = 33.59, p = .000, ω2 = .02).64 Still, the association between index values and the variable ‘century’ is weak, the effect size ω2 not attaining the threshold for medium effect (Döring & Bortz 2016: 821).65
Since the standard deviation varies with the size of the mean, I additionally calculated the coefficient of variation (CV), i.e., the ratio of the standard deviation to the mean, to facilitate comparison across centuries. Statistical tests were performed on data normalized to corpus sizes of 10,000 tokens to avoid the p-value fallacy. Moreover, ANOVAs and t-tests were carried out with Welch’s correction as Levene’s test showed variances between samples to violate the homogeneity assumption (F (7, 13719) = 124.78, p = .000). Although the literature is reticent about effect size measures for Welch ANOVA, ω² seems robust enough to handle heteroscedasticity in the case of significant F statistics and large samples (Carroll & Nordholm 1975). Since both conditions are met, I computed ω² based on the F value and degrees of freedom, following Carroll & Nordholm (1975), who mathematically translate the standard formula into ω2 = (F - 1) / (F + ((N - J + 1) / (J - 1))), where N is the total of the nominal tokens and J equals the number of groups (periods).
160
7 Overall development of syntheticity
Figure 28: Development of the synthetic index in the nominal data.
With respect to changes between adjacent periods, independent tw-tests show the decreases in the 14th and 15th centuries to have been significant, but the effect size is less than small with Cohen’s d equaling .10 and .06 (Döring & Bortz 2016: 821). Similarly, the increases observed in the 16th and 17th centuries, while statistically significant, exhibit effect sizes less than small. Subsequently, the incremental changes in the synthetic index did not even reach statistical significance until the 20th century, which testifies to a significant increase in syntheticity with a nearly small effect size (d = .18). Thus, the nouns in usage changed gradually without sudden leaps, initially toward less syntheticity but ultimately toward more complexity in terms of constituent morphemes. The decreasing syntheticity in the nominal part of the language until the 15th century continued the trend reported by Haselow (2011), i.e., the shrinking use of bound morphemes in nominalizations in OE and early ME, and was partly backed by the adoption of Romance loanwords, which were morphologically intransparent when first introduced into English. In eModE the trend was reversed as indicated by the continuously rising synthetic index since the 16th century. In the final period, the mean number of morphemes per noun amounted to 1.43, reaching an all-time high though not surpassing the index value of 1.68 calculated for the 20th century by Greenberg (1960). At first glance, this result appears counterintuitive since we would expect lexical elements like nouns to evince higher morpheme–word ratios than running texts with their large proportion of
7.2 The synthetic index of the nominal data
161
Table 29: Synthetic index: Measures of central tendency and dispersion per period.
Token total Morpheme total SI SD CV
th/th
th
th
th
th
th
th
th
, , . . .
, , . . .
, , . . .
, , . . .
, , . . .
, , . . .
, , . . .
, , . . .
predominantly monomorphemic function words.66 But this discrepancy is easily explained: In addition to lexical morphemes as defined in the preceding section, Greenberg’s count includes sequences like -ceive, considered non-morphemic in the present study (see Chapter 6), as well as grammatical morphemes.67 In any event, the synthetic index values computed for the nominal usage data in each century lie well within the range stipulated for analytic languages (see above). Regardless of whether we accept an upper limit of nearly two morphemes per word as an indication of an analytic language type at this point, the synthetic index as such is a convenient measure to obtain a first impression of the development of syntheticity in the nominal part of the language. While the synthetic index provides a general sense of the extent of syntheticity, it does not capture the kind of syntheticity since it focuses on the number of morphemes per word exclusively. To distinguish between the synthetic subtypes, i.e., agglutinating and fusional, Greenberg (1960: 185) introduces an index of agglutination, computed as the ratio of the number of agglutinative junctures to the number of all morpheme junctures. Index values greater than 0.50 are considered
To gauge the distribution of nouns and function words in PDE running texts, I performed an automatic search on the Freiburg-LOB Corpus of British English, comprising 500 British English texts, totaling roughly one million words. Limited to tokens tagged as common nouns, on the one hand, and grammatical words (modals, prepositions, pronouns, etc.), on the other, the search revealed that the proportions of nouns and function words to all tokens amounted to 20.6% and 42.7%, respectively. As a thought experiment on how the inclusion of grammatical morphemes would have raised the synthetic index, consider the plural marker: Based on figures obtained from the FreiburgLOB Corpus of British English, the proportions of plural and singular common nouns equal 25.8% and 74.2%, respectively; hence, we might assume that roughly a quarter of all nominal tokens in the 20th century would comprise one additional morpheme, namely -s. It follows that, if the respective instances of the plural marker were included in the morpheme count, the synthetic index would increase from 1.43 to 1.68 points, which happens to be the same value computed by Greenberg (1960).
162
7 Overall development of syntheticity
indicative of agglutinating languages, whereas ratios below 0.50 are assumed to be symptomatic of fusional languages. Still, Greenberg’s approach has its shortcomings: Besides problems surrounding the definition of agglutinating constructions, especially the types of automatically induced phonological alterations allowed for, the agglutination index is a binary variable, unable to accommodate the gradualness of fusion between the constituent morphemes of complex words. Similarly, Payne (2017: 87), proposing an “index of fusion” as a complementary measure to Greenberg’s agglutination index, does not address these issues. For this reason, the next chapter presents alternative measures to compute the degree of fusion in order to locate the English nominal usage data within the synthetic spectrum.
8 Typological subtypes: Between isolation and fusion The typological subtypes isolating, agglutinating and fusional are thought to occupy distinct stages of a cycle, or spiral, around which languages move in the course of their existence. As noted in Chapter 2, Dixon (1997: 41–42) proposes to visualize the cycle as a clock, offering a neat descriptive tool, which is employed in this chapter to depict the successive typological positions of the English nominal language in use since 1150. To this end, all mono- and polymorphemic nouns need to be taken into account, but the first task to be tackled is to determine the extent of fusion between morphemes concatenated to form complex words.
8.1 Assessment of fusion 8.1.1 Previously suggested parameters of fusion The most elaborate proposal to assess the degree of fusion in complex lexemes has been advanced by Dalton-Puffer (1996). Grounded in the concept of natural morphology, she develops a scale of morphotactic transparency “to grade complex words according to how well their morphophonemic shape reflects their morphosemantic compositionality” (Dalton-Puffer 1996: 56). For her ME data, she suggests a ranking that consists of six stages, illustrated by sample words, as follows: (1) concatenation without phonological variation, possibly exemplified by lauerdom, (2) resyllabification, e.g., accep$table, (3) morphophonological processes considered non-fusional, e.g., voicing of the base-final fricative in caitif > caiti$uete, and (4) those regarded as fusional, e.g., dyvyded > dyvysioun, (5) base vowel alternation, e.g., long > lengthe, and (6) weak suppletion, i.e., base allomorphy exhibited by corumpen > corupcion (Dalton-Puffer 1996: 58). The ME examples provided by Dalton-Puffer allow us to identify various stages based on the words’ spelling; however, orthography generally fails to reveal important features necessary for a phonological description of the data. A case in point, discussed by the author, is the position of a lexeme’s main stress, and thus the quantity and/or quality of its base vowel, which “is particularly unsatisfactory in the case of the Romance suffixes” (Dalton-Puffer 1996: 42). We assume that shifts in this regard occurred during ME times, “[b]ut what the state of
https://doi.org/10.1515/9783111317717-011
164
8 Typological subtypes: Between isolation and fusion
affairs was at different points during the Middle English period still is anybody’s guess” (Dalton-Puffer 1996: 42–43).68 Besides the difficulty, or even impossibility, of inferring the value of vowels from graphic representations, spelling may also misguide our interpretation of consonants. A short example from my 12th/13th century data suffices to illustrate the problem: The spelling of ‘verse’ as ferrs (cmorm.m1) or fers (cmancriw-1.m1) would indicate a voiceless fricative at the word’s onset, but, at the same time, the data seem to attest to the voiced instantiation represented by the letters and . While the difference in voicing may reflect regional variation in phonological realizations, the variants uers and vers as well as fers appear to have been written by the same hand (cmancriw-1.m1). It seems unlikely that the author oscillated in his pronunciation of a lexeme that, according to the dictionaries, had been adopted into English some 300 years earlier; instead, I would argue that the graphic variants testify to fluctuations in spelling, underlining the uncertainties of ME orthography. While ME pronunciation can only be reconstructed on the basis of comparative evidence, it is commonly believed that “from the mid-sixteenth century we have explicit and often quite reliable phonetic descriptions” (Lass 1992: 24). On closer inspection, however, such assumptions appear overly optimistic, as demonstrated by Dobson’s (1957) impressive study of eModE sources. Based on a careful scrutiny of more than 100 works by spelling reformers, language teachers, phoneticians and grammarians from the 16th and 17th centuries, including authorities such as Thomas Smith, John Hart, William Bullokar, Charles Butler, Alexander Gil, Richard Hodges and Robert Robinson, Dobson meticulously describes the phonetic value of vowels, diphthongs and consonants, comparing the evidence given by the eModE experts. These comparisons reveal not only differences between the orthoepists but also variation within a single author. John Hart, to take a prominent example, alternated, for instance, between a long and short vowel in child and between the voiced and voiceless fricative in reason (Dobson 1957: 479, 930). Confidence in Hart’s An Orthographie, published in 1569, is further weakened when his work is compared with Alexander Gil’s Logonomia, issued in 1619.69 Since both writers
Dalton-Puffer resolves the issue by focusing on the theoretical stress-shifting potential of an affix, “the defining criterion being that a suffix may cause stress shift, but not that it necessarily does in a specific case at a specific time” (1996: 43, emphasis original). Although published 50 years later, Gil’s Logonomia compares to Hart’s work because it is assumed to represent an older stage of pronunciation: As described by Dobson (1957: 131), in 1619 Gil had reached the age of 54 years “and his speech therefore may have by that time become ‘old-fashioned’”.
8.1 Assessment of fusion
165
were professed representatives of Standard English, the speech of educated people in London, differences in their phonetic descriptions probably resulted from different personal speech styles. On the one hand, Hart seems to have been more conservative than Gil, as exemplified by their pronunciation of ‘answer’, rendered as aunsuer (Hart) vs. answer (Gil). On the other, Hart appears to have been unusually advanced since his realization of the ME diphthong /aɪ/ as monophthongal /ɛ:/, strongly criticized by Gil, contrasted with the diphthongal pronunciation preferred until the middle of the 17th century (Dobson 1957: 766). The inconsistencies of phonetic descriptions in the 16th and 17th centuries partly reflect faults made by the authors, but, more importantly, they also indicate variation in actual pronunciation in eModE (e.g., Dobson 1957: 421), making it difficult for today’s researchers to decide on the then prevalent pronunciation. During the next century, variation was increasingly reduced not only in sound descriptions; Johnson’s (1792) Dictionary of the English Language, originally published in 1755, fixed the vocabulary for the first time, and English grammar was codified (see also Baugh & Cable 2002: 285). Sheldon (1946) provides a succinct overview of the advance of phonetic descriptions throughout the 18th century: While Bailey (1763) was the first to mark primary stress in the 1727 edition of his dictionary, a practice copied by Johnson in 1755, it was not until the publication of Sheridan’s General Dictionary of the English Language in 1780 that all entry words were phonetically respelled, including the proper rendition of the pronunciation of consonants. In short, we have to accept that, even though phonological accounts of individual sounds were refined and supported by contemporary evidence in eModE, the detailed phonetic descriptions necessary for applying Dalton-Puffer’s (1996: 58) scale of morphotactic transparency did not become available before the end of the 18th century. While in theory her proposal would suitably capture various degrees of fusion between constituents of complex words, the six-stage scale is too fine-grained to handle the data: For most of the study period, we lack detailed information on the phonetic realization of the nouns under investigation, so that the allocation of a given lexeme to any stage would be driven by (informed) guesswork rather than solid phonetic evidence. An alternative method to assess the extent of word-internal fusion between morphemes can be derived from proposals advanced by Tauli, who introduces “some criteria for defining the concepts of analysis and synthesis” (Tauli 1945/49: 81) with respect to inflection. Regarding analyticity and syntheticity as relative notions, his criteria also account for cases that “have often been called agglutinative” (Tauli 1945/49: 84) and can therefore be applied to the distinction between
166
8 Typological subtypes: Between isolation and fusion
the synthetic subtypes. In total, Tauli suggests eleven parameters, including four criteria that focus on properties relevant to the present study.70 Two parameters address phonological variation, namely base and affix variance. According to Tauli (1945/49: 83), “a form with a common [i.e., invariable] stem is more analytic than one with an alternant stem”; in other words, if the base of a complex word displays variation relative to its free occurrence, the lexeme is more synthetic, or more fusional for that matter. The parameter ‘base variance’ does not discriminate between the various phonological alterations caused by affixation; instead, any modification signals a variable base, be it base vowel change, e.g., deep > depth, epenthesis or syncope in the base, e.g., marked > markedness or enter > entrance, alternation of the base-final segments, e.g., conclude > conclusion, or any combination thereof, e.g., antecede > antecessor. Likewise, affix variance induces more syntheticity, and thus more fusion; put conversely, “a form with a common morpheme is more analytic than a form with an alternant morpheme” (Tauli 1945/49: 83). The most obvious case of variation in this respect is, of course, affix allomorphy, which covers all kinds of alternations regardless of their motivation; in contrast to Greenberg (1960), Tauli (1958: 153) regards automatic variation as “the most simple instances of this phenomenon [i.e., affix allomorphy]”. Again, the parameter is a binary one, merely coding whether or not the affix realized in a complex word exhibits phonological variation. Base and affix variance are both indicative of fusion because, even if the morpheme juncture is not obscured by processes operating on the base final and/ or affix onset, phonological alternations obliterate the one-to-one correspondence between form and meaning that is characteristic of agglutination. Besides phonological aspects, however, fusional and agglutinative lexemes can be differentiated according to the status of their constituent morphemes. In this connection, two additional parameters, suggested by Tauli (1945/49), prove useful. As before, the first criterion addresses the base. If the base occurs independently, the complex word is considered more analytic; put differently, a bound base indicates more syntheticity than a free base. This parameter captures the importance of the status of the base, discussed by Haselow (2011: 233–234) and others as a crucial feature in the development of late OE from syntheticity to ana-
Six criteria address syntactic characteristics such as morpheme order or separability, which are irrelevant to the present purpose. A further criterion refers to the morpheme’s phonological strength, implicating that “a phonetically stronger morpheme, e.g., a syllable, is more analytic than a phonetically weaker morpheme, e.g., a non-syllabic sign” (Tauli (1945/49: 83). I disregarded this parameter, too, because I find the postulated link hardly convincing – would words affixed with disyllabic morphemes like -ation need to be considered more analytic than those including monosyllabic affixes like -er?
8.1 Assessment of fusion
167
lyticity, namely from stem-based to word-based morphology. While I would object to analyzing complex words as instances of analyticity, the parameter ‘status of base’ conveniently distinguishes between the synthetic subtypes since agglutination takes free bases as input, whereas bound bases are typical of fusion. The fourth and final parameter attends to the status of the affix. Despite being a bound morpheme by definition, an affix may be accompanied by an identical form in free occurrence; for convenience, the affix may then be labeled ‘free morpheme’. Complex words comprising affixes that also occur as free morphemes are considered more analytic than lexemes including affixes that are attested in bound use only (Tauli 1945/49). Similarly, Gabelentz (1891: 332) notes that affixes used as free morphemes are “Anzeichen einer besonders losen Agglutination” [indicators of an especially loose agglutination]. Undoubtedly, such cases approximate the agglutinating stage directed toward isolation where compounds are located, inducing Marchand (1969: 113–121) to classify them as preparticle compounds.
8.1.2 Fusion quantified in terms of four parameters The parameters ‘base variance’, ‘affix variance’, ‘status of base’ and ‘status of affix’ are binary variables, adequate to handle the morphemic status, i.e., boundedness vs. free occurrence, but less satisfactory with regard to phonological variation. In fact, I argued above that fusion as a function of phonological processes operating on the constituent morphemes is best considered gradual, but, at the same time, I had to conclude that the application of Dalton-Puffer’s (1996: 58) scale of morphotactic transparency to historical data is doomed to fail in the absence of phonetic respelling dictionaries prior to the end of the 18th century. Instead of ranking the phonological processes with respect to the degree of fusion they entail, I therefore included all alternations without distinction. Each parameter was coded for fusion by annotating the variable with 1 to indicate base/affix variance and base/affix boundedness, whereas instances of invariance and free occurrence of base/affix were marked with 0, implying agglutination. Essentially, the criteria are independent of each other and may “occur in very different combinations” (Tauli 1945/49: 83), i.e., they do not necessarily converge on either synthetic subtype. Nevertheless, they may conspire to establish highly fused complex words like posterity as opposed to purely agglutinating lexemes such as overtime. Summing up the values of each parameter for the individual complex noun, then, enables us to calculate a fusion index for the respective word, as illustrated in Table 30 by sample nouns from the 20th century.
168
8 Typological subtypes: Between isolation and fusion
Table 30: Fusion index calculated for sample nouns. Base
overtime unrest entrance division posterity
Affix
Status
Variance
Status
Variance
∑
FI
The first example in Table 30, overtime (1979obs1n8b), exhibits no fusion: The complex word consists of a free base (0) and a free morpheme as affix (0); moreover, its base does not alter in free and bound usage (0), and the prefix has no allomorphic variants (0). Consequently, the penultimate column displays a total of 0 index points. The fusion index (FI) is allocated the value 1 to account for the difference between multimorphemic lexemes and monomorphemes, which are assigned fusion index 0.71 Fusion index 2 is exemplified by unrest (1967stm1n8b), for which the parameter values add up to 1 since the noun merely satisfies one fusional criterion, namely nonoccurrence of the affix un- as free morpheme. By contrast, entrance (1936dugdj7b), undergoing syncope in the base and including an affix not attested in free usage, displays fusion index 3, i.e., the sum of the parameter values increased by one index point as before. In the prefinal example, division (1989tim1n8b), three criteria are annotated with 1: Base variance is manifest in the shortened base vowel and the spirantized base final; moreover, the affix is always a bound morpheme and exhibits allomorphy. The total of the parameter values plus the additional index point yields fusion index 4 for this lexeme. The noun posterity (19xxcadmh7b), finally, exhibits the highest fusion index since all parameters adopt the value 1: A bound base can be established on the basis of two parallel forms, adjectival posterious and nominal posterior; both contain the diphthong /ɪə/, which is reduced to /ɛ/ in posterity, thereby fulfilling the criterion ‘base variance’ as well. Furthermore, the affix is attested in bound use only and has allomorphic realizations, namely -ety and -ty, so that the total parameter values, augmented by one point as usual, result in fusion index 5.
For ease of exposition, the labels ‘fusion index 0ʹ, ‘fusion index 1ʹ, ‘fusion index 2ʹ, etc. are abbreviations used to indicate the value assigned to the fusion index for a given noun. Monomorphemic nouns, accorded fusion index 0, are included in the final step because they instantiate isolation, thereby impacting on the position of the nominal lexicon on Dixon’s clock.
8.1 Assessment of fusion
169
This short sketch illustrates that, despite the binarity of each parameter, different degrees of fusion can be assessed for complex words by drawing on their combined values. For quantification purposes, I attributed equal weight to the four criteria and determined that fusion index 2 represents twice as much fusion as fusion index 1; likewise, fusion index 3 reflects thrice as much fusion as fusion index 1, while fusion indices 4 and 5 indicate four and five times as much, respectively. Admittedly, this calculation is extremely crude, and quite possibly misrepresents the impact of the individual parameters, but I ask the readers’ indulgence, bearing in mind that this is probably the first attempt to quantify and correlate some of Tauli’s criteria. At any rate, the mean fusion index per century can be computed on the basis of the fusion index calculated for each noun, which allows us to trace the gradual changes in the nominal language in use. The computation of the fusion index for the complex nouns was based solely on the immediate constituents, ignoring the fusional properties of complex bases, e.g., journaliz + ing (drummond-1718), and complex compound constituents, e.g., coronation + oath (montagu-1718). The determination of the immediate constituents followed the etymological analysis given by the pertinent dictionaries; if these sources provided conflicting information, e.g., misdoer, segmented into misdo + -er by the OED but interpreted as mis- + doer by the MED, the complex noun was excluded from the investigation.72 As regards complex nouns originating from borrowing, I followed the OED’s suggestions how to analyze the lexemes from an English perspective; accordingly, the Romance loanword resemblance1 (elyot-e1-h), for instance, could be understood in terms of the verbal base resemble1 and suffixal -ance. If nouns were converted from members of other word classes, the fusion index was calculated for the respective base word; in this vein, the adjectival base of indispensables (collier-1835) was split into its immediate constituents in- and dispensable. Complex lexemes resulting from inflectional splits, such as clothes, were dismissed. In these instances, one of the immediate constituents is the grammatical marker, which I disregarded since the focus is on fusion of lexical morphemes. Moreover, complex nouns derived by simultaneous concatenation of more than two constituents, e.g., polycythaemia (1985lowem8b) < poly + cyte + haemia, were excluded from further consideration. Discarding lexemes along these lines as well as unclear cases resulted in the omission of 0.7% of the complex tokens on average throughout the study period.
As usual, the date of origin and, in case of obsolescence, the last attestation date of the immediate constituents were recorded to ensure that only transparent complex nouns were included.
170
8 Typological subtypes: Between isolation and fusion
The values of the parameters were set for each noun along the following guidelines. The status of the base, i.e., free occurrence vs. boundedness, was determined as described in Chapter 7. With respect to the status of the affix, I considered Germanic elements classified as prefixes but listed as locative particles by Marchand (1969: 113–121) as freely occurring morphemes, in line with their etymological origin and historical development since ME. By the same token, suffixal -ful, e.g., mouthful (goldsmith-1773), was accorded free morphemic status. Furthermore, affixes were checked individually for their occurrence in free usage. Besides its appearance as an independent word, the secreted affix had to convey the same meaning as its bound counterpart; thus, the prefix arch-, documented as adjective in the sense of ‘chief, principal, prime, preeminent’ since 1574 by the OED, was annotated as free morpheme in 16th century’s archdekyn ‘archdeacon’ (merrytal-e1-h). Additionally, the combined meaning of the freely occurring affix and the base had to approximate that of the complex noun. The adjective micro, for instance, secreted from lexemes prefixed with micro- in the 20th century, has been used in the sense of ‘small-scale, very small’. This meaning is patently evident in the semantics of microorganism (1905lewim7b) but difficult to relate to the meaning of microscope (1905croom7b); therefore, the affix was marked for free status (0) in the first and for bound status (1) in the latter. Phonological variation was determined for the constituent morphemes, ignoring material inserted at the morpheme boundary, which is understood to demarcate the morphemes rather than creating fusion between them, as discussed in Chapter 6. Base variance was established on the basis of the phonetic descriptions provided by the OED; in other words, PDE pronunciation was projected backwards to previous centuries regardless of phonological processes that may have affected the lexemes during the study period. Evidently, this procedure fails to record the words’ actual realization throughout the centuries, but, as explicated above, authentic pronunciation is difficult, if not impossible, to ascertain before the advent of pronouncing dictionaries at the end of the 18th century. Lacking detailed phonetic evidence for most of the study period, retrojecting the pronunciation of the late 18th century to earlier periods would have been as arbitrary as resorting to the PDE pronunciation described by the OED, which has the additional advantage of including considerably more of the studied words than its precursors. Also, the drawbacks inherent in this approach are limited since the parameter ‘base variance’ is a binary variable, merely denoting whether the base differs between its free and bound instantiations. Passing over phonetic details, the parameter is able to capture, for instance, base vowel alternations for any period because changes in vowel quality in PDE, e.g., divine > divinity, usually correspond
8.1 Assessment of fusion
171
to shifts in vowel quantity in ME; hence, marking divinity for base variance on the basis of PDE presumably sets the correct parameter value for ME.73 In short, I compared the British English pronunciations of base and complex words suggested by the OED to ascertain whether the phonetic realization of the base differs in free and bound usage. Cases of resyllabification were disregarded since syllabification is a general process affecting all non-monosyllabic words regardless of morpheme boundaries; as such, resyllabification is not decisive for morphological fusion. Similarly, I ignored differences concerning word-final , which is not realized in the non-rhotic British dialects, e.g., anchor1 /aŋkə/, so that the base in anchorage (southey-1813) was not classified as a variant despite enunciation of the base-final due to resyllabification. If the dictionary did not provide a phonetic description because the lexeme had become obsolete, e.g., ostage ‘hostage2, hostel, inn’ (cmaelr3.m23), the parameter ‘base variance’ was allocated the value 0 to avoid an artificial increase in the fusion index based on mere speculation. Moreover, I opted for an invariant base if the OED recorded different pronunciations, e.g., blessedness (cmmandev.m3) transcribed as /ˈblɛsɪdnɪs/ and /ˈblɛstnɪs/, and if one alternative showed the base to be pronounced identically in bound and free usage.74 The parameter ‘affix variance’ was set for variance if the affix had allomorphic realizations, determined on the basis of the works by Dixon (2014) and Marchand (1969) with the occasional recourse to the OED; the variant forms are included in the appended affix tables (A4, A5, A10, A11). Additionally, very few lexemes exhibit affix pronunciations that deviate from the morpheme’s usual rendition. A case in point is archangel (cmpolych.m3), where the affricate is replaced by the velar plosive; for these nouns, the parameter was allocated the value 1. For convenience, combining forms, occupying a cline from bound bases to affixes (Bauer, Lieber & Plag 2013: 486), were treated as affixes if attached to free bases, whereas instances of neoclassical compounds, i.e., concatenations of initial and final combining forms, were considered to consist of a base (initial CF) and an affix (final CF) in keeping with the formal classification. In any event, their status is bound, so that the respective variables were accorded the value 1. The parameters ‘base variance’ and ‘affix variance’ were annotated for variation if
That said, the retrojection of PDE pronunciation to earlier centuries very likely antedates phonetic processes like assimilation. Such antedatings, also acknowledged in the context of Romance affixation (Chapter 6), are inevitable in historical studies as broad in scope as the present one. At the time of writing, the OED, constantly under revision, no longer lists alternative pronunciations for the nouns under consideration, as was the practice when the data were analyzed and classified along these lines.
172
8 Typological subtypes: Between isolation and fusion
the combining form was not realized in its unabridged form, listed in Appendix A12; accordingly, ‘affix variance’ was assigned the value 1 for septangles (recorde1-h) since nominal angle is premodified by the shortened variant of the initial combining form septi-1.
8.1.3 Development of fusion since 1150 Before addressing the development of fusion in general, as denoted by the fusion index, a brief survey of the four parameters is expedient. Figure 29 presents the distribution of tokens that exhibit fusion with respect to the individual criterion, i.e., the numbers of tokens for which the parameter was set to 1 in each period. For ease of comparison, token numbers for the 12th/13th century were normalized to a corpus size of 100,000 words.
Figure 29: The four fusional parameters in language use since 1150.
The illustration immediately reveals that the parameters are not equally important insofar as their instantiations in terms of token numbers diverge considerably. We might speculate that bound bases, the least realized criterion, are more difficult for language users to process than bound affixes, which are more common in language use. Hence, the impact of the diverse parameters on the speakers’ perception would differ, corroborating concerns raised previously. Numerically, boundedness of the base has been the most negligible factor throughout the study period. In line with observations about the shift from stembased to word-based morphology in OE (e.g., Haselow 2012a), the number of tokens evincing bound bases further decreased in ME before starting to rise again
8.1 Assessment of fusion
173
in the 16th century. The moderate increase, caused by Romance affixed words, continued until the 19th century and gathered some pace in the last period, partly due to the growing use of neoclassical compounds. Overall, the reintroduction of stem-based morphology via Romance affixed lexemes seems less relevant than might have been expected on the basis of previous claims. In contrast to base boundedness, the boundedness of affixes has been the quantitatively most important criterion during the investigation period, which is hardly surprising since affixes are essentially bound morphemes. The increase in the respective tokens in the 14th century can be traced to the rising number of Romance affixed words, whereas the number of Germanic tokens affixed with bound morphemes started to decline. The decreasing number of tokens containing Germanic bound affixes led to the overall decline of words comprising bound affixes in the 15th century. In eModE the usage of complex lexemes with bound affixes increased notably before leveling out in the 18th century; again, the increase can be attributed to Romance affixed tokens. Between base boundedness, on the one hand, and affix boundedness, on the other, the parameters ‘base variance’ and ‘affix variance’ materialize, indicated by black and gray broken lines in Figure 29. The numbers of tokens exhibiting a variant base and those displaying affix variation have developed so similarly that the broken lines are difficult to distinguish. Since both criteria are indicative of Romance affixation, it is hardly surprising that their development reflects the trend displayed by model lexemes for Romance affixation as outlined in Chapter 6 (Figure 21). While it is a well-known fact that Romance affixed words reestablished stembased morphology and phonological variation in English (e.g., Haselow 2012b), I am not aware of previous attempts to gauge the importance of the different features. In this respect, the quantified distribution of tokens across the four parameters, illustrated in Figure 29, is the first to specify the extent of various fusion aspects that were restored in language use. Though interesting in itself, this brief survey should suffice since the primary purpose of this section is to assess the overall extent of fusion surfacing in usage. To this end, the parameter values were pooled for each token, including nonfusional complex lexemes like compounds with values of 0. After calculating the fusion index for each complex word as described before, I computed the mean fusion index per century by multiplying the number of tokens manifesting a given fusion index by this index value, adding up the resulting products and dividing the sum total by the total number of complex tokens. Table 31 provides an overview of the distribution of tokens across fusion indices 1 to 5 and the mean fusion index in each period.
174
8 Typological subtypes: Between isolation and fusion
Table 31: Development of fusion based on the computed fusion index per century. th/th
th
th
th
th
th
th
th
, ,
,
,
,
,
, ,
, ,
, , ,
Mean FI
.
.
.
.
.
.
.
.
SD
.
.
.
.
.
.
.
.
CV
.
.
.
.
.
.
.
.
No of tokens
FI FI FI FI FI
A Welch ANOVA with ‘century’ as between-group variable, performed on data normalized to a sample size of 10,000 tokens per period, shows the mean fusion index to vary significantly over the whole study period, the effect size attesting to a small to medium effect (Fw (7, 1318) = 19.63, p = .000, ω2 = .04).75 The development of the mean fusion index indicates that language users have slowly but steadily employed more fused complex words. Apart from the significant increase in the 14th century with a nearly medium effect size (d = .41), most differences between adjacent periods turn out to be statistically non-significant, except for the growth in the 18th century and the decline in the 20th century, both manifesting, however, an effect size less than small. In conclusion, this short study confirms that fusion in language use has grown: At the beginning of the study period, speakers used considerably more complex words composed of free morphemes (FI 1) than in later centuries. Still, the majority of complex tokens in all periods exhibit a moderate degree of fusion (FI 2), whereas the proportion of strongly fused words in usage (FI 5) has always been negligible. Having established the fusion index for complex words, we can now take into account the entire nominal language in use, including monomorphemic words (FI 0), in order to determine its typological status in the centuries since 1150.
Since Levene’s test demonstrated a violation of homogeneity of variance (F (7, 3227) = 44.44, p = .000), the ANOVA and t-tests were carried out with Welch’s correction.
8.2 The slow move toward fusion: Four minutes in 850 years
175
8.2 The slow move toward fusion: Four minutes in 850 years Incorporating the monomorphemic tokens into the computation reduces the mean fusion index considerably due to the large proportion of monomorphemes, constituting more than 75% of the tokens on average across all periods. Table 32 provides an overview of the distribution of all tokens, except for those excluded (see above), according to the fusion index realized in each century; for convenience, the token numbers for fusion indices 1 to 5 are repeated from Table 31. Table 32: Mean fusion index of the nominal usage data per century. th/th
th
th
th
th
th
th
th
, , ,
, ,
, ,
, ,
, ,
, , ,
, , ,
, , , ,
Mean FI
.
.
.
.
.
.
.
.
SD
.
.
.
.
.
.
.
.
CV
.
.
.
.
.
.
.
.
No of tokens
FI FI FI FI FI FI
Throughout the centuries, the mean fusion index has remained well below 1, although the reductions in ME have been followed by continuous increases since the 16th century. An ANOVA, carried out on data normalized to 10,000 token samples, reveals significant variation over the whole study period, albeit with only a small effect (Fw (7, 5859) = 34.03, p = .000, ω2 = .02).76 Apart from the decrease in the 14th century and the small gain in the penultimate period, all differences between adjacent periods prove statistically significant, but even the largest increases, observed in the 16th and 17th centuries, evince effect sizes that fail to reach the threshold for small effects. Thus, the nominal language in use has become slightly more fusional in a slow but steady manner since eModE. Based on their mean fusion index, the nominal usage data for each century can be located on Dixon’s clock in order to depict the typological development since 1150. To this end, I consider the six fusion indices to correspond to 60 minutes on the clock face; hence, the fusion indices 0 to 5 can be arranged clockwise with
As before, the ANOVA and t-tests were performed with Welch’s correction because variances were unequal according to Levene’s test (F (7, 13695) = 113.08, p = .000).
176
8 Typological subtypes: Between isolation and fusion
ten-minute intervals, starting at four o’clock with the isolating type as proposed by Dixon (1997: 42). For illustrative purposes, Figure 30 presents the position of the fusion indices along with the respective sample nouns listed in Table 30 as well as monomorphemic anniversary (1905palln7b).
Figure 30: Fusion indices 0 to 5 arranged on Dixon’s clock.
The arrangement visualized in Figure 30 is perfectly suited to capture the gradience in language with respect to the typological subtypes: While anniversary (FI 0), unrest (FI 2) and division (FI 4) represent typical specimens of the isolating, agglutinating and fusional language types, respectively, less typical examples such as overtime (FI 1), entrance (FI 3) and posterity (FI 5) can plausibly be placed between the quintessential types. Accordingly, complex words that display no fusion (FI 1), such as compounds, are aptly located halfway between isolation and agglutination since their constituent morphemes largely retain their autonomy. Similarly, nouns satisfying two fusion criteria (FI 3) are well positioned between agglutination and fusion, whereas the morphemes of complex tokens such as posterity (FI 5) are so strongly fused that the words might easily be interpreted as instances of the isolating type, thereby motivating their position between fusion and isolation. This short illustration elucidates the rationale underlying the ordering of the fusion indices on Dixon’s clock; still, our primary concern is not the location of individual words but the position of the entire nominal usage data in each period. To this effect, the mean fusion indices per century, listed in Table 32, were transferred onto the clock face, as in the following example: Starting from four o’clock with a fusion index value of 0, the mean fusion index of the 12th/13th century, equaling 0.43, translates into four minutes past four o’clock. (For expository reasons, the fig-
8.2 The slow move toward fusion: Four minutes in 850 years
177
ures were rounded to the nearest minute.) Along these lines, the data points for each period were situated around the clock face as depicted by Figure 31.
Figure 31: Typological stages of the nominal usage data since 1150.
Besides illustrating the development of the nominal part of the English language, Figure 31 graphically illuminates the perhaps most surprising result of the foregoing analysis, namely that the nominal lexicon in use has been a typological mixture of isolation and agglutination with a strong preponderance of isolating elements. Though developing toward the more agglutinating type, the language did not even reach the stage halfway between quintessential agglutination and isolation in the final period. This approach, then, provides insights impossible to obtain by any of the various indices proposed by Greenberg (1960). Occupying positions around five o’clock, English proceeded beyond the stages assumed for Proto-IE descendants by Dixon (1997: 42), who locates them between one and possibly three o’clock, thus between fusional and isolating language types. As to the direction of the development, Figure 31 displays the constant progression toward the agglutinating type, although the language appears to have been stuck during the ME period. In any case, the usage data do not corroborate claims that English seems to hasten toward a purely isolating system as maintained by Gabelentz (1891: 252); rather, the language has been crawling in the opposite direction. Since typological judgments like those expressed by Dixon (1997), Gabelentz (1891) and others are usually based on observations about the grammar, conclusions drawn from evidence in the lexical domain may well differ. At this point,
178
8 Typological subtypes: Between isolation and fusion
however, I would not preclude the possibility that the above postulations about the typological profile and development of the English language – or its grammar, for that matter – are purely impressionistic claims, which may not stand up to empirical scrutiny. Before addressing this issue in the next chapter, I would like to briefly comment on the rate of the typological development of the English lexicon. As apparent in Figure 31, the language moved four minutes toward agglutination during the study period, which translates into an average of half a minute per century. To get a sense of the pace, this figure may be compared with the development of Egyptian: According to Hodge (1970), the language is supposed to have rounded the clock in 3,000 years, which results in two minutes per century on average. Viewed from this perspective, Egyptian progressed four times faster than English. Even if we allow for different rates in grammar (Egyptian) and lexicon (English), the considerable difference in speed between the languages suggests that English has been slow to change its typological profile.
9 Changes in syntheticity and analyticity While the previous chapters have focused on changes within the nominal domain exclusively, the present chapter relates the development of nouns toward more analyticity and/or syntheticity to general language use, utilizing the syntheticity and analyticity indices introduced by Szmrecsanyi (2012, 2016). As mentioned in Chapter 2, Szmrecsanyi adopts the Greenbergian synthetic index in a slightly modified fashion to assess the extent of syntheticity and analyticity in grammatical coding evident in language use. Accordingly, the syntheticity index is calculated by dividing all word tokens with a bound grammatical morpheme in a given text by the token total of that text; the ratio is then normalized to a sample size of 1,000 tokens. Likewise, the analyticity index is computed as the ratio of all free grammatical morphemes (function words) in a document to the token total of that document, multiplied by 1,000 (Szmrecsanyi 2016: 97). Szmrecsanyi’s work is extremely beneficial for the present study. For a start, the proposed indices are easily applicable to the nominal data by inserting the number of complex (synthetic) nouns into the formula for the syntheticity index, on the one hand, and incorporating the number of monomorphemic (analytic) nouns into the calculation of the analyticity index, on the other. Moreover, the index values computed for the nominal area can be directly compared with the results obtained for the grammatical domain by Szmrecsanyi as both investigations are based on the Penn Parsed Corpora of Historical English series, applying the same subdivisions into centuries.77 Capitalizing on this circumstance, the syntheticity and analyticity indices of the nominal data (henceforth ‘lexical data’) are immediately compared with the figures presented by Szmrecsanyi (2012, 2016) in the following exposition. The development of the mean syntheticity index in the domains of grammar and (nominal) lexicon is depicted in Figure 32; the mean values of the grammatical syntheticity and analyticity indices from the 12th to the 20th centuries are drawn from Szmrecsanyi (2016: 101), for the mean lexical SI in each century see Table 33 below. The illustration exhibits that the grammatical syntheticity index has been consistently higher than its lexical counterpart. Szmrecsanyi (2012: 658) reports an overall mean syntheticity index of 148 index points across all periods, whereas
That said, the databases differ insofar as Szmrecsanyi (2012, 2016) included all 605 texts from 1150 to 1913, while I ignored supplements to the original eModE Helsinki data; further, I disregarded the mere five files ranging from 1901 to 1913 but included 90 texts from the ARCHER to represent the 20th century (see Chapter 4). https://doi.org/10.1515/9783111317717-012
180
9 Changes in syntheticity and analyticity
the respective index generated by the lexical data merely amounts to 40 index points (standard deviation 20.33, coefficient of variation 0.51); for a detailed listing of the mean and dispersion measures per century see Table 33.78 On the whole, the lexical syntheticity index has varied significantly over the study period, as indicated by a Welch ANOVA (Fw (8, 37) = 10.53, p = .000, ω2 = .16); also, the effect size ω2 suggests a large effect of the variable ‘century’.79
Figure 32: Development of the grammatical and lexical syntheticity indices.
Still, the diachronic development appears to have been steady but slow as most differences between adjacent centuries do not prove statistically significant. During ME the syntheticity index generally decreased, but only the difference between the 13th and 15th centuries turns out to be significant, reaching a medium effect size (d = .59). Since the beginning of eModE the lexical syntheticity index has constantly risen. While the difference between the 15th and 16th centuries is marginally significant at the .052 level, the development gained momentum in the following two periods, marked by significant changes from the 16th to the 17th century (d = .33) and from the 17th to the 18th century (d = .45) with small to medium effect sizes, before slowing down in the final two periods.
Following Szmrecsanyi (2012), computations of dispersion measures and statistical tests (ANOVAs, t-tests) for syntheticity and analyticity are based on the respective index values calculated for each data file separately (n = 389). Again, ANOVAs and t-tests were performed with Welch’s correction since Levene’s test showed variances between files to violate the homogeneity assumption (F (8, 380) = 12.14, p = .000).
9 Changes in syntheticity and analyticity
181
The development of the lexical syntheticity index parallels the tendency observed for the respective index in the grammatical domain. Szmrecsanyi (2012) notes a significant decrease from the first to the second period, but subsequent differences between adjacent centuries seem to have been non-significant.80 Overall, however, the grammatical syntheticity index appears to have reached its nadir in the 15th and 16th centuries with 140 and 141 index points, respectively, before slowly increasing afterwards; the difference between the 17th and the 20th centuries is reported to be significant at the .011 level (Szmrecsanyi 2012: 658). In sum, the grammatical and lexical domains exhibit similar developments with regard to syntheticity, which is also corroborated by a global chi-square test revealing no significant difference in the distributions of the grammatical and lexical syntheticity index values across the centuries (χ2 (8) = 11.77, p = .162). Table 33: Syntheticity index: Mean and dispersion across texts per century.
No of texts Mean SI Minimum Maximum SD CV
th
th
th
th
th
th
th
th
th
. .
. .
. .
. .
. .
. .
. .
. .
. .
Turning to analyticity, Figure 33 outlines the developments of the mean analyticity index in the grammatical and lexical areas during the study period. As before, the grammatical index is invariably higher than the index values calculated for the nominal data; more precisely, the overall mean indices of syntheticity and analyticity are 3.6 and 3.7 times higher in grammar than in lexical usage, suggesting that grammatical expressions, whether bound or free morphemes, generally occur 3.6 to 3.7 times more frequently than (nominal) lexemes. The grammatical mean analyticity index across all centuries amounts to 471 index points (Szmrecsanyi 2012: 657), whereas its lexical counterpart equals 130 points (standard deviation 27.37, coefficient of variation 0.21); mean and dispersion measures per century are detailed in Table 34. Variation of the lexical analyticity
Although I present the index values for the 12th and 13th centuries separately, following Szmrecsanyi’s (2012, 2016) period distinction, I am rather skeptical about the reliability of the results thus obtained due to the skewed distribution of the data with the 12th century comprising no more than three texts. For this reason, I merely relate the respective figures without further comment.
182
9 Changes in syntheticity and analyticity
index over the entire investigation period proves significant with a medium to large effect size according to a Welch ANOVA (Fw (8, 37) = 6.60, p = .000, ω2 = .10).81
Figure 33: Development of the grammatical and lexical analyticity indices.
Diachronically, the mean index has developed gradually without dramatic changes between adjacent periods, except for the 14th and 20th centuries. The development broadly displays the mirror image of that observed for the syntheticity index: Initially, the analyticity index increased, accentuated by a significant difference between the 13th and 14th centuries manifesting a large effect (d = 1.24), but started to slowly decline in the 16th century. The index remained fairly stable in the 16th to 19th centuries as demonstrated by an ANOVA (Fw (3, 126) = 1.11, p = .348), before significantly decreasing in the final period with an almost large effect size (d = .78). Comparison of the developments of the lexical and grammatical analyticity indices reveals parallel tendencies since 1150. Szmrecsanyi (2012: 658), too, observes an increase in the grammatical analyticity index in the 14th century, significant at the .002 level, followed by a period of equilibrium until the 17th century, when the index started to steadily decrease. Although no significant changes between adjacent centuries seem to have occurred, the difference between the 17th and 20th centuries is reported to be significant at the .012 level (Szmrecsanyi 2012: 658), which corresponds to the discrepancy between those periods evident in the lexical domain (p = .001). Hence, the lexical and grammatical analyticity indices have followed a remarkably similar course; again, this assessment is supported
ANOVAs and t-tests were computed with Welch’s correction due to heterogeneous variances of the samples (Levene’s F (8, 380) = 6.67, p = .000).
9 Changes in syntheticity and analyticity
183
by a global chi-square test, confirming that the distributions of the grammatical and lexical analyticity index values across the centuries do not differ significantly (χ2 (8) = 2.71, p = .951). Table 34: Analyticity index: Mean and dispersion across texts per century.
No of texts Mean AI Minimum Maximum SD CV
th
th
th
th
th
th
th
th
th
. .
. .
. .
. .
. .
. .
. .
. .
. .
Apart from the surprisingly parallel diachronic trajectories of the syntheticity and analyticity indices in the grammatical and lexical domains, both areas show striking similarities on a general level. As noted above, the mean analyticity index across all periods in grammar and lexicon amounts to 471 and 130 points, respectively, whereas the overall mean syntheticity index in grammar and lexicon equals 148 and 40 index points, respectively. Correlating both indices reveals that the ratio between analyticity and syntheticity is virtually identical in both domains: The mean analyticity index is around 3.2 times higher than the mean syntheticity index – in other words, in grammatical as well as lexical usage data, analytic items occur more than three times as often as synthetic words. Moreover, the dispersion of the syntheticity and analyticity indices per text around the respective means across all centuries is comparable in both areas. The standard deviation from the mean analyticity and syntheticity indices in the grammatical domain equals 32.8 and 30.2 index points, respectively (Szmrecsanyi 2012: 658); similarly, the standard deviation from the mean lexical analyticity and syntheticity indices, amounting to 27.37 and 20.33 points, respectively, suggests more variation in analyticity than in syntheticity in terms of absolute deviation. However, the relative deviation, adjusted for the mean values, exhibits the opposite: The coefficient of variation of the lexical and grammatical analyticity indices, equaling 0.20 and 0.07, respectively, is lower than that of the lexical and grammatical syntheticity indices, reaching 0.51 and 0.20 index points, respectively.82 In
For the grammatical domain, I computed the coefficients of variation on the basis of the values of overall mean syntheticity and analyticity indices and standard deviations reported by Szmrecsanyi (2012: 657–658).
184
9 Changes in syntheticity and analyticity
brief, the overall text frequency of analytic elements is not only higher but also more stable across text files than that of synthetic items in both domains. Besides systematically evaluating the extent of analyticity for the first time, the great merit of Szmrecsanyi’s (2012, 2016) expositions is the innovative proposal for correlating the syntheticity and analyticity indices to establish the status of the language in terms of analyticity and syntheticity in grammatical expression. For each century, both index values are combined into a two-dimensional matrix, positioning the language in use relative to its grammatical analyticity and syntheticity in each period and enabling us to trace historical developments in this respect. Following Szmrecsanyi (2012, 2016), the mean index values calculated for the lexical data in each period are plotted in the left-hand diagram in Figure 34, where the mean syntheticity index corresponds to the x-coordinate and the mean analyticity index is graphed along the y-axis. In contrast to Figures 32 and 33, the data point for the 12th century is omitted since I consider the respective indices, based on a mere three texts, too unreliable to provide a realistic picture of the status of the language in the earliest period; therefore, the illustration starts with the coordinates for the 13th century (36, 121).
Figure 34: Development of the coordinated mean lexical and grammatical indices.
The left-hand chart in Figure 34 summarily illustrates the development of syntheticity and analyticity in the usage data as a result of frequency changes in the use of synthetic and analytic nouns. While analyticity rose during ME, coinciding with minor decreases in syntheticity, it started to slowly decline in the following periods, attended by gains in syntheticity, especially strong in eModE. The depiction unveils a historical trajectory that is not discernible in the one-dimensional representations provided in the previous chapters, where development is gauged in terms of more or less syntheticity along a single axis. Integrating analyticity
9 Changes in syntheticity and analyticity
185
into the survey, we observe a kind of semicircular movement from the 13th to the 20th centuries, which may even complete the full circle if data from OE periods were included in the analysis. This is the key proposition advanced by Szmrecsanyi (2012, 2016): Based on the distribution of free and bound grammatical elements in the 12th century, he concludes that the development has almost come full circle as illustrated by Figure 3 (Chapter 2). Although this suggestion seems extremely probable since OE is considered “a highly synthetic inflecting language” (Lass 1992: 94), more textual evidence would be needed to reliably determine the degree of grammatical syntheticity and analyticity in the 12th century. Preferring a more cautious approach, I took the liberty of reproducing Szmrecsanyi’s (2016) illustration without the data point for the 12th century, depicted as the right-hand diagram in Figure 34, in order to establish grounds for comparison. The resulting charts in Figure 34 show fairly similar developments of the syntheticity–analyticity coordinates, which is hardly surprising given the parallel trends observed above for both indices in the grammatical and lexical domains. Taken together, the depictions vividly underscore that the coding of grammatical information and the expression of lexical content have developed along parallel paths that appear to have formed a semicircle within the syntheticity–analyticity spectrum.
10 Typological shifts in lexical structure and word formation The preceding chapters, highlighting different aspects of the typological profile of the nominal lexicon in use, produce convergent results with respect to the overall development. In summary, the extent of syntheticity decreased in ME but has steadily risen again since the 16th century. This trend is reflected by the Greenbergian synthetic index, by the mean fusion index computed for complex and simplex nouns and by the syntheticity index adapted from Szmrecsanyi. While the synthetic and fusion indices disclose the degree of analyticity only indirectly, the means being based on the aggregate frequencies of monomorphemic and complex tokens, the syntheticity index, established on the number of synthetic words, is complemented by an index that explicitly addresses the extent of analyticity in data. The development of this analyticity index displays the reverse image of the trend observed for the syntheticity index; hence, we may conclude that changes in syntheticity correlate with shifts in analyticity rather than with shifts in synthetic components such as the number of morphemes or the degree of fusion in complex words. In fact, the overall decrease in syntheticity in ME is mirrored neither by the number of morphemes per complex token, which increased in the 14th and 15th centuries, nor by the development of the mean fusion index calculated for synthetic nouns. Treating syntheticity and analyticity as discrete phenomena enables us to evaluate their respective impact on the typological profile of the language. As mentioned previously, the overall mean analyticity index is 3.2 times larger than the overall mean syntheticity index, i.e., more than three quarters of the nouns in use have been structurally simple, while the proportion of morphologically complex nominal tokens has not reached 25% on average across the study period. The strong preponderance of analytic tokens, then, accounts for the location of the nominal language on Dixon’s clock: Despite increases in syntheticity, described in terms of fusion in that context, the nominal lexicon in use has not even reached the stage midway between the quintessentially isolating and agglutinating subtypes in the 20th century. These observations, finally, tie in with the general impression conveyed by the Greenbergian synthetic index with values lying within the range stipulated for analytic language types. In short, the nominal word-stock in use has been largely analytic at all times, although its synthetic proportion has never been negligible and has gradually grown since the beginning of eModE. We are now in a position to address the question raised at the end of Part 2 of the book, namely whether language users have extended their lexicon by choosing techniques in line with the typological profile of the nominal language. https://doi.org/10.1515/9783111317717-013
10 Typological shifts in lexical structure and word formation
187
On the whole, changes in the distribution of tokens instantiating new additions to the word-stock seem to reflect the shifts in the structural makeup of the lexicon in use. The trend toward more analyticity in ME is paralleled by especially high numbers of tokens realizing new analytic types in the first two periods; similarly, the development toward increasing syntheticity since the end of ME coincides with the growing numbers of tokens actualizing new types derived by agglutinating and fusional processes since the 16th century, as displayed in Figure 13 (Chapter 5). In brief, the typological developments on the token level appear to have followed highly similar trajectories, but we need to consider the possibility that this similarity is merely superficial, offering no insight into the correspondence between language type and speakers’ choices, since usage frequencies of new lexical additions are not truly indicative of language users’ creativity. Therefore, the preliminary conclusions based on token frequencies need to be substantiated with the focus turning to the distributions of analytic and synthetic types. To this end, Figure 35 reproduces the presentation of newly added types according to typological techniques from Part 2 (Figure 24), juxtaposed with the respective type distribution in the lexicon. To allow for comparison on a more specific level, the nominal types manifest in language use were classified for the typological subtypes they realize on the basis of the fusion indices. As described in Chapter 8, fusion index 0 denotes the isolating type, whereas fusion indices 1 and 2 correspond to the agglutinating subtype and fusion indices 3 to 5 represent the fusional subtype.83 Finally, the type totals of the three subtypes in ME, eModE and lModE were correlated to provide the relative distributions, displayed in the right-hand diagram in Figure 35. In view of the above remarks on the typological makeup of the nominal word-stock, the large proportion of the isolating subtype in all historical epochs is hardly surprising. Although the extent of analyticity in the lexicon was previously determined on the basis of usage frequencies, which seem to have been relatively high for isolating lexemes considering the TTRs provided in Table 28 (Chapter 6), a similar picture emerges on the type plane. Even though the percentage of types actualizing the isolating subtype has decreased by 9 points during the study period, the lexicon has remained largely isolating.
With respect to the agglutinating subtype, I might add that fusion index 1 applies to compound or compound-like words, whereas fusion index 2 relates to quintessentially agglutinative lexemes, i.e., free bases affixed with a bound morpheme. Although fusion index 2 can, in principle, result from any of the four parameters set for fusion, in actual fact 99.9% of the respective types exhibit affix boundedness.
188
10 Typological shifts in lexical structure and word formation
Figure 35: Relative type distribution by typological technique/subtype in new additions and the nominal lexicon.
The second most important subtype in terms of type frequencies is the agglutinating one. Constituting roughly a third of all nouns throughout the investigation period, the proportion of agglutinative types has always represented a substantial part of the lexicon despite decreases in eModE. By contrast, the impact of fusional types on the typological profile of the word-stock has been limited. While the proportion of nouns manifesting the fusional subtype doubled from ME to eModE, subsequent increases seem to have been modest; overall, the fusional component of the lexicon has only been half as large as the agglutinating part. Turning to the typological techniques used to extend the nominal lexicon, evident in the left-hand diagram in Figure 35, we observe a basic symmetry between the distribution of new types and the lexical structure in ME. In apparent consonance with the typological makeup of the word-stock, ME speakers predominantly employed isolating methods but also strongly relied on agglutinating processes while avoiding newly introduced fusional techniques to form new nouns. In the subsequent periods, the fairly close correspondence between the nominal lexicon and new additions seems to have weakened, at least quantitatively. However, the quantitative approach adopted here may be misleading in assessing whether language users’ behavior may have complied with the typological characteristics of the lexicon in a given period. A case in point is the high proportion of isolating techniques used by ME speakers to enlarge their nominal inventory, which coincides with an equally large proportion of the isolating subtype in the lexicon, suggesting that language users may have followed the trend manifest
10 Typological shifts in lexical structure and word formation
189
in the word-stock. Still, closer inspection of the different isolating methods revealed that speakers were mostly inspired by language-external factors, namely contact with Romance, as they primarily resorted to borrowing. The typological structure of the lexicon may well have fostered the absorption of loanwords, as discussed by Sapir (1921: 208), but it has evidently not prompted language users to exploit internally motivated isolating techniques as few nouns were derived by conversion. Consequently, the extensive recourse to isolating strategies, chiefly prompted by external factors, and the high degree of analyticity in ME are best considered coincidental. By contrast, the utilization of agglutinating and fusional techniques might seem to have been motivated by the distribution of the synthetic subtypes in the word-stock. The widespread use of agglutinating strategies to form new nouns is in line with the substantial proportion of agglutinative types in the lexicon; likewise, the fairly limited application of fusional methods in noun formation correlates with the small fraction of the fusional subtype evident in the nominal wordstock. With respect to synthetic processes, we might thus construe a link between the typological profile of the nominal lexicon and the typological techniques chosen by language users to coin new nouns. Still, Figure 35 displays discrepancies between the lexicon and new additions in the relative distribution of agglutinating and fusional nouns. While the spread of the fusional and agglutinating subtypes stabilized at a ratio of 1:2 in eModE, the number of new derivations by fusional techniques was considerably lower and further declined in lModE, whereas agglutinating processes have been excessively exploited. Obviously, language users have avoided fusional methods more strongly than suggested by the proportion of the fusional subtype in the lexicon; instead, they have taken recourse to agglutinating techniques, which have been firmly anchored in the language since early ME. Their fairly conservative behavior in this regard, accompanied by the strong reliance on old lexemes noted in Chapter 5 (see Figure 6), translates into the slow pace of the development of the nominal lexicon revealed by Dixon’s clock. In sum, the comparison of the typological profile of the word-stock and the typological techniques used to extend the lexicon shows clear correspondences between both areas, despite differences in numerical ratios. On the level of tokens, the development toward increasing analyticity in ME and the reversed trend toward growing syntheticity since eModE is manifest in the word-stock and in the means of lexicon extension. On the type plane, a broadly similar picture emerges with respect to the overall development. While the extent of analyticity in the ME lexicon was paralleled by an abundant use of isolating techniques in that period, there are no grounds to claim that the choice of isolating methods was guided by the structural makeup of the lexi-
190
10 Typological shifts in lexical structure and word formation
con. The use of synthetic processes, however, might have been motivated by the typological configuration of the synthetic part of the word-stock. From the diachronic perspective, the development toward an increasingly synthetic lexicon would seem to have vastly advanced the employment of agglutinating processes since eModE. Against this background, one might be tempted to conclude with Kastovsky (1992b) that the typological profile of the nominal language has influenced language users in their choice of typological techniques to enlarge their nominal inventory, but this issue is addressed more specifically in the Chapter 12.
Part 4: Discussion and conclusion
11 Typological trends in English morphology and beyond 11.1 Corresponding developments in the lexicon? Based on the findings presented in Part 3, the nominal lexicon in language use can be characterized as largely analytic given that monomorphemic nouns have been employed three times more often than complex lexemes on average across the study period. Increases in analyticity during ME times culminated in the 15th century, with analytic nouns occurring five times more frequently than their synthetic counterparts. The reversed trend toward increasing syntheticity since eModE is reflected by the constantly declining ratio of the analytic to the synthetic index, reaching its lowest point in the 20th century, with simplex nouns being used only twice as often as multimorphemic types. The growth of syntheticity has not resulted in a fundamental redistribution of the synthetic subtypes in the nominal lexicon; although the fusional part of the word-stock has undoubtedly increased, the agglutinating subtype has prevailed throughout. Within the agglutinative spectrum, the data attest to a development toward isolation, i.e., the use of combinations of free morphemes. In particular, compounds, considered hybrid between agglutination and isolation (Gabelentz 1891: 341; see also Giegerich 2009), have been progressively employed, the proportion of compound tokens to all agglutinative tokens rising from 22.8% in eModE to an average of 33.0% in the final two centuries. Moreover, the trend toward isolation, or more loosely agglutinated structure, has been advanced by the replacement of OE bound prefixes by potentially free morphemes, which, according to Bauer (2003: 38), “looks like a change from synthesis towards analysis”. Having established the typological profile of the nominal usage data, we next need to address the question whether the findings can be generalized to other lexical areas. In the absence of comparable investigations of typological developments in the verbal and/or adjectival domains, the answer to this question remains speculative and essentially depends on the researcher’s theoretical stance. Within the structuralist framework, for instance, language is considered to consist of different parts that may, in principle, display different typological traits (Croft 2003: 46). Along the same lines, Kortmann (2012: 614) notes “that shifts in language typology . . . may be of different nature on different structural levels in different periods of English”.
https://doi.org/10.1515/9783111317717-014
194
11 Typological trends in English morphology and beyond
11.2 The verbal domain Still, there is some evidence that the increasing analyticity in ME not only affected the nominal area but also materialized in the verbal part of the lexicon, where “[a] historical drift has occurred from prefix-stem sequences to so-called particle verbs” (Berg 2015: 154). Considering that prefixed verbs are inherently synthetic units, while phrasal verbs are periphrastic, thus analytic, constructions (e.g., Aikhenvald 2007b), the change from prefixed to particle verbs signals a move from syntheticity to analyticity. The shift appears to have started in early ME, slowly gathering pace until the 15th century, when verb–particle sequences finally replaced prefix–verb constructions in new word formations (Burnley 1992: 445; see also Bolinger 1971: xi). The emergence and proliferation of particle verbs was probably effected by the change from OV to VO word order (e.g., Traugott 1999: 247), possibly supported by the adoption of borrowed syntactic units, such as bear ill will < porter male volonté (Nielsen 2005: 113–115); for an extensive listing of phrases adopted from Romance see Prins (1959, 1960). In any event, the increasing use of phrasal verbs in the 15th century coincided with the peak of the analytic development observed for nouns. The number of particle verbs is supposed to have grown exponentially during eModE (e.g., Traugott 1999), suggesting an ever-increasing tendency toward analytic verb formation. Given the lack of studies considering developments from the 17th to 20th centuries, the idea that the analytic trend has continued unabated would not have been questioned, but, fortunately, this research gap has recently been filled by Rodríguez-Puente (2019). Analyzing the use of phrasal verbs in ARCHER and the Old Bailey Corpus from 1650 to 1990, she finds that “phrasal combinations grow steadily in frequency from 1650 to 1799, but start a slow but continuous decrease until the end of the twentieth century” (Rodríguez-Puente 2019: 176). Although we need to allow for register-specific differences, the combined results from both corpora seem to attest to the trend reversal evident in nominal morphology. Similar to phrasal verbs, composite predicates are periphrastic structures, instantiating analyticity in the verbal domain. Also labeled ‘light verb constructions’, these configurations comprise a semantically light verb, typically do, give, have, make, take, and a deverbal noun that is formally identical and/or derivationally related to the corresponding verb. Replacing a semantically dense verb such as care with its associated composite predicate take care entails an increase in verbal analyticity, whereas the substitution of light verb constructions by their related verbs, e.g., make use > use, indicates the opposite.
11.2 The verbal domain
195
The growing use of composite predicates seems to have preceded that of phrasal verbs – according to Traugott (1999), the number of light verb constructions increased exponentially in ME, thus paralleling the development in the nominal domain as indicated by the analyticity index. In eModE, the overall proliferation of composite predicates appears to have decelerated, albeit at different rates contingent on the specific light verb. Kytö (1999), studying the development of light verb constructions comprising the aforementioned verbs during the 16th and 17th centuries, notes that structures with give rose more markedly than those with make, take and have, whereas the number of composite predicates with do decreased in the 17th century. Despite the declining use of do + N constructions, however, the general development of composite predicates seems to suggest that the analytic tendency in verb usage has steadily continued, albeit at a reduced pace. As such, it would run counter to the trend observed for nouns, i.e., the decrease in analyticity, especially pronounced in the 20th century. Still, I would caution against generalizing the findings on composite predicate distributions in ME and eModE beyond 1700, not least because of the apparent trend reversal recently detected in phrasal verb use. As in the field of particle verbs, we lack studies covering the developments of light verb constructions during lModE, but this research gap is on the brink of being filled. In her corpus-based investigation, Berlage (forthcoming) traces the development of various selected composite predicates from the 16th to 20th centuries, finding that several specific light verb constructions exhibit shrinking usage frequencies since 1800, which would parallel the reversal of the analytic tendency observed for phrasal verbs. In short, the combined research findings suggest broadly parallel developments in the areas of phrasal verbs and composite predicates: Analyticity continuously increased in ME and eModE but started to decline in lModE, which would roughly correspond to the typological development of nouns. While these results seem encouraging, evidence is too scarce to serve as a basis for the postulation of converging trends in the verbal and nominal domains – in fact, we may even doubt that periphrastic verbs are representative of the verbal category. Although I am not aware of any investigation studying the quantitative distribution of differently structured verb types in language use, past or present, I would suspect that the proportion of periphrastic verbs to all verbs has not been large enough to substantially impact on the entire verbal lexicon. Pending studies analyzing the typological profile of the verbal domain, i.e., the distribution of monomorphemic and complex verbs, we can only speculate about the degree of analyticity and/or syntheticity in this area. With regard to diachronic developments, however, speculations would be less wild if we took into account
196
11 Typological trends in English morphology and beyond
the major processes available for verb formation, namely borrowing, conversion, affixation and possibly compounding. For a very first impression of the distributions of these means during the investigation period, Table 35 presents the relative proportions of new verbal types according to their origin, based on figures provided by the OED.84 Table 35: Relative distribution of new verbs according to origin based on the OED.
Conversion Borrowing Affixation/compounding
th/th
th
th
th
th
th
th
th
% % %
% % %
% % %
% % %
% % %
% % %
% % %
% % %
Despite possible reservations pertaining to the credibility of the dictionary’s etymologies, Table 35 outlines a development that is not entirely unexpected: The continued increase in types derived by analytic means in ME, accompanied by decreasing numbers of types formed by synthetic processes, subsided in eModE, when the total of concatenated verbs started to exceed the number of verbal types derived by borrowing and/or conversion. Thus, the typological tendencies in verb formation seem to correspond to those observed in noun formation, although their quantitative impact on the structure of the entire verbal word-stock may well be as limited as in the nominal domain. In order to determine the degree of analyticity and/or syntheticity in verbal language use, we would have to count the occurrences of tokens instantiating monomorphemic and complex verbs – a time-consuming task awaiting future research. Still, I would hypothesize that the typological shift from increasing analyticity toward growing syntheticity in the nominal lexicon, evidenced by the developments of the respective indices discussed previously, materializes in the verbal domain as well. On the one hand, the analyticity index, calculated on the basis of monomorphemes, would certainly have risen during ME, not least due to the influx of Romance verbs that were structurally unanalyzable at their inception, but became transparently morphologically complex in subsequent centuries. On the other hand, the syntheticity index, based on verbs comprising more than one morpheme, must have increased since eModE as verbs have been progres-
An advanced search for verbs first used in the given periods was performed in July 2023, capitalizing on the OED’s new interface which returns type numbers classified for “type of formation”. Unfortunately, the figures are far less reliable than desired, providing no more than an extremely crude approximation.
11.3 Parallel shifts in derivational and inflectional morphology
197
sively derived by concatenative processes if the figures supplied by the OED are not completely erroneous. Moreover, it stands to reason that verbs converted from nouns, i.e., their principal source, have become increasingly complex in view of the nouns’ structural complexity and the finite stock of monomorphemic words available for conversion. In sum, the hypothesized typological trajectory of verbs in general and the empirical evidence regarding the development of periphrastic verbs in particular indicate parallel trends in the verbal and nominal domains. While I have not addressed the issue of adjectives, the deliberations on verbs should provide sufficient grounds to generalize the typological profile of the nominal data to the lexical word-stock given that verbs constitute the open word class maximally distinct from nouns. Finally, we must bear in mind the nouns’ representativeness for the lexicon in terms of quantity; recall from Chapter 1 that the rationale for focusing the study on this lexical category was its size. Undoubtedly, nouns comprise the largest word class in terms of type and token frequencies; a quick search of the OED revealed that the ratio of verbs and adjectives to nouns equals 1:1.5.85 At the token level, the analysis of PDE data extracted from the Freiburg-LOB Corpus of British English indicated that nouns were used twice as often as verbs and occurred 2.6 times more frequently than adjectives.86 Against this background, I take the findings obtained for the nominal data to be representative of the typological makeup of the lexical data in their entirety until further research proves otherwise.
11.3 Parallel shifts in derivational and inflectional morphology Broadening the scope to morphology in general, we now turn to the perhaps most impressive result of Part 3 – the remarkably parallel trends in grammar and lexicon with respect to analyticity and syntheticity. I need to stress that this insight would have been impossible without Szmrecsanyi’s (2012, 2016) important contributions detailing the typological development of English grammar until recent times.
In September 2021 the following figures were retrieved via the advanced search interface: 142,400 nouns, 22,400 verbs, 71,700 adjectives. Since the search routine ignores verbs and/or adjectives included in entries for nouns, and vice versa, the numbers provide a rough approximation only. An automatic corpus search for tokens tagged as common nouns, lexical verbs and adjectives produced the following results: 206,889 nouns, 105,375 verbs, 80,040 adjectives.
198
11 Typological trends in English morphology and beyond
Prior to his investigation, we could only rely on received wisdom, which contends an ever-increasing analyticity in grammatical morphology since OE, based on observations about the reduction of inflectional affixes in the verbal, adjectival and nominal paradigms. Adopting a strictly quantificational approach to grammatical coding in successive centuries from early ME to PDE, Szmrecsanyi was able to demonstrate that the trend toward analyticity was reversed in eModE, readjusting our understanding of typological developments in English grammar. Comparison of the trends in grammatical and lexical morphology, then, exhibits an astonishing parallelism between the two domains. Similar to the development in the lexicon, grammatical analyticity constantly rose during ME times, peaking in the 15th and 16th centuries, before it started to continuously decline. Moreover, the 3:1 ratio of analytic to synthetic elements on average across the study period is virtually identical in grammar and lexicon, although deviations are less pronounced in the grammatical part of the language. At the zenith of analyticity, function words were used 3.4 times more frequently than inflectional morphemes – at its nadir in the 20th century, free grammatical elements were employed 2.5 times more often than their bound counterparts. Compared to the distribution of analytic and synthetic lexical words mentioned at the beginning of this chapter, English grammar proves to have been more stable with respect to analyticity, and its mirror image syntheticity, than the lexicon. Despite discrepancies in detail, the degree of correspondence between grammar and lexicon is surprising in view of the different characteristics of the two domains with respect to function, inventory size and semantics. Whereas lexical elements supply the content, grammatical material provides the structure in communication, entailing that the former are organized in open classes, while the latter is arranged in closed classes. As to semantics, Talmy (2011: 625) aptly states that “the meanings that open-class forms can express are virtually unrestricted, whereas those of closed-class forms are highly constrained”. Regardless of these fundamental differences, however, the coding of grammatical and lexical information has been fairly similar from a typological perspective. Parallel developments in grammar and lexicon have already been observed for OE, prompting Haselow (2011: 240) to conclude that “changes in English inflection were accompanied by changes in derivation, which both drifted into the same direction”. His findings demonstrate a trend toward increasing analyticity from 700 to 1250 and can now be extended along the temporal axis on the basis of Szmrecsanyi’s (2012, 2016) work and the present study: The development toward growing analyticity accelerated during ME but came to an end in eModE, when syntheticity started to rise again. The fairly similar picture emerging in grammar and lexicon from OE to PDE raises the question whether we can claim that the typological profile is character-
11.4 Beyond morphology: Tendencies in syntax and semantics
199
istic of English morphology in general, transcending the grammatical and lexical categories. As noted previously, the answer ultimately depends on the researcher’s theoretical conception. In the structuralist tradition, which conceptualizes lexical and grammatical morphology as clearly distinct domains, typological developments in either category would be independent of each other; hence, any correspondence between grammar and lexicon in this respect would be due to pure chance. In the cognitive approach adopted in this book, however, lexical and grammatical categories are conceived as gradient; thus, morphological coding is not compartmentalized into distinct categories, so that analogous developments in grammar and lexicon are to be expected. Consequently, I do not consider the parallels in the morphological expression of grammatical and lexical material arbitrary; rather, I assume the typological developments in grammar and lexicon to represent a general tendency in English morphology. Accordingly, the (semi)cycles, depicted earlier in Figure 34, along which lexical and grammatical morphology have progressed since early ME jointly reflect the cyclic movement of English morphology. In general, cyclic changes affect specific instances, such as the development of negative marking described by Jespersen (1917), but they may also apply to languages in their entirety, referred to as “macrocycles” by Gelderen (2016: 5). The morphological cycle, denoting changes of a large subpart of the language, i.e., morphological coding, emerges between these planes and may further conspire with typological developments in other subparts to constitute a macrocycle in the sense of Gelderen. The macrocycle can be regarded as the modern conceptualization of the linguistic drift introduced by Sapir (1921: 183), who postulates that changes consistently affect linguistic units at different planes, propelling them in the same direction. Similarly, Payne (2017: 85) notes that “languages do seem to ‘favour’ one or another [language type]”. In order to assess whether shifts in English might have converged in harmony, the remainder of this chapter focuses on developments in areas outside morphology.
11.4 Beyond morphology: Tendencies in syntax and semantics Besides shifts in morphological typology, English has witnessed “a radical change [. . .] in word order typology” (Comrie 1989: 203), the highly flexible sentence patterns in OE having been replaced by the canonical subject–verb–object sequence. Word order in this sense is one of the three criteria of a basic order typology established by Greenberg (1963); the additional parameters denote the use of prepositions vs. postpositions and the occurrence of premodifying vs. postmodifying
200
11 Typological trends in English morphology and beyond
adjectives. The English language type appears to be harmonic with respect to its basic order because in all three construction types, i.e., subject–verb–object declaratives, prepositional phrases and adjective–noun sequences, “the corresponding members tend to be in the same order” (Greenberg 1963: 77), thereby allowing for generalizations about sentential and phrasal syntax. While none of the basic order types can, of course, be argued to be more analytic than its counterparts, a fixed and conventionalized order is essential to deduce the intended meaning, such as agent or patient roles, in languages that have lost most, if not all, inflection. In this spirit, Sapir identifies three concomitant developments in English, “each [. . .] at work in other parts of our linguistic mechanism” (Sapir 1921: 174). Besides case reduction and fixed word order, he notes “the drift toward the invariable word”, i.e., “[the] striving for a simple, unnuanced correspondence between idea and word” (Sapir 1921: 180). As a result, words like whence, whither or thence, thither that basically derive their meaning from a simpler form (where, there) but are marked for distinctions deemed fastidious have not survived in contemporary language use. According to Sapir, the trend toward the invariable word appears to determine the developments in declension and word order; in any event, the interplay between the three domains seems indisputable. Seizing on Sapir’s ideas, Hawkins (2019) proposes that still more structural and semantic developments correlate with the trend toward invariant words. Comparing PDE with its cognate language German, which has remained fairly synthetic, he uncovers several syntactic and semantic discrepancies, all pointing to greater ambiguity in English. On the syntactic plane, the language admits category-ambiguous lexemes, such as house, as well as verbs that can be used transitively and intransitively (She opens the door. – The door opens.). Moreover, English tolerates syntactic ambiguities arising from raising constructions, e.g., She wants us to help, and tough movement, e.g., This problem is tough to solve; in German, by contrast, such syntax patterns are more restricted, if at all possible. On the semantic plane, Hawkins notes a broader set of thematic roles that can occupy the subject and/or object position in transitive sentences, such as locatives, e.g., This table sits four people. Greater semantic diversity, or less specificity, is also apparent in the selectional restrictions of English verbs, permitting a wider range of arguments than their German counterparts; accordingly, English speakers know a person, they know a language, and they certainly know how to use this verb, whereas German language users, for want of an equally versatile verb, need to resort to three different lexemes (kennen, können, wissen). Finally, Hawkins lists the loss of semantic contrasts formerly manifest in words such as whence and whither, which is assumed to have resulted from English’s “greater
11.4 Beyond morphology: Tendencies in syntax and semantics
201
collapsing of semantic distinctions and of different semantic types onto common surface forms” (Hawkins 2019: 704).87 The emergence of structural and semantic ambiguity, or vagueness, is closely related to the shift from word-internal to word-external properties in English, which Hawkins regards as the consequence of the drift toward invariant words. Hence, “information once contained in lexical stems or inflectional and derivational affixes” (Hawkins 2019: 706) needs to be supplied by the context, so that the single, unnuanced word heavily relies on its neighboring words for its syntactic and semantic interpretation (see also Vachek 1961). The development toward invariable words and the concomitant change from word-internal to word-external coding affected not only inflectional paradigms but also derivational affixation, with important repercussions for complex lexemes. Sapir (1921) posits that derivatives tend to disappear if their meaning is not sufficiently distinct from that of their base words, i.e., if “the derivation runs danger of being felt as a mere nuancing of, a finicky play on, the primary concept” (Sapir 1921: 181). A more sober explanation is provided by Tauli (1958), who notes that, in the course of the drift toward arbitrary and unanalyzable words, lexical bases and their affixes are no longer discriminated, indicating that “the meaning of the derivational affix in the word has faded” (Tauli 1958: 173) and the base has ceased to exist as an independent unit in the mental lexicon. As a result, complex words may become so fused that the derivational morpheme eventually evaporates. The fading, or complete loss, of affix semantics is evident when words like enlighten are tautologically marked for their verbal status or when nouns like statement, formally specified by nominal suffixes, are used as verbs. The dissolution of specific affix meaning may also account for competition in deverbal nominalizations such as admonishment and admonition, both used since ME without semantic differentiation – an extensive listing of competing suffixes in the 17th century is supplied by Bauer (2006: 186); for interchangeability of abstract noun suffixes in ME see Dalton-Puffer (1996: 126–130). In short, English words, regardless of their potential formal complexity, became syntactically and semantically more general, requiring context for their syntactic disambiguation and semantic specification. This development was accompanied by an increase in the use of phrasal verbs and composite predicates as well as prepositional verbs (Hawkins 2019). Similarly, the strong reliance on word-external properties to counter semantic underspecification may have fos-
The above criteria are a selective choice of the twelve structural and semantic differences between PDE and German recorded by Hawkins (2019: 703–704).
202
11 Typological trends in English morphology and beyond
tered the growing use of compounds, as these manifest specification of their head nouns in an essentially syntactic fashion. As noted previously, context dependency is indicative of analytic languages (e.g., Aikhenvald 2007b), and the introduction of periphrastic verbal constructions in ME is symptomatic of the language’s development toward increasing analyticity. Hence, analyticity is reflected by the shift from formerly prefixed verbs, which had become semantically underspecified due to the loss of their prefixes, to phrasal verbs, which spread their meaning over free constituents “instead of packing a fat bundle of semantic features into one word” (Bolinger 1971: 45). But even verbs that had retained distinct semantics appear to have been affected, as evidenced by the tendency to replace a simple verb like surrender with its phrasal analog give up (Vachek 1961; see also Tauli 1958: 68). Such replacements, if confirmed in further studies, may have been entirely motivated by stylistic considerations; alternatively, we might hypothesize that increasing analyticity influenced speakers’ conceptualization to the effect that complex meanings, as manifest in surrender, were broken up into two simpler concepts, each coded separately, as illustrated by give and up. On the whole, the review of the literature documents some well-founded developments in morphology, syntax and semantics, which seem to have jointly supported the analytic tendency in English. Vachek (1961), proceeding from the assumption that the different linguistic levels are interrelated, considers the analytic trend in English to have impacted even more universal linguistic notions. For a start, he reminds us of the difficulties in demarcating the category ‘word’ in English because the language allows formations such as phrasal compounds, e.g., merry-go-round, that obscure the boundary between words and word groups. By contrast, the respective borderline is more conspicuous in synthetic languages, where the category ‘word’, in general, is “definitely more clearcut and more strictly delimited” (Vachek 1961: 23) than in analytic languages. Moreover, Vachek suggests that the boundary between the categories ‘word’ and ‘affix’ is less distinct in analytic languages. In synthetic languages the affix invariably operates on the individual word, whereas the English affix may attach to units larger than words, as evident in group genitives, e.g., the Queen of England’s funeral, or dephrasal adjectives, e.g., an old-maidish lady. The larger scope of the affix implies that “the mutual relation of English words and affixes is much looser” (Vachek 1961: 22), which, according to the author, indicates that the categories ‘word’ and ‘affix’ are less discrete than in synthetic languages. Even if we do not want to follow his line of reasoning to its conclusion – the relaxed association between word and affix may be more indicative of a strongly syntactic nature than of indistinct category boundaries – the categories ‘word’ and ‘affix’ indeed seem less clearly demarcated in English, permitting researchers to classify
11.5 The global typological profile of the English language
203
morphemes like back- or over- as compound constituents (Marchand 1969) or as prefixes (Dixon 2014). In view of converging developments on various linguistic planes, Vachek (1961: 62) concludes that it is necessary to regard “the analytical trend of English not as a purely morphological affair but rather as a principle”. While the notion of a principle pervading the entire language is certainly debatable, the expositions by Sapir (1921), Hawkins (2019), Tauli (1958) and Vachek (1961) clearly suggest the collaboration of developments at different levels in shaping the typological profile of English.
11.5 The global typological profile of the English language Not surprisingly, all authors characterize English as an analytic language. The analytic traits are particularly noticeable when contrasting PDE with synthetic languages such as German (Hawkins 2019) or Czech and Russian (Vachek 1961), but the largely analytic character of the language is also visible when comparing the extent of analytic and synthetic coding in PDE morphology as described above. With respect to the language’s development, English morphology attests to a reversed trend from increasing analyticity in ME to growing syntheticity since eModE, while studies of syntactic and/or semantic phenomena have not yet scrutinized historical trajectories along these lines. So far, diachronic investigations have usually settled for comparisons between synthetic OE and analytic PDE, concluding an increase in analyticity which is explicitly or implicitly assumed to have continued unabated. An exception to this rule is Danchev’s (1992) attempt to identify “analytic and synthetic developments at all three basic language levels – phonology, grammar and the lexicon” (Danchev 1992: 27), in the course of which he presents some provocative ideas. The aforementioned group genitive, for instance, which is considered indicative of analyticity by Vachek (1961), is reinterpreted as suggestive of growing syntheticity “because a string of words functions as one structural and semantic unit” (Danchev 1992: 32). Since he dates the emergence of group genitives to late ME, this would nicely coincide with the gradual reversal of the typological trend in English morphology. A further striking proposal advanced by Danchev is to consider cases of blending as “[a]nother type of syntheticity” (Danchev 1992: 35), although he does not elaborate on the idea. We might, however, speculate that the concept expressed by a blend like brunch synthesizes the meanings of breakfast and lunch, thereby representing an instance of semantic syntheticity. As such, it would counter, or reverse, the analytic development hypothesized above, namely the
204
11 Typological trends in English morphology and beyond
breakup of complex meanings into simpler concepts, which may have motivated the replacement of semantically dense simplexes by periphrases, but these are mere speculations at this point. Similarly challenging are Danchev’s suggestions concerning analyticity and syntheticity in phonology. Analytic developments are thought to be represented by the decomposition of umlaut vowels in borrowed words, such as /y/ in Romance musique, into two sequential phonemes, i.e., /ju/ in English music.88 Synthetic tendencies, by contrast, may be read into the nasal velar consonant; though well aware of the actual genesis of the phoneme, the author proposes that “/ŋ/ could be considered as partial synthesis of /n/ and /g/” (Danchev 1992: 38). On the whole, Danchev’s claims must be regarded as tentative; besides certain possibly inspiring ideas, some of the evidence presented to corroborate synthetic trends is flimsy at best.89 As to the overall typological profile of English, he concludes that “whereas the dominant typological feature of Middle English is provided by the marked trend towards analyticity, the Modern English period is characterized both by continuing analyticity and by reemerging syntheticity” (Danchev 1992: 36, emphasis original). This assessment, though built on somewhat tenuous grounds, is perfectly in line with the findings of the present study. The survey on developments in areas outside morphology reveals desiderata for future research pertaining to the kind of evidence adduced and the historical depth of the studies. So far, typological evaluations of English syntax and semantics have been based solely on anecdotal evidence, which allows us to gain an initial impression but needs to be supplemented by quantificational approaches. Since the typological profile of a language is inherently gradient, it cannot be described appropriately by focusing on systematic dissimilarities between different language types; instead, usage frequencies must be taken into account. Equally imperative are syntactic and semantic investigations that cover historical periods in more detail. The usual comparison between the rather synthetic OE and the rather analytic PDE shows English to have become more analytic but cannot detect possible trend reversals like those observed in morphology since
While such cases would be analyzed as instances of agglutination, thus syntheticity, in the Western tradition, East European linguists consider them to reflect isolation, hence analyticity. In any event, the outlined decomposition of umlaut vowels indicates a reduction in fusion, i.e., a qualitative decrease in syntheticity. To give just one example, Danchev (1992: 34) believes synthetic trends in the lexicon to be “reflected in the spelling of the originally free collocation loan word through loan-word to the single lexeme loanword”; as noted earlier, however, orthography is not a decisive criterion for asserting compound or phrasal status.
11.5 The global typological profile of the English language
205
eModE; accordingly, the narrative about English’s incessant movement toward analyticity has been perpetuated, regardless of its actuality. Due to the lack of such diachronic studies, typological developments in areas outside morphology cannot be compared to the trends established for English morphology, so that this chapter concludes on a more general note. The accelerated growth of analyticity in ME morphology was partly paralleled, partly followed by analytic tendencies in syntax and semantics; in any case, developments on various planes seem to have been closely interrelated, propelling the language in a uniform direction. Hence, we have good reason to assume a macrocycle as specified by Gelderen (2016), or a drift in the sense of Sapir (1921). Against this backdrop, I would expect that the shift toward increasing syntheticity attested in both morphological domains since eModE would surface in other subparts of the language as well. Claims to this effect are, of course, empirically unsubstantiated and need to be reassessed in future work; however, within a cognitive framework, converging developments are highly plausible, as expounded in the next chapter.
12 Typology and change: Cognitive and sociocultural roots If English has indeed developed harmoniously across different levels, one might be tempted to conclude that its typological profile has influenced language users in their choice of typological techniques to enlarge their nominal inventory. The view that the language type has subconsciously influenced speakers’ behavior is advanced by Kastovsky (1992b); similarly, Haselow (2011: 255) seems to endorse this position when he writes that the disappearance of derivational suffixes in late OE “was determined by a general typological change of the language”. Such statements appear to construe the language type as an abstract principle that impacts on the language, which is inherently incompatible with the cognitively inspired usage-based approach adopted here. Within this framework, language types and typological shifts are merely descriptive tools without explanatory potential; in other words, “[t]he names of structural tendencies (analytic, synthetic, etc.) indicate the direction or the results of the changes, but offer no explanation” (Tauli 1958: 56). The overall language profile resulting from developments in various parts of the language nicely compares to the beaten-track image invoked by Keller (1989): Choosing the straight path between the cafeteria and library across the campus lawn, students create a trodden path without planning to do so. While the individual actions are motivated by the general preference for the shortest route, they jointly produce a beaten track as if led by an invisible hand; similarly, “[language] is the unintended cumulative consequence of a countless number of intentional communicative acts by countless people” (Keller 1989: 115). Hence, Keller properly conceptualizes language as emergent from usage, but his exclusive focus on intentional acts ignores the fact that language is a social as well as a cognitive activity (e.g., Croft 2003: 289; Divjak 2019: 5–6). A more comprehensive view is advanced by usage-based approaches that regard language as “changing through the interaction of social usage events with the cognitive processes characteristic of the human brain in general” (Bybee & Beckner 2010: 854). Along these lines, the present chapter explores the factors that have presumably determined the typological developments in the nominal domain. More specifically, Section 12.1 introduces a selection of physiological and cognitive mechanisms as well as sociocultural factors that potentially affect the use and extension of the nominal word-stock. Broadening the view, Section 12.2 briefly illustrates that the suggested factors operate in grammar as well, followed by a short discussion about the adequacy of the theoretical framework. Section 12.3, finally, reviews the typo-
https://doi.org/10.1515/9783111317717-015
12.1 Key factors in the use and extension of the nominal lexicon
207
logical shifts in the nominal lexicon since 1150 with reference to their physiological, cognitive and sociocultural roots.
12.1 Key factors in the use and extension of the nominal lexicon Before considering some key factors in more detail, we need to note the basic precondition for usage instances to affect language and its structure. Prerequisite is the assumption that language is “a dynamic system that changes with experience” (Bybee & Beckner 2015: 504), which, in turn, presupposes that each experience with a given linguistic type, lexical or grammatical, is stored in the mental lexicon. Accordingly, an exemplar-based model, as proposed by Bybee (2010: 14–32), assumes a rich memory that not only includes specifics about phonetic details, meaning, contexts of use, etc. but also allows for redundant information like usage tokens to be accumulated. This concept of memory sharply contrasts with earlier notions entertained by structuralists and others who maintain that redundancies are not stored in permanent memory, partly because of limitations on mental storage. More recent research, however, suggests that the capacity of long-term memory is theoretically unlimited, thereby providing support for rich memory representations of exemplar models (Bybee 2010: 15). Hence, I presume that language originates and changes in actual usage, with invariant as well as altered realizations stored in exemplar representations. The next step, then, is to consider the disparate factors involved in language use, starting with domain-general physiological and cognitive mechanisms that operate when we employ language.
12.1.1 Physiological and cognitive mechanisms First of all, language production is a neuromotor activity that becomes increasingly automated through practice. Thus, in frequent sequential realizations, articulatory gestures are reduced in magnitude and/or temporarily overlap, resulting in phonetic reduction and/or assimilation to the point that originally multimorphemic elements develop into monomorphemes, as exemplified by the simplex noun lord, which evolved from the OE compound hlāfweard. Since repetition affects all neuromotor routines, its effects are observed not only in articulation but in all kinds of highly practiced behavior, such as tying shoelaces, bicycling or handwriting (e.g., Bybee 2015: 9, 238).
208
12 Typology and change: Cognitive and sociocultural roots
Importantly, language is a mental activity grounded in cognitive mechanisms that do not apply solely to the domain ‘language’ but operate in other areas like vision or reasoning as well (e.g., Bybee & Beckner 2010). In the following paragraphs, I single out several domain-general processes that are likely to have impacted on the nominal data; a more extensive list is provided by Bybee (2015: 238–239). A ubiquitous process in our daily life is categorization, an ability that is fundamental for survival; by constantly matching new experiences with stored impressions of past experiences, we distinguish between edible and harmful foods, between bonfire and wildfire, between friend and foe. Similarly, we categorize instances of language use, comparing them to stored representations and, if deemed sufficiently similar, allocating them to existing categories. Categories are built up at different levels of schematicity: While token frequency strengthens particular type representations (see below), type frequency, understood as “the use of a pattern with different items” (Bybee 2015: 238), reinforces patterns, thereby creating a more general category. Since patterns differ as to their granularity, e.g., affix specificity vs. word class membership, categories are formed at different levels of generality. These categories, or patterns, can be analogically extended to coin new words, as discussed at some length in Chapter 6; this way, analogical extension strengthens and possibly generalizes the respective pattern, depending on the number and kinds of types it applies to. Moreover, “[p]atterns with high type frequency tend to replace patterns with lower type frequency” (Bybee 2015: 238), so that differences between previously distinct patterns are leveled out. The process of analogical leveling is evident in the replacement of the OE agent morpheme -end, e.g., helpend (cmvices1.m1), by its more frequent counterpart -er, e.g., helper (cmearlps.m2), in ME. While analogical extension to less frequently used patterns may result in the disappearance of categories composed of few types, such as agent nouns suffixed with -end, individual types are protected by frequent use. As described in the excursus in Chapter 5, an exemplar with extremely high token frequency acquires a mental representation strong enough to be accessed directly instead of through its constituent parts; thereby, previously utilized affixes may ‘survive’ in form, though not in meaning, as exemplified by -lock, which is still discernible in the frequently employed noun wedlock. The effects of token frequency are not only observed for individual, possibly complex, words but show at all levels of linguistic organization in the tendency of chunking; more precisely, “the repetition of strings of elements leads to their forming chunks in cognitive representation” (Bybee 2015: 238). Chunking seems to be the mechanism underlying cases that I labeled ‘univerbation’ when, for in-
12.1 Key factors in the use and extension of the nominal lexicon
209
stance, the repeated sequencing of prepositional at and nominal onement yielded the noun attonement ‘atonement, harmony’ (moreric-e-1-h). Another domain-general mechanism is “[t]he tendency to associate meaning directly with form” (Bybee 2015: 238); in language, this process is intrinsic to what is commonly called ‘folk etymology’ by linguists. Thus, language users impose an internal structure on a lexeme by ascribing meaning to its supposed parts, so that originally meaningless sequences achieve morpheme status. This mechanism governs the emergence of new derivational affixes like -aholic and accounts for the genesis of Romance affixes in English, discussed in Chapter 5. Finally, Bybee exposes the mechanism responsible for the growing semantic vagueness, or generalization, discussed by Hawkins (2019). As noted in the previous chapter, Hawkins focuses on the consequences of this development, namely the strong reliance on contextual clues for semantic disambiguation, while attributing the loss of specific meaning to the drift toward invariant words. However, meaning is not generalized due to some elusive drift; rather, linguistic units lose their semantic specificity as they are used in increasingly diverse settings. The process perpetuates itself since “expanding contexts lead to generalization which then leads to further expansion of contexts” (Bybee 2015: 133).
12.1.2 Sociocultural forces Besides the aforementioned mechanisms, which operate mostly beyond our awareness, intentional factors play a crucial role as “[l]anguage has a fundamentally social function” (Beckner et al. 2009: 2). By communicating, we coordinate social interaction and exchange knowledge and ideas, we express attitudes and emotions, and we establish and affirm our identity, to name just a few objectives. A prerequisite for communication to be successful is that speakers need to be clear in what they say in order to be understood; thus, one of the major forces driving language change is a “tendency toward clarity” (Tauli 1958: 50), counteracting the effects of automated speech production (see also Gabelentz 1891: 251). More specifically, this tendency manifests itself in the striving “to maintain a oneto-one mapping between underlying semantic structures and surface forms” (Slobin 1977: 186), which is best accommodated by word formation via compounding. Since language use is not restricted to conveying purely informational content, speakers need additional techniques to communicate their emotions, their self, their relation with their interlocutors, etc., utilizing elements of surprise, flattery, humiliation or other tools to impress their audience (e.g., Slobin 1977). A frequently used means to impart emotionality, for instance, is the recourse to
210
12 Typology and change: Cognitive and sociocultural roots
unusual, often longer expressions, i.e., “divergence from the linguistic norm” (Tauli 1958: 51). The utilization of novel expressions that diverge from standard forms is not limited to emotional contexts but is a general rhetorical device observed in grammatical and lexical change. Due to inflationary use, linguistic elements lose their expressiveness, so that speakers introduce new expressions “for their special extravagant effect” (Haspelmath 2018: 112; see also Tauli 1958: 57). Haspelmath (2018) considers inflation and extravagance the driving force behind the replacement of synthetic patterns by analytic ones in grammaticalization; similarly, Tauli (1958: 57) notes that analytic grammatical forms are more expressive than synthetic ones. Applied to the lexical domain, the desire to increase, or restore, expressiveness may well have motivated speakers to compensate for the loss of OE affixes by employing freely occurring particles, thereby creating compoundlike nouns and periphrastic verbs. While sociocultural factors like these affect the linguistic behavior in any society, populations engaged in contact with other speech communities also frequently modify their language by adopting foreign words. Depending on the purpose they serve, loanwords are divided into cultural and core borrowings (Haspelmath 2009b). Cultural borrowings denote concepts that are newly introduced into a language community; accordingly, cyteseyn (cmaelr3.m23) and nacioun (cmpolych.m3), for instance, refer to the novel concepts ‘citizen’ and ‘nation’ imported from the Continent in ME times. For reasons of convenience, speakers use loanwords to designate new ideas rather than employing loan translations or other word formation processes (Haspelmath 2009b). Core borrowings, on the other hand, are loanwords that label familiar concepts for which the borrowing language already has means for reference. Consequently, the borrowed word either replaces the old term, as in the case of edlen (cmlambx1. mx1) supplanted by reward (cmbrut3.m3), or the loanword is used in addition to an existing lexeme, creating synonymous pairs such as present (1917mansx7b) and gift (1910clifh7b). Core borrowings are assumed to be “associated with the prestige of the donor language” (Haspelmath 2009b: 48); hence, their usage is ultimately motivated by the speaker’s desire to construe and communicate an identity deemed advantageous.
12.2 Broadening the view Many of the aforementioned processes are domain-general in the sense that they pervade all areas of human activity; by consequence, they influence linguistic behavior at all levels, whether lexical or grammatical. Neuromotor routinization,
12.2 Broadening the view
211
for instance, does not distinguish between lexicon and grammar but generally impacts on phonological sequences; as a result, OE inflectional suffixes like the nominative case marking in nama ‘name’ were reduced to schwa. Similarly, the cognitive processes introduced above materialize in the grammatical domain as well; the following few examples should suffice to illustrate the point. Analogical leveling is evident in the development of plural markers; hence, early ME data still attest to substantial variation between -en and -s even within the same text, e.g., daȝen and daȝes ‘days’ (cmkentho.m1), but most plural forms had been regularized to -s by late ME. High usage frequency, however, preserved eyen ‘eyes’ (cmreynar.m4) throughout the 15th century and ensured the survival of some forms like children into PDE. The propensity to form chunks in mental representations is obvious in grammaticalizing sequences, such as the future construction formed with BE going to (Bybee 2015: 124). The human tendency to relate meaning directly to form shows in grammar when the umlauted vowel in feet, for instance, is associated with plural meaning (see also Bybee 2015: 76). Semantic generalization occurs when lexical items, such as will or going to, grammaticalize and gradually lose meaning components; finally, meaning may be completely lost, as happened to the oblique case marker in OE sceaduwe, sceadwe ‘shadow’, mentioned in Chapter 4. In short, both lexicon and grammar are shaped by physiological and cognitive mechanisms that mostly operate beyond awareness and by consciously performed actions intended to achieve sociocultural goals. Although the above factors do certainly not bear equally on all linguistic areas – lexical words may be more prone to semantic generalization, grammatical expressions may be more affected by chunking – they do not indicate a principled difference between grammar and lexicon. Rather, we have good reason to assume that developments in different parts of the language, grounded in the very same processes, converge, thereby producing a consistent typological profile as if guided by an invisible hand. Reflections on physiological and cognitive processes as well as human agency in language use have inspired Beckner et al. (2009) to regard language as a complex adaptive system, following proposals from evolutionary biology. The authors suggest that “[t]he structures of language emerge from interrelated patterns of experience, social interaction, and cognitive mechanisms” (Beckner et al. 2009: 2), which concisely summarizes the points raised in the previous section in more general terms. The deliberate generality, or vagueness, of the phrasing is inherent in the theoretical approach since the conception of language as a complex adaptive system should offer “a unified account of seemingly unrelated linguistic phenomena” (Beckner et al. 2009: 2), thus requiring a broad appeal to factors underlying human behavior.
212
12 Typology and change: Cognitive and sociocultural roots
The unspecificity of the approach is both its strength and its weakness, as succinctly noted by Hartmann (2020), who points out that the complex adaptive systems view threatens to “lead to massive theoretical vagueness and as such to a largely a-theoretical, essentially descriptive view of language and language change that lacks explanatory power” (Hartmann 2020: 17). I absolutely share these concerns if ‘explanatory power’ refers to falsifiability and/or predictability: On the one hand, it would be difficult, if not impossible, to falsify the presuppositions related to cognition; on the other, social interaction, an integral part of the theory, is shaped by factors that are prioritized according to varying extralinguistic needs which are essentially unpredictable. Along these lines, the present work, too, is atheoretical: I claim that language use, the locus of change, is rooted in neuromotor routines, cognition, society and culture, adducing nonfalsifiable cognitive factors and unpredictable sociocultural parameters. Still, I would argue that an approach which integrates the variables discussed above is necessary and suitable to explain causes for language change, if only in retrospect.90 Accordingly, the next section reviews the typological shifts in the nominal domain in the light of factors that I assume to have consciously or unconsciously influenced the linguistic behavior of English speakers during the past millennium.
12.3 A cognitive account of the typological development since 1150 Even in the earliest period under investigation, the nominal lexicon in use was largely analytic as a consequence of processes that had affected the internal structure of the nouns. First, neuromotor automation had caused the loss of the thematic element, i.e., the marker of the lexeme’s declension class, so that declensional types were differentiated only by their inflectional paradigms (Hogg 1992). The subsequent analogical extension of the inflectional pattern of the a-stem class, especially, to other declension classes not only leveled the inflectional paradigms but, at the same time, reduced the inventory as such because the most frequently used cases (nominative and accusative singular) had ceased to be morphologically marked in a-stem nouns. The disappearance of thematic phonemes and the reduction of obligatory case marking produced numerous monomorphemes since roots no longer
Frankly, I suspect that historical linguists, especially, have to settle either for theories that allow predictions but cannot account for all observations or for frameworks that provide more satisfactory explanations at the expense of predictive power.
12.3 A cognitive account of the typological development since 1150
213
needed to be supplemented by theme and inflection but could be used as independent nouns. Thus, the analytic development in the grammatical domain had immediate consequences for the typological profile of the lexicon (see also Haselow 2011: 216–237).91 Analyticity in the nominal word-stock further increased during ME due to sociocultural factors. The contact between English and French, more extensive and continuous than commonly recognized, set the stage for English language users to import new concepts, conveniently referred to by their Romance designation. The contact scenarios described in Chapter 3 suggest that Romance loanwords were also used by speakers to establish and confirm their affiliation with the prestigious professional classes or, more generally, to create a favorable impression. Hence, it is hardly surprising that the excessive recourse to borrowing in ME chronologically corresponded to the contact period, which continued well into the 15th century. Equally unsurprising, at least from a cognitive perspective, is the fact that Romance nouns were borrowed more frequently than members of other word classes (e.g., Dekeyser 1986). Unlike verbs or adjectives, nouns designate lexically dense concepts that do not rely on related concepts for their completeness; such isolated concepts, discussed in more detail below, can be easily adopted as independent units. Similarly, Bybee (2015: 192) suggests that the preponderance of nouns in borrowing is due to “their high degree of lexical content and their lesser degree of integration into discourse”. The progressive reduction of obligatory declension since OE is likely to have facilitated the inclusion of Romance nouns into the lexicon of ME speakers (see also Sapir 1921: 208).92 Although language users greatly enlarged their nominal inventory by borrowing, they also relied on well-anchored techniques if need for new words arose. Until 1400 new nouns were formed by Germanic affixation in particular; as such, they were semantically more transparent than loanwords, thereby serving the speaker’s objective to be understood. But the recourse to Germanic affixation was not primarily motivated by such intentional factors; rather, it was guided by
While theme elements may be considered inflectional, derivational or ambivalent between both (Kastovsky 2006b), it is the loss of grammatical case makers that eventually cleared the way for stems to be used as words. In an inflectional language like German, loanword integration is less straightforward: Nouns are inflected according to their grammatical gender, but gender assignment to nouns borrowed from genderless languages, such as English, can be challenging. Thus, the authoritative source for German, Duden online, allows multiple genders for Feature (neuter and feminine), Essay (masculine and neuter), Runway (feminine and masculine), to name just a few, causing variation and possibly uncertainty in case marking.
214
12 Typology and change: Cognitive and sociocultural roots
the powerful cognitive mechanism of analogy. The ease with which language users analogically extended models to create new nouns by Germanic affixation during ME became apparent when borrowed words were immediately affixed with Germanic morphemes; in the 14th century, for instance, more than half of the new types derived by deverbal -ing1 contained Romance bases. That said, we have to bear in mind that the number of new nouns derived by Germanic affixation has constantly declined since 1150, continuing the trend observed for OE by Haselow (2011: 188). In the 15th century, Germanic affixation lost its dominant role in agglutinative noun formation; instead, English speakers started to increasingly resort to compounding. The shift from analogical extension to syntactic noun formation seems to have been grounded in the collaboration of cognitive and intentional factors against the background of an extremely analytic word-stock in use. To begin with, recall that the decrease in new Germanic affixed words was accompanied by a parallel decline of the respective model nouns, thereby reducing the inventory of model lexemes, i.e., the very basis for noun formation by analogy. Moreover, we need to take into account the fading or complete loss of affix semantics noted in the previous chapter; thus, Germanic affixation conflicts with the communicative goal of imparting information, whereas compounding accommodates the speaker’s desire to increase informational content by combining distinct concepts and to maximize semantic transparency in order to be understood. Additionally, and perhaps most importantly, the shift from analogical Germanic affixation to syntactic compounding has to be considered in the broader context of the lexicon in use. In the 15th century, specifically, when Germanic affixation steeply declined while compounding markedly rose, the number of monomorphemic tokens reached its all-time high, amounting to 84% of all nouns in usage. Surrounded by an overwhelmingly analytic word-stock, language users must have been accustomed to expressing themselves by drawing on monomorphemes, which they embedded in syntactic structures and/or concatenated into compounds in the fashion of speakers of other analytic languages. To sum up, changes in the typological profile of English were rooted in neuromotor and cognitive processes as well as sociocultural developments: First, articulatory automation and analogical leveling in OE resulted in a largely analytic nominal lexicon; subsequently, integration of new concepts from the Continent and speakers’ desire to convey an air of sophistication in ME furthered the analytic tendency, moderately curbed by the analogical extension of traditional affixation models; finally, the striving for expressiveness since late ME reversed the analytic trend, increasing syntheticity in the nominal word-stock. The kind of syntheticity introduced by newly derived nouns, however, profoundly differs from
12.3 A cognitive account of the typological development since 1150
215
earlier stages since language users, equipped with a largely monomorphemic word-stock, have employed agglutinating word formation processes that gravitate toward isolating strategies. The final question that needs to be addressed is why language users have only rarely derived new nouns by Romance affixation and conversion despite the large numbers of theoretically available models for these word formation strategies. Even if the respective figures were revised downward for reasons described in Chapter 6, the numbers of model lexemes would probably have been sufficient to invite analogical extensions. Still, in both processes, the powerful mechanism of analogy seems to have been counteracted. The rare employment of Romance affixes to form new nouns has to be considered in connection with the decrease in Germanic affixation in noun formation, indicating a general reluctance in the usage of bound morphemes, possibly due to the semantic unspecificity of the Germanic and Romance affixes. While Germanic affixes, which had developed from free morphemes, lost their concrete meaning over time, Romance affixes may never have been allocated more than vague meaning in English, as noted in the excursus in Chapter 5. Additionally, the fusional properties of Romance affixation obscure semantic transparency, thereby impeding understandability, which may also have discouraged the use of Romance affixes in noun formation. Based on the considerations advanced in Chapter 5, we can assume that Romance sequences acquired morphemic status in eModE, i.e., during a period when speakers were increasingly opting for compounding for reasons described above. Against this backdrop, it is hardly surprising that language users have largely avoided Romance affixation – why would they use an affix with low semantic content when they could easily resort to an alternative that best suited their desire to communicate as much meaning as possible? On balance, I would thus suggest that speakers have eschewed Romance affixation to enhance communication, hence attending to sociocultural factors. By contrast, the limited use of conversion to derive new nouns appears to be grounded in cognition. So far, observations that nouns are less frequently converted from verbs and/or adjectives than vice versa have not been satisfactorily accounted for; previous appeals to the size of affix inventories (e.g., Marchand 1969: 364) or to the size of the word classes (e.g., Biese 1941: 406) are essentially descriptive. On the one hand, the circumstance that the stock of verbal and/or adjectival affix types is smaller than the nominal affix inventory does not force speakers to resort to conversion instead of exploiting the available affixes to coin new verbs and adjectives. On the other, the fact that the verbal and adjectival word classes are smaller than their nominal counterpart, while suggesting that
216
12 Typology and change: Cognitive and sociocultural roots
deverbal and deadjectival conversions are less probable than denominal conversion, cannot explain the extreme rarity observed in Chapter 5. To fill the explanatory void, I would propose to regard the extremely low numbers of new nouns derived by conversion as indicative of a general reluctance to use this process for noun formation, which seems to originate in the fundamentally different mental representations of the word classes. Following Langacker (2008: 98–100), we basically distinguish nouns that profile things (in the broadest sense of the term) from verbs and adjectives that profile relationships; more precisely, verbs denote processes while adjectives designate nonprocessual relations. At the cognitive level, then, members of the category ‘noun’ evoke thing-concepts (again, in the most general sense), which are not only lexically dense but also fairly stable since they do not depend on other mental representations for their full meaning; accordingly, they can be considered isolated concepts. By contrast, members of the categories ‘verb’ and ‘adjective’ elicit relational concepts, which derive their actual meaning via relations to other representations; thus, their semantics are variable and “likely to be altered to fit the context” (Gentner 1981: 168). The fundamental difference between isolated and relational concepts materializes on the linguistic plane in that nouns can form a complete phrase by themselves, whereas (transitive) verbs combine with object arguments to constitute a verb phrase. Provided that concepts, whether isolated or relational, include information about their word class membership, word formation by conversion requires a transformation on the conceptual level, so that an isolated concept emerges as a relational one and vice versa. This said, it stands to reason that converting a relational mental representation is rather difficult because we have to decide which meaning components contained in related concepts are to be incorporated into the new isolated representation at the expense of others. Conversely, isolated concepts seem easy to convert as they comprise in themselves all semantic features, which, to a greater or lesser extent, are retained in the new relational representation. Against this backdrop, I would suggest ranking the convertibility of concepts based on their interrelatedness, i.e., the number of related mental representations that are indispensable for their meaning. At the top end we find isolated concepts, referred to by nouns, that are most effortlessly converted; at the bottom we may locate relational concepts designated by prepositions, which are highly dependent on a multitude of other representations and, consequently, the most difficult to convert. Between these extremes, further relational representations can be distinguished according to the number of participating concepts necessary to determine their content. Nonprocessual concepts, denoted by adjectives, would minimally require one related representation; thus, color concepts such as ‘red’, ‘yellow’ and ‘blue’, for instance, appear to be cognitively anchored by the con-
12.3 A cognitive account of the typological development since 1150
217
cepts ‘fire’, ‘sun’ and ‘sky’ in the minds of English speakers (Wierzbicka 1990). By contrast, processual relational concepts, referred to by verbs, seem to derive their full meaning through their interrelatedness with at least two other concepts, most commonly representations of some schematic agent and patient. In short, I would claim that processual relational concepts are more demanding to convert than nonprocessual relational representations, which, in turn, are more challenging than isolated concepts. This ranking is, of course, based on a highly idealized depiction of the mental representations evoked by members of the different word classes and their interrelatedness; in actual fact, all concepts, whether isolated or related, may be modified to some extent via their connection to other representations, as suggested by psychological research. Goldstone (1996), for instance, concludes that the results of several visual experiments he performed indicate “a continuum between completely isolated and completely interrelated concepts” (Goldstone 1996: 626). These caveats notwithstanding, the proposed classification of the word classes as to their convertibility nicely accounts for the data of this study: Not only have nouns, in general, rarely been converted from members of other categories, but the few exceptions to the rule have realized deadjectival rather than deverbal conversion. In conclusion, a variety of sociocultural and cognitive factors have determined the processes employed by language users to extend their nominal lexicon since 1150. Their desire for understandability and expressiveness in a largely analytic environment has deterred speakers from utilizing fusional techniques; at the same time, cognitive constraints have prevented language users from adopting linguistically motivated isolating strategies, i.e., conversion. Hence, English speakers have avoided processes that may have significantly changed the typological profile of language. Except for the ME period, when the word-stock was greatly extended by borrowing, language users have remained fairly conservative, resorting to long-established agglutinating techniques, first driven by analogy, later motivated by intentional factors. While Germanic affixation and compounding have both been available at all times, they have been exploited to different degrees to accommodate speakers’ shifting priorities. The changes in the utilization of means to extend the lexicon have had repercussions for the typological profile of the nominal language, albeit minor ones in view of the overall distribution of new and old lexemes, detailed in Chapter 5. Since the proportion of tokens instantiating old nouns amounted to roughly 96% on average across the study period, the impact of actually employed means should not be overestimated. The strong recourse to established lexemes, motivated by the need to be understood (e.g., Bybee 2015: 5–6), has moderated the typological development of the nominal word-stock. Though fol-
218
12 Typology and change: Cognitive and sociocultural roots
lowing the synthetic trend manifest in word formation, the lexicon in use has remained largely analytic during the past 850 years, attesting rather to typological stability than change. In this light, diachronic linguists would be well advised to not only attend to innovation but to balance novel expressions against those maintained by convention, which “keeps features of language the same across many generations of language users” (Bybee 2015: 9; see also Croft 2003: 289). Taking into account both innovative and customary facets of language use would doubtlessly improve our views on language development and change.
13 Conclusion The objective of this book was to trace the development of nouns since early ME in terms of typology. This classification scheme provides the foundation for comparing typological trends in grammar and lexicon, thus enabling us to assess whether the supposedly analytic development of English grammar was paralleled by similar shifts in the nominal word-stock. Working within a cognitive framework, which presumes that all linguistic behavior is, in principle, governed by the same physiological, cognitive and sociocultural factors, I anticipated similar typological developments to surface in the lexical and grammatical domains. On the assumption that language and its structural properties emerge and change in usage, I approached the typological development of the nominal lexicon from two complementary angles – noun formation and noun structure. Since changes are most immediately reflected by speakers’ innovations, the primary focus of the book was on the means used to enlarge the nominal word-stock in each century between 1150 and 2000. These means comprise the central word formation processes, i.e., conversion, compounding and Germanic and Romance affixation, as well as borrowing. Although historical investigations in word formation have contributed detailed insights into a given process, their restriction to specific techniques, such as Haselow’s (2011) study of nominal suffixation, prevents us from assessing the importance of the respective method for the overall development of the vocabulary. To this end, each process needs to be put into perspective against all means of lexicon extension; accordingly, this investigation is the first to quantitatively correlate the different word formation processes and borrowing based on usage data. The systematic correlation of the means used for extending the nominal lexicon produced findings that challenge various ideas suggested in previous literature. Perhaps most surprisingly, the impact of Romance affixation on English derivation seems to have been vastly overrated. When quantified and related to other means of lexicon extension, the importance of Romance affixation starts to evaporate; the small proportion of this process relative to other techniques employed to enlarge the nominal word-stock certainly does not support claims about a typological trend reversal caused by Romance affixation. Similarly remarkable are the findings related to conversion insofar as they do not confirm the supposed link between borrowing and conversion. Despite its prima facie plausibility, the supposition that the adoption of formally identical members of different word classes from Romance advanced word formation by conversion turned out to be untenable. Regardless of numerous borrowed word
https://doi.org/10.1515/9783111317717-016
220
13 Conclusion
pairs, the small numbers of nouns derived by conversion clearly suggest that this process has generally been an unpopular option for noun formation. In terms of typology, borrowing and conversion are isolating techniques, corresponding to the analytic language type; by contrast, compounding and Germanic affixation are agglutinating processes and Romance affixation is a fusional method, characteristic of synthetic languages. From this perspective, the analysis of newly added lexemes revealed that ME speakers continued and reinforced the analytic trend observed for OE by Haselow (2011) as they resorted to borrowing from Romance in particular; in the 14th century this process prevailed, accounting for roughly 64% of all new nominal types. Yet this typological development has been reversed since the end of the ME period, with language users increasingly employing agglutinating techniques to enlarge their word-stock. Within these techniques, the data attest to a shift from Germanic affixation to compounding: In the 12th/13th century nearly a third of all new nouns were still derived by Germanic affixation, whereas in the 16th century compounding started to predominate and has steadily increased since. Given that analogy is considered an especially powerful cognitive mechanism operative in word formation, I devoted a significant part of the study to possibilities for analogical extensions. To this end, all nouns were classified as to their potential to promote noun derivation by conversion, compounding and Germanic and Romance affixation in each century. The distribution of the typological techniques represented by these model lexemes differs markedly from that exhibited by newly added nouns: The proportion of agglutinating techniques has continuously decreased since the 14th century; conversely, the type frequency of patterns for isolating and fusional methods has increased. The reduction of model words for agglutinating processes corresponds to the decrease in Germanic affixation observed for new derivations and is apparently unrelated to compounding, a rather syntactic word formation process, which does not seem to depend on analogy. The discrepancy between new lexemes and model nouns for isolating techniques is, to some extent, due to methodological problems, but I also discussed the possibility that analogy in conversion may be inhibited by cognitive difficulties in converting nouns from members of other word classes, despite the high type frequency of the pattern. With respect to fusional techniques, again, the low numbers of newly derived nouns stand in stark contrast to the relatively large proportion of model words for Romance affixation. Although the latter have constituted about one fifth of all model nouns since the 16th century, language users appear to have avoided fusional methods in noun formation. In short, the comparative study of available model lexemes and actu-
13 Conclusion
221
ally derived nouns discloses that the mechanism ‘analogy’ can be, and has been, superseded by other cognitive factors and/or sociocultural forces. Against this background, I think the most important lesson to be drawn from the chapter on word formation patterns concerns the methodological approach of future research: Lexemes that display a specific word formation pattern should not be taken automatically to indicate that the respective pattern has been used to coin new words. It follows that conclusions about, say, the productivity of a certain affix may well be misleading if based solely on the type frequency of this affix pattern in language use. Consequently, we need to focus on new additions to the word-stock in order to track the typological development of the means used to extend the lexicon, which may be broadly summarized along the following lines: In ME new nouns originated from both isolating and agglutinating techniques; since eModE isolating strategies have been virtually abandoned. Instead, language users have chiefly resorted to agglutinating methods, shifting from more fusional to more isolating agglutination, i.e., from Germanic affixation to compounding. Fusional techniques, on the other hand, seem to have been the disfavored option at all times. Hence, the typological methods used to enlarge the nominal word-stock enhanced lexical analyticity during ME but subsequently increased the degree of syntheticity in the nominal lexicon. This overall trajectory largely corresponds to the development of the typological profile of the nominal language from 1150 to 2000. In order to establish the structure of the word-stock in each century and the attendant typological shifts since early ME, all nouns were analyzed in detail for their morphological makeup, adopting token-based approaches that highlight different aspects. The adaption of Greenberg’s (1960) synthetic index, based on the number of morphemes per word, revealed that the degree of syntheticity in nominal language use decreased until the 15th century but has constantly increased since. In an attempt to determine the synthetic subtype – agglutinating or fusional – I subsequently assessed the degree of fusion manifest in the nominal lexicon, devising a new procedure to quantify the extent of fusion. Based on this fusion index, I established the structure of the nominal word-stock in each period and positioned the stages around Dixon’s clock face to uncover the typological development. In sum, the nominal lexicon turns out to be largely isolating, although the word-stock has slowly but steadily developed toward agglutination since the 16th century. To gauge the extent of analyticity against syntheticity, I finally adopted the indices proposed by Szmrecsanyi (2012, 2016), which explicitly address the degree of syntheticity, on the one hand, and the degree of analyticity, on the other. The developments of the two indices broadly mirror each other: While the syntheticity index constantly declined during ME and has increased again since eModE,
222
13 Conclusion
the analyticity index rose until the 15th century and started to decrease in the 16th century. In sum, the three different approaches produced converging results: The trend toward increasing analyticity during ME times was reversed in eModE, when syntheticity started to constantly increase. Hence, changes in the use of typological techniques to enlarge the word-stock corresponded to shifts in the typological profile of the nominal usage data, which is hardly surprising since processes and structure are somehow interrelated. Still, the question is how to best conceive this interrelatedness. The cognitive approach adopted in this study implies that the typological makeup of a language emerges from language use; consequently, changes in linguistic behavior, grounded in physiological, cognitive and sociocultural dynamics, entail shifts in the language’s type. But, as mentioned at the beginning of this book, the immediate impact of speakers’ behavior on the typological profile of the contemporary word-stock has been fairly modest in terms of new lexemes. After considerable additions to the lexicon during ME, primarily due to the influx of Romance loanwords, the need for new nouns seems to have greatly diminished in the subsequent centuries. Thus, shifts in speakers’ behavior would have barely affected the structure of the entire vocabulary in the short term. In the long run, however, the linguistic behavior of successive generations influenced and gradually changed the typological profile of the language. First, continued speech automation, resulting in assimilation and (morpho)phonemic reduction, together with extensive borrowing from Anglo-French entailed the growing analyticity of the lexicon in ME. As a result, eModE speakers ‘inherited’ an overwhelmingly analytic word-stock they had to cope with, and, if the need for new lexemes arose, they resorted to compounding in particular, enlarging their vocabulary in the fashion of speakers of other analytic languages. This way, language users were influenced in their choice of means by the typological shape of their language, which, in turn, has become increasingly synthetic due to the techniques used to extend the lexicon. Classifying the successive stages of the nominal lexicon in terms of typology provided the basis for direct comparison with findings in grammar, thus paving the way for “rethinking the history of English in light of larger patterns or correlations among structural changes” (Kortmann 2012: 606). On the whole, the development of the typological profile established for the nominal data has proven to be remarkably parallel to that recently determined for the grammatical domain by Szmrecsanyi (2012, 2016). In both areas, we observe trends toward increasing analyticity in ME, followed by developments toward greater syntheticity since eModE.
13 Conclusion
223
Perhaps even more importantly, the typological shape of the nominal and grammatical data has turned out to be surprisingly similar. Both grammar and the nominal lexicon have been largely analytic during the past 850 years, exhibiting a virtually identical ratio between analyticity and syntheticity; more precisely, analytic tokens occurred 3.2 times more often than synthetic ones on average per century. The parallel developments in the coding of grammatical and lexical information as well as the consistent ratios of analyticity to syntheticity in grammar and lexicon not only confirm the initial expectation of similar typological changes in both domains but ultimately suggest global morphological trends in English. In this respect, the overall outcome of the investigation is most encouraging and should, in fact, inspire us to rethink the history of English, as proposed by Kortmann (2012). By its design, the present study afforded new insights but also raised a number of new questions. To begin with, I included all processes used to extend the wordstock to determine their relative importance in each century, focusing exclusively on quantitative aspects while ignoring qualitative features such as semantics. In this regard, future analyses could be strengthened – for instance, by incorporating the context of use, which I did not consider despite its pivotal role in figuring out an item’s meaning or its entrenchment in the mental lexicon. Moreover, due to the global approach of the investigation, I ignored register effects, although I noted that the number of new nouns sharply increased in scientific writings during the 20th century, which may be related to the major shifts in textual conventions observed for this text type. Hence, a register-specific examination may conclude that the findings of the present work are more pronounced in certain text types than in others. In addition, I should point out that, even though I adopted strictly quantificational methods, the results presented in Part 2 of the book may be less reliable than desired because I disregarded the effects of usage frequency. As discussed at some length in connection with the emergence of Romance affixes in English, recent research has suggested that different frequency levels need to be factored in to approximate (mental) reality. Applying the proposed criteria – extremely high frequency, high frequency and relative frequency – to my data, however, was far from convincing due to the lack of threshold values signaling that activation of the constituents passes into activation of the whole. It remains an open question how to determine such breakpoints, necessary for any quantitative analysis, as long as we are still ignorant about the specific effects of repetition frequency on memory. Accordingly, the developments inferred from the analyses in the second part of the book describe broad tendencies, albeit quite robust ones.
224
13 Conclusion
Finally, I have to emphasize that, strictly speaking, the findings presented in this work do not apply to the lexicon as such but are restricted to its nominal part, although I discussed some evidence suggesting that the typological shift from growing analyticity toward increasing syntheticity observed for the nominal lexicon would materialize in the verbal vocabulary as well. In particular, recent research on phrasal verbs and composite predicates indicates that analyticity continuously rose in ME and eModE but started to decline in lModE, which would roughly correspond to the typological development of nouns. Besides, figures from the OED attest to a continued increase in verbal types derived by analytic means during ME – accompanied by decreasing numbers of verbs formed by synthetic processes – which subsided in eModE, when the total of concatenated verbal types started to exceed the number of verbs derived by analytic techniques. Against this positive backdrop, I would hypothesize that the verbal and nominal parts of the word-stock followed parallel typological trajectories and consider the results obtained for the nominal data to be representative of the lexical wordstock. Still, future studies would, of course, need to examine in detail the structural properties of the members of non-nominal lexical categories. Once the typological shapes of the major word classes are determined and correlated with each other in quantificational terms, the typological profile of the lexicon will be substantiated. The corresponding tendencies in lexical and grammatical morphology provide sufficient grounds to justify claims about English morphology in general and may even signal a typologically harmonious development on other levels of the language, but it is the task of future research to show whether the results reported in this book can be generalized to developments in areas outside morphology. Various aspects ranging from ambiguous sentence structure to highly generalized meaning have been discussed in the literature, and the summary review of these discussions encouraged me to suggest that the developments in morphology, syntax and semantics seem to have jointly supported the analytic tendency in English. However, while morphology manifested a trend reversal during eModE, studies in syntax and semantics still have to close the gap between OE and PDE in order to determine whether developments toward increasing syntheticity materialized in either or both domains as well. Equally imperative are quantificational investigations in these areas to accommodate the gradient nature of structural organization and conceptual representations. Meanwhile, the main conclusion to be drawn for the morphological typology of English is that, although derivational and inflectional morphology became increasingly analytic in ME, syntheticity has progressively been (re)introduced into English morphology since eModE.
References Corpora and corpus tools ARCHER. A Representative Corpus of Historical English Registers. Version 3.1. 2006. Originally compiled under the supervision of Douglas Biber and Edward Finegan at Northern Arizona University and University of Southern California; version 3.1 modified and expanded by Northern Arizona University, University of Southern California, University of Freiburg, University of Helsinki and Uppsala University. Barlow, Michael. 2000. Concordancing with MonoConc Pro 2.0. Houston: Athelstan. CLARIN-D/SfS-Uni. Tübingen. 2012. WebLicht: Web-Based Linguistic Chaining Tool. Online. https://weblicht.sfs.uni-tuebingen.de (last accessed October 10, 2023) Freiburg-LOB Corpus of British English. Release 2007 (POS-tagged version). Compiled by Christian Mair (Albert-Ludwigs-Universität Freiburg) and Geoffrey Leech (Lancaster University). Penn Parsed Corpus of Modern British English (PPCMBE). 2010. Compiled by Anthony Kroch, Beatrice Santorini & Ariel Diertani. University of Pennsylvania. Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME). 2004. Compiled by Anthony Kroch, Beatrice Santorini & Ariel Diertani. University of Pennsylvania. Penn-Helsinki Parsed Corpus of Middle English (PPCME2). 2000. Compiled by Anthony Kroch & Ann Taylor. University of Pennsylvania.
Dictionaries Anglo-Norman Dictionary. 2001–. Edited by William Rothwell, Stuart Gregory & David Trotter. [Revised online edition completed in 2021.] https://www.anglo-norman.net (last accessed October 10, 2023) Bailey, Nathan. [1721] 1763. An universal etymological English dictionary (20th edn.). London: Printed for T. Osborne et al. Bosworth-Toller Anglo-Saxon Dictionary. Based on Joseph Bosworth & Thomas Northcote Toller. 1898/1921. An Anglo‐Saxon dictionary. Digitized and edited by Sean Crist, Ondrey Tichy et al. (2001–2010). https://bosworthtoller.com (last accessed October 10, 2023) Chambers dictionary of etymology. 1999. Edited by Robert K. Barnhart, [Originally published as: The Barnhart dictionary of etymology. Bronx, N.Y.: H.W. Wilson Co., 1988.] Edinburgh: Chambers Harrap. Cotgrave, Randle. 1611. A Dictionarie of the French and English Tongues. London: Adam Islip. Duden online. [n. d.]. https://www.duden.de (last accessed October 10, 2023) Hollyband, Claudius. 1593. A Dictionary French and English. London: Woodcock. https://leme.library. utoronto.ca/lexicons/205/details#details (last accessed October 10, 2023) Johnson, Samuel. [1755] 1792. A Dictionary of the English Language (10th edn.). London: Printed for J. F. & C. Rivington. Lexicons of Early Modern English (LEME). 2006–. Edited by Ian Lancashire. Toronto: University of Toronto Library & University of Toronto Press. https://leme.library.utoronto.ca (last accessed October 10, 2023) Merriam-Webster Online Dictionary. 1996–. https://merriam-webster.com (last accessed October 10, 2023) https://doi.org/10.1515/9783111317717-017
226
References
Middle English Dictionary. 1952–2001. Edited by Hans Kurath, Sherman Kuhn & Robert E. Lewis. Online edition in Middle English Compendium. Edited by Frances McSparran et al. Ann Arbor: University of Michigan. https://quod.lib.umich.edu/m/middle-english-dictionary (last accessed October 10, 2023) Miège, Guy. 1677. A New Dictionary French and English, with another English and French. London: Thomas Dawks. Oxford English Dictionary: OED online. 2000–. https://www.oed.com (last accessed October 10, 2023) Sweet, Henry. 1897. The student’s dictionary of Anglo-Saxon. New York/London: Macmillan.
Literature Aikhenvald, Alexandra Y. 2007a. Grammars in contact: A cross-linguistic perspective. In Alexandra Y. Aikhenvald & Robert M. W. Dixon (eds.), Grammars in contact: A cross-linguistic typology (Explorations in Linguistic Typology 4), 1–66. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. 2007b. Typological distinctions in word-formation. In Timothy Shopen (ed.), Language typology and syntactic description, vol. III: Grammatical categories and the lexicon (2nd edn.), 1–65. Cambridge/New York: Cambridge University Press. Anderson, Stephen R. 1982. Where’s morphology? Linguistic Inquiry 13(4). 571–612. Arndt-Lappe, Sabine. 2015. Word-formation and analogy. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-formation: An international handbook of the languages of Europe, vol. II (Handbücher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science 40.2), 822–841. Berlin/Boston: De Gruyter Mouton. Baayen, Harald R. 2009. Corpus linguistics in morphology: Morphological productivity. In Anke Lüdeling & Merja Kytö (eds.), Corpus linguistics: An international handbook, vol. II (Handbücher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science 29.2), 899–919. Berlin/New York: De Gruyter Mouton. Bauer, Laurie. 1983. English word-formation. Cambridge: Cambridge University Press. Bauer, Laurie. 1998. When is a sequence of two nouns a compound in English? English Language and Linguistics 2(1). 65–86. Bauer, Laurie. 2003. English prefixation – a typological shift? Acta Linguistica Hungarica 50(1–2). 33–40. Bauer, Laurie. 2006. Competition in English word formation. In Ans van Kemenade & Bettelou Los (eds.), The handbook of the history of English, 177–198. Malden/Oxford: Blackwell. Bauer, Laurie. 2017. Compounds and compounding (Cambridge Studies in Linguistics 155). Cambridge/ New York: Cambridge University Press. Bauer, Laurie. 2019. Rethinking morphology. Edinburgh: Edinburgh University Press. Bauer, Laurie, Rochelle Lieber & Ingo Plag. 2013. The Oxford reference guide to English morphology. Oxford: Oxford University Press. Baugh, Albert C. & Thomas Cable. 2002. A history of the English language (5th edn.). London: Routledge. Becker, Thomas. 1990. Analogie und morphologische Theorie (Studien zur theoretischen Linguistik 11). München: Fink. Beckner, Clay, Richard Blythe, Joan Bybee, Morten H. Christiansen, William Croft, Nick C. Ellis, John Holland, Jinyun Ke, Diane Larsen-Freeman & Tom Schoenemann. 2009. Language is a complex adaptive system: Position paper. Language Learning 59 (Suppl. 1). 1–26.
Literature
227
Berg, Thomas. 2009. Structure in language: A dynamic perspective (Routledge Studies in Linguistics 10). New York: Routledge. Berg, Thomas. 2012. The cohesiveness of English and German compounds. The Mental Lexicon 7(1). 1–33. Berg, Thomas. 2014. On the Relationship between type and token frequency. Journal of Quantitative Linguistics 21(3). 199–222. Berg, Thomas. 2015. Locating affixes on the lexicon-grammar continuum. Cognitive Linguistic Studies 2(1). 150–180. Berg, Thomas, Sabine Helmer, Marion Neubauer & Arne Lohmann. 2012. Determinants of the extent of compound use: A contrastive analysis. Linguistics 50(2). 269–303. Berlage, Eva. forthcoming. The semantic-syntactic specialization of composite predicates in English. Cambridge: Cambridge University Press. Biber, Douglas & Susan Conrad. 2009. Register, genre, and style. Cambridge: Cambridge University Press. Biber, Douglas, Edward Finegan & Dwight Atkinson. 1994. ARCHER and its challenges: Compiling and exploring a representative corpus of historical English registers. In Udo Fries, Gunnel Tottie & Peter Schneider (eds.), Creating and using English language corpora: Papers from the fourteenth international conference on English language research on computerized corpora, Zürich 1993, 1–13. Amsterdam/Atlanta: Rodopi. Biese, Yrjö M. J. 1941. Origin and development of conversions in English (Annales Academiae Scientiarum Fennicae B 45.2). Helsinki: Suomalaisen Kirjallisuuden Seuran Kirjapainon. Blumenthal-Dramé, Alice. 2012. Entrenchment in usage-based theories: What corpus data do and do not reveal about the mind (Topics in English Linguistics 83). Berlin/Boston: De Gruyter Mouton. Bolinger, Dwight. 1971. The phrasal verb in English. Cambridge: Harvard University Press. Britnell, Richard. 2013. Uses of French language in medieval English towns. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 81–89. Woodbridge: York Medieval Press. Burnley, David. 1992. Lexis and semantics. In Norman F. Blake (ed.), The Cambridge history of the English language, vol. II: 1066–1476, 409–499. Cambridge: Cambridge University Press. Bybee, Joan L. 1985. Morphology: A study of the relation between meaning and form (Typological Studies in Language 9). Amsterdam/Philadelphia: John Benjamins. Bybee, Joan L. 1995. Diachronic and typological properties of morphology and their implications for representation. In Laurie Beth Feldman (ed.), Morphological aspects of language processing, 225–246. Hillsdale: Lawrence Erlbaum. Bybee, Joan L. 1996. Productivity, regularity and fusion: How language use affects the lexicon. In Rajendra Singh (ed.), Trubetzkoy’s orphan: Proceedings of the Montréal roundtable “Morphonology: Contemporary responses” (Current Issues in Linguistic Theory 144), 247–269. Amsterdam/ Philadelphia: John Benjamins. Bybee, Joan L. 1998. The emergent lexicon. In Catherine M. Gruber, Derrick Higgins, Kenneth S. Olson & Tamra Wysocki (eds.), Proceedings of the Chicago Linguistic Society 34: The panels, 421–435. Chicago: Chicago Linguistic Society. Bybee, Joan L. 2006. From usage to grammar: The mind’s response to repetition. Language 82(4). 711–733. Bybee, Joan L. 2008. Usage-based grammar and second language acquisition. In Peter Robinson & Nick C. Ellis (eds.), Handbook of Cognitive Linguistics and second language acquisition, 216–236. New York: Routledge.
228
References
Bybee, Joan L. 2010. Language, usage and cognition. Cambridge/New York: Cambridge University Press. Bybee, Joan L. 2015. Language change. Cambridge: Cambridge University Press. Bybee, Joan L. & Clay Beckner. 2010. Usage-based theory. In Bernd Heine & Heiko Narrog (eds.), The Oxford handbook of linguistic analysis, 827–855. Oxford/New York: Oxford University Press. Bybee, Joan L. & Clay Beckner. 2015. Language use, cognitive processes and linguistic change. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 503–518. London: Routledge. Carroll, Robert M. & Lena A. Nordholm. 1975. Sampling characteristics of Kelley’s ϵ2 and Hay’s ω2. Educational and Psychological Measurement 35(3). 541–554. Comrie, Bernard. 1989. Language universals and linguistic typology: Syntax and morphology (2nd edn.). Oxford: Blackwell. Cowie, Claire. 1999. Diachronic word-formation: A corpus-based study of derived nominalizations in the history of English. Cambridge: University of Cambridge dissertation. Cowie, Claire. 2012. Early Modern English: Morphology. In Alexander Bergs & Laurel J. Brinton (eds.), English historical linguistics: An international handbook, vol. I (Handbücher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science 34.1), 604–620. Berlin/Boston: De Gruyter Mouton. Cowie, Claire & Christiane Dalton-Puffer. 2002. Diachronic word-formation and studying changes in productivity over time: Theoretical and methodological considerations. In Javier E. Díaz Vera (ed.), A changing world of words: Studies in English historical lexicography, lexicology and semantics (Costerus New Series 141), 410–437. Amsterdam/New York: Rodopi. Croft, William. 2003. Typology and universals (2nd edn.). Cambridge: Cambridge University Press. Cutler, Anne. 1981. Degrees of transparency in word formation. Canadian Journal of Linguistics/Revue canadienne de linguistique 26(1). 73–77. Dalton-Puffer, Christiane. 1996. The French influence on Middle English morphology: A corpus-based study of derivation (Topics in English Linguistics 20). Berlin/New York: De Gruyter Mouton. Danchev, Andrei. 1992. The evidence for analytic and synthetic developments in English. In Matti Rissanen, Ossi Ihalainen, Terttu Nevalainen & Irma Taavitsainen (eds.), History of Englishes: New methods and interpretations in historical linguistics (Topics in English Linguistics 10), 25–41. Berlin/ Boston: De Gruyter Mouton. Deeming, Helen. 2013. French devotional texts in thirteenth-century preachers’ anthologies. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 254–265. Woodbridge: York Medieval Press. Dekeyser, Xavier. 1986. Romance loans in Middle English: A re-assessment. In Dieter Kastovsky & Aleksander Szwedek (eds.), Linguistics across historical and geographical boundaries, vol. I: Linguistic theory and historical linguistics (Trends in Linguistics. Studies and Monographs 32), 253–265. Berlin/New York: De Gruyter Mouton. Dietz, Klaus. 2002. Lexikalischer Transfer und Wortbildung am Beispiel des französischen Lehngutes im Mittelenglischen. In Mechthild Habermann, Peter O. Müller & Horst Haider Munske (eds.), Historische Wortbildung des Deutschen (Reihe Germanistische Linguistik 232), 381–405. Tübingen: Max Niemeyer. Dietz, Klaus. 2015a. Foreign word-formation in English. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-formation: An international handbook of the languages of Europe, vol. III (Handbücher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science 40.3), 1637–1660. Berlin/München/Boston: De Gruyter Mouton.
Literature
229
Dietz, Klaus. 2015b. Historical word-formation in English. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-formation: An international handbook of the languages of Europe, vol. III (Handbücher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science 40.3), 1914–1930. Berlin/München/Boston: De Gruyter Mouton. Divjak, Dagmar. 2019. Frequency in language: Memory, attention and learning. Cambridge/New York: Cambridge University Press. Dixon, Robert M. W. 1997. The rise and fall of languages. Cambridge/New York: Cambridge University Press. Dixon, Robert M. W. 2014. Making new words: Morphological derivation in English. Oxford: Oxford University Press. Dobson, Eric J. 1957. English pronunciation 1500–1700. 2 vols. Oxford: Clarendon. Döring, Nicola & Jürgen Bortz. 2016. Forschungsmethoden und Evaluation in den Sozial- und Humanwissenschaften (5th edn.). Berlin/Heidelberg: Springer. Driver, Martha W. 2013. ‘Me fault faire’: French makers of manuscripts for English patrons. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100– c.1500, 420–443. Woodbridge: York Medieval Press. Durkin, Philip. 2009. The Oxford guide to etymology. Oxford/New York: Oxford University Press. Durkin, Philip. 2014. Borrowed words: A history of loanwords in English. Oxford: Oxford University Press. Durkin, Philip. 2016. The OED and HTOED as tools in practical research: A test case examining the impact of loanwords on areas of the core lexicon. In Merja Kytö & Päivi Pahta (eds.), The Cambridge handbook of English historical linguistics, 390–406. Cambridge: Cambridge University Press. Ellis, Nick C. 2017. Salience in language usage, learning and change. In Marianne Hundt, Sandra Mollin & Simone E. Pfenninger (eds.), The changing English language: Psycholinguistic perspectives, 71–92. Cambridge: Cambridge University Press. Gabelentz, Georg von der. 1891. Die Sprachwissenschaft, ihre Aufgaben, Methoden und bisherigen Ergebnisse. Leipzig: T. O. Weigel Nachfolger. Gadde, Fredrik. 1910. On the history and use of the suffixes -ery (-ry), -age and -ment in English. Lund: Gleerupska University dissertation. Gardner, Anne-Christine. 2014. Derivation in Middle English: Regional and text type variation (Mémoires de la Société Néophilologique de Helsinki 92). Helsinki: Société Néophilologique. Gelderen, Elly van. 2016. Cyclical change continued: Introduction. In Elly van Gelderen (ed.), Cyclical change continued (Linguistik Aktuell/Linguistics Today 227), 3–17. Amsterdam: John Benjamins. Gentner, Dedre. 1981. Some interesting differences between verbs and nouns. Cognition and Brain Theory 4(2). 161–178. Giegerich, Heinz. 2009. Compounding and lexicalism. In Rochelle Lieber & Pavol Štekauer (eds.), The Oxford handbook of compounding, 178–200. Oxford: Oxford University Press. Goldberg, Adele E. 2009. The nature of generalization in language. Cognitive Linguistics 20(1). 93–127. Goldstone, Robert L. 1996. Isolated and interrelated concepts. Memory & Cognition 24(5). 608–628. Grant, Anthony P. 2009. Loanwords in British English. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 360–383. Berlin: De Gruyter Mouton. Green, Monica H. 2013. Salerno on the Thames: The genesis of Anglo-Norman medical literature. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 220–231. Woodbridge: York Medieval Press.
230
References
Greenberg, Joseph H. 1960. A quantitative approach to the morphological typology of language. International Journal of American Linguistics 26(3). 178–194. Greenberg, Joseph H. 1963. Some universals of grammar with particular reference to the order of meaningful elements. In Joseph H. Greenberg (ed.), Universals of language: Report of a conference held at Dobbs Ferry, NY, April 13-15, 1961, 58–90. Cambridge, MA: MIT Press. Hagège, Claude. 1990. Do the classical morphological types have clear-cut limits? In Wolfgang U. Dressler, Hans C. Luschützky, Oskar E. Pfeiffer & John R. Rennison (eds.), Contemporary morphology (Trends in Linguistics. Studies and Monographs 49), 297–308. Berlin/New York: De Gruyter Mouton. Hartmann, Stefan. 2020. Language change and language evolution: Cousins, siblings, twins? Glottotheory 11(1). 1–25. Haselow, Alexander. 2011. Typological changes in the lexicon: Analytic tendencies in English noun formation (Topics in English Linguistics 72). Berlin/New York: De Gruyter Mouton. Haselow, Alexander. 2012a. A typological view on the development of English derivation. English Studies 93(2). 203–226. Haselow, Alexander. 2012b. Lexical typology and typological changes in the English lexicon. In Terttu Nevalainen & Elizabeth C. Traugott (eds.), The Oxford handbook of the history of English, 643–653. Oxford/New York: Oxford University Press. Haspelmath, Martin. 2009a. An empirical test of the agglutination hypothesis. In Sergio Scalise, Elisabetta Magni & Antionietta Bisetto (eds.), Universals of language today (Studies in Natural Language and Linguistic Theory 76), 13–29. Dordrecht: Springer. Haspelmath, Martin. 2009b. Lexical borrowing: Concepts and issues. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages: A comparative handbook, 35–54. Berlin: De Gruyter Mouton. Haspelmath, Martin. 2018. Revisiting the anasynthetic spiral. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 97–115. Oxford/New York: Oxford University Press. Hawkins, John A. 2019. Word-external properties in a typology of Modern English: A comparison with German. English Language and Linguistics 23(3). 701–721. Hay, Jennifer. 2002. From speech perception to morphology: Affix ordering revisited. Language 78(3). 527–555. Hodge, Carleton T. 1970. The linguistic cycle. Language Sciences 13. 1–7. Hogg. Richard M. 1992. Phonology and morphology. In Richard M. Hogg (ed.), The Cambridge history of the English language, vol. I: Beginnings to 1066, 67–167. Cambridge: Cambridge University Press. Huddleston, Rodney D. & Geoffrey K. Pullum. 2002. The Cambridge grammar of the English language. Cambridge: Cambridge University Press. Ingham, Richard. 2013. The persistence of Anglo-Norman 1230–1362: A linguistic perspective. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 44–54. Woodbridge: York Medieval Press. Jespersen, Otto. 1917. Negation in English and other languages. Copenhagen: A. F. Høst. Jespersen, Otto. [1922] 1969. Language: Its nature, development and origin. London: Allen & Unwin. Kastovsky, Dieter. 1985. Deverbal nouns in Old and Modern English: From stem-formation to wordformation. In Jacek Fisiak (ed.), Historical semantics, historical word-formation: International conference on historical semantics and historical word-formation, Błażejewko (Poland) 1984 (Trends in Linguistics. Studies and Monographs 29), 221–261. Berlin/New York: De Gruyter Mouton.
Literature
231
Kastovsky, Dieter. 1992a. Semantics and vocabulary. In Richard M. Hogg (ed.), The Cambridge history of the English language, vol. I: Beginnings to 1066, 290–408. Cambridge: Cambridge University Press. Kastovsky, Dieter. 1992b. Typological reorientation as a result of level interaction: The case of English. In Günter Kellermann & Michael D. Morrissey (eds.), Diachrony within Synchrony, Language History and Cognition: Papers from the International Symposium at the University of Duisburg, 26–28 March 1990 (Duisburger Arbeiten zur Sprach- und Kulturwissenschaft 14), 411–428. Frankfurt am Main: Peter Lang. Kastovsky, Dieter. 2006a. Historical morphology from a typological point of view. In Terttu Nevalainen, Juhani Klemola & Mikko Laitinen (eds.), Types of variation: Diachronic, dialectal and typological interfaces (Studies in Language Companion Series 76), 53–80. Amsterdam: John Benjamins. Kastovsky, Dieter. 2006b. Typological changes in derivational morphology. In Ans van Kemenade & Bettelou Los (eds.), The handbook of the history of English, 151–176. Malden/Oxford: Blackwell. Keller, Rudi. 1989. Invisible-hand theory and language evolution. Lingua 77(2). 113–127. Kempf, Luise & Stefan Hartmann. 2018. Schema unification and morphological productivity: A diachronic perspective. In Geert E. Booij (ed.), The construction of words: Advances in Construction Morphology (Studies in Morphology 4), 441–474. Cham: Springer International Publishing. Koefoed, Geert & Jaap van Marle. 2004. Fundamental concepts. In Geert E. Booij, Christian Lehmann, Joachim Mugdan & Stavros Skopeteas (eds.), Morphology: An international handbook on inflection and word-formation, vol. II (Handbücher zur Sprach- und Kommunikationswissenschaft/ Handbooks of Linguistics and Communication Science 17.2), 1574–1589. Berlin/New York: De Gruyter Mouton. Kortmann, Bernd. 2012. Typology and typological change in English historical linguistics. In Terttu Nevalainen & Elizabeth C. Traugott (eds.), The Oxford handbook of the history of English, 605–621. Oxford/New York: Oxford University Press. Kowaleski, Maryanne. 2013. The French of England: A maritime lingua franca? In Jocelyn WoganBrowne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 103–117. Woodbridge: York Medieval Press. Kytö, Merja. 1999. Collocational and idiomatic aspects of verbs in early Modern English. In Laurel J. Brinton & Minoji Akimoto (eds.), Collocational and idiomatic aspects of composite predicates in the history of English, 167–206. Amsterdam/Philadelphia: John Benjamins. Labov, William. 1994. Principles of linguistic change, vol. I: Internal factors (Language in Society 20). Malden/Oxford: Blackwell. Labov, William. 2001. Principles of linguistic change, vol. II: Social factors (Language in Society 29). Malden/Oxford: Blackwell. Langacker, Ronald W. 2008. Cognitive grammar: A basic introduction. Oxford/New York: Oxford University Press. Lass, Roger. 1992. Phonology and morphology. In Norman F. Blake (ed.), The Cambridge history of the English language, vol. II: 1066–1476, 23–155. Cambridge: Cambridge University Press. Libben, Gary. 2015. Word-formation in psycholinguistics and neurocognitive research. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-formation: An international handbook of the languages of Europe, vol. I (Handbücher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science 40.1), 203–217. Berlin/München/Boston: De Gruyter Mouton. Lloyd, Cynthia. 2011. Semantics and word formation: The semantic development of five French suffixes in Middle English (Studies in Historical Linguistics 6). Frankfurt am Main: Peter Lang.
232
References
Lusignan, Serge. 2013. French language in contact with English: Social context and linguistic change (mid-13th–14th centuries). In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 19–30. Woodbridge: York Medieval Press. Marchand, Hans. 1964. A set of criteria for the establishing of derivational relationship between words unmarked by derivational morphemes. Indogermanische Forschungen 69(1). 10–19. Marchand, Hans. 1969. The categories and types of present-day English word-formation: A synchronicdiachronic approach (2nd edn.). München: Beck. Marslen-Wilson, William, Lorraine K. Tyler, Rachelle Waksler & Lianne Older. 1994. Morphology and meaning in the English mental lexicon. Psychological Review 101(1). 3–33. Marvin, Julia. 2013. The vitality of Anglo-Norman in late medieval England: The case of the prose Brut chronicle. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 303–319. Woodbridge: York Medieval Press. Miller, D. Gary. 2012. External influences on English: From its beginnings to the Renaissance. Oxford: Oxford University Press. Murray, James A. H. 1888. Preface. In James A. H. Murray (ed.). A new English dictionary on historical principles, vol. I: A and B, i–xiv. Oxford: Clarendon. Nevalainen, Terttu. 1999. Early Modern English lexis and semantics. In Roger Lass (ed.), The Cambridge history of the English language, vol. III: 1476–1776, 332–458. Cambridge: Cambridge University Press. Nielsen, Hans F. 2005. From dialect to standard: English in England 1154–1776. Odense: University Press of Southern Denmark. Ormrod, W. Mark. 2003. The use of English: Language, law, and political culture in fourteenth-century England. Speculum 78(3). 750–787. Ormrod, W. Mark. 2013. The language of complaint: Multilingualism and petitioning in later medieval England. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 31–43. Woodbridge: York Medieval Press. Palmer, Chris C. 2009. Borrowings, derivational morphology, and perceived productivity in English, 1300–1600. Ann Arbor: University of Michigan dissertation. Payne, Thomas E. 2017. Morphological typology. In Alexandra Y. Aikhenvald & Robert M. W. Dixon (eds.), The Cambridge handbook of linguistic typology, 78–94. Cambridge: Cambridge University Press. Petti, Anthony G. 1977. English literary hands from Chaucer to Dryden. London: Arnold. Prins, Anton A. 1959. French influence in English phrasing: A supplement. English Studies 40. 27–32. Prins, Anton A. 1960. French influence in English phrasing (continued). English Studies 41. 1–17. Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech & Jan Svartvik. 1985. A comprehensive grammar of the English language. London/New York: Longman. Rodríguez-Puente, Paula. 2019. The English phrasal verb, 1650–present: History, stylistic drifts, and lexicalisation. Cambridge: Cambridge University Press. Rosenbach, Anette. 2006. Descriptive genitives in English: A case study on constructional gradience. English Language and Linguistics 10(1). 77–118. Rothwell, William. 2001. English and French in England after 1362. English Studies 82(6). 539–559. Rothwell, William. 2006. Anglo-French and the Anglo-Norman Dictionary. https://anglo-norman.net/ anglo-french (last accessed October 10, 2023) Sapir, Edward. 1921. Language: An introduction to the study of speech. New York: Harcourt, Brace. Sarnowsky, Jürgen. 2002. England im Mittelalter. Darmstadt: Primus. Sauer, Hans. 1992. Nominalkomposita im Frühmittelenglischen: Mit Ausblicken auf die Geschichte der englischen Nominalkomposition. Tübingen: Max Niemeyer.
Literature
233
Saussure, Ferdinand de. [1916] 1959. Course in general linguistics. Edited by Charles Bally & Albert Sechehaye, translated by Wade Baskin. New York: Philosophical Library. Serjeantson, Mary S. 1936. A history of foreign words in English. New York: Dutton. Sheldon, Esther K. 1946. Pronouncing systems in eighteenth-century dictionaries. Language 22(1). 27–41. Slobin, Dan I. 1977. Language change in childhood and in history. In John Macnamara (ed.), Language learning and thought, 185–214. New York: Academic Press. Stenroos, Merja. 2017. Perspectives on geographical variation. In Laurel J. Brinton (ed.), English historical linguistics: Approaches and perspectives, 303–331. Cambridge: Cambridge University Press. Szmrecsanyi, Benedikt. 2012. Analyticity and syntheticity in the history of English. In Terttu Nevalainen & Elizabeth C. Traugott (eds.), The Oxford handbook of the history of English, 654–665. Oxford/New York: Oxford University Press. Szmrecsanyi, Benedikt. 2016. An analytic-synthetic spiral in the history of English. In Elly van Gelderen (ed.), Cyclical change continued (Linguistik Aktuell/Linguistics Today 227), 93–112. Amsterdam: John Benjamins. Taavitsainen, Irma. 2016. Genre dynamics in the history of English. In Merja Kytö & Päivi Pahta (eds.), The Cambridge handbook of English historical linguistics, 271–285. Cambridge: Cambridge University Press. Taavitsainen, Irma & Päivi Pahta. 1997. Corpus of Early English medical writing 1375–1750. ICAME Journal 21. 71–78. Talmy, Leonard. 2011. Cognitive semantics: An overview. In Claudia Maienborn, Klaus von Heusinger & Paul Portner (eds.), Semantics: An international handbook of natural language meaning, vol. I (Handbücher zur Sprach- und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science 33.1), 622–642. Berlin/Boston: De Gruyter Mouton. Tauli, Valter. 1945/49. Morphological analysis and synthesis. Acta Linguistica 5(2). 80–85. Tauli, Valter. 1958. The structural tendencies of languages, vol. I: General tendencies. Helsinki: Suomalainen Tiedeakatemia. Teddiman, Laura. 2012. Conversion and the lexicon: Comparing evidence from corpora and experimentation. In Dagmar Divjak & Stefan Th. Gries (eds.), Frequency effects in language representation, vol. II (Trends in Linguistics. Studies and Monographs 244.2), 235–254. Berlin/ Boston: De Gruyter Mouton. Thomason, Sarah G. & Terrence Kaufman. 1988. Language contact, creolization, and genetic linguistics. Berkeley: University of California Press. Timofeeva, Olga & Richard Ingham. 2018. Introduction. Special issue on mechanisms of French contact influence in Middle English: Diffusion and maintenance. English Language and Linguistics 22(2). 197–205. Traugott, Elizabeth C. 1999. A historical overview of complex predicate types. In Laurel J. Brinton & Minoji Akimoto (eds.), Collocational and idiomatic aspects of composite predicates in the history of English, 239–260. Amsterdam/Philadelphia: John Benjamins. Vachek, Joseph. 1961. Some less familiar aspects of the analytical trend of English. Brno Studies in English 3. 9–74. Wasserstein, Ronald L. & Nicole A. Lazar. 2016. The ASA [American Statistical Association] statement on p -values: Context, process, and purpose. The American Statistician 70(2). 129–133. Watson, Nicholas. 2013. Lollardy: The Anglo-Norman heresy? In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 334–346. Woodbridge: York Medieval Press.
234
References
Wierzbicka, Anna. 1990. The meaning of color terms: Semantics, culture, and cognition. Cognitive Linguistics 1(1). 99–150. Winter, Bodo & Martine Grice. 2021. Independence and generalizability in linguistics. Linguistics 59(5). 1251–1277. Wogan-Browne, Jocelyn. 2013. General introduction. What’s in a name: The ‘French’ of ‘England’. In Jocelyn Wogan-Browne (ed.), Language and culture in medieval Britain: The French of England, c.1100–c.1500, 1–13. Woodbridge: York Medieval Press.
Appendices E1. Corpus configuration E2. Files included from ARCHER 3.1 E3. Distribution of old and new types across registers Electronic Appendices E1 to E3 are available at https://www.degruyter.com/docu ment/isbn/9783111317717/html A1. Unclear cases and miscellaneously derived new nouns per century
Unclear cases Alteration
th/th
th
th
th
Types
th
th
th
th
Tokens
Types
Tokens
Blending
Types
Tokens
Commonization
Types
Tokens Imitative
Types
Tokens
Initialism
Types
Tokens Suffix secretion Univerbation
Types
Tokens
Types
Tokens
Type totals In % of all new types
.
.
.
.
.
.
.
.
Token totals In % of all new tokens
.
.
.
.
.
.
.
.
https://doi.org/10.1515/9783111317717-018
236
Appendices
A2. Germanic affixation: Distribution of new types (new tokens) Prefix type
th/th
backby-
th
th
th
()
()
()
()
()
()
()
()
()
foregain-
()
halfin-
()
kine-
()
mis-
()
on-
()
out-
()
over-
() ()
un-
()
up-
th
th
()
()
th
th
()
()
() () ()
to-
Suffix type
th
()
self-
under-
th
()
()
()
()
()
() th/th
th
th
th
th
th
()
-ard
()
-ase
()
-cade -dom
()
-en
()
-end -er, -ar, -or
()
() ()
()
()
()
()
()
-ild -ing
()
-le, -el
()
()
()
() ()
-ling
() ()
-ness
()
()
()
()
()
()
()
()
()
()
()
() ()
()
()
()
()
()
-ship
()
()
-th
()
()
-y, -ie
()
()
-lock -red
()
()
-ful -hood
()
()
Appendices
237
th
th
A3. Romance affixation: Distribution of new types (new tokens) Prefix type
th/th
th
th
th
th
th
()
co ()
counterex-
()
hyper-
() ()
mal ()
non-
() ()
pseudo ()
semi-
()
viceSuffix type
()
th/th
th
th
th
-acy
()
-al
()
-an
() ()
-ance, -ence
()
th
th
th ()
() ()
()
-ancy, -ency
()
-arian
()
-ary ()
-ation
()
()
()
-ess
()
() ()
-ette ()
() ()
-ine -ism
()
()
()
()
() ()
-ment
()
()
() ()
-osis ()
-rel -ry
() ()
-ist -ity
-ure
()
()
-eer
-ier
th
()
() ()
() ()
()
238
Appendices
A4. Germanic nominal affixes Prefix
Example
after-
after-effect
Variant form
D✶
M
OED
+
−
+
back-
backdoor
+
−
o
by-
bypath
+
−
o
ed-
edlen ‘reward’
−
−
+
fore-
foremast
+
+
+
gain-
gainclap
half-
half-circle
again-
−
o
+
+
−
o
in-
in-patient
+
−
−
kine-
kine-ring
−
−
−
mid-
midday
+
+
−
mis-
misdeed
+
+
+
off-
offshoot
+
−
+
on-
onsight
+
−
+
out-
outhouse
+
−
+
over-
overtime
+
−
+
self-
self-control
+
−
+
step-
stepson
+
+
− +
through-
thoroughfare
+
−
to-
tocome ‘arrival’
−
−
+
twi-
twilight
−
+
−
un-
unfaith
+
+
+
under-
undersheriff
+
−
+
up-
upfloor
+
−
+
thorough-
wan-
wanhope ‘hopelessness’
−
−
+
with-
withstander ‘opponent’
−
o
+
Suffix
Example
D
M
OED
-al
chloral
−
−
+
-ard
bastard
+
o
+
Variant form
-ase
transaminase
−
−
+
-cade
motorcade
−
o
+
-dom
chiefdom
+
+
+
-en
stitchen ‘small piece’
−
−
+
-en
wolfen ‘she-wolf’
−
−
+
-end
helpend ‘helper’
−
−
+
-er, -ar, -or
accuser
+
+
o
-ful
cartful
+
+
+
-hood
boyhood
o
+
o
-head
Appendices
239
A4. (continued) Suffix
Example
D
M
-ild
cheapild ‘female trader’
Variant form
−
−
OED +
-ing
feeding
+
+
+
-ing
lording
-le,-el
sparkle
-ling
underling
-lock
ferlac ‘fear, terror’
-ness -ock
+
+
+
−
o
+
+
+
+
−
−
o
briskness
+
+
+
hillock ‘little hill’
−
−
+
-red
hatred
−
−
+
-ship
drunkship
+
+
+
-ster
washester ‘female washer’
+
+
+
-th
growth
+
o
+
-y,-ie
sweetie
+
+
+
-els -laik
-t
✶
D = Dixon (2014), M = Marchand (1969), OED = Oxford English Dictionary + = listed as nominal affix, − = not listed as nominal affix, o = listed as nominal affix but differing with respect to variant forms and/or origin
A5. Romance nominal affixes Prefix
Example
D
M
OED
ante-
antechamber
Variant form
+
+
+
anti-
anti-Americanism
+
+
+
arch-
archbishop
+
+
+
auto-
autocar
o
+
−
bi-
bicycle
+
+
− +
circum-
circumstance
+
+
co-
co-worker
+
+
+
counter-
counter-irritation
+
+
+
demi-
demigod
+
+
+
dis-
discomfort
+
+
+
dys-
dysmenorrhœa
−
−
+
epi-
epicycle
+
+
+
ex-
ex-smoker
+
+
+
hemi-
hemisphere
+
−
+
hyper-
hypertension
+
+
+
hypo-
hypoglycaemia
+
+
+
in-
injustice
+
+
+
im-, il-, ir-
240
Appendices
A5. (continued) Prefix
Example
D
M
OED
inter-
intermarriage
Variant form
+
+
+
mal-
malfunction
+
+
+
meta-
metaphysics
+
+
+
micro-
microorganism
+
+
− −
mono-
monotheism
+
+
multi-
multipara
+
+
−
non-
non-actor
+
+
+
para-
paranymph
o
+
+
para-
parasol
o
−
+
peri-
peristome
+
+
+ +
pre-
presupposition
+
+
pseudo-
pseudopolycythaemia
+
+
−
re-
republication
+
+
+
semi-
semicircle
+
+
+
sub-
sub-chief
+
+
+
super-
superinvolution
+
+
+
sur-
surname
+
+
+
trans-
transaction
+
+
+
tri-
triangle
+
+
−
ultra-
ultraviolet
+
+
+
+
+
+
D
M
OED
vice-
vice-admiral
Suffix
Example
-acy
supremacy
o
+
+
-age
carriage
+
+
+
-al
arrival
+
+
+
-an
African, civilian
o
+
+
-ance, -ence
acceptance
o
+
o
-ancy, -ency
expectancy
o
+
o
-ant, -ent
assistant
+
+
+
-arian
parliamentarian
+
+
+
Variant form
-ian
-ary
missionary
+
−
+
-ate
magistrate
+
+
+
-ate
chlorate
+
+
+
Appendices
241
A5. (continued) Suffix
Example
Variant form
D
M
OED
-ation
formation, classification, assumption, absolution
-ion, -ication, -tion, -ition, -ption, -ution
+
o
o
-cy
prophecy
o
+
o
-ee
assignee
+
+
+
-eer
privateer
+
+
+
-ese
Japanese
+
+
+
-ess
servantess
+
+
+
-et
tablet
−
+
+
-ette
cigarette
+
+
+
-ice
service
−
−
+
-ide
chloride
−
−
+
-ier
glazier
o
o
+
-ine
creatinine
−
+
o
-ism
cynicism
+
+
+
-ist
baptist
+
+
+
-itude
longitude
-ude
−
+
o
-ity
similarity
-ety, -ty
+
o
o
-in, -ane, -ene, -one
-ium
calcium
−
−
+
-let
booklet
+
+
+
-ment
treatment
+
+
+
-ory
depository
+
−
+
-osis
cyanosis
−
−
+
-rel
puckerel
-erel
+
+
+
-ry
outlawry
-ery
+
+
+
-ure
closure
-ature, -iture
+
o
o
-y
jealousy, assembly
+
−
o
-atory
242
Appendices
A6. Model lexemes for conversion, compounding, Germanic and Romance affixations by origin (type frequency) th/th th th th th th
Origin Conversion (Section ..)
Conversion
Inherited
,
Other processes
Compounding
Other processes
Unclear
Rom affixation (Section ..)
Inherited Borrowed < Rom
Gmc affixation (Section ..)
th th
Borrowed < Rom Unclear Compounding (Section ..)
Gmc affixation
Inherited
Borrowed < Rom
Other processes
Unclear
Borrowed < Rom
Other processes
Rom affixation Inherited
Unclear
A7. Model nouns reflecting minor compound patterns by word class of constituents th
th
th
PREP + N
proportion
PN + N
Indiaman
PN + CF
saxophone
N + ADV
passer-by
Example ADV + N
welfare
th/th
th
th
th
th
N + PP
man-of-war
N+V
bloodshed
V + ADV
feedback
N + ADJ
president general
N + PN
herb Robert
243
Appendices
A8. Germanic affixation: Distribution of model noun types Prefix type
th/th
th
th
afterback-
th
fore-
gain-
half-
kine-
th
ed-
th
by-
th
in-1
th
mid-
mis-
off-
on-
out-
over-
self-
step-
throughto-1
twiununder-
up-
wan-
with-
th
th
Suffix type
th/th
th
th
th
th
th
-al
-ard
-ase
-cade
-dom
-en
-en
244
Appendices
A8. (continued) Suffix type -end -er, -ar, -or
th/th
th
-ild
th
th
th
th
th
-ful -hood
th
-ing
-ing
-le, -el
-ling
-lock
-ness
-ock -red
-ship
-ster
-th
-y, -ie
245
Appendices
A9. Romance affixation: Distribution of model noun types Prefix type
th/th
th
th
th
th
th
antiarch-
th
th
ante
co-
counter-
auto-
bi
circum-
demidis-
dysepi-
ex-
hemi-
hyper-
hypoin-
inter-
malmeta-
micro-
mono-
multi
non-
para-
para-
peripre-
pseudoresemi-
subsuper-
sur-
transtri-
ultravice-
246
Appendices
A9. (continued) th/th
th
-age
-al
-an
-ance
Suffix type -acy
th
-ancy -ant
th
th
-arian
-ary -ate
th
-cy
-ee -eer
-ess
-ette -ice
-ide -ier -ine
-ism -ist
-itude -ity
-ium
-let -ment
-ory
-osis
-rel -ry
-ure -y
-ese -et
th
-ate -ation
th
247
Appendices
A10. Additional Germanic derivational affixes Prefix
Example
Variant form
D
M
a-
askew ‘slanted’, angin ‘begin’
an-, on-
+
+
OED +
and-
and-sware ‘answer’
−
−
+
be-
bedevil, birewe ‘feel sorry for’
for-
forgive
of-
ofthink ‘be displeasing to’, ‘grieve, regret’
a-
−
−
+
or-
ortrow ‘distrust’, arise
a-
−
−
+
bi-
+
+
+
−
−
+
to-
to-tread ‘trample down’
−
−
+
umbe-
umbethink ‘think about’
−
−
+
wither-
witherwin ‘enemy, adversary’
−
−
+ OED
Suffix
Example
D
M
-ed
shrewd, clear-headed
Variant form
+
+
+
-en
woolen
+
+
+
-en
fasten, harden
+
+
+
-ern
southern
+
+
+
-fold
manifold
+
+
+
-ing
living (adj.)
−
−
+ +
-ish
childish
+
+
-leche
neighleche ‘approach, draw near’
−
−
+
-less
heartless
+
+
+
-ly
friendly
+
+
+
-some
lovesome
+
+
+
-th
fifth
+
−
+
-ward
leeward (adv.)
+
+
+
-wise
shedwise ‘in the manner of a shed’
+
−
−
-y
greedy
+
+
+
248
Appendices
A11. Additional Romance derivational affixes Prefix
Example
D
M
OED
a-
atheist
+
+
+
de-
demerit, degrade
+
+
+
di-
dihydrogen
en-
enlarge, emplaster
ex-
excommunicate, export
in-
incorporate, imprison
Variant form
+
+
−
em-
+
+
+
+
−
+
im-
+
+
+
juxta-
juxtaposition
−
−
+
omni-
omnibus
+
+
−
pro-
progenitor
+
+
+
uni-
uniform
+
+
−
Suffix
Example
Variant form
D
M
OED
-able
acceptable, digestible
-ible
+
+
+
-al
formal, potential, spiritual, familiar
-ial, -ual, -ar
+
o
+
-ase
transferase
+
−
+
-ate
alienate
+
+
+
-ic
periodic, schismatic
-atic, -tic
+
+
+
-fy
-ify
classify
+
+
+
-ine
adamentine (adj.)
+
+
+
-ise
merchandise
−
−
+
-ite
favorite
+
+
+
-ive
creative, preservative, competitive
+
o
+
-ize
normalize
-ative, -itive
+
+
+
-oid
deltoid
−
−
+
-ol
imidazole
-ole
−
o
+
-ous
grievous, courteous, licentious
-eous, -ious
+
+
+
-sis
analysis
−
−
+
Appendices
A12. Initial and final combining forms Initial CF aeroagriangioanthropoarthroastrobacteriobenzobiocalccarbocardiocentichloro- chromatechronocorticocosmocrosscyanocytoendoenteroequiethnogeoh(a)ematohelihexahydrohysterokilomagnimanomanumastomenometro- milli-
aeroplane agriculture angiogram anthropoid arthritis astrology bacteriology benzimidazole biodegradation calcite carbohydrate cardioplegia centimeter chloroform chromatography chronology corticosteroid cosmogony cross-examination cyanometer cytoprotection endometrium enterocoele equinox ethnology geology haemoglobin helicopter hexane hydrogen hysterectomy kilometer magnify, magnitude manometer manuscript mastoid menopause metrorrhagia milliliter
myomyxoneuronormooculoooorthooxy- p(a)edoparapatripedipentapetro- pharmacophotoplenipolypneumopornoportepsychoquadruradio- rheosarcosepti- septo- stenostreptotachytechnotelethermothoracothrombotopotracheoventriculo-
myocardium myxoma neuralgia normotensive ocular oosperm orthography oxygen pedagogue paratrooper patriarch pedal pentagastrin petroleum pharmacopoeia photograph plenipotentiary polygamy pneumococcus pornography porte-crayon psychoanalysis quadrangle radioimmunoassay rheophore sarcoma septangle septostomy stenographer streptococcus tachycardia technology telescope thermometer thoracotomy thrombogenesis topography tracheostomy ventriculotomy
249
250
Final CF -(a)emia -agogue -algia -arch -cardia -cardium -coele -cracy -ectomy -facient -form -gamy -gen -gony -gram -graph -graphy -itis -meter
Appendices
septicaemia pedagogue neuralgia heresiarch tachycardia myocardium enterocoele autocracy myomectomy rubefacient chloroform monogamy fibrinogen cosmogony telegram telegraph geography arthritis parameter
-metrium -metry -nomy -olatry -ology -oma, -ome -pathy -ped -phobia -phone -phore -phorous -plegia -p(o)eia -rrhagia -scape -scope -stomy -tomy
endometrium geometry astronomy idolatry phraseology carcinoma, genome antipathy quadruped photophobia saxophone rheophore electrophorus cardioplegia pharmacopoeia metrorrhagia landscape telescope tracheostomy thoracotomy
Index adjective 58, 60, 66, 116, 197 affix 61, 112, 115, 167, 170, 173, 202 see also Germanic affix; Romance affix – inventory 26, 34, 83, 137, 215 – semantics 63, 84, 156, 170, 201, 214, 215 affixation 15, 61, 62, 112 – Germanic see Germanic affixation – Romance see Romance affixation agglutinating language see language subtype agglutination 15, 167, 176, 177, 178 see also typological technique allomorphy 15 – affix 35, 65, 88–90, 118, 166, 171 – base see base variation analogical extension 208, 212, 214, 220 analogical leveling 208, 211, 214 analogy 6, 43, 101, 102–103, 139, 143, 214, 215, 221 analytic language see language type analyticity 13, 16, 19, 36, 147, 165, 167, 202, 204 see also grammar; lexicon analyticity index 22, 179, 181–184, 186, 222 analyzability 85, 89, 92, 104, 106, 114, 121 see also transparency base 156 – morphological status 24, 25, 35, 36, 65, 87–88, 104, 113, 116, 158, 166–167, 170, 172 – variation 16, 35, 65, 166, 170–171 blending 203 borrowing 33, 37, 54, 55–57, 138, 146, 189, 213, 220 see also contact situation – core borrowing 210 – cultural borrowing 210 – impact 34–35, 36, 74, 82, 127 – new nouns 70, 73–75 bound morpheme 61, 64, 112, 116, 172, 173 see also affix; base categorization 102, 208 category see word class chunking 208 cognitive approach 1, 96, 199, 206
https://doi.org/10.1515/9783111317717-019
combining form 61–62, 64, 79, 111, 112, 118, 130, 158, 171 complex adaptive system 211 complex word see word structure compound 20, 48, 193, 202 – combining form see combining form – constituents 38, 48, 78, 105, 110–111, 115, 130, 131 – genitive 48, 60, 110 – left-headed 111, 131 – modifier 47, 60, 78, 110, 130 – neoclassical 61–62, 111–113, 130 compounding 38–39, 59–62, 109, 115, 147, 209, 214, 220 – model nouns 105, 110–113, 121, 129–132 – neoclassical 61, 79, 132 – new nouns 70, 77–79, 130 contact situation 26–34, 53, 55, 71, 213 conversion 38, 57, 107, 140, 146, 169, 215–217, 219 – adjective 38, 58–59, 77, 107, 108, 124, 127, 216–217 – model nouns 105, 108, 121, 126–129 – new nouns 71, 75–77, 127 – verb 38, 57, 77, 107, 123, 126, 128, 216–217 cumulative exponence 14, 15, 24 derivation see morphology; word formation process – correlative 66, 116, 158 Dixon’s clock 21–22, 153, 163, 175, 176, 186 domain-general mechanisms 2, 102, 207–209, 210 entrenchment 90, 92 see also token frequency exemplar 90, 93, 124, 207, 208 expressiveness 210, 214 extender 118 folk etymology see reanalysis frequency – token see token frequency – type see type frequency fusion 14, 15, 20, 24, 147, 149, 162, 163–176 see also typological technique
252
Index
fusion index 167–169, 173, 175–176, 187, 221 fusional language see language subtype Germanic affix 63, 80 – inventory 80, 83, 113, 137, 155 – prefix 63, 81, 115, 133–134, 170, 193 – suffix 63, 81, 133–134 Germanic affixation 34, 37, 62–64, 115, 147, 213, 220 – model nouns 105, 113–115, 122, 132–135 – new nouns 70, 79–81, 134 gradience 2, 17, 18, 37, 92, 103, 105, 162, 176, 199, 204 grammar 1, 37, 178, 181, 183, 206, 211 – analyticity 1, 22–23, 181 – inflection see morphology – syntheticity 22–23, 179–181 – typological development 19, 177, 184, 197–198, 213 hybrids 55–56 inflection see morphology isolating language see language subtype isolation 176, 177 see also typological technique language change 19, 33, 199, 209 see also domain-general mechanisms; sociocultural factors – and usage 3, 206, 207, 212 – invisible hand 206, 211 language subtype 17, 161, 163, 176, 186, 187, 189 see also typological technique – agglutinating 14, 15, 17, 18, 162 – fusional 14, 15, 17, 18, 25, 162 – isolating 14, 39 language type 159, 206 – analytic 2, 19, 24, 37, 38, 140, 161 – synthetic 19, 24, 37 lexical strata 24–25, 35–36 lexicon 1, 37, 178, 211 – analyticity 26, 148, 149, 179–185, 189, 196, 198, 213, 221 – syntheticity 149, 158, 159, 179–185, 189, 196, 198, 221
– typological development 26, 36, 37, 70, 97–101, 140–143, 148, 160, 177, 184, 186, 195, 198, 217 lexicon extension 4, 34, 43, 70–73, 144, 188 see also borrowing; word formation process lexicon structure 5, 187, 188, 190, 214 see also typological profile linguistic behavior 5, 148, 188, 189, 210, 222 linking element 110, 112 loanword 2, 26, 34, 35, 36, 53, 84, 97, 160, 210 see also borrowing meaning see semantics memory 90, 92, 96, 97, 124, 207 mental concept – isolated 213, 216–217 – relational 216–217 morpheme boundary 14, 15, 17, 36, 37, 38, 88, 105, 112, 119, 166, 170 morphological typology 1, 5, 11, 16, 17, 23, 199, 224 morphology 2, 34, 36 – derivational 25, 36 – global trend 197–198, 199, 223 – inflectional 58, 109, 127, 210–211 neologism see noun neuromotor routinization 207, 212 noun 3, 49, 74 – dating 52–53, 59 – distribution old–new 5, 66–70, 217 – etymology 53–54, 125, 127–128, 131–132, 134–135, 138 – model nouns see word formation pattern – new nouns 51, 52, 70 see also borrowing; compounding; conversion; Germanic affixation; Romance affixation Old English 25–26, 35, 36, 58, 76, 77, 148, 160, 166, 172, 198, 214 phonological erosion 26, 57, 63 phonological variation 15–16, 25, 35, 36, 65, 87, 88, 162, 163, 166, 167, 170 phonology 34, 105, 164–165, 204
Index
prefixation – Germanic 80, 81, 115, 133 – Romance 82, 136 productivity 51, 125, 142 reanalysis 34, 35, 84, 209 reborrowing 57, 72, 75 register 45, 46, 53, 69, 70, 78, 117, 136, 223 Romance affix 57, 64, 82 – emergence in English 56, 84–97, 139, 209 – inventory 83, 117–118, 137, 138, 155 – prefix 136–137 – suffix 55, 85, 136–137 Romance affixation 34, 35, 37, 56, 64–66, 97, 101, 125, 173, 215, 219 – model nouns 105, 115–119, 122, 135–139 – new nouns 70, 81–83, 87, 137 schema 103 semantics 20, 49, 59, 95–96, 104, 106, 157, 202, 203 – ambiguity 200–201 – context dependence 96, 201–202 simplex word see word structure sociocultural factors 209–210, 213, 215 see also contact situation speech automation see neuromotor routinization structuralist approach 1, 193, 199, 207 structure – lexicon see lexicon structure – noun see word structure suffixation – Germanic 25, 80, 81, 133 – Romance 82, 83, 136 suffixing preference 79, 81, 124, 133, 137 syntax 30, 199–200, 205 synthetic index 17, 158–161, 221 synthetic language see language type syntheticity 13, 17, 159, 161, 165, 166, 204 see also grammar; lexicon syntheticity index 22, 179–181, 183–184, 186, 221 text type see register token frequency 18, 22, 49, 67, 85, 90, 153, 172, 175, 187, 197, 204, 208, 223
253
– extremely high frequency 90–92, 95, 124, 127, 139, 142 – higher frequency 92–94 – relative frequency 95, 107 transparency 38, 85, 87–90, 104–105, 112, 119, 155, 156, 163, 165, 214, 215 see also analyzability type frequency 49, 68, 85–87, 125, 139, 143, 145, 187, 197, 208 type-token ratio (TTR) 49, 74, 76, 77, 79, 82, 99, 100, 123–124, 127, 142 typological development 19, 21, 23, 203, 205, 214, 222 see also grammar; lexicon – cycle 20–23, 163, 184–185, 199, 205 – drift 19, 199, 205 – spiral 21, 163 typological profile 5, 18, 148, 153, 158, 177, 186, 189, 197, 198, 203, 204, 206, 214, 217, 222 typological technique 14, 140, 144, 186, 189 – agglutinating 14, 17, 24, 37, 38, 98, 99, 101, 141, 142, 147, 188, 217 – fusional 14, 17, 24, 37, 83, 98, 101, 141, 142, 145, 188, 217 – isolating 16, 25, 38, 98, 99, 140, 142, 146, 187, 189, 217 uniformitarian principle 104 univerbation 59, 110, 132, 208 see also chunking usage frequency see token frequency usage-based approach 3, 18, 71, 75, 153 verb 60, 156, 157 – composite predicates 194 – phrasal verbs 194 – typological development 196–197, 224 – verb formation 196 word class 62, 217 see also adjective; noun; verb – closed class 57, 109, 126 – size 3, 128, 197, 215 word formation pattern 43, 102, 115, 119, 121–125, 126, 139, 143, 208, 221 – compounding see compounding – conversion see conversion – Germanic affixation see Germanic affixation
254
Index
– model noun 102, 103–104, 119, 120–121, 125, 145, 214 – Romance affixation see Romance affixation word formation process 4, 54, 73, 84, 101, 148 – compounding see compounding – conversion see conversion – Germanic affixation see Germanic affixation – Romance affixation see Romance affixation – typological classification 37–39
word structure 21, 101, 106, 139 – immediate constituents 61, 110, 169 – monomorphemic 76, 91, 142, 156, 168, 175, 179, 186, 193 – polymorphemic 60, 84, 155, 168, 179, 186, 193 word-internal boundary see morpheme boundary zero-derivation 38, 58 see also conversion