The Lexis and Lexicogrammar of Sri Lankan English (Varieties of English Around the World) [UK ed.] 9027249148, 9789027249142

This book offers the first in-depth corpus-based description of written Sri Lankan English. In comparison to British and

325 3 16MB

English Pages 262 [263] Year 2015

Table of contents :
The Lexis and Lexicogrammar of Sri Lankan English
Editorial page
Title page
LCC data
Table of contents
List of figures
List of tables
List of abbreviations
Acknowledgments
1. Sri Lankan English and Sri Lankan Englishes
2. The development of Sri Lankan English
2.1 Sri Lankan English under British Colonial Rule (1796–1948)
2.2 Sri Lankan English in the Postcolonial Era (1948–2010)
2.3 Sri Lankan English: The State of the Debate
2.3.1 Sociocultural considerations
2.3.2 Sociolinguistic considerations
2.3.3 Sociopolitical considerations
3. Methodology
3.1 The corpus environment
3.1.1 The International Corpus of English
3.1.2 The newspaper corpora
3.1.3 The Google Advanced Search Tool
3.2 Indicators of structural nativisation in Sri Lankan English
4. Sri Lankan English lexis
4.1 Formality markers
4.1.1 Formality markers: Frequency
4.1.2 Formality markers: Genre-specificity
4.1.3 Formality markers: Case studies
4.2 Pan-South Asian English lexemes
4.2.1 Pan-South Asian English lexemes: Frequency
4.2.2 Pan-South Asian English lexemes: Genre-specificity
4.2.3 Pan-South Asian English lexemes: Case studies
4.3 Archaism markers
4.3.1 Archaism markers: Frequency
4.3.2 Archaism markers: Case studies
4.4 Sri Lankan English lexis: An overview
5. Sri Lankan English lexicogrammar
5.1 Particle verbs
5.1.1 Particle verbs: Frequency
5.1.2 Particle verbs: Genre-specificity
5.1.3 Particle verbs: Unrecorded particle verbs
5.2 Light-verb constructions
5.2.1 Light-verb constructions: Frequency, genre-specificity and types
5.2.2 Light-verb constructions: Potentially innovative light-verb constructions
5.3 Verb complementation
5.3.1 Verb Complementation: HATE
5.3.2 Verb Complementation: LIKE
5.3.3 Verb Complementation: LOVE
5.4 Sri Lankan English lexicogrammar: An overview
6. A model of (the emergence of) distinctive structural profiles of semiautonomous varieties of English
References
Appendix
Index

Recommend Papers

English in the Netherlands: Functions, forms and attitudes (Varieties of English Around the World) [UK ed.] 9027249164, 9789027249166

This volume provides the first comprehensive investigation of the Netherlands in the World Englishes paradigm. It explor

422 74 13MB Read more

A Sociophonetic Approach to Scottish Standard English (Varieties of English Around the World) [UK ed.] 902724913X, 9789027249135

Applying a sociophonetic research paradigm, this volume presents an investigation of variation and change in the Scottis

413 92 4MB Read more

The Syntax of Spoken Indian English (Varieties of English Around the World) 9027249059, 9789027249050

This book offers an in-depth analysis of several features of spoken Indian English that are generally considered as ‘typ

416 24 2MB Read more

English in Cyprus or Cyprus English: An empirical investigation of variety status (Varieties of English Around the World) 9027249067, 9789027249067

This volume provides the first-ever comprehensive analysis of a potential variety of English, spoken in the Greek part o

471 62 6MB Read more

English in Southeast Asia: Features, policy and language in use (Varieties of English Around the World) 9027249024, 9789027249029

This volume provides a first systematic, comprehensive account of English in Southeast Asia (SEA) based on current resea

840 38 4MB Read more

Language Variation on Jamaican Radio (Varieties of English Around the World) 9789027249203, 9027249202

This volume presents an in-depth analysis of language variation in Jamaican radio newscasts and talk shows. It explores

490 41 3MB Read more

Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes (Varieties of English Around the World) 9027249032, 9789027249036

This volume presents a collection of in-depth cross-varietal studies on a broad spectrum of grammatical features in Engl

434 60 3MB Read more

Varieties of English: 3 The Pacific and Australasia 9783110208412, 9783110196375

This volume gives a detailed overview of the varieties of English spoken in the Pacific and Australasia, including regio

175 32 5MB Read more

The History of English: Volume 5 Varieties of English 9783110525045, 9783110522792

This volume is one of the first detailed expositions of the history of different varieties of English. It explores langu

165 21 9MB Read more

The History of English: Volume 5 Varieties of English 9783110525045, 9783110522792

This volume is one of the first detailed expositions of the history of different varieties of English. It explores langu

157 99 3MB Read more

The Lexis and Lexicogrammar of Sri Lankan English (Varieties of English Around the World) [UK ed.]
9027249148, 9789027249142

Author / Uploaded
Tobias Bernaisch

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

The Lexis and Lexicogrammar of Sri Lankan English

Varieties of English Around the World (VEAW) issn 0172-7362

A monograph series devoted to sociolinguistic research, surveys and annotated text collections. The VEAW series is divided into two parts: a text series contains carefully selected specimens of Englishes documenting the coexistence of regional, social, stylistic and diachronic varieties in a particular region; and a general series which contains outstanding studies in the field, collections of papers devoted to one region or written by one scholar, bibliographies and other reference works. For an overview of all books published in this series, please see http://benjamins.com/catalog/veaw Editor Stephanie Hackert

University of Munich (LMU)

Editorial Board Manfred Görlach Cologne

Rajend Mesthrie

University of Cape Town

Peter L. Patrick

University of Essex

Edgar W. Schneider

University of Regensburg

Peter Trudgill

University of Fribourg

Walt Wolfram

North Carolina State University

Volume G54 The Lexis and Lexicogrammar of Sri Lankan English by Tobias Bernaisch

The Lexis and Lexicogrammar of Sri Lankan English Tobias Bernaisch Justus Liebig University Giessen

John Benjamins Publishing Company Amsterdam / Philadelphia

8

TM

The paper used in this publication meets the minimum requirements of the American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.

doi 10.1075/veaw.g54 Cataloging-in-Publication Data available from Library of Congress: lccn 2015020904 (print) / 2015021693 (e-book) isbn 978 90 272 4914 2 (Hb) isbn 978 90 272 6822 8 (e-book)

© 2015 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · https://benjamins.com

Table of contents List of figures

vii

List of tables

ix

List of abbreviations

xi

Acknowledgments Chapter 1. Sri Lankan English and Sri Lankan Englishes

xiii 1

Chapter 2. The development of Sri Lankan English 2.1 Sri Lankan English under British colonial rule (1796–1948) 24 2.2 Sri Lankan English in the postcolonial era (1948–2010) 31 2.3 Sri Lankan English: The state of the debate 39 2.3.1 Sociocultural considerations 39 2.3.2 Sociolinguistic considerations 44 2.3.3 Sociopolitical considerations 49

19

Chapter 3. Methodology 3.1 The corpus environment 56 3.1.1 The International Corpus of English 58 3.1.2 The newspaper corpora 66 3.1.3 The Google Advanced Search Tool 69 3.2 Indicators of structural nativisation in Sri Lankan English 78

55

Chapter 4. Sri Lankan English lexis 4.1 Formality markers 83 4.1.1 Formality markers: Frequency 85 4.1.2 Formality markers: Genre-specificity 95 4.1.3 Formality markers: Case studies 101 4.2 Pan-South Asian English lexemes 105 4.2.1 Pan-South Asian English lexemes: Frequency 106 4.2.2 Pan-South Asian English lexemes: Genre-specificity 111 4.2.3 Pan-South Asian English lexemes: Case studies 114 4.3 Archaism markers 119 4.3.1 Archaism markers: Frequency 120 4.3.2 Archaism markers: Case studies 125 4.4 Sri Lankan English lexis: An overview 133

83

 The Lexis and Lexicogrammar of Sri Lankan English

Chapter 5. Sri Lankan English lexicogrammar 5.1 Particle verbs 137 5.1.1 Particle verbs: Frequency 140 5.1.2 Particle verbs: Genre-specificity 144 5.1.3 Particle verbs: Unrecorded particle verbs 151 5.2 Light-verb constructions 170 5.2.1 Light-verb constructions: Frequency, genre-specificity and types 175 5.2.2 Light-verb constructions: Potentially innovative light-verb constructions 185 5.3 Verb complementation 193 5.3.1 Verb complementation: HATE 197 5.3.2 Verb complementation: LIKE 198 5.3.3 Verb complementation: LOVE 200 5.4 Sri Lankan English lexicogrammar: An overview 202

137

Chapter 6. A model of (the emergence of) distinctive structural profiles of semiautonomous varieties of English

205

References

225

Appendix

233

Index

245

List of figures Figure 1. A political map of Sri Lanka 3 Figure 2. The SLE dialect continuum 13 Figure 3. The corpus environment of the present study 56 Figure 4. Mean normalised (pmw) frequencies of formality markers in the ICE, newspaper and GAST data 94 Figure 5. The factor scores of the spoken and written ICE genres for Factor 1: interactive casual discourse vs. informative elaborate discourse 96 Figure 6. Normalised (pmw) and absolute frequencies of formality markers in the genres of ICE-SL, ICE-IND and ICE-GB 97 Figure 7. Association plot of formality markers in the genres of ICE-SL, ICE-IND and ICE-GB 99 Figure 8. Relative and absolute frequencies of formality markers in social and business letters in ICE 100 Figure 9. Mean normalised (pmw) frequencies of PSA lexemes in the ICE, newspaper and GAST data 110 Figure 10. Normalised (pmw) and absolute frequencies of PSA lexemes in the genres of ICE-SL and ICE-IND 113 Figure 11. Relative and absolute frequencies of lakh and its numeric/lexical alternatives in the newspaper data 115 Figure 12. Relative and absolute frequencies of tank and reservoir in the newspaper data 118 Figure 13. Mean normalised (pmw) frequencies of archaism markers in the ICE, newspaper and GAST data 124 Figure 14. Relative and absolute frequencies of madam and its alternatives in the correspondence (W1B) sections in ICE 126 Figure 15. Relative and absolute frequencies of sir and its alternatives in the correspondence (W1B) sections in ICE 128 Figure 16. Relative and absolute frequencies of HAIL from + place/family/other in the newspaper data 131 Figure 17. Normalised (pmw) and absolute frequencies of PVs in the ICE and newspaper data 141 Figure 18. Normalised (pmw) and absolute frequencies of PVUs, PVOUs and PVOFs in the genres of ICE-SL, ICE-IND and ICE-GB 146 Figure 19. Relative and absolute frequencies of COPE with and COPE up with in the ICE, newspaper and GAST data 154

 The Lexis and Lexicogrammar of Sri Lankan English

Figure 20. Normalised (pmw) and absolute frequencies of unrecorded PVOUs in the newspaper and GAST data Figure 21. Relative and absolute frequencies of LEASE and LEASE out in the newspaper data Figure 22. Relative and absolute frequencies of WAIVE and WAIVE off in the newspaper data Figure 23. Relative and absolute frequencies of BOAST of and BOAST off in the newspaper and GAST data Figure 24. Relative and absolute frequencies of DISPOSE of and DISPOSE off in the newspaper and GAST data Figure 25. Normalised (pmw) and absolute frequencies of LVCs with GIVE, HAVE, PUT and TAKE in the ICE and newspaper data Figure 26. Association plot of LVCs with GIVE, HAVE, PUT and TAKE in ICE Figure 27. Normalised (pmw) and absolute frequencies of LVCs in the genres of ICE-SL, ICE-IND and ICE-GB Figure 28. Relative and absolute frequencies of article variants in LVCs in the ICE and newspaper data Figure 29. Relative and absolute frequencies of LVCs with call in GAST Figure 30. Relative and absolute frequencies of LVCs with nap in GAST Figure 31. Relative and absolute frequencies of LVCs with rest in GAST Figure 32. Relative and absolute frequencies of HATE Ving and HATE to V in the ICE, newspaper and GAST data Figure 33. Relative and absolute frequencies of LIKE Ving and LIKE to V in the ICE, newspaper and GAST data Figure 34. Relative and absolute frequencies of LOVE Ving and LOVE to V in the ICE, newspaper and GAST data Figure 35. Multiple language contact situations of PCEs (SLE) Figure 36. A model (of the emergence) of distinctive structural profiles of semiautonomous varieties of English

159 160 163 165 166 176 178 179 183 189 191 193 197 199 200 209 215

List of tables Table 1. Endonormative stabilisation of present-day SLE (adapted from Mukherjee 2008: 360) Table 2. The ICE corpus design for written texts (cf. Greenbaum & Nelson 1996: 13–14) Table 3. The sizes and sources of the newspaper corpora Table 4. Estimates of the number of English words in the top-level country domains (as on 23 June 2011) Table 5. Limitations in the use of web data Table 6. An overview of the objects of investigation of the present study Table 7. Absolute and normalised (pmw) frequencies of formality markers in ICE Table 8. Absolute and normalised (pmw) frequencies of formality markers in the newspaper data Table 9. Absolute and normalised (pmw) frequencies of formality markers in GAST Table 10. Absolute and relative frequencies of definite numeral and other premodification of persons in the ICE and newspaper data Table 11. Absolute and normalised (pmw) frequencies of refrigerator and fridge in GAST Table 12. Absolute and normalised (pmw) frequencies of PSA lexemes in ICE Table 13. Absolute and normalised (pmw) frequencies of PSA lexemes in the newspaper data Table 14. Mean absolute and normalised (pmw) frequencies of PSA lexemes in GAST Table 15. Absolute and normalised (pmw) frequencies of three PSA lexemes in the genres of ICE-SL Table 16. Absolute and normalised (pmw) frequencies of three PSA lexemes in the genres of ICE-IND Table 17. Absolute and normalised (pmw) frequencies of three usage patterns of rupee in GAST Table 18. Absolute and normalised (pmw) frequencies of archaism markers in ICE Table 19. Absolute and normalised (pmw) frequencies of archaism markers in the newspaper data

53 59 67 70 74 80 86 89 93 102 104 106 107 109 112 112 117 120 122



The Lexis and Lexicogrammar of Sri Lankan English

Table 20. Absolute and normalised (pmw) frequencies of archaism markers in GAST Table 21. Absolute and normalised (pmw) frequencies of three usage patterns of that + name of a person + fellow in GAST Table 22. Absolute and normalised (pmw) frequencies of patterns of hails from in GAST Table 23. Absolute and relative frequencies of CARRY out, PERFORM and CONDUCT in ICE Table 24. DP values of PVs in ICE Table 25. Absolute and relative frequencies of PVUs in social and business letters in ICE Table 26. Absolute and relative frequencies of PVOUs in social and business letters in ICE Table 27. Absolute and normalised (pmw) frequencies of RILE up and GLOW up in GAST Table 28. Absolute and normalised (pmw) frequencies of WAIVE off in GAST Table 29. Absolute and relative frequencies of LVCs in social and business letters in ICE Table 30. Absolute and normalised (pmw) frequencies of HAVE a(n)/the/Ø glimpse, TAKE a(n)/the/Ø benefit from and TAKE a(n)/the/Ø lease in the newspaper data Table 31. Absolute and normalised (pmw) frequencies of HAVE a glimpse, TAKE Ø benefit from and TAKE a lease in GAST Table 32. Absolute and relative frequencies of LVCs with call in the newspaper data

123 130 132 143 145 150 150 155 164 181

186 186 187

List of abbreviations abs. freq. absolute frequency BNC news news section of the British National Corpus BrE British English BSE British Standard English cf. confer CPVD Collins COBUILD Phrasal Verbs Dictionary DP deviation of proportions e.g. exempli gratia ed. editor eds editors EFL English as a foreign language ENL English as a native language ESL English as a second language et al. et alii etc. et cetera GAST Google Advanced Search Tool GAST-GB British English online data GAST-IND Indian English online data GAST-SL Sri Lankan English online data GloWbE Corpus of Global Web-based English i.e. id est ICE International Corpus of English ICE-GB written part of the British component of the International Corpus of English ICE-IND written part of the Indian component of the International Corpus of English ICE-SL written part of the Sri Lankan component of the International Corpus of English IndE Indian English LPVD Longman Phrasal Verbs Dictionary LTTE Liberation Tigers of Tamil Eelam LV(s) light-verb(s) LVC(s) light-verb construction(s) MEPs Members of the European Parliament

 The Lexis and Lexicogrammar of Sri Lankan English

norm. freq. normalised frequency OED Oxford English Dictionary PCE(s) postcolonial English(es) pmw per million words PSA pan-South Asian English PV(s) particle verb(s) PVOF(s) particle verb(s) with off PVOU(s) particle verb(s) with out PVU(s) particle verb(s) with up rel. freq. relative frequency SAVE South Asian Varieties of English (Corpus) SAVE-IND Indian component of the South Asian Varieties of English Corpus SAVE-SL Sri Lankan component of the South Asian Varieties of English Corpus SLE Sri Lankan English SLFP Sri Lanka Freedom Party SLICELT Sri Lanka-India Centre of English Language Training SLWE Sri Lankan writing in English SP Sessional Paper UNP United National Party

Acknowledgments More people than I can reasonably mention here have facilitated the writing of this book on so many different levels and I would like to take this opportunity to express my gratitude to at least some of them. Since we first met, Joybrato Mukherjee has repeatedly opened doors I did not know existed for me and his and Magnus Huber’s constructive and critical feedback on earlier versions of this text have provided me with invaluable sources of inspiration on how to improve this manuscript. I had the privilege of learning a lot about Indian English and more generally South Asian Englishes from Claudia Lange – also via first-hand experience during a research trip with her and Christopher and Natascha Koch in the spring of 2012. Carolin Biewer often gave me valuable food for thought on postcolonial Englishes not only when we studied the development of English in Hong Kong on site together. I would also like to express my sincerest thanks to the members of the project team working on the Sri Lankan component of the International Corpus of E nglish for having made and still making the exploration of Sri Lankan English such an incredibly fascinating experience. I am particularly indebted to D ushyanthi Mendis and Shariya Dilini Algama for making Janina Werner and me feel at home and introducing us to their colleagues, friends and the Sri Lankan way of life in the course of a research stay in 2010, where I was also lucky enough to get to know Michael Meyler, who was nothing but helpful whenever I had any questions on his dictionary of Sri Lankan English. In the preparation of this manuscript, I have also had the pleasure of working alongside great people at the Chair of English Linguistics and at the Chair of Modern English Linguistics and more generally at the Department of English at Justus Liebig University Giessen. My deepest thanks especially go out to Stefanie Dose-Heidelmayer and Sandra Götz, my academic big sisters, for having offered their opinions and guidance on many different aspects of academic (as well as non-academic) life. I would also like to thank Stefan Th. Gries for introducing me to the wonderful world of R and Marco Schilk for many discussions on verb complementation and topics way beyond. I am also maximally appreciative of Stephanie Hackert’s detailed comments and suggestions on drafts of this text that were salient factors in writing a better book and I would like to thank her and Kees Vaes for being the editors they are. Further, I am grateful for Andrew Liston’s stylistic feedback on the manuscript.

 The Lexis and Lexicogrammar of Sri Lankan English

The German Academic Exchange Service (DAAD) and the International Graduate Centre for the Study of Culture (GCSC) provided funding for research trips and related conference presentations, which I highly appreciate. I am also truly thankful for having been and being surrounded by a great number of fun and caring people when I am not at a desk. I do not want to mention a selected few here because you know who you are anyway. Whom I would like to mention as explicitly as possible, however, is my soon-to-be wife Franziska Hegele. I could not have asked for a better partner. Finally, I would like to thank my parents. Although I know they would never want me to repay them in any way for everything they have done for me, I would at the very least like to dedicate this book to them.

chapter 1

Sri Lankan English and Sri Lankan Englishes South Asia certainly is one of the linguistically most diverse regions around the globe. While it is the large number of indigenous languages and varieties thereof that comes to mind as a source of this diversity first (and probably the amount of languages in India in particular), the emergence of varieties of languages historically non-native to South Asia has also greatly contributed to its lectal diversification. This holds true for Southeast Asian languages such as Malay, for which a localised variant in Sri Lanka has already been systematically described (cf. Ansaldo 2008), but also for the English language. India was the first South Asian country where English – at the beginning mainly for trade between the East India Company and local Indian merchants – came to be used on a routine basis from the end of the 16th century onwards making Indian English (IndE) the oldest South Asian variety of English. With the split of Pakistan from India in 1947 and the subsequent split of Bangladesh from Pakistan in 1971, two relatively young South Asian Englishes with what must be assumed to be strong IndE legacies added to the diversity of English in South Asia. Also in the Maldives and Nepal, English has played and is still playing an important role mainly for economic reasons although it has at no point been subject to similar degrees of institutionalisation as in India given that neither Nepal nor the Maldives were under direct British colonial rule. English arrived in Sri Lanka, which also became part of the British Raj at a later stage, towards the end of the 18th century, i.e. approximately two centuries later than in India, but underwent comparable institutionalisation processes via missionary schools and language policies under British administration. It is, in conjunction with other factors, these distinctive (partly colonial) sociohistorical conditions that have led to the emergence of a set of distinct South Asian Englishes, among which IndE – at times even equated with the English language in South Asia in its entirety (cf. Kachru passim; Strevens 1980: 86–87) – has been given scholarly prominence as reflected in a relatively ample body of introspective and descriptive literature (cf. e.g. Nihalani et al. 2004; Balasubramanian 2009; Sedlatschek 2009; Schilk 2011; Lange 2012). Among the varieties of English outside India, Sri Lankan English (SLE) deserves particular attention since it is the only variety which has been and is institutionalised to a degree comparable to that



The Lexis and Lexicogrammar of Sri Lankan English

of India, i.e. in terms of its use in formal public domains such as education and administration as well as in terms of its codification (cf. Meyler 2007), without being a direct descendant of IndE itself such as Pakistani or Bangladeshi English. The tropical island of Sri Lanka is located in the Indian Ocean off the southeast coast of India with an area of 65,610 km2.1 Apart from India, which is separated from Sri Lanka by the Gulf of Mannar and the Palk Strait, a strip of shallow water, the countries in closest physical proximity are the Maldives to the west, the Nicobar Islands to the east and the Andaman Islands to the northeast. The total population of 19.8 million people lives in nine provinces and 24 districts that stretch over five major geographic regions as illustrated in Figure 1: “the central highlands, the well-watered southwest, the drier east and southeast, the northern lowland, and the coastal belt” (Coperahewa 2009: 70). Owing to the stimulating effect of an economic shift from agriculture to a more liberalised open economy, Sri Lanka, despite its comparably low per capita income, is currently considered a middle-income country with high ratings in quality of life indices based on life expectancy, infant mortality and literacy. In 2009, 14% of the population spoke English. Governmental and private institutions as well as associations offering classes in English cluster in urban areas such as Colombo and Kandy, where the quality of teaching is presumably at the highest local standard. In other regions, however, there is a general lack of welltrained teachers. Sri Lanka is an ethnically diverse country. According to the 2001 census, the Sinhalese comprising 82% of the Sri Lankan population represent the largest ethnic group and have a strong connection to Buddhism (cf. 〈http://www.statistics.gov. lk/PopHouSat/PDF/Population/p9p8%20Ethnicity.pdf〉 (17 October 2014)). The second largest ethnic group – the Tamils – is comprised of Sri Lanka Tamils (4.3%) and Indian Tamils (5.1%). In addition to some characteristics of outer appearance, language is the main indicator of ethnic belonging in that the Sinhalese speak Sinhala (or Sinhalese) and the Tamils Tamil as their respective first language. Sinhala is a member of the Indo-Aryan language family and thus “related to such modern Indian languages as Hindi, Bengali, Gujarati, Marathi, Punjabi, and Kashmiri” (Coperahewa 2009: 75). Sinhala and Tamil have some features in common although Tamil, a Dravidian language, has a history distinct from that of Sinhala. Tamil, which is also used in South India, stems linguistically from the same source as other modern South Indian languages like Malayalam, Telugu or Kannada. As

. The information on Sri Lanka in terms of geography and population as well as the figures on proficiency in English and ethnic groups in Sri Lanka are based on Coperahewa (cf. 2009: 70–96).

Chapter 1. Sri Lankan English and Sri Lankan Englishes

0

Sri Lanka District Boundary Province Boundary (non-administrative) Road Railroad River National Capital District Capital City or town 20 40KM

40Miles 0 20 District names are the same as their capitals. © 2007 Geology.com

Figure 1. A political map of Sri Lanka (taken from 〈http://geology.com/world/sri-lanka-map.gif〉 (17 October 2014))

Tamil is present both in southern India and Sri Lanka, it used to play a key role in Sri Lankan trade and business. Further large ethnic groups in Sri Lanka are the Sri Lankan Muslims and the Burghers. The Sri Lankan Muslims can be subdivided into five groups, i.e. the Sri Lanka/Ceylon Moors, the Coast/Indian Moors, the Malays, the Memons and the Borahs. Due to the linguistic diversity of this ethnic group, which speaks Tamil, Malay and English, ethnic cohesiveness among the Sri Lankan Muslims is not achieved via language, but via religion.





The Lexis and Lexicogrammar of Sri Lankan English

The Burgher population is an ethnic group of European descent, which can be considered to be a reflection of Sri Lanka’s history of colonisation. As the island was occupied by the Portuguese and later by the Dutch from the beginning of the 16th century till the end of the 18th century, Portuguese and Dutch settlers came to Sri Lanka and some of them never left the island. The descendants of these settlers are referred to as the Dutch Burghers and the Portuguese Burghers in present-day Sri Lanka. The Burgher population, comprising 35,283 people in the 2001 census, is characterised by the adoption of English as the language of the home, and even before the British controlled the island, Burghers used English in private settings, which is why they obtained powerful positions during British administration. Two central aspects, which have been alluded to in the course of this short introduction to Sri Lanka, remain to be emphasised here. With the Sri Lankan Muslims as an exception, it can be observed that, first, ethnicity and language are inextricably intertwined, which is of utmost importance to fully grasp the developments in politics and language planning and policy in Sri Lanka under and after British rule as described in detail in Chapter 2. Second, because of this ethnolinguistic constellation, the Sri Lankan linguistic landscape needs to be understood as a volatile linguistic equilibrium with Sinhala, Tamil and English as the three major forces where shifting this equilibrium in favour of a particular language often means privileging one ethnic group to the disadvantage of the others. Given the unique linguistic ecology characterised by an unparalleled situation of multiple language contact in which English has developed in Sri Lanka for more than two centuries, it is to be expected that this uniqueness also finds reflection in the structures of the Sri Lankan variant of English, which have so far, however, not been systematically explored. Much of the literature available up to now on the features of SLE have been largely impressionistic accounts not supported by representative samples of speakers […] or by corpus data that reflect syntactic and grammatical language in use across a range of genres. (Mendis & Rambukwella 2010: 186)

This bird’s-eye view on studies on SLE adequately captures the dominance of intuition-based approaches to this South Asian English and highlights the peripheral role empirical investigations have played in its description. However, this lack of quantitative research into SLE on the basis of authentic text material is certainly also related to the only recently completed compilation of representative corpus data, which has been available for other varieties in Asia for some time already. It may thus not come as much of a surprise that in comparison to varieties such as Singapore English (cf. e.g. Lim 2004) or Hong Kong English (cf. e.g. Bolton 2002) representing different Southeast Asian Englishes and IndE (cf. e.g. Sedlatschek 2009; Schilk 2011), SLE has not yet received similar scholarly attention.

Chapter 1. Sri Lankan English and Sri Lankan Englishes

It is against this background that this study provides a first in-depth empirical description of written acrolectal SLE in terms of its lexis and lexicogrammar across a number of structural features on the basis of a complex corpus environment, which pursues three central and closely intertwined objectives. First, the formal structures of SLE need to be investigated in order to establish to what extent SLE can be regarded as a full-fledged South Asian variety of English with a variety- specific structural profile evident on different levels of language organisation. Second, the (earlier) classification of SLE as a mere variant of IndE (cf. Kachru passim; Strevens 1980: 86–87) as well as third, the inter alia local perception of SLE as an interlanguage or learner language (cf. Passé 1950: 133; Gunesekera 2000: 112) have to be critically revisited in the light of authentic text material. In this context, the emergence of linguistic structures characteristic of varieties in postcolonial settings is also closely scrutinised to theoretically establish and exemplify the relevance of additional principles and mechanisms complementary to the notion of structural nativisation as put forward by Schneider (2003, 2007) in the development of variety-specific structural profiles. In this synchronic corpus-based study, SLE is compared to IndE, a neighbour variety which has been identified as a linguistic epicentre for South Asia and can thus be assumed to exert epicentral influences on (the structures of) the varieties surrounding it (cf. Leitner 1992: 225), and British English (BrE), the historical input variety of SLE. The corpus data stem from the respective components of the International Corpus of English (ICE) and large-scale newspaper corpora. Online data complete the corpus environment, in which frequency-based structural profiles of SLE, IndE and BrE are contrastively delineated. As there is a relatively diverse selection of labels which have been assigned to various kinds of uses of English in Sri Lanka, an overview of the terms and their respective meanings is beneficial in clearly delineating different SLEes and eventually also the variety of SLE the present study will be mainly concerned with. According to Meyler (2007: xi), “Standard Sri Lankan English”, “Lankan English”, “Singlish”, “Tamlish” and “Sinenglish” are terms referring to the English language as it is used in Sri Lanka while Passé (1943: 64) also employs the expression “Ceylonese English”. However, these terms are by no means interchangeable and the selection of one term over another is not ultimately a matter of taste. First, it needs to be pointed out that Singlish as well as Tamlish are not local variants of English, but variants of Sinhala and Tamil respectively since, for instance, Singlish […] is as a linguistic system more Sinhala than English based i.e. Singlish would show a higher proportion of Sinhala phrases and sentences than English ones. In other words Singlish is a sub-variety of Sinhala, not a sub-variety of English.(Fernando 1977: 355)





The Lexis and Lexicogrammar of Sri Lankan English

The same relation holds between Tamlish and Tamil, which is why neither Singlish nor Tamlish can be understood as varieties of English. Furthermore, a clear difference between the concept of Standard SLE and Sinenglish needs to be stressed since the former should be considered an acrolectal variety of English whereas the latter is a mesolect (cf. Wickramasinghe 1999: 355). Ceylonese English as well as Lankan English both represent a Sri Lankan variety of English which displays localised features on the structural levels of language organisation (cf. Passé 1943: 64), but, in opposition to Standard SLE, these terms do not imply a certain degree of standardisation, social status or formality level. However, as the standards of SLE, which may not have even fully emerged yet, still lack description and, in consequence, also validation, the present study refrains from using the notion Standard SLE. The term Sri Lankan English is currently employed most frequently to refer neutrally to the Sri Lankan variant of English. It, however, implies that this variety of English is a homogeneous entity characterised by a relatively uniform set of linguistic features used across the island, but this certainly does not live up to the sociolinguistic reality of English in Sri Lanka. It is more adequate to consider SLE an umbrella term for various SLEes that have emerged as a result of the social, ethnic and demographic diversification of Sri Lanka and its speech community. Even within a small country like Sri Lanka, and even within the relatively tiny English-speaking community, there are several sub-varieties of Sri Lankan English. Sinhalese, Tamils, Muslims and Burghers speak different varieties; Christians, Buddhists, Hindus and Muslims have their own vocabularies; the older generation speak a different language from the younger generation; and the wealthy Colombo elite (who tend to speak English as their first language) speak a different variety from the wider community (who are more likely to learn it as a second language). (Meyler 2007: x–xi)

Thus, (sub)varieties of SLE can inter alia be established along the lines of ethnicity and religion. Other factors which have fostered for the emergence of additional varieties of SLE are medium, speaker age and proficiency. However, although these variation-inducing dimensions all seem relatively distinct, most of them can be traced back to what brought the English language to Sri Lanka in the first place – colonisation under British administration. The two most significant ethnic groups in Sri Lanka (both numerically and socially) are the Sinhalese and the Tamils with the former group generally acquiring Sinhalese as its first language and the latter being characterised by commonly learning Tamil first. As a consequence of this ethnic divide (and the linguistic background against which English is acquired), different varieties of SLE can be demarcated:

Chapter 1. Sri Lankan English and Sri Lankan Englishes

Sinhala speakers can drop Sinhala words into their conversation and make themselves understood, while Tamil words are less likely to be understood by non-Tamil speakers. […] Tamil speakers may do the same thing amongst themselves, but they cannot do so to the same extent outside their own language community.(Meyler 2007: xxviii–xxix)

In this context, the fact that these varieties do not exhibit a comparable degree of L1 influence deserves some more attention. The main reasons for this are the spread (and the (resulting) prominence) of the indigenous languages. As Sinhala is spoken by 74% of the Sri Lankan population (cf. Coperahewa 2009: 84–85), Sinhala terms can be expected to be understood more readily than their Tamil equivalents in English-medium discourse among Sri Lankans, thus rendering the inclusion of Sinhala loan words generally more functional than the integration of Tamil lexemes. The Sri Lankan subvariety of English with traces of Sinhalese is, according to Gunesekera (cf. 2005: 37), by and large more prominent than its Tamil counterpart and Burgher English, another ethnic variety of English with features from Portuguese Creole spoken by people of Eurasian decent in Sri Lanka. With regard to developmental prospects, it seems that [b]ecause of the given power relations between the Sinhala-speaking (and mostly Buddhist) majority and the Tamil-speaking (mostly Hindu) minority, the normdeveloping potential of L1 Sinhala speakers of English is, by definition, much higher in present-day Sri Lanka. (Mukherjee 2012: 196–197)

While the Tamils predominantly cluster in the northern and eastern parts of Sri Lanka with smaller groups of Tamils also in the central areas of the island, the majority of the Sinhalese live in the remaining parts of the island. Thus, one could in fact speculate that, originating from this settlement pattern, regional subvarieties of SLE might develop as Tamil can be expected to influence English more strongly in the northern and eastern provinces whereas Sinhala might exert a stronger influence than Tamil in the southern and western territories. Thus, the ethnic varieties as posited above seem to be anchored in different areas of the island, giving rise to a potential development of regional Sri Lankan varieties of English. Ethnic and religious social group boundaries tend to coincide in Sri Lankan society (cf. de Silva 1981: 500). While the majority of the Sinhalese community is Buddhist, the Tamil population, with its cultural and religious roots in southern India, leans more towards Hinduism (cf. de Silva 1981: 352). Subsequently, it might be the case that the Sinhalese and Tamil ethnic varieties of SLE are also to some extent marked by the largely distinct religious backgrounds of their speakers. This may find reflection in the fact that the Sinhalese-influenced Sri Lankan variety of English has a tendency to incorporate Buddhist lexical items relatively





The Lexis and Lexicogrammar of Sri Lankan English

frequently, but Tamil SLE might display more lexemes from Hinduism (cf. Meyler 2007: x–xi). It is interesting to observe that, mainly due to economically motivated decisions, the British colonial administration is very much responsible for the above linguistic diversification of English by (a) introducing English to Sri Lanka in the first place and (b) unconsciously bringing what turned out to be agents of (linguistic) diversity to Sri Lanka at a later stage. In 1815, after the British had captured the Sri Lankan island in its entirety, the colonisers started “to bring in more Tamil speaking people to work in the tea, coffee and coconut plantations” (Senaratne 2009: 26), and, consequently, the number of permanent Tamil residents increased. This, as argued above, was a stimulus for the diversification of SLE, and has also led to additional long-term linguistic as well as non-linguistic effects in many social spheres. SLE is not medium-restricted as it is used in both oral and written communication (cf. Coperahewa 2009: 90–91). To mention but a few spoken and written domains, SLE is used in universities and schools, broadcasting companies, courtrooms, letters, literary works and newspapers. Nevertheless, it would be misleading to assume that the spoken and written varieties are identical. The scarcity of systematic studies contrasting spoken and written varieties of SLE due to the lack of empirical data, however, allows only tentative statements about the differences that may exist. One significant feature of SLE is a more marked difference between the written and spoken language than in standard British English. This distinction is perhaps a reflection of the gulf which exists in Sinhala and Tamil between formal written language and the everyday colloquial language. (Meyler 2007: xiv)

Notwithstanding the question whether or not the difference between spoken and written varieties is more pronounced in SLE than in BrE, which, in the absence of empirical studies seems to be a claim rather than an observation, it can be deduced that the set of languages in a speaker’s linguistic repertoire might result in reciprocal linguistic effects (e.g. taking over medium-dependent conventions from one language to another). With regard to the distance between spoken and written texts in SLE, it has been argued that “adherence to archaic written norms” (Mendis & Rambukwella 2010: 188) as propagated by outdated school curricula and possibly induced via the indigenous languages, in particular in cases where English is acquired as a second or third language, gives the written variety of SLE its archaic flavour. In contrast to this, spoken SLE with frequent loans from the indigenous languages does not seem to stick as rigidly to exonormatively imposed standards. In consequence, the medium-dependent differences of SLE might partly be caused by the strictness with which users of SLE adhere to traditional norms of English.

Chapter 1. Sri Lankan English and Sri Lankan Englishes

Socio-political developments in the second half of the 20th century led to distinct environments in which the English language was used and taught in Sri Lanka. As a result of this, the acquisitional parameters and the degree of influence of various (exonormative) models varied across time leading to divergent speaker age-related varieties of SLE. The language spoken by today’s younger generation is very different from the language inherited from the British at independence. On the other hand, many of the features of ‘colonial’ English remain in SLE having fallen out of fashion in British English, which explains why some aspects of SLE seem outdated and formal to speakers of contemporary British English. Most significant perhaps was the ‘Sinhala only’ policy which led to English being dropped as the medium of education in Sri Lankan schools, and the emigration of most of the Burgher community and many other first-language English speakers. This resulted in what most people would regard as a general ‘lowering’ in the standard of English in the country. As Sinhala became the dominant language of education and government, its influence on SLE also increased. (Meyler 2007: xxvi–xxvii)

The generation of speakers of English in Sri Lanka who acquired and used the language from the 1940s to the 1960s were very much oriented towards BrE as a target variety and, as a direct outcome of the abolition of English in favour of Sinhala, employed English with decreasing frequency. From the 1970s onwards, however, the open market policy adopted by the government at that time and the increasing presence of American, Indian and Australian varieties of English via media and migration are, among other reasons, factors that can be held accountable for a distinct younger variety of SLE. In this context, the steady re-institutionalisation of English in the Sri Lankan education system certainly is another key aspect that promotes the adoption of English at an early age. The 1990s saw the introduction of government-sponsored interventions designed to strengthen the teaching of English in all state and private schools in which the medium of instruction was either Sinhala or Tamil. These interventions applied at all levels of the curriculum, from Grade 1 to Grade 13. Children were thus supposed to be exposed to English at a very early age. A policy of bilingual education came into practice in 2000, when English-medium instruction in science and mathematics subjects was introduced to selected schools at the secondary level (Grades 11 and 12). Around the same time, several of the faculties of arts and humanities in Sri Lanka’s universities which had either Sinhala or Tamil as a medium of instruction started considering the possibility of moving towards English-medium instruction. (Mendis & Rambukwella 2010: 185)

Thus, it is reasonable to assume that the disparate sociolinguistic sceneries in the middle of the 20th century and at the beginning of the 21st century gave and continue to give rise to distinct age-related varieties of SLE. The heavy reliance on BrE



 The Lexis and Lexicogrammar of Sri Lankan English

and the decrease of usage around the 1950s and 1960s contrasts with the influx of additional models of English and a boost of English language usage and education around the new millennium resulting in a generation gap in SLE. What adds another layer of complexity to the concept of SLE is that, in contrast to the commonly expressed perspective that SLE is a second-language variety of English, several proficiency-related varieties of SLE with distinct acquisitional parameters exist alongside each other.2 [T]he picture of English in Sri Lanka or Sri Lankan English as a postcolonial institutionalised second-language variety would be an oversimplification. It is certainly true that for many competent and regular users of English in Sri Lanka, the language is an additional language besides Sinhala or Tamil as their mother tongue. And it is also true that in many structural and functional regards, the variant of English used by those speakers of English as an additional language is a classic case of an emerging New English variety in the Kachruvian ‘outer circle’. […] However, as in all other South Asian countries that once formed part of the British Empire, there is also a substantial group of speakers who only display a low proficiency in English and whose usage cannot be regarded as representing an institutionalised variety of English. What is more, there is also a distinct group of native speakers of English in Sri Lanka, including (but by no means being restricted to) the Burgher community (i.e. descendants of European colonists) […]. In essence, then, all three Kachruvian circles – relating to English as a native (ENL), second (ESL) and foreign language (EFL) respectively – are present in Sri Lanka today. (Mukherjee et al. 2010: 65)

At the same time, these proficiency-related varieties of SLE, which in certain ways correspond to a dialect continuum of acrolectal, mesolectal and basilectal SLE (cf. Strevens 1980: 67), correlate with a number of sociolinguistic parameters. The proficiency-related varieties of SLE can be regarded as the results of different constellations of the following interrelated factors: acquisitional parameters (e.g. access to and education in English), (resulting) English language competence and prestige. Acrolectal speakers of SLE generally have easy access to the English language. Either, English is used as the language of the home, as is the case with, for example, a restricted number of upper class families residing in the Sri Lankan urban centres

. The concept of English as a second language refers to those “[c]ountries […] where English is not spoken as a native language but where it has an important role as a means of communication within the country in the education system and/or the media and/or the government” (Trudgill 2003: 44). The notion of English as a second language is also related to the chronological order of language acquisition and does not necessarily imply certain or lower degrees of proficiency in English compared to the respective native language.

Chapter 1. Sri Lankan English and Sri Lankan Englishes

such as Colombo 7, where the “stereotypical residents […] are rich, Westernised, English-speaking, etc.” (Meyler 2007: 60), or competent ESL speakers formally adopt it early via the Sri Lankan education system.3 This early exposure to English generally results in a high degree of proficiency. Mainly due to the economic and political influence of the ENL speakers of Sri Lanka, the acrolectal variant of SLE is often considered most prestigious and is sometimes also referred to as “Standard Sri Lankan English” (Gunesekera 2005: 34) in the sense that it may provide a reference for which linguistic structures are locally acceptable and which are not. The further you travel – metaphorically, if not geographically – from ‘Colombo 7’, the greater is the influence of Sinhala and Tamil on the English people speak. Many Sri Lankans can claim to be bilingual in English and Sinhala or Tamil, but there are degrees of bilingualism, and many people speak English as their second language. The result is that, as well as the many Sinhala and Tamil words which have entered SLE, there are also many grammatical features which would be considered errors not only by speakers of British English, but also by educated speakers of Sri Lankan English. (Meyler 2007: xxvii)

Consequently, the mesolectal variant of SLE is clearly marked by a more visible influx of the indigenous languages, which, however, does not only lead to more structural variability. With the SLE mesolect, there is also the issue of where to draw the line between acceptable forms of SLE and mistakes stemming from limited proficiency in the English language partly caused by lack of exposure to it. For that reason, the mesolectal variety of SLE has often been referred to as “sub- standard or non-standard Sri Lankan English” (Gunesekera 2005: 36) or as an interlanguage (cf. Fernando 1985: 53), which is indicative of the lack of prestige of this variety encountered more often in the peripheral areas of the island. The nature of the distinction between the mesolectal and the basilectal variant of SLE is extremely fuzzy as both variants are marked by the influx of the indigenous languages, forms that would be regarded as errors from a prescriptive perspective and a lack of prestige. However, it may well be the case that the influence of the indigenous languages is more pronounced and that errors occur more frequently in the basilect than in the mesolect.

. Still, the urban-rural divide in relation to access to and quality of English language teaching should not be overlooked here. In contrast to the urban centres, which generally have the highest Sri Lankan standards concerning English language teaching, there is a lack of competent English teachers in the more peripheral areas of the island (cf. Coperahewa 2009: 94). In this context, Kumarasamy (2007) shows on the basis of empirical data that rural children generally rate their proficiency in English lower than their urban counterparts.



 The Lexis and Lexicogrammar of Sri Lankan English

Thus, proficiency in English seems to be the determining factor in identifying which subvariety of SLE a particular speaker uses. Kandiah elaborates that [t]he English forms and expressions issuing from the mouths and pens of Sri Lankans constitute, at all linguistic levels, a highly variegated range of linguistic elements. These extend from some incomprehensible and even, at times, apparently unsystematizable items at one end to items at the other that, except in phonology, appear to conform in most significant respects to the norms of St E [= Standard English]. The former are produced by those whose familiarity with the language is extremely limited, and they clearly have no significant contribution to make to the characterization of LnkE [= SLE] as a distinctive and viable linguistic organism. […] At first sight, it might appear that the items at the other end of the range too would, by their very faithfulness to the norms of St E have their usefulness limited from this point of view. […] [E]ven as they remind us of the original model out of which, in interaction with the native languages, LnkE emerged, they still belong in this new system. (Kandiah 1981b: 63)

The dialect continuum of SLE ranging from a comparatively unsystematic variant to a variety structurally close to, but still distinct from the notion of Standard English seems clearly related to the different proficiency levels of its speakers. However, what adds another layer of complexity to the matter is that, in addition to proficiency, contextual factors exert an influence on the choice of a particular (sub)variety of SLE in a given discourse setup and produce stylistic variation. The stereotype ‘Colombo 7’ family are likely to speak English as their first language, to send their children to international schools, to spend holidays visiting relatives abroad, and to be exposed to international English via the media, internet, etc. The English they speak is therefore likely to be closer to an international standard. They are also likely to have the ability to switch between ‘Sri Lankan mode’ and ‘international mode’ depending on the context they are in. (Meyler 2007: xxvii)

Consequently, it seems to be the case that proficiency is the prime determinant regarding the situating of a particular subvariety of SLE on the SLE dialect continuum, but contextually-triggered speech accommodation phenomena as described in the above quotation may, for example, cause speakers at the acrolectal end of the dialect continuum to resort to less acrolectal styles in given linguistic contexts. Against this background, it seems plausible that higher proficiency has the potential for a greater number of different styles. Thus, one might argue that the subvarieties on the dialect continuum of SLE encapsulate various context-dependent styles triggering a certain level of subvariety-internal heterogeneity. In line with this, it may sometimes be difficult to judge from certain pieces of textual evidence whether a speaker of SLE generally uses acrolectal or mesolectal SLE since there

Chapter 1. Sri Lankan English and Sri Lankan Englishes 

are styles that belong to both the lower end of the acrolectal variant and the upper end of the mesolect. The label SLE, thus, seems to be best perceived as an umbrella term for various subvarieties of English used in Sri Lanka, which form a tripartite dialect continuum exhibiting subvariety-internal styles with the potential of the acrolect and mesolect to partly overlap.4 In accordance with this definition, Figure 2 represents the SLE dialect continuum graphically. (Potential for) Stylistic Variation

Proficiency in English High

SLE

Acrolectal SLE

Mesolectal SLE

Basilectal SLE Low Figure 2. The SLE dialect continuum

The Sri Lankan Constitution provides a legal perspective on SLE. The de jure status of the English language in Sri Lanka is documented in the section on official languages. Chapter IV – Language Official Language. 18. (1) The Official Language of Sri Lanka shall be Sinhala. (2) Tamil shall also be an official language. (3) English shall be the link language. (The Constitution of the Democratic Socialist Republic of Sri Lanka; 〈http://www. priu.gov.lk/Cons/1978Constitution/Chapter_04_Amd.html〉 (17 October 2014))

. Note here that Fernando (cf. 1977: 348–349) also draws attention to fuzzy boundaries between the most proficient and second most proficient group in her classification of Sri Lankan bilinguals speaking English and Sinhala.

 The Lexis and Lexicogrammar of Sri Lankan English

This piece of legislation has become a source of linguistic controversy in relation to the legal status of English in Sri Lanka. While there appears to be unanimous scholarly agreement that English is de facto an official language of Sri Lanka as evident from its usage in a wide range of formal and informal domains (cf. K ünstler et al. 2009), the controversy originates from the question as to whether English should also de jure be regarded as a Sri Lankan official language. On the one hand, the fourth chapter of the Sri Lankan Constitution of 1978 and its amendments clearly list English as a link language under the heading “Official Language”, which could mean that – independent of the concrete status given to the individual l anguages – all languages listed under this heading should legally be considered official languages of Sri Lanka. This is the central argument of those in favour of arguing that “English was re-established as a de-jure official language” (Mukherjee 2012: 192) via this constitutional provision. On the other hand, it is obvious that, despite the constitutional heading “Official Language”, Sinhala and Tamil are explicitly defined as official languages in the respective clauses, which is evidently not the case with English and has led other scholars to the conclusion that English has “de facto status […] in administration and the judiciary” (Mendis & Rambukwella 2010: 184). Against the backdrop of these conflicting arguments, it has to remain unclear whether English is de jure an official language of Sri Lanka and only reliable legal advice will be able to resolve this issue, but the status of English as a de facto official language of Sri Lanka is undeniable.5 Given that no further explanation of the notion of English as a link language is provided in Article 18 of the constitution, this statute needs further clarification – in particular with regard to which groups of speakers English is supposed to link in the first place. The clauses put forward in Article 22 concerned with the Sri Lankan languages of administration provide some insights as to which functions are ascribed to English. (2) In any area where Sinhala is used as the language of administration a person other than an official acting in his official capacity, shall be entitled: (a) to receive communications from, and to communicate and transact business with, any official in his official capacity, in either Tamil or English;

. What is clear, however, is that this constitutional ambiguity has contributed to the perception that English is an official language of Sri Lanka in the local speech community. Mendis (2002) reports that approximately one out of ten Sinhalese and one out of three Tamil university students in Colombo think that English, Sinhala and Tamil are the official languages of Sri Lanka. In a comparable study based on 20 academics at the Open University of Sri Lanka, Raheem (2006) shows that more than every second Sinhalese informant and four out of ten Tamil informants considered English, Sinhala and Tamil to be Sri Lanka’s official languages as well.

Chapter 1. Sri Lankan English and Sri Lankan Englishes 

(b) if the law recognizes his right to inspect or to obtain copies of or extracts from any official register, record, publication or other document, to obtain a copy of, or an extract from such register, record, publication or other document, or a translation thereof, as the case may be, in either Tamil or English; (c) where a document is executed by any official for the purpose of being issued to him, to obtain such document or a translation thereof, in either Tamil or English; (3) In any area where Tamil is used as the language of administration, a person other than an official acting in his official capacity, shall be entitled to exercise the rights, and to obtain the services, referred to in sub paragraphs (a), (b) and (c) of paragraph (2) of this Article, in Sinhala or English. (The Constitution of the Democratic Socialist Republic of Sri Lanka; 〈http://www. priu.gov.lk/Cons/1978Constitution/Chapter_04_Amd.html〉 (17 October 2014))

Article 22 thus spells out that in regions of Sri Lanka where a person is not proficient in the local language of administration, i.e. either Sinhala or Tamil depending on the area, communication can take place in the respective other official language of administration or English. According to Mendis and Rambukwella (2010: 184), this “can certainly be read as the granting of some degree of official status or recognition to English”. In addition to that, Article 22 also provides clues in relation to the function and role of English as link language. As English can be used as a substitute alongside the respective other official language in case people are not proficient in the local language of administration, English clearly takes on the role of a medium of communication between speakers who do not share a first language – namely that of a lingua franca. Given that English is less strongly associated with any of the dominant ethnic groups in Sri Lanka than Sinhala or Tamil, one might assume that English as a link language is supposed to serve as a comparably neutral medium of communication between the two major ethnic groups in Sri Lanka (cf. Mukherjee et al. 2010: 65). While it is true that English is more neutral in terms of its connection to the Sinhalese and the Tamils than Sinhala or Tamil respectively, considering English a neutral means of communication in Sri Lanka would certainly be a delusion. As evident from the history of English in Sri Lanka depicted in Chapters 2.1 and 2.2, the English language – in particular in the first decades after its arrival at the end of the 18th century – was inextricably linked to and effectively used as a tool to spread Western values in particular via missionary schools (cf. Yogasundram 2008: 238), the remnants of which are still very much present today in the sociolinguistic association of English with a to a certain degree westernised lifestyle (cf. Meyler 2007: 60). From a present-day perspective, English – to the disadvantage of other local languages – is also perceived as “the most empowering language in contemporary Sri Lanka” (Coperahewa 2009: 94) since it provides access to well-paid jobs

 The Lexis and Lexicogrammar of Sri Lankan English

and higher social classes. What should not be overlooked either is that, as elaborated above in this chapter, the Sinhala-based subvariety of English may be more prominent than its Tamil counterpart (cf. Gunesekera 2005: 37) under consideration of e.g. the mere number of speakers of Sinhala as a first language in contrast to Tamil. For this reason, English cannot be regarded as a neutral language in Sri Lanka since Western values and elitism – together with certain degree of Sinhala – appear to be contemporary connotations surrounding it, but it is still probably the language which is most readily geared towards functioning as a link language from a sociocultural perspective. However, given that there is a substantial group of Sri Lankans who have limited knowledge of English (cf. Coperahewa 2009: 93), the practicality of offering and using English as a link language probably not only for administrative purposes remains to be evaluated. Acrolectal SLE as used by proficient speakers is the ideal link language. The acrolect is by definition a “variety or lect which is socially the highest, most prestigious variety in a social dialect continuum” (Trudgill 2003: 3). Thus, for Sri Lankan speakers (with distinct ethnic and linguistic backgrounds) to be on a par with each other, the acrolect is probably the most neutral and least face-threatening choice. In sum, then, it can be posited that SLE as a national variety of English is in fact a generic term for various SLEes existing alongside each other in the SLE speech community. The most central dimensions along which varieties of SLE can be established are ethnicity, religion, medium, speaker age and proficiency. However, these dimensions are not independent from each other since, for instance, the notions of ethnicity and religion are closely intertwined in that a certain ethnic background is in many cases also indicative of a particular religious orientation, which, in turn, can be related to certain geographical regions of Sri Lanka. McArthur (2002: 330), in a similar vein, puts forward that “English in Sri Lanka has a range of subvarieties based on proficiency in its use and the language background of its users” and thus stresses the salience of proficiency and of speakers’ linguistic repertoires in the framework of SLEes again. Due to the sociolinguistic relevance of the acrolectal variety of English in Sri Lanka and the systematicity and regularity that is to be expected from it, the present study will mainly be concerned with this variety of SLE. This acrolect is adequately described as the variety, or set of closely related varieties, which enjoys the highest social prestige. It serves as a reference system and target norm in formal situations, in the language used by people taking on a public persona (including, for example, anchorpersons in the news media), and as a model in the teaching of English […]. (Kortmann & Schneider 2008: 2)

In particular, the study at hand will focus on the written acrolectal variant of SLE as it is used in a wide array of text genres. Due to its prestige and spread, writing

Chapter 1. Sri Lankan English and Sri Lankan Englishes 

of proficient users of SLE is of special interest since the written texts composed by these users (e.g. novels, newspaper articles, scientific publications) may (a) exert a standardising function given the absence of standard reference works such as variety-specific dictionaries or grammars (cf. Schilk 2011: 47) and (b) represent the variant of SLE most likely to be standardised (cf. Bernaisch et al. 2011: 1).6 After this introduction to the linguistic complexities to be encountered in conceptualising SLE and the identification of written acrolectal SLE as the object of the present study, historical and present-day perspectives on the development of SLE will be offered in Chapter 2. In the framework of Schneider’s (2003, 2007) dynamic model of the evolution of postcolonial Englishes (PCEs) and an examination of the developmental status of SLE in this model, SLE will be depicted in conjunction with perspectives on Sinhala and Tamil, both of which are closely connected to the respective ethnic groups. Chapter 2.1 will focus on SLE under British colonial rule from 1796 to 1948 and capture the changing face of, motivation behind and consequences of the British involvement in systematically institutionalising the English language in Sri Lanka. The postcolonial era after Sri Lankan independence from the British is characterised by separatist movements with the Sinhalese as well as the Tamils and this unfortunately not only sociolinguistically turbulent period marked by struggles for political power and recognition in the history of Sri Lanka will be described with a particular focus on its significance for SLE in Chapter 2.2. Chapter 2.3 will provide contemporary perspectives on (a) the identity construction of the Sri Lankan population in Chapter 2.3.1, (b) earlier research conducted into SLE in Chapter 2.3.2 and (c) current political campaigns related to SLE in Chapter 2.3.3. Chapter 3 will describe and critically evaluate the corpus environment and the structural objects of investigation. The corpus data consist of (a) the written Sri Lankan, Indian and British ICE components, (b) newspaper texts from Sri Lanka, India and Great Britain and (c) online data retrieved via Googlebased searches in the respective country domains. The structure of the data as well as its assets and drawbacks relevant to the study at hand will be shown in Chapter 3.1. In said corpus environment, three lexical and three lexicogrammatical features will be scrutinised and they are presented in Chapter 3.2. The lexis of SLE will be analysed since the occurrence of certain individual lexemes or groups of lexemes in written texts (e.g. place names, proper nouns, references to food, etc.) provides comparatively overt indications of text provenance, which

. The term SLE will primarily refer to this written SLE acrolect – especially in Chapters 4, 5 and 6 presenting empirical findings and a resulting model – while the term may on occasion also less restrictively denote English in Sri Lanka, in particular when earlier studies on SLE by various scholars with different definitions of SLE are under discussion.

 The Lexis and Lexicogrammar of Sri Lankan English

is why v ocabulary-related studies can be expected to produce relevant insights into variety-specific structural differentiation on the word level. The three groups of lexical items examined are formality markers, pan-South Asian English lexemes and archaism markers and each of these lexical groups will be described in Chapter 3.2. Chapter 3.2 also presents the lexicogrammatical features examined, i.e. particle verbs, light-verb constructions and verb-complementational patterns of HATE, LIKE and LOVE, with which it can generally be expected that varietyspecific differences will manifest themselves mainly in disparate quantitative preferences and not in truly c ategorical differences (cf. Mukherjee 2007: 175). The analytical chapters are organised according to the individual lexical and lexicogrammatical objects of investigation. Chapter 4 covers the lexical analyses and depicts the frequency, genre-specificity and relevant case studies of formality markers in Chapter 4.1, pan-South Asian English lexemes in Chapter 4.2 and archaism markers in Chapter 4.3. The lexicogrammatical perspectives on SLE also including frequency- and genre-related approaches in Chapter 5 are constituted by examinations of particle verbs in Chapter 5.1, light-verb constructions in Chapter 5.2 and verb-complementational patterns in Chapter 5.3. In the light of the distinctive structural profile of SLE that Chapters 4 and 5 will delineate, Chapter 6 will scrutinise its potential origins with a particular focus on the language contact situation in which SLE, but also PCEs more generally have evolved and continue to do so. It is via this situation of multiple language contact that conservative and progressive forces emerge and influence the development of PCEs. The model of (the emergence of) distinctive structural profiles of semiautonomous varieties of English describes how these relatively abstract forces can have an impact on concrete structural realisations and, against this background, provides novel and complementary perspectives on the notion of structural nativisation as proposed by Schneider (2003, 2007). Chapter 6 will also discuss potential avenues for future research into SLE and end with some concluding remarks.

chapter 2

The development of Sri Lankan English The development of SLE will be presented in three parts. The first section in 2.1 traces English from when it was brought to Sri Lanka in 1796 until Sri Lankan independence in 1948 while the second part in 2.2 focuses on its postcolonial development. This chronological account is complemented by a present-day perspective on central sociocultural, sociolinguistic and sociopolitical developments in 2.3 including the controversial discussion on the status and role of SLE in Sri Lankan society. This largely historical description is set against the background of tracing the emergence of SLE in Schneider’s (2003, 2007) dynamic model of the evolution of PCEs, which has attracted much scholarly attention due to its novel theoretical perspectives on the spread and development of English on a global scale as well as its wide practical applicability. Sociohistorical events and sociolinguistic characteristics salient to an evaluation of the evolution and present-day developmental status of SLE, which will be discussed towards the end of Chapter 2.3, will thus be related to and presented in the light of Schneider’s (2003, 2007) model, the central dimensions and characteristics of which are briefly elaborated on here. While earlier descriptions and models mainly focussed on highlighting structural differences between individual regional varieties to create an awareness of what is today commonly known as World Englishes (cf. e.g. Kachru passim), Schneider’s (2003, 2007) model stresses their developmental commonalities. Despite all obvious dissimilarities, a fundamentally uniform developmental process, shaped by consistent sociolinguistic and language-contact conditions, has operated in the individual instances of relocating and re-rooting the English language in another territory, and therefore it is possible to present the individual histories of PCEs as instantiations of the same underlying process. More specifically, it is posited that evolving new varieties of English go through a cyclic series of characteristic phases […] determined by extralinguistic conditions. Individual countries in which PCEs are spoken are regarded as positioned at different phases along this cycle, an explanation which accounts for some of the differences observed in the shapes and roles of PCEs. (Schneider 2007: 5)

As implied by the above statement, this by no means entails the denial of variety- specific differences that certainly exist between national varieties of English.

 The Lexis and Lexicogrammar of Sri Lankan English

It merely puts parallel pan-varietal diachronic developments into the centre of attention. The model is organised along two central dimensions: (a) the sequence of five developmental evolutionary stages and, at each of these stages, (b) identity-related considerations as regards the major speech communities, i.e. the indigenous population and the settler community, involved in the process of variety-formation (cf. Schneider 2007: 30). The assumption that, over time, both speech communities (or “‘strands’ of communicative perspective” in Schneider’s (2007: 31) terminology) rewrite their identities and linguistically accommodate is presented as the driving force behind variety formation and argued to be the reason for the emergence of characteristic features of national varieties of English (cf. Schneider 2007: 29–30). The diachronic emergence of PCEs is depicted as “a typical developmental scenario, [in which] the history of PCEs can be described as a sequence of five distinct phases, labeled ‘Foundation,’ ‘Exonormative stabilization,’ ‘Nativization,’ ‘Endonormative stabilization,’ and ‘Differentiation […]’” (Schneider 2007: 6). To each of these phases with fuzzy boundaries, the following descriptive parameters apply (cf. Schneider 2007: 30–31): –– extralinguistic factors (e.g. historical events and political situations) –– characteristic identity constructions in both communicative strands –– sociolinguistic determinants (conditions of language contact, language use and language attitudes) –– structural effects in the varieties concerned. What is important here is that Schneider (2007: 30) assumes that a “monodirectional causal relationship” holds between the individual parameters with (variety-specific) structural characteristics thus being the end product of certain configurations in the preceding parameters. These structural features develop via the process of structural nativisation, i.e. “the emergence of locally characteristic linguistic patterns” (Schneider 2007: 5). The attractiveness of the concept of (structural) nativisation is put in a nutshell in the following statement: The notion of nativization […] has attracted particular attention because it is a concept that bridges the gap between the norm-producing inner circle and the norm-developing outer circle and because it has helped to establish New Englishes as full-fledged varieties besides the native varieties of Englishes in the British Isles, North America, South Africa, and the Pacific Rim. (Mukherjee 2007: 160)

Structurally nativised (or localised) features in the respective varieties of E nglish are described as originating in the language use of individuals in narrowly circumscribed functional and social contexts due to e.g. cultural particularities

Chapter 2. The development of Sri Lankan English 

(cf. Olavarría de Ersson & Shaw 2003), structural rules of simplification or (over) generalisation to avoid exceptions (cf. Williams 1987) or an orientation towards a particular language type (cf. Mesthrie 2006). In the course of time, however, the originally idiosyncratic use of certain structures becomes less restricted and the features permeate into larger spheres of the speech community concerned. Via continued and increasingly frequent use, these features become an integral part of the linguistic make-up of the English language in this specific country, thus turning into markers of this newly-emerging variety of English through the process of structural nativisation (cf. Schneider 2007: 85–86). When attempting to trace (the development of) structurally nativised features in varieties of English, it is recommendable to examine high-frequency items at the lexis-grammar interface as this is the structural level on which early linguistic phenomena of the “indigenization of language structure mostly occur” (Schneider 2007: 46). The intersection between lexis and grammar is argued to display innovative and thus potentially distinctive features in the sense that “certain words but not others of the same word class prefer specific grammatical rules or patterns. The patterns as such are not new, nor are the words, but what is novel is the habitual association between them in specific varieties” (Schneider 2007: 83). These co-occurrence phenomena are not limited to basilectal or mesolectal variants of PCEs, but they also form part of the respective acrolects. However, with acrolects, nativised variety-specific features may tend to manifest themselves rather in quantitative preferences than in categorical differences and thus escape speakers’ attention (cf. Schneider 2007: 82–83), but they can nevertheless be delineated on the basis of corpus data.1 In sum, Schneider’s (2003, 2007) macrosociolinguistic model describes the presumably analogous evolution of PCEs on a high level of abstraction (cf. Mukherjee 2007: 160). It can be distinguished from former representations of varieties of English in that it is holistic (it is argued to be applicable to all

. In corpus-based research on South Asian Englishes, both quantitative as well as categorical differences have been documented and used to highlight variety-specific structural profiles. On the one hand, studies on the verb-complementational profiles of ditransitive verbs such as GIVE and OFFER (e.g. Mukherjee 2008; Bernaisch 2013) show that there are no categorical differences between the complementational patterns verbs enter in SLE as opposed to BrE and IndE. Nevertheless, there are noteworthy quantitative and statistically significant differences across the varieties reflecting their structural distinctiveness, which may have been caused by variety-specific collocational preferences or restrictions with the verbs studied. On the other hand, Lange (2007) is able to show that, in addition to its functions as a reflexive pronoun or an intensifier, itself is also used for presentational focus marking in IndE – a qualitative characteristic not present in BrE.

 The Lexis and Lexicogrammar of Sri Lankan English

v arieties of English – including pidgins and creoles – and not to isolated cases only (cf. Schneider 2007: 311)), dynamic (cf. Schneider 2007: 313) and innovative in adopting perspectives rooted in relevant speech communities and not in the abstract concept of nation (cf. Schneider 2007: 313). After this concise overview of the central pillars of Schneider’s (2003, 2007) model, one related aspect of relevance to the argument of the present study at later stages – in particular to that in Chapter 6 where the process of structural nativisation in SLE will be modelled in detail – will be discussed briefly, i.e. the negotiation of conservative and progressive forces in varietal development. Schneider (cf. 2007: 313) calls attention to the predictive power of his framework when he argues that, eventually, each variety has the potential to arrive in the phase of differentiation unless unpredictable sociohistorical events lead to fossilisation, i.e. “the situation that the development of English along the developmental cycle simply stops somewhere along the road” (Schneider 2007: 57).2 The notion of fossilisation implies that the dynamics involved in varietal maturation have disappeared causing a halt in the evolutionary progress. While Mukherjee (2007: 173) also conceives these halts in evolutionary progress in the sense that a variety remains at a certain evolutionary stage for a considerable amount of time without any strong indication of further development, he considers them “steady states as stable equilibria”, in which evolutionary dynamics, however, remain intact, but do not lead to evolutionary progress along the developmental cycle since conservative and progressive forces are constantly negotiated. In his description of IndE, Mukherjee (2007) puts forward that its current status is characterised by both conservative (e.g. the retention of traditionally BrE vocabulary items) and progressive forces (e.g. the adoption of Indianisms in the lexicon of IndE), which causes IndE to momentarily not advance any further given that the effects of conservative and progressive forces cancel each other out. However, this conceptualisation of steady states entails – in contrast to Schneider’s (cf. 2007: 57) notion of fossilisation – the potential resumption of varietal development (in either a conservative or progressive direction) at some future point. The notion of a steady state thus represents a valuable conceptual contribution in that it (a) reconciles Schneider’s (2003, 2007) goal-oriented perspective on variety development with his concept of fossilisation (cf. Schneider 2007: 57) and (b) highlights the constant negotiation between progressive and conservative forces in the evolution of PCEs, on the basis of which evolutionary progress should not be interpreted as the absence of conservative

. Contrastively, Moag (1982), in his life cycle of non-native Englishes, models restrictions of the usage of non-native varieties to certain domains as the default developmental end and also considers their deaths possible developmental endpoints.

Chapter 2. The development of Sri Lankan English 

forces, but rather as the effects of progressive forces outweighing those of the conservative ones at given points in time. When applying Schneider’s (2003, 2007) framework to SLE (and South Asian Englishes more generally), one needs to consider whether this model applies as readily to varieties with a historically only relatively small settler community as to “settler-strand-dominated varieties such as American and A ustralian English” (Mukherjee 2007: 173). In this regard, Schneider (2007: 311) argues that “the entire range of contributing effects” concerning identity construction and subsequent linguistic accommodation phenomena are always in operation in the relevant territories independent of colonization type and size of settler community. In former exploitation colonies in South and South-East Asia, the STL [= settler] strand is often demographically weakened, or even almost completely removed, with the return of colonial administrators after independence, but the effects and attitudes generated by them linger on and remain effective. Factors like the appreciation of English, its persistent presence with important functions, and the desire to maintain contacts with the former colonial power and to participate in international communication have the same effect as the physical presence of large numbers of English speakers. (Schneider 2007: 42)

The perpetual contact with the former colonial power (e.g. via frequent businessrelated contacts, but also via more symbolic occasional visits of the British Royal Family to Sri Lanka) and the institutionalisation of BrE in an education system largely set up according to British models (cf. Künstler et al. 2009: 69) are inter alia central factors which, at least to a certain degree, compensate for the physical absence of permanent British residents in Sri Lanka. It is for these reasons as well as the continuous psycholinguistic relevance of BrE in the SLE speech community (cf. Bernaisch 2012) that the application of Schneider’s (2003, 2007) model to SLE can be justified. In this regard, the transition from phase 3 (nativisation) to phase 4 (endonormative stabilisation) is probably most significant along the developmental cycle of an emerging ESL variety since the completion of this transitional process marks the establishment and recognition of a new, full-fledged and linguistically selfdependent variety of English. This change in evolutionary status is also terminologically marked in that “the difference between phases 3 and 4 is commonly given symbolic expression by substituting a label of the ‘English in X’ type by a newly coined ‘X English’” (Schneider 2007: 50). This transition will be at the centre of attention in terms of the evolutionary status of SLE, but reference will also be made to SLE in relation to (the completion of) earlier phases in Schneider’s (2003, 2007) model in the description of the history of SLE.

 The Lexis and Lexicogrammar of Sri Lankan English

2.1 Sri Lankan English under British colonial rule (1796–1948) The first advances of the English language to appear on the linguistic surface of Sri Lanka find their origin in 1796 when the maritime provinces of what was then Ceylon came into contact with the East India Company (cf. de Silva 1981: 210; Gunesekera 2005: 11). This undoubtedly economically motivated step – for the Sri Lankan cinnamon business was highly lucrative at that time – marks the foundation of the Sri Lankan variety of English. The trade with cinnamon, for which the East India Company was guaranteed a monopoly, was expanding rapidly and yielded high profits at the beginning of the 19th century making it one of the main motives for the East India Company to get involved in Sri Lanka (cf. de Silva 1981: 210). Due to this predominantly monetary incentive, the presence of the British did not initially seem as if it would become a long-term occupation of the island (cf. de Silva 1981: 211), which is why it can be assumed that the identity construction in both the settler and the indigenous strand was limited to an acknowledgment of the existence of the other. Consequently, the language contact between the two speech communities was probably restricted to the most basic communicative purposes, which leads to the assumption that only toponymic lexical items were likely to have been adopted in the then form of English since “anybody who is new to a region will ask for names of places and landmarks and accept them as naturally true, as the names which these localities simply ‘have […]’” (Schneider 2007: 36). The East India Company territories were incorporated into the British Crown Colony of Sri Lanka on 1 January 1802. Except for Kandy, which had also been successful in posing resistance to former Portuguese and Dutch colonial powers, the British took over the government of the whole island, which consequently promoted the English language in Sri Lanka to a powerful and prestigious position (cf. Senaratne 2009: 26). In retrospect, it can be argued that English became de facto the official language of Sri Lanka when the East India Company first entered the country in 1796 (cf. Gunesekera 2005: 14) since, from this point in time onwards, “English began to be used as the medium of communication in higher-level domains such as administration, education, the legal system, and commerce” (Samarakkody & Braine 2005: 147). At this initial stage of introducing English to the island, the language was taught to the locals as a variety of BrE – an English seminary founded by Governor Frederick North in 1799 was one of the earliest English-medium institutions established to teach the British variety of English (cf. Gunesekera 2005: 15). This may have led to the present-day rejection of BrE (as a target variety and teaching model) in favour of SLE or an international standard variety by a selected circle of Sri Lankans (cf. Algama 2008; Seneviratne 2010).

Chapter 2. The development of Sri Lankan English 

On 2 March 1815, the British Crown Colony of Sri Lanka succeeded in formally annexing the Kandyan kingdom (cf. de Silva 1981: 230) and, thus, in bringing the whole island under British control. The fall of Kandy was preceded neither by a societal nor economic collapse, but, at the onset of the 19th century, the K andyan king, Sri Vikrama Rajasinha, faced continuous struggles with the chiefs in the ruling hierarchy (cf. de Silva 1981: 230), which eventually caused him to flee the city in the company of his family. The capture of Sri Vikrama Rajasinha, the last king of Sri Lanka, on 18 February 1815 by the British completed the colonisation of the whole of the island. Hereafter, all official business was conducted in English (cf. Gunesekera 2005: 15), which supported the status of English throughout the island.3 However, at that time, it was not only English which altered the balance of the linguistic forces in Sri Lanka; in order to maximally profit from the natural resources on offer, the British started relocating Tamils (mainly from South India) to Sri Lanka to have them work on plantations (cf. Senaratne 2009: 26) and, as a consequence, considerably increased the number (as well as their political need for governmental recognition and representation) of the people of this ethnolinguistic group. What added another facet of complexity to the linguistic situation is that in 1816 and 1817, Protestant missionaries from America set up teaching facilities (either as self-regulating facilities or as additions to already existing public buildings such as hospitals) in the area around Jaffna (cf. Yogasundram 2008: 286). Early examples of American involvement in the teaching of English in Sri Lanka are the foundation of the hospital with integrated classrooms in Tellipalai in 1817 and the establishment of boarding schools for boys such as the Batticotta Seminary in Vaddukodia in 1823 (cf. Jayawardena 2003: 205). It has been claimed that some phonetic features of American English, which are said to find their origin in these teaching facilities, can be attested in the present-day English of Sri Lankan speakers from Jaffna and Batticaloa (cf. Gunesekera 2005: 38–39). Because of the lack of a consistent governmental policy regarding education, the relatively small number of American missionaries combined with the large number of British missionary schools played a vital role in providing education as they had managed to establish

. As Sri Lanka in its entirety was only taken over by the British in 1815 after they had landed on the island in 1796, there is no unanimous agreement as regards the beginning of English as an official language in Sri Lanka. Some scholars (cf. e.g. Gunesekera 2005: 14) date it to the year 1796 since, in fact, English was used as an official language for the majority of territories in Sri Lanka from then on while other researchers (cf. e.g. Coperahewa 2009: 77) point to 1815 as the beginning of English as an official language. In this context, Yogasundram (cf. 2008: 286) claims that English became the sole official language of Sri Lanka as late as 1826 since he argues that, from this point in time onwards, it was mandatory to be proficient in English to obtain even a minor post in administration.

 The Lexis and Lexicogrammar of Sri Lankan English

a “rudimentary organisational structure” (de Silva 1981: 252) for education, which came into being in the 1830s. As a result of the domination of anglo-oriented teaching institutions, English, though not by law, became a medium of instruction at that time (cf. Thirumalai 2002), which fostered minority bilingualism including an indigenous language and English. The establishment of the Colebrooke-Cameron Commission and the realisation of the recommendations of the related so-called Colebrooke Report marked a transition in governmental policy from a comparatively uncommitted laissez-faire attitude in favour of a stronger involvement of the British as regards education and language planning and policy. The Colebrooke Report of 1831 – 1832 is a landmark in the colonial administration of the island […]. In effect, the Commissioners were recommending that similar institutions to those that already existed in Britain be established in Sri Lanka with the intention of converting a static, feudal society that had stagnated into a more vibrant and economically viable one – on the British pattern. This was, in fact, the commencement of a planned westernization of the country. […] The English language was to be the main tool for this conversion […]. He [= Colebrooke], like many other colonial administrators of the time, had great faith in the civilising influence of English. (Yogasundram 2008: 238)

In response to the commission’s suggestions, English was officially made the medium of instruction and the introduction of English as the language of administration, education and the courts was formally laid down (cf. Coperahewa 2009: 97) as had already been suggested earlier. A lack of dedication, however, had formerly prevented this systematic implementation. This change in attitude by the government might have also been fostered by the urgent need to develop efficient channels of communication. As the British administration was not proficient in any of the indigenous languages, mediators who could convey official resolutions to the local population had to be trained to ensure the governmental ability to act. As a consequence of the Colebrooke-Cameron Commission, “a native elite […] proficient in the official language of English for the Civil Services” (Kumarasamy 2007: 40) emerged.4 This strengthened the tie between the English language and high- status positions in society as well as in the labour market. The prestige associated . The Colebrooke-Cameron Commission’s report of 1831–1832 can be regarded as a less radical anticipation of Thomas Babington Macaulay’s 1835 minute for India (cf. Coperahewa 2009: 97–98). Although both documents stress the need to develop local elites via a transplanted British education system – a need which might have been more urgent in India because of a greater range of languages and dialects used – and are consequently in line with imperial thinking, Macaulay’s minute has been argued to be less accommodating as regards indigenous languages and dialects.

Chapter 2. The development of Sri Lankan English 

with the English language could be observed with westernised English-educated Sinhalese and Tamil families starting to adopt English as the language of the home, which came at the cost of downgrading the indigenous languages (cf. Coperahewa 2009: 98). It should not come as much of a surprise that colonial policies in general and some administrative decisions taken in the area of language planning and policy in particular met resistance in the local population. A “[f]irst revolt against English rule” (Gunesekera 2005: 15) is reported to have taken place in 1818 and the Kandyan rebellion of 1848 was a result of the frustration of the socially and vocationally marginalised local population, which was deprived of access to well-paid jobs (partly because of their lack of proficiency in English). In the light of the consequences of the Colebrooke-Cameron Commission, the functions of and attitudes towards the English language in the 1830s were manifold. On the one hand, English was seen as granting access to modern (Western) ways of thinking and technical innovations from Europe (cf. de Silva 1981: 479), which could make significant contributions to Sri Lanka’s social and economic prosperity with the latter development probably being in the centre of interest of the colonial power. On the other hand, it almost goes without saying that officially promoting English to the extent described above naturally finds reflection in the depreciation of the other languages of the respective speech community. Although English is said to have “revitalised the indigenous languages, Sinhalese and Tamil alike, as profoundly as Sanskrit had done in the past, and perhaps even more so” (de Silva 1981: 479), the number of domains in which Sinhalese and Tamil were used on a regular basis was effectively reduced. In the decades following the Colebrooke Report, English education was mainly coordinated by British missionary schools. Locals who had a command of the English language gained access to lucrative jobs, which, in conjunction with the development of a capitalist economy in the second half of the 19th century (cf. Yogasundram 2008: 281), resulted in an increasing affluence of a growing middle class which stood in stark contrast to the deprived majority of the people. Consequently, an education in British missionary schools was generally considered highly attractive. Still, it has to be born in mind that the education offered in missionary schools was not of a secular nature; it brought along Christian values and, as a result of this, an English education by design also implied acceptance (or at least toleration) of Christian belief. This ecclesiastical load of English at that time had a detrimental effect on e.g. the Muslims, a relatively conservative and cohesive community then, since they rejected the English language on the grounds of religion and, consequently, “by the third quarter of the nineteenth century the more enlightened Muslim leaders were profoundly disturbed to find their community sunk in ignorance and apathy, parochial in outlook and grossly materialistic” (de Silva 1981: 354).

 The Lexis and Lexicogrammar of Sri Lankan English

In retrospect, the Colebrooke-Cameron Commission, which is nowadays inter alia evaluated as a “planned westernization of the country” (Yogasundram 2008: 238), can thus be regarded as ushering in the phase of exonormative stabilisation, i.e. the second phase in Schneider’s (2003, 2007) framework, given that the new English-educated native elite was taught a BrE standard (cf. Gunesekera 2005: 51) – a clear sociolinguistic indication of exonormative orientation. Via the related introduction of English in administration, education and the courts, the Commission thus fostered a “local-plus-British” (Schneider 2007: 55) identity in the indigenous speech community, although evidence of a similarly systematic identity-related convergence of the settler community towards the indigenous population is not available. The Glossary of Native and Foreign Words occurring in Official Correspondence and Other Documents released in 1869 (cf. Gunesekera 2005: 84), however, attests that lexical items from the indigenous languages started to be integrated more readily into English texts and may thus be considered a first indication of the beginning of structural nativisation on the word level, but given the need of a glossary for the respective words in the first place, their currency and degree of institutionalisation in an emergent SLE must be assumed to have been relatively limited. As a reaction to several ethnic groups being put at a disadvantage due to the dominance of English, the government allocated funding for the establishment of free-of-charge vernacular schools to provide at least basic education for the majority of the Sri Lankan population (cf. de Silva 1981: 329). In fact, the government not only established vernacular schools, but also allocated funding for the establishment of public English and bilingual schools (cf. Yogasundram 2008: 287). However, the English and bilingual schools, which taught in a vernacular language and English, levied fees hindering less affluent Sri Lankans from attending. This governmental attempt at providing access to schools to the entire Sri Lankan population met little sympathy with the ruling elite as a broad education was considered to be a potential threat to existing hierarchies, which, however, never came into being (cf. de Silva 1981: 332). There is ample proof that not education in general, but a certain degree of proficiency in English and a corresponding westernised lifestyle were the ticket to elite circles in Sri Lanka in the second half of the 19th century (cf. de Silva 1981: 332–333). Against this background, it is important to note that ethnic dissimilarities crystallise in the acceptance of English education and ways of living. While the Sinhalese readily adopted the English language and the westernised lifestyle, the Tamils, though eager to learn the English language, turned their back on European customs and mainly stuck with Hindi tradition (cf. de Silva 1981: 351). At the beginning of the 20th century, manifestations of nationalist currents against British rule re-surfaced – in particular among the Tamil community in

Chapter 2. The development of Sri Lankan English 

the north of the island. Tamil students, who generally did not show much interest in an anglicised lifestyle based on Western values, had developed a routine of going to India to pursue their studies. In the course of their stays, “[t]hey absorbed the political influences at work in India, and on their return sought to stimulate political activity in the island on the lines of Indian political movements” (de Silva 1981: 369). Nationalist pressures gained momentum in the first decade of the 20th century and, in 1915, culminated in a violent uprising against the languagebased “‘divide and rule’ policy” (Gunesekera 2005: 15) of the British privileging a small Sri Lankan elite. The nationalist tendencies found their continuation in the Swabhasha movement of the 1920s, an initiative to replace English as the official language of the country with Sinhalese and Tamil. The proficiency rates of the Sri Lankan population in English give an indication as to why English was perceived as the language of a small elite in the 20th century. In 1921, 3.7% of the population were literate in English, in 1946 6.5% and in 1953 9.6% (cf. C operahewa 2009: 92). Although a steady increase in relative terms can be observed, it is evident that only a minority of the population was able to function in English. Against this background, it seems only natural that there was a relatively strong demand for the local languages to become official languages of the country. It is all the more surprising that, in an environment that hostile to British rule as well as its language, the first novel in English written by a Sri Lankan was published in 1917 (Lucien de Zilwa’s The Dice of the Gods) and the first novel in English written by a Sri Lankan female followed only a couple of years later in 1928 (Rosalind Mendis’ The Tragedy of a Mystery: A Ceylon Story), which could be interpreted as one of the first signs of cultural accommodation of the English language in Sri Lanka (cf. Goonetilleke 2005: 9). Yet, in the face of a relatively restricted number of potential Sri Lankan readers and buyers, the books were published in London, England. Despite the challenge posed by the struggle to establish Sinhala and Tamil as official languages, English in Sri Lanka showed traces of advancement as regards (creative) performance and teaching facilities. At University College Colombo, which is the University of Colombo today, the Department of English worked at a comparatively high level under the auspices of the first Sri Lankan to become professor of English, E.F.C. Ludowyk, in 1936 (cf. Goonetilleke 2005: 200–201). Not only in tertiary education did English gain ground; state-owned Englishmedium schools were opened in rural areas under the Central School scheme in the 1940s, which consequently aimed at providing a more adequate education to those who had formerly been deprived of it and at ridding the English language from its (possibly negative) urban connotations (cf. de Silva 1981: 474). In addition to that, in 1943, the Special Committee on Education recommended that English should be made a compulsory subject in vernacular schools from grade

 The Lexis and Lexicogrammar of Sri Lankan English

3 onwards, but the project, in spite of having increased the number of learners of English, eventually failed because of the lack of sufficient qualified English teachers (cf. Samarakkody & Braine 2005: 149). When it comes to the performance in English in the field of creative writing, English slowly started to establish itself, while in journalism, “the performance was truly professional – the style clear and sharp, the comments incisive and, on political issues, often witty and iconoclastic” (de Silva 1981: 481). A few years before Sri Lankan independence in 1948, a sense of self-governance started to be felt with regard to language planning and policy in that bills to replace the language of the colonisers as the official language with the two widely-used local languages Sinhala and Tamil entered parliament. In 1943–1944, J. R. Jayewardene introduced a resolution in the State Council that Sinhalese alone should replace English. However, the Sinhalese and Tamil parties worked out a compromise based on which the Council passed a resolution that English be replaced by Sinhala and Tamil as the official languages of the nation. (Thirumalai 2002)

The above resolution “was more fundamental than all the previous attempts and led to a final decision by the State Council to make the ‘national languages’ – Sinhala and Tamil – the official languages of the country” (Coperahewa 2009: 105). At this point, it is noteworthy that, initially, only Sinhala, the majority language of Sri Lanka, was proposed as the language to replace English as the official language, which is indicative of a tendency to deny Tamil official status. This denial became the law with the Sinhala Only Policy in 1956. Interestingly, S.W.R.D. Bandaranaike, the Prime Minister of Sri Lanka who passed the controversial Sinhala Only Policy in his term of office from 1956 till his assassination in 1959, was in favour of promoting Sinhala as well as Tamil to official status in the 1940s (cf. de Silva 1981: 472). This goes to show that, in Sri Lanka, language planning and policy have been exploited to gain political power hazarding the consequences for the speech communities affected. On the basis of the decision to make Sinhala and Tamil official languages after independence, a select language committee was established in 1946 to sketch the necessary steps for a practicable transition from English to Sinhala and Tamil. Owing to the need to structurally and functionally expand both Sinhala and Tamil in order to gear them towards usage in the public sector, which prohibited an immediate transition from English to the native languages, the committee devised a ten-year program aimed at ensuring a smooth replacement of English by 1957 (cf. Kumarasamy 2007: 43; Coperahewa 2009: 107). Along with the envisaged replacement of English by Sinhala and Tamil came better conditions in the Sri Lankan education sector. From 1943 onwards, several

Chapter 2. The development of Sri Lankan English 

steps had been taken to make education readily available for the vast majority of the local population. The financial burden was removed from public education in 1943 when “education from the kindergarten to the university was declared free” (Yogasundram 2008: 288). In 1945, the government included private, formerly feelevying English secondary schools in their policy of free education and the number of free state-owned English schools steadily increased in the 1940s (cf. de Silva 1981: 475). Almost parallel to the freedom movement in India, Sri Lanka became independent from the British on 4 February 1948 (cf. Yogasundram 2008: 267). In contrast to the granting of independence to other South Asian countries such as India or Burma, the transfer of power in Sri Lanka stands out in that it was achieved peacefully via existing political institutions, which was feasible because adequate preparations had been made in the last decades of colonial rule (cf. de Silva 1981: 449). What needs to be stressed here is that Sri Lanka also took appropriate steps for a linguistic independence from the British with relevant constitutional provisions and with the work of the aforementioned select language committee, which had developed a viable plan to guarantee a peaceful taking over of the local languages Sinhala and Tamil. The fact that Sinhala, the local majority language, was to be promoted to the status of an official language alongside Tamil representing the language of the second largest ethnic group in Sri Lanka actually promised to make this linguistic transition more successful than e.g. its Indian equivalent, which, in the absence of a dominant majority language, chose to replace English exclusively with Hindi, the language with the largest speaker group in India (cf. Mukherjee 2007: 167). Consequently, “[o]n the eve of independence, the understanding was that both Sinhala and Tamil would be made official national languages, thus giving both languages ‘parity of status’. The situation changed dramatically within a few years after independence” (Coperahewa 2009: 107). 2.2 Sri Lankan English in the postcolonial era (1948–2010) As envisaged by the select language committee, English remained the official language of Sri Lanka for several years after independence, which meant that people educated in the vernacular were still, at least for the most part, deprived of the opportunity to work in the island’s administration (cf. Coperahewa 2009: 107–108). In order to structurally expand Sinhala and Tamil for official purposes, an official language committee was formed on 23 May 1951. This committee was supposed to identify structural shortcomings of both languages and derive suitable strategies for their linguistic elaboration. The government started implementing

 The Lexis and Lexicogrammar of Sri Lankan English

these s trategies based on five interim reports and one final report, also frequently referred to as the Sessional Paper (SP) XXII of 1953 with all-encompassing suggestions regarding corpus, status and acquisition planning (cf. Kumarasamy 2007: 43), from 1951 onwards (cf. Coperahewa 2009: 108; Thirumalai 2002). This very systematic approach to language planning and policy promised to lead to an adequate realisation of the anticipated linguistic transition thus satisfying the majority of the population, but it was also at the beginning of the 1950s that first separatist notions among the Tamils started to be felt more strongly in the country. In 1951, the Federal Party demanded recognition of the Tamil-speaking population as distinct from the Sinhalese speakers and stipulated a certain degree of regional autonomy for the Tamils (cf. de Silva 1981: 513). In 1952, English was still the language of administration in Sri Lanka and was of utmost importance when it came to issues of national scope (cf. Yogasundram 2008: 302; de Silva 1981: 499–500). This left the Sinhalese intelligentsia in particular extremely unsatisfied as they felt that not only the English-educated, but also the Tamils having received a better education were occupying prestigious (governmental) jobs (cf. de Silva 1981: 500). To mitigate this volatile situation and to make progress with the development of the local languages, a S wabhasha department – the department of national languages – was set up in 1955 on grounds of recommendations made by the official language committee in their fourth report in 1953. At that time, contrary to the political agenda of fostering a sense of unity in the country, incoherent, but persistent nationalist currents in the Sinhalese and Tamil communities started surfacing. The Sinhalese, on the basis of historical arguments, tended to equate Sinhalese nationalism with Sri Lankan nationalism, while the Tamils regarded themselves as a separate nation residing in Sri Lanka. Soon, these nationalist currents would be put on linguistic foundations (cf. de Silva 1981: 496), which, in combination with the struggle for political influence, probably constitute the origin of the civil war conditions characterising almost three decades of Sri Lanka’s postcolonial history. The aforementioned political agenda of the United National Party (UNP), the ruling party at that time, aimed at fostering an atmosphere of harmony or at least mutual toleration between the Sinhalese and Tamils, which, however, started disintegrating in 1955. The Sri Lanka Freedom Party (SLFP), in stark opposition to the spirit of earlier and contemporary national language planning and policy activities and, to some surprise, to the SLFP’s former agenda, adopted a policy of promoting Sinhala as the sole official language of Sri Lanka (with a certain amount of Tamil usage). It should not go unmentioned that this change in policy by the SLFP had most probably not been devised as an effective political manoeuvre before the elections. It was more likely an immediate, but all the more successful reaction (in terms of the number of voters) to the unanticipated public outrage in

Chapter 2. The development of Sri Lankan English 

the Sinhalese areas against the then Prime Minister’s proclamation to give Sinhala and Tamil parity of status (cf. de Silva 1981: 501). The instrumentalisation of language for political ends in Sri Lanka from the 1950s onwards is also stressed by Coperahewa (2009: 110) who argues that “the SLFP recognized the power of language as an aspect of group identity and used that fact effectively in their political campaign”. It was also around this time that the English language, in all probability due to its colonial burden, no longer figured prominently in debates on language planning and policy. The formal abolition of English as the official language in Sri Lanka in 1956 was facilitated by three central factors prior to the general elections in the same year: the popularity of the SLFP, the infamy of the governing party UNP and the establishment of a strong link between ethnicity, religion and language. S.W.R.D. Bandaranaike, who left the UNP to found the SLFP and was to become Prime Minister in 1956, capitalised on the emotions of the Sinhalese people by declaring that, if he was elected Prime Minister, he would make Sinhala the only official language of Sri Lanka within 24 hours (cf. Coperahewa 2009: 113). Due to the fact that “[i]n a multilingual society such as Sri Lanka, the role of each language is parallel to the importance of the community that speaks it” (Coperahewa 2009: 73), Bandaranaike received major support from the Sinhalese community, i.e. more than 73% of the population (cf. Goonetilleke 2005: 36), for the Sinhalese aimed at securing and advancing their social standing which had lagged behind due to the dominance of English for decades. It goes without saying that Tamil and thus the Tamil speech community had suffered as much from the prestigious role of English in society as the Sinhalese and desired an equal degree of official recognition, which, however, was not granted. For this reason, the applause the SLFP received from the Sinhalese met with riots and protests by the Tamils in 1956 as well as in 1958 (cf. Gunesekera 2005: 15). The UNP, the only major opponent to the SLFP in the elections in 1956, lost its credibility soon after it had publically announced that it aimed at establishing Tamil alongside Sinhala as the official languages as described above. As a reaction to the unexpected communal rage against this proposal in the Sinhalese parts of the country, the U.N.P. (in February 1956) reversed its position on language rights and adopted one that was even more thoroughgoing in its commitment to Sinhalese as the official language than that of the S.L.F.P. But the patent insincerity of the conversion discredited both the Prime Minister and the U.N.P. (de Silva 1981: 501)

The UNP had thus lost its footing and, in addition, worsened its prospects of remaining in power by advancing the general elections, which had originally

 The Lexis and Lexicogrammar of Sri Lankan English

been scheduled for 1957, to 1956, the year of the Buddha Jayanthi, a world-wide Buddhist celebration. As the Buddhists, who were and still are the largest religious group in Sri Lanka, intended to keep 1956 free of political agitation because of the religious festivities, they were offended by the decision of the UNP to reschedule the general elections, which made the UNP lose even more voters and paved the way for the SLFP to win the next general elections (cf. de Silva 1981: 501). This uneven constellation between the SLFP and the UNP was triangulated by the strong link between ethnicity, religion and language, all of which were interrelated to such an extent that they could not be treated independently. The Sinhalese feared that if Sinhala lost some of its societal standing, Buddhism, along with the Sinhalese culture, would be disadvantaged in Sri Lanka (cf. de Silva 1981: 500). The SLFP managed to foster exactly this apprehension for their political ends. Because of the developments sketched above, the SLFP won the general elections in 1956 and S.W.R.D. Bandaranaike became Prime Minister of Sri Lanka (cf. Goonetilleke 2005: 36). In order to live up to his pre-election promise, he introduced what came to be known as the Sinhala Only Bill in June 1956 (cf. Yogasundram 2008: 302), but it soon became obvious that the resistance posed by the Tamils and the need to structurally expand the Sinhala language in particular in terms of vocabulary to gear it towards being used as an official language prevented the instantaneous realisation of the bill resulting in a delay of its full implementation until 1961 (cf. Kumarasamy 2007: 46). Nevertheless, the Official Language Act, No. 33 of 1956 based on the Sinhala Only Bill formally deprived English of its status as an official language of Sri Lanka and replaced it with S inhala, but it also accommodated the usage of either English or Tamil for administrative purposes under certain conditions till 1961 (cf. Kumarasamy 2007: 46). In the aftermath of the general elections of 1956, the functional and sociolinguistic changes that the English language underwent are particularly noteworthy due to their far-reaching consequences. In addition to the devaluation of English reflected in the change from its official-language status to the status of a second language, the use of English as a medium of instruction in schools and universities continuously decreased (cf. Goonetilleke 2005: 139). Nevertheless, Goonetilleke (cf. 2005: 36) puts forward that English was never treated as a second language as people proficient in English remained in the centre of power despite the official status of Sinhala. However, the abolition of English in combination with the Sinhala Only Policy had more far-reaching devastating effects than could probably have been anticipated. Sri Lanka’s societal unity was severely threatened. [T]he fragmenting of Ceylonese or Sri Lankan identity was ironically, the result of English losing its position as the only official language of the country. The fight for independence was fought in English, the fruits of freedom were enjoyed in

Chapter 2. The development of Sri Lankan English 

English, but 1956 meant the breakdown of collective consciousness and a new beginning of identity in terms of ethnicity: Sinhalese, Tamil, Burgher or Muslim rather than Ceylonese or Sri Lankan. (Gunesekera 2005: 18)

In the years preceding Sinhala as the sole official language, the English language, either as a shared privilege or a common enemy, had a unifying effect for the various ethnic groups in Sri Lanka. However, when English was relegated to the status of a second language, this unifying effect ceased to exist accordingly, which lead to each ethnic group fighting for niches in the social hierarchy. With the Sinhala Only Policy, the ethnic equilibrium characterised by comparable levels of suppression of each ethnicity under English rule was disturbed and only the largest ethnic group, the Sinhalese, found governmental support. From that point in time onwards, “it was language which provided the sharp cutting edge of a new national self-consciousness” (de Silva 1981: 513) resulting in a lasting social division of the Sri Lankan population. Owing to the ensuing dissatisfaction of the Tamil community, which frequently found expression in riots and protests against the dominance of Sinhala in administration denying Tamils access to prestigious jobs in the bureaucracy, the Bandaranaike-Chelvanayakam Pact was signed in July 1957. Among other measures granting more autonomy to the Tamils, it was agreed that Tamil should be the official language in the northern and eastern provinces of the country and a national language of a minority (but not of Sri Lanka; cf. Yogasundram 2008: 303). The resulting Tamil (Special Provisions) Act was implemented against the background of massive protests from the Sinhalese in January 1966 (cf. de Silva 1981: 530). Both the Official Language Act, No. 33 of 1956 and the Tamil (Special Provisions) Act of 1966 were incorporated in the 1972 constitution of Sri Lanka, which did not include any reference to the English language (cf. Coperahewa 2009: 119). The constitution marked the official end of Sri Lanka’s dominion status in the British Commonwealth and founded the Democratic Socialist Republic of Sri Lanka, which found reflection in the name change from Ceylon to Sri Lanka (cf. Senaratne 2009: 26). Meanwhile, though not given any place in public discourse, the English language retained its prestige, which finds expression in the following facts. As early as 1957, one year after the Sinhala Only Policy, first recommendations to make English a compulsory subject up to the 8th grade for all pupils were discussed in public (cf. Coperahewa 2009: 93). In the 1960s, literary writing in English firmly established itself (cf. Goonetilleke 2005: 60) and in the 1970s, the government developed an open economic system entailing international trade, which made knowledge of English a prerequisite for success and, thus, gave a significant boost to the popularity of English (cf. Samarakkody & Braine 2005: 149).

 The Lexis and Lexicogrammar of Sri Lankan English

The study of English offered all Sri Lankans very substantial material advantages. To the non-cultivator castes it offered, in addition, the possibility of moving away from a caste system based on hereditary occupation towards a class system based on education, government or commercial employment, and money. (Fernando 1977: 343)

The unceasing power of English in Sri Lanka was mirrored in that “[i]t was during this period [= the 1960s and 1970s] that the term kaduwa [bold in original], the Sinhala word for ‘sword’, to refer to the English language, was coined and gained a currency which continues till today” (Goonetilleke 2005: 50). The metaphor of a sword is ambivalent in that it can be understood either as English dividing society into two segments with only one segment having access to certain social ranks and job opportunities or as English being used as a linguistic weapon to oppress those who have no command of it. The divisive power of English could also be observed in Sri Lankan universities in the 1970s and 1980s, where a distinction between the Haras and the Kults was made. “The term ‘Hara’ was an abbreviation from the Sinhala name ‘Haramanis’ associated with the village. ‘Kult’ on the other hand, was an abbreviation of ‘kultur […]’ associated with the elite or the cultured ones.” (Gunesekera 2005: 22) The Kults differentiated themselves from the Haras in their competence in English as well as in a certain degree of westernisation in their clothing and patterns of behaviour. One party literally fighting the dominance of English was the JVP (short for Janatha Vimukthi Peramuna or the People’s Liberation Front). It succeeded in mobilising the Sinhalese youth, which, though well-educated in Sinhala, frequently faced unemployment due to its lack of English skills. The anger of this group culminated in Marxist rebellions against English and the English-speaking decision makers in 1971 and 1988 until 1989 (cf. Goonetilleke 2005: 44) to change the existing order of things towards a more communist state, but the group ultimately failed to achieve its political ends. With growing disputes between the Sinhalese and the Tamils, the 1978 constitution marks the next important step in language planning and policy, in which Tamil stakes were represented to a larger extent than in the former constitution of 1972 (cf. Thirumalai 2002). Sinhala, though not laid down as radically as before, was still retained as the sole official language of Sri Lanka in Chapter IV of the 1978 constitution, but Tamil was promoted to the status of a national language of Sri Lanka alongside Sinhala (cf. Coperahewa 2009: 120; Yogasundram 2008: 323; Gunesekera 2005: 16). English was also given constitutional recognition in 1978, namely when it was stated that all laws had to be published in the national languages with a translation in English. Issues in language planning and policy were again handled by the official languages department after its re-establishment in 1979 (cf. Coperahewa 2009: 121).

Chapter 2. The development of Sri Lankan English 

Following communal riots triggered by Tamil extremist youth groups in 1977 as well as 1979 and the launch of the fight for a separate state in the north of the country by the Liberation Tigers of Tamil Eelam (LTTE) in 1983 (cf. Coperahewa 2009: 79), Sri Lanka faced severe ethnic uprisings throughout the country (cf. Goonetilleke 2005: 82). For this reason, it can be argued that 1983 marks the beginning of the Sri Lankan civil war, the causes for which, however, had already manifested themselves in society over decades. As a reaction to the island-wide outbreak of violence, the government aimed at putting Sinhala and Tamil on equal footing with the help of several governmental measures in order to settle the conflict. On 29 July 1987, the Indo-Sri Lanka Accord, which gave Sinhala, Tamil and, remarkably, English the status of official languages of Sri Lanka, was signed (cf. Yogasundram 2008: 339). As a result, English was revived as an official language of Sri Lanka, though only for a couple of months, after it had been abandoned in 1956. With the 13th amendment to the constitution in November 1987, English was again relegated to “be a ‘link language’, although no further definition or clarification of its use and status in relation to Sinhala or Tamil was provided” (Coperahewa 2009: 93), which, eventually, left the major part of the Sri Lankan society confused as regards the formal status of English in Sri Lanka. Gunesekera (2005: 76) states that “[t]he changing status of English has led to a great deal of confusion about its legality. The majority of Sri Lankans believes that English is a ‘national language’ or a ‘link’ language, and most people are reluctant to call it a language of Sri Lanka”. In this 13th amendment to the constitution, it was also documented that Sinhala and Tamil were both formally promoted to the status of official and national languages of Sri Lanka (cf. Gunesekera 2005: 11). However, the wording employed to give official status to Tamil is noteworthy. Coperahewa (2009: 121) calls attention to this when he comments that “the 13th amendment to the constitution stated, ‘Tamil shall also be an official language’. However, the legality of the word also was not explained in the relevant constitutional provision”. The usage of also could be interpreted as a remnant of the fact that Sinhala had been granted official status before Tamil, which might be considered an implicit signal of a certain degree of constitutional primacy of Sinhala. For the implementation of the above language policies, an official language commission with legislative powers was set up (cf. Coperahewa 2009: 123). When it comes to the educational sector, a huge demand for instruction in English in all spheres of society could be observed. The choice of English courses or English-medium instruction was wide as programmes of differing quality, most of which were liable to costs, were offered by private and international schools, several associations such as the English Speakers’ Association and the English

 The Lexis and Lexicogrammar of Sri Lankan English

Teachers’ Association and institutions like the British Council (cf. Samarakkody & Braine 2005: 150). These courses, however, were almost exclusively attended by students from the relatively well-off fraction of Sri Lankan society for which investing a relatively huge amount of money in the education of the next generation was financially viable. It needs to be pointed out that the above type of English education was offered by (private) institutions and associations because, following independence, English had been discarded from government schools and universities till 1997. Until then, the segments of society in desperate need for economic development would be left behind from a financial perspective and wealth as well as power stayed with those who had already had it (cf. Samarakkody & Braine 2005: 156). The education reforms of 1997, in the context of which Standard SLE is supposed to have become the target model in learning material provided to students such as books, textbooks and audio cassettes (cf. Gunesekera 2006: 30), aimed at changing this imbalance and introduced English as a subject in grade 1 in public schools and English-medium instruction was promoted from grade 5 onwards in schools that had adequately trained teachers at their disposal (cf. Gunesekera 2005: 16). In the new millennium, the role of English has remained relatively unchanged since English has not ceased to be associated with power and prestige (cf. Gunesekera 2005: 13) while some interesting developments with implications for the status of SLE as a variety of English in its own right could be observed – these will be considered in more detail in Chapter 2.3. As regards the perception of and attitude towards SLE, “at least some users of English are prepared to say they speak or use Sri Lankan English” (Gunesekera 2005: 11). The fact that SLE was adopted as a target model in teaching is very much in line with positive attitudes towards it (cf. Samarakkody & Braine 2005: 156). From an aerial view on the postcolonial history of SLE then, Sri Lankan independence from the British in 1948 marks the onset of the phase of nativisation, i.e. phase 3 in Schneider’s (2003, 2007) framework. Despite this political disengagement, connections on various levels between the former colonisers and Sri Lanka did not cease to exist. The retention of English as a de facto official language and the pervasiveness and acceptance of BrE as the production goal in education with some speakers (cf. Künstler et al. 2009: 69) are examples of the persistence of British influence in the Sri Lankan linguistic sphere, but also other cultural influences remained, for example in architecture or in sports (e.g. cricket or tennis). Despite less frequent contact between the former settler and the indi genous strand, bi- or multilingualism (meaning proficiency in at least one indi genous language and in English here) is as evident in the Sri Lanka of the 1950s as in the Sri Lanka of the 21st century (cf. Kumarasamy 2007), which, in conjunction with the continued presence of native speakers of SLE (cf. Mukherjee et al. 2010: 65), may be indicative of the maturation of SLE. The linguistic developments

Chapter 2. The development of Sri Lankan English 

of SLE in this phase are marked by extensive lexical borrowing in a wide range of semantic fields as well as lexical productivity (cf. Meyler 2007). Code-mixing, in particular in the spoken medium, is also a frequently employed discourse feature (cf. Senaratne 2009). Further, emerging characteristics on various structural levels as touched upon in Chapter 2.3.2 can be also observed, but it needs to be pointed out that the status of the features discussed in earlier research is not always clear. Due to the restricted datasets of some studies, it can be challenging to differentiate between relatively uninstitutionalised nonce formations in and well-established variety-specific structures of SLE. While there is sufficient evidence to show that SLE has already entered the phase of nativisation, a more in-depth analysis is needed to establish whether SLE has already completed the transition from nativisation to endonormative stabilisation (phase 4). Mukherjee (2008: 361), in his analysis of SLE in Schneider’s (2003, 2007) model, concludes that “it is reasonable to assume that Sri Lankan English is an institutionalised second-language variety of English which may well be on its way towards endonormative stabilisation”. His evaluation positions SLE between phase 3 and phase 4, but since some significant developments relevant to the evolutionary status of SLE have set in only relatively recently, a fresh look at to what extent SLE can be considered an endonormatively stabilised variety is necessary.

2.3 Sri Lankan English: The state of the debate Chapter 2.3 covers contemporary developments in Sri Lanka relevant to an understanding of the present-day intricacies in its linguistic scenery and to an evaluation of the degree to which SLE can be regarded as an endonormatively stabilised PCE in Schneider’s (2003, 2007) evolutionary model. The main focus of Chapter 2.3.1 is sociocultural in that the identity construction of the Sri Lankan population and its valuation and perception of (Sri Lankan) English also in terms of Sri Lankan literature in English are scrutinised against the background of the current political situation of Sri Lanka. Chapter 2.3.2 presents a concise overview of earlier research into the structures and sociolinguistic aspects of SLE. Recent sociopolitical developments and debates and the role Meyler’s (2007) dictionary of SLE has played in them are under scrutiny in Chapter 2.3.3, where an assessment of the current evolutionary status of SLE constitutes the final chord of this sociohistorical overview of SLE. 2.3.1 Sociocultural considerations The most recent version of the Sri Lankan Constitution states that “Sri Lanka (Ceylon) is a free, Sovereign, Independent and Democratic Socialist Republic and

 The Lexis and Lexicogrammar of Sri Lankan English

shall be known as the Democratic Socialist Republic of Sri Lanka” (The Constitution of the Democratic Socialist Republic of Sri Lanka; 〈http://www.priu.gov.lk/ Cons/1978Constitution/Chapter_01_Amd.html〉 (17 October 2014)). With regard to history and politics, this attests that Sri Lanka clearly is an independent and selfdependent nation and thus fulfils the historical and political criteria of an endonormatively stabilised variety. When it comes to identity construction, however, the absence of the settler strand renders an evaluation of this parameter difficult: “the European planters and merchants in Sri Lanka were all birds of passage and not permanent settlers with an abiding interest, i.e. not a true plantocracy” (de Silva 1981: 361). An overview of the ethnic groups of Sri Lanka as given in the 1981 census calls attention to the by and large absence of descendants of the former colonisers. According to the last, complete national census, which was carried out in 1981, the population of Sri Lanka comprises 73.98 percent Sinhalese, 12.6 percent Tamils, 7.12 percent Moors, 5.56 percent Indian Tamils, 0.29 percent Malays, 0.26 percent Burghers (descendants of the Portuguese and the Dutch), and 0.20 percent others. These overall percentages are still valid. (The most recent census was conducted in 2001, but it was incomplete because the Tamil terrorist organisation, the Liberation Tigers of Tamil Eelam, LTTE for short, did not permit it to be held in the areas in the North and East under its control.) (Goonetilleke 2005: 79)

The Burghers in Sri Lanka are also particularly relevant for a description of the local development of the English language. Coperahewa draws attention to the fact that although the Burghers may have a Portuguese or Dutch ancestry, they nevertheless often speak English as a first language. In modern times, English has been the language of the home for most Burghers. The Burghers became a leading community during British colonial times because of their command of the English language, and they dominated in government employment. Moreover, they had greater social contact with British offices. However, after independence, most of the Dutch Burghers began emigrating to Britain and Australia. (Coperahewa 2009: 79)

This indicates that even though a minority group of European descendants lives in Sri Lanka, it is for the most part non-British in origin, which shows the physical absence of the settler strand. It can thus by no valid means be argued that the “STL [= settler] strand community now perceive themselves as members of a newly born nation, definitely distinct from their country of origin” (Schneider 2007: 49). However, with the indigenous community, it may well be the case that its identity is nevertheless marked by British influence. Despite the physical absence of the British, their psychological legacy, in particular with the English-educated and

Chapter 2. The development of Sri Lankan English 

socially as well as economically powerful intelligentsia, lives on in that BrE still seems to be a central target model in both English language teaching and speech production. This is a tendency which can be observed for several former British colonies. Frequently after a colony’s independence the STL community vanishes almost completely, having left the country and returned ‘home’ – but via the education system and the needs of international and also intranational communication the English language remains and retains a vital presence in the language contact setup.(Schneider 2007: 66)

The settler and the indigenous population are clearly not physically interwoven in Sri Lanka. Nevertheless, some traces of a psycholinguistic identity constructed under the persistent influence of the former British settler community are still evident in the indigenous speech community. Still, the challenge in evaluating to what extent a new nation with a new identity cutting across former ethnic boundaries has emerged does not only lie in the complex interrelations that hold between the indigenous and the settler community. When it comes to a new “pan-ethnic” (Schneider 2007: 56) identity merging British and local identities, it seems rather obvious that this can only be true to a very limited extent and, if at all, can probably be seen in (minor) readjustments in the indigenous community as the British implicitly signalled their disinterest in becoming further entrenched in Sri Lankan society and related identity constructions by leaving the country after economic incentives became unavailable. Although the British largely vanished from the scene after 1948, Sri Lankan identities have lately undergone drastic changes, which might eventually culminate in a fresh pan-Sri Lankan ethnic identity (though not in Schneider’s (2003, 2007) sense). This new identity does not fuse British and Sri Lankan, but local Sinhalese and Tamil identities. After the end of the long-lasting Sri Lankan civil war between the Sinhalese and the Tamils in 2009, it may well be the case that a new ethnic character of Sri Lanka emerges since both ethnic groups tend to mix more readily and learn each other’s language due to attractive job opportunities scattered all over Sri Lanka. This also finds reflection in the fact that some Sri Lankans – in fact more than every fifth informant in a recent attitudinal study on English in Sri Lanka (cf. Bernaisch 2012: 284) – have begun to describe their ethnicity as Sri Lankan and not as Sinhalese or Tamil any more. Consequently, it will be central to observe whether or not a new pan-Sri Lankan identity will develop, but it is probable that this identity will not be overtly marked by attributes stemming from British colonial rule decades ago. In this context, it needs to be mentioned that Schneider (2003: 250) suggests that “[w]hile the transition may be smooth and gradual, it is also possible

 The Lexis and Lexicogrammar of Sri Lankan English

that the transition between stages 3 and 4 is caused by some exceptional, quasicatastrophic political event”. The Sri Lankan civil war creating a depressing islandwide atmosphere of anxiety and mistrust certainly falls in this category of events. It is only the cessation of such a detrimental event which facilitates “a redirection of the emphasis of a collective identity from one’s social or ethnic group to one’s status as a member of a newly forging nation” (Schneider 2007: 185). Still, as the Sri Lankan civil war was ended only relatively recently, the new political climate will probably still need more time to fully enter the Sri Lankan psyche and the possibly resulting pan-Sri Lankan identity may emerge at a similar pace in the context of this process. With regard to sociolinguistic perspectives on the usage of and attitudes towards SLE, the Sri Lankan speech community seems to be characterised by both conservative and progressive forces. On the one hand, the scholarly community accepts SLE as a variety of English in its own right (cf. e.g. Samarakkody & Braine 2005: 156; Coperahewa 2009: 96) and welcomes the adoption of SLE teaching material in schools (cf. e.g. Samarakkody & Braine 2005: 152; Gunesekera 2006: 30). A generally positive attitude towards SLE in the local speech community (cf. Bernaisch 2012: 289) is certainly indicative of progressive sociolinguistic tendencies as is its sociofunctional profile. English still occupies an extremely powerful position in the linguistic equilibrium of Sri Lanka in that “it gives access to both the most lucrative jobs and social prestige” (Coperahewa 2009: 94) and perseveres in domains such as higher education, business, science and technology (cf. Coperahewa 2009: 94) as the medium of international communication. Also in the media, English is a salient asset. A number of daily and weekly newspapers (e.g. Daily Mirror, Daily News) are published in English and there are several English-medium TV channels (e.g. ETV; cf. Künstler et al. 2009: 61). When it comes to certain spheres of electronic communication, English figures prominently as well. More recently, the emergence of new internet-based forms of communication, in particular e-mails, has opened up new domains that are restricted to English, as Sinhala or Tamil writing-systems are not as readily and easily available to computer users in Sri Lanka as the Latin alphabet on the standard keyboard: e-mails in Sri Lanka are, thus, written in English even if language users would write equivalent traditional letters or notes in Sinhala or Tamil. (Künstler et al. 2009: 61)

On the other hand, BrE still functions as a target in education and linguistic production (cf. Künstler et al. 2009: 69) and the wider speech community seems relatively reluctant to claim to speak SLE (cf. Gunesekera 2005: 20). In sum, however, empirical attitudinal research could show that “most functional and attitudinal

Chapter 2. The development of Sri Lankan English 

requirements for the status of Sri Lankan English as an institutionalised secondlanguage variety in its own right have been satisfied at least to some extent” (Künstler et al. 2009: 72). In relation to South Asian English(es), Kachru (1982: 367) calls attention to the fact that English “has developed a local body of writing in various literary genres”. The adoption of localised forms of English as vehicles for creative writing is a common tendency in many PCEs outside South Asia as well: Creative writers in ex-colonies of Britain have reached a stage when the use of English in creative work has ceased to be an issue, and critics have now to think beyond the parameters to which they have been long accustomed. English has become a naturalized language in a great many countries. It has come to stay, is spreading, and literature in English is set to proliferate in every conceivable direction. Indeed, the world language will, in time, generate a world literature. (Goonetilleke 2005: 55)

SLE is no exception to the rule despite the current dearth of international recognition of its literature (cf. Goonetilleke 2005: 153). Only after the cessation of colonial occupation was SLE fully embraced for literary purposes since “[a]s commonly in the ex-colonies, in Sri Lanka the presence of the colonial ‘masters’ had a suffocating effect on the creative energies of the local inhabitants. English literature in Sri Lanka emerges from the growth of nationalist currents” (Goonetilleke 2005: 35). Mendis and Rambukwella elaborate on the development of Sri Lankan literature in English: Creative writing in English has been a part of Sri Lankan literary culture since the late eighteenth century. In its early phases this writing was largely limited to the British expatriate community although a few Sri Lankan writers such as James de Alwis wrote and published in the early nineteenth century. A more substantial body of English writing is evident in the first half of the twentieth century, with British writers such as Leonard Woolf and Sri Lankan writers such as R.L. Spittel and Lucian de Zilwa producing novels which received some critical acclaim. However, it is with the increasing output of writing in the post-independence period that Sri Lankan writing in English (SLWE) becomes identifiable as a distinctive postcolonial category. There has been a steady increase in SLWE from the 1970s onwards with both resident and non-resident writers contributing to its regional and international profile. (Mendis & Rambukwella 2010: 191–192)

The writing of English-medium literature is also encouraged and rewarded via two literary prizes for outstanding Sri Lankan literary works in English. In addition to the annual Gratiaen prize awards, the Sri Lankan government also officially recognises and praises Sri Lankan literature in English with a special State Literary Award (cf. Mendis & Rambukwella 2010: 191–192). In sum, there is ample testimony of English being used by Sri Lankan authors for creative writing.

 The Lexis and Lexicogrammar of Sri Lankan English

2.3.2 Sociolinguistic considerations The shortage of empirical research into authentic SLE language data is a challenge as regards the evaluation of linguistic developments in Schneider’s (2003, 2007) model. With the exception of pragmatics, studies have been undertaken on all structural levels as well as on sociolinguistic aspects of SLE, but in case the respective studies worked empirically at all, the databases in which the corresponding findings were grounded have so far been either relatively small or not accurately documented, thus limiting the explanatory power of the claims put forward. After the finalisation of large-scale corpus projects on SLE, a general tendency towards empirical approaches based on authentic language data started surfacing. In terms of the sound system of SLE, Gunesekera (2005: 117) differentiates between features claimed to be characteristic of Standard SLE and features of substandard SLE arguing that “these few differences can make the difference between being employed and unemployed, because the better jobs are still reserved for the privileged few who speak SSLE [= Standard SLE]”. Meyler (2007) compares the SLE sound system to its BrE equivalent and highlights systematic differences, while Senaratne (2009) analyses SLE contrastively in relation to the sound systems of other South Asian varieties such as IndE. Although the studies on the SLE sound system principally offer collections of distinctive features that set SLE apart from other varieties of English, their currency in actual language use largely remains to be established via future empirical investigations. In one of the earliest studies of SLE morphology and lexis, Passé (1950: 133) investigates what he considers “errors in Ceylon English”. Already in the 1950s, there were first indications that these deviations from what was at that time the accepted and, thus, expected standard tended to establish themselves in SLE usage. The ‘translation errors’ that concern us are those which have gained currency in Ceylon English. It is obvious that the more proficient a person becomes in the use of English, the less error of this kind there will be in his conversation and writing. But there are locutions, idioms, and even syntactical constructions that are used by quite good speakers and writers, that appear in newspapers and other writing, including the essays of University students; errors of expression that have become more or less fixed in Ceylon English and which the users would be startled and shocked to hear stigmatized as ‘un-English’. (Passé 1950: 133)

In retrospect, a diachronic paradigm shift from monolithic to pluricentric perspectives on and perceptions of the English language can be observed in (lexical) studies on SLE. In the 1950s, Sri Lankan peculiarities were considered to be errors due to their deviation from BrE. At that time, the relatively high degree of institutionalisation and currency of these features was disregarded and considered to be a menacing indication of lowering linguistic standards exacerbating the then

Chapter 2. The development of Sri Lankan English 

v ariety of English in Sri Lanka. Today, in contrast, exactly these features which have retained their status as integral elements of SLE inter alia mark the foundation for assigning SLE the status of a variety of English in its own right. In the light of this, it is of particular interest that a number of the forms which Passé (1950) sees as erroneous in spite of their institutionalisation have found their way into Meyler’s (2007) dictionary of SLE, which can be interpreted as a sign that these features are variety-specific elements of SLE due to their continual usage in SLE. Front house, for instance, stands for “the house across the street, facing, opposite to your house” (Passé 1950: 136) and exactly this meaning of front house is also attested in Meyler’s (cf. 2007: 93) dictionary. The same holds true for junction (cf. Passé 1950: 135), which, according to Meyler (2007: 126) is a lexeme present in both SLE and BrE, but semantic differences can be shown in that “[i]n BSE [= British Standard English], a ‘junction’ is simply an intersection of two or more roads; in SLE it suggests the focal point of a town or village, with shops, market, bus stand, etc.”. Further, SLE reduplications based on Sinhala equivalents such as hot hot (cf. Passé 1950: 148) meaning “freshly made and/or steaming hot” (Meyler 2007: 116) are also included in Meyler’s (2007) dictionary. Thus, one might argue that Passé (1950) unintentionally provided a list of lexemes which eventually developed into lexical variety markers of SLE. Based on their diachronic stability, these lexemes have become central parts of SLE lexis and are thus documented in respective reference works. Further lexical studies include e.g. Wickramasuriya (1962) focussing on spelling mistakes and their origin in the SLE sound system, Fernando (2003) illustrating the various sources of the SLE vocabulary such as languages indigenous to Sri Lanka (e.g. anaconda from Sinhala henakandaya (cf. Fernando 2003: 15)) and the colonial languages preceding English, i.e. Portuguese and Dutch. Fernando (2003), Gunesekera (2005) and Meyler (2007) also evaluate the productivity of individual word formation processes, among which borrowing, compounding, derivation and semantic change are presented to be the most frequently used ones.5 These processes have created a distinct pool of SLE vocabulary, which can be argued to have become relatively stable in diachronic terms. Focussing on verb complementation and particle verbs, Passé (1950) briefly discusses the lexis-grammar interface in his early study of systematic deviations of SLE speakers from British standards. Verbs prototypically used ditransitively

. Although semantic change is not a word formation process in the classic sense, it nevertheless alters layers of meaning of existing English lexemes and, eventually, adds to the lexical complexity and specificity of SLE.

 The Lexis and Lexicogrammar of Sri Lankan English

in BrE (e.g. GIVE), he (cf. 1950: 141) argues, are used without any explicit objects due to a structural analogy in Sinhala, which allows using transitive verbs intransitively in case the relevant objects are contextually retrievable.6 Example (1) illustrates this usage:7 (1) Give, I’ll carry. [italics removed from original] (Passé 1950: 141)

When it comes to particle verbs, divergent particle use as well as divergent meaning of existing particle verbs are claimed to be attestable in SLE in comparison to BrE (cf. Passé 1950: 152–153). Zooming in on particle verbs with up, Passé (cf. 1950: 153) lists a number of examples with meanings and/or structures different from the respective BrE ones, among which “cope up with” (Passé 1950: 153), a relatively institutionalised alternative to COPE with in South Asia now (cf. Mukherjee 2012: 204–205; Zipp & Bernaisch 2012: 188), can also be found. Gunesekera (2000: 114) looks into the morphosyntax and syntax of written texts produced by proficient users of SLE in university, school and media contexts to establish “deviations from accepted usage in Sri Lanka, [which] will be considered errors”. The key (lexico)grammatical areas considered to cause SLE speakers problems are –– –– –– ––

pluralisation (of e.g. uncountable nouns) subject-verb agreement (over)use of prepositions the confusion of active and passive voice (cf. Gunesekera 2000: 129).

However, it is revealing to observe that, in a later publication on SLE, Gunesekera revises her prescriptive perspective on prepositional usage given the pervasiveness of the structural features under discussion: ‘He posed off as a policeman.’ ‘They did not sit for the exam.’ ‘She cannot cope up with the work.’ The examples given above were considered errors initially, but they seem to be now part of the language, since most users of English are using them. (Gunesekera 2005: 136)

Thus, it might be argued that systematicity and continuity are also characteristics of particle verb usage in SLE.

. Verbs spelled with capital letters stand for the lemmata of the verbs concerned. . In all examples, the feature to be exemplified is printed in bold.

Chapter 2. The development of Sri Lankan English 

With the availability of corpus-linguistic resources, a shift in focus from introspective to empirical approaches to SLE lexicogrammar can be observed. Mukherjee (2008) marks the first corpus-linguistic pilot study on SLE lexicogrammar with a particular focus on its verb-complementational profile. Other studies on various lexicogrammatical phenomena have followed since (e.g. Mendis 2010 on particle verbs in academic writing, Schilk et al. 2012 on transfer-causedmotion constructions, Koch & Bernaisch 2013 on new ditransitives, Hundt et al. 2012 on the hypothetical subjunctive, Bernaisch 2013 on the complementation of OFFER, etc.) and while they are able to depict structural intricacies particular to SLE, they – for the most part – also document pan-South Asian English features and structural characteristics of the common core highlighting the area of tension between progressive and conservative forces in which SLE evolves. While the study of SLE syntax is also marked by prescriptive (cf. e.g. Wickramasuriya 1961) and more introspective, but at the same time very detailed studies contrasting spoken and written SLE (cf. e.g. Gunesekera 2005) or SLE and BrE (cf. e.g. Meyler 2007), more recent empirical accounts are also available. Analysing interview material from 18 prototypical speakers of SLE for whom English is a second language, Herat (2005) conducts a study on zero copula in spoken SLE syntax as exemplified in (2) – a particularly interesting feature since the indi genous languages of Sri Lanka do not feature copula verbs equivalent to the English ones (cf. Herat 2005: 186).

(2) I never take them to the street stall, I said, you know, these places ^ very bad. (Herat 2005: 194)

She (cf. 2005: 206) extracts three factors that seem to be associated with copula deletion: (a) complementation type (e.g. going to and adjectival complementation favour BE absence), (b) preceding phonological environment (e.g. vowels lead to an omission of a copula) and (c) pronominality (e.g. zero copula usage occurs when the subject is realised as a pronoun). The structural configurations just described may trigger the relatively high frequency of copula deletion in the data (cf. Herat 2005: 187). Herat (2006: 65) examines “substitute one in its function as a pronoun which replaces a noun or noun phrase that has been mentioned or is inferred from the context”. Example (3) instantiates this SLE usage of substitute one.

(3) Now when we go to Mihintale I tell that is the first one declared as sanctuary from the whole world. (Herat 2006: 71)

The usage pattern of substitute one in SLE is characterised by (a) the occurrence of substitute one with missing antecedents which are also not textually recoverable, (b) the occurrence after noun phrases, pronouns and proper nouns as well as

 The Lexis and Lexicogrammar of Sri Lankan English

(c) the more frequent usage of modified than unmodified substitute one (cf. Herat 2006: 71–72). This usage of substitute one has its origin in Sinhala since speakers of this language tend to add the Sinhala morpheme eka (meaning one) to words borrowed from English. It thus seems that speakers of SLE have creatively extended the usage of the English translation of eka to contexts where other varieties of English might not use substitute one (cf. Herat 2006: 70–71). In sum, studies on SLE syntax thus provide more evidence of prescriptive and/ or introspective descriptions, which need to be critically revisited in the light of authentic data. However, the relevance of conceptualising SLE as an in the main second-language variety acquired against the background of mostly Sinhala or Tamil also becomes evident in the syntax of SLE given the existence and recurrence of potential products of structural transfer from the indigenous languages. While the semantics and pragmatics of SLE are the most underresearched areas despite laudable exceptions such as Werner and Mukherjee (2012) establishing quantitative differences in the meaning facets of GIVE and TAKE in SLE in comparisons to IndE and BrE, sociolinguistic investigations so far figure most prominently in research on SLE. The studies, which are for the most part survey-based and quantitative, yield a relatively diverse picture of the sociolinguistic characteristics of SLE, but three central observations resurface: first, the stark socio-economic and (possibly related) educational contrast between urban centres (with Colombo being the most prominent one) and more peripheral areas finds reflection in (socio-)linguistic attributes such as proficiency in and attitudes towards (Sri Lankan) English (cf. e.g. Gunesekera 2005; Kumarasamy 2007). Second, although there seem to be a growing acceptance of and positive attitudes towards SLE as a variety in its own right in the SLE speech community (cf. e.g. Raheem 2006; Bernaisch 2012), which are salient characteristics of an endonormatively stabilised variety in Schneider’s (2003, 2007) model, at least in Colombo, the exonormative model of BrE still plays a central role when it comes to teaching and production goals (cf. e.g. Gunesekera 2005; Künstler et al. 2009). Third, E nglish used to and seems to continue to be (perceived as) an instrument of socio-economic advancement, power and privilege (cf. e.g. Kandiah 1984; Senaratne 2009). With regard to the evolutionary status of SLE, this overview of research into SLE brings to the fore that only isolated corpus-linguistic studies have been conducted on a narrow selection of research phenomena, which have so far not allowed to establish empirical foundations to trace (certain degrees of) the stabilisation and the homogeneity of SLE. Consequently, while the present study hopes to shed some more empirical light on recurrent structural patterns in SLE, noteworthy progress has been made in terms of its codification (cf. Meyler 2007) – another aspect indicative of an endonormatively stabilised variety according to Schneider (2003, 2007).

Chapter 2. The development of Sri Lankan English 

2.3.3 Sociopolitical considerations With his A Dictionary of Sri Lankan English, Meyler (2007) has laid the cornerstone for the codification of SLE. However, as Meyler himself observes, […] in truth it is not really a dictionary, because it focuses only on the differences between ‘standard’ British English and ‘standard’ Sri Lankan English. And of course in reality there are infinitely more similarities than differences. A standard Sri Lankan English speaker will read this paragraph with as much ease as any other English speaker. A true ‘dictionary’ of Sri Lankan English would also include all the words and expressions which are common to every variety of English (cat, love, biscuit, whatever), and the distinctively Sri Lankan bits would suddenly become an insignificant part of the whole. (Meyler 2010)

Thus, although Meyler’s (2007) dictionary is rather a usage guide to SLE than a dictionary in the classic sense, it undoubtedly marks the first step in the codification process of SLE set in a larger, partly heated socio-political argument on the promotion of endonormative standards in SLE, which reached its climax in 2010 and 2011 – the main focus of the subsequent discussion. There is almost unanimous agreement that SLE is a variety of English in its own right. Various scholars (cf. e.g. Kandiah 1981a; Gunesekera 2005), though using different terminology, consider SLE a structured linguistic entity which can clearly be distinguished from other varieties of English. In this respect, Meyler (2007: ix) states that his dictionary “attempts to define Sri Lankan English (SLE), and to promote the acceptance of SLE as one of the many established varieties of English as an international language”. However, the promotion of SLE repeatedly faces resistance in the Sri Lankan public and in certain academic circles. The grounds on which the support of SLE is still criticised are, for example, put forward by Fonseka (2003: 2), who regards SLE as a collection of “all the language errors and vulgarisms committed by Sri Lankan speakers/writers of English”, in his argument that the institutionalisation of SLE will result in putting Sri Lankans at a disadvantage on the international job market, which is why an international standard variety should be taught in Sri Lanka instead. Yet, Fonseka (2003) does not elaborate on the concrete structural norms of this evasive notion of an international standard variety of English. In the context of the dictionary of SLE, another extralinguistic factor entered the discussion, namely that its compiler, Michael Meyler, is a speaker of BrE, which, to some extent, fits in with the perspective that although “the proponents of the socalled Sri Lankan form of English […] go on preaching on it they themselves try to use an internationally recognised standard form of English” (Fonseka 2003: 7). As will be illustrated below, these issues, as well as the Indian involvement in the governmental campaign English as a Life Skill (cf. Boange 2010), are currently hotly

 The Lexis and Lexicogrammar of Sri Lankan English

disputed in local newspapers such as Daily Mirror or Sunday Observer as well as on alternative journalist websites such as Groundviews 〈http://www.groundviews. org〉 (17 October 2014). Apart from the ideologically loaded arguments, the quintessence of the criticism of the promotion of SLE, however, principally lies in its lack of standardisation (cf. Meyler 2009: 55) since its major opponents perceive the label SLE as a toleration of an anything-goes policy (cf. Boange 2010). With a presidential campaign, the Sri Lankan government drawing on local as well as international expertise in language planning and policy is actively working towards the creation of reference works for the standardisation of SLE on all structural levels. English as a Life Skill, which was launched in April 2008, is an initiative to foster the development of the English language in Sri Lanka.8 The campaign is jointly coordinated by the board of investment of the Ministry of Enterprise Development and the presidential secretariat. The English as a Life Skill initiative aims at improving the skills of English teachers in the private and public sector, at providing training in English for business communication, at exploiting electronic resources for distanced learning and at involving more low-income groups in education in English. Equipping 50,000 people with adequate communication skills for the business sector has been set as a short-term goal of the initiative and reflects the involvement of the Ministry of Enterprise Development, which is in need of people proficient in English to make Sri Lanka more attractive for international trade and commerce. In order to live up to the above expectations, the teaching of English in Sri Lanka is being developed in conjunction with Indian institutions. The collaboration manifests itself in the setting up of the Sri Lanka-India Centre of English Language Training (SLICELT) in Peradeniya (cf. 〈http://www.news.lk/index. php?option=com_content&task=view&id=14047&Itemid=44〉 (15 September 2010)) and in the three-month training of Sri Lankan master teachers in Hyderabad, India, where the Teacher Guide and Training Manual in Spoken English – a set of guidelines for Sri Lankan teachers of English – was developed (cf. 〈http://www. news.lk/index.php?option=com_content&task=view&id=10292&Itemid=44〉 (15 September 2010)). The short-term goal of equipping 50,000 Sri Lankans with an advanced knowledge of English is planned to be achieved via financially assisting people between 18 to 24 years over a period of three years in acquiring joboriented English skills.

. The information on English as a Life Skill is taken from a governmental press release (cf. 〈http://www.news.lk/index.php?option=com_content&task=view&id=5595&Itemid=44〉 (15 September 2010)) if not indicated otherwise.

Chapter 2. The development of Sri Lankan English 

The second phase of the English as a Life Skill campaign was launched on 19 July 2010 with an interim report on the progress of the programme.9 It was related that 60% of the country’s teachers had been trained to teach spoken English and that a new certificate for basic English skills, which entails the participation in a 100-hour curriculum, had been introduced to enlarge the English-speaking community in Sri Lanka. The campaign Speak English Our Way, which evolved in the context of English as a Life Skill to eradicate the fear associated with (British) English in the Sri Lankan population, is supposed to support the goals of the latter initiative.10 Due to its presence in national television and newspapers, Sri Lankans tend to be more aware of Speak English Our Way and, for this reason, it is also more frequently discussed in a controversial manner in letters to the editor of popular newspapers or in online forums. Speak English Our Way aims to spread the usage of English throughout Sri Lanka, but puts particular emphasis on the acceptance and promotion of the local variant of English. According to Sunimal Fernando, the then presidential advisor and coordinator of the initiative, there are three fundamental shortcomings that need to be addressed in the course of the programme: (a) the method of teaching English in Sri Lankan schools has been adopted from Britain where it is geared towards children learning English at home through the interaction with their parents; as the Sri Lankan pupils, however, generally do not speak English at home, with the urban elite being the only exception, the school curriculum is only suitable for this small group leaving approximately 90% of the pupils behind; (b) English language planning and policy as well as educational policies for English are being handled by a privileged group of people who are out of touch with the needs of rural English teachers; and (c) BrE, and not SLE, is being used as the target of English language teaching in Sri Lanka, which, in essence, means that established SLE forms deviating from the British norm may be considered incorrect, which alienates Sri Lankan pupils from English. The strategies of the initiative derived to cope with the above deficiencies are manifold. In order to improve the English skills of the majority of the pupils, the

. The information on the progress and the achievements of the English as a Life Skill programme are taken from a governmental press release (cf. 〈http://www.news.lk/index. php?option=com_content&task=view&id=15912&Itemid=44〉 (15 September 2010)). . The information on the initiative Speak English Our Way is taken from an interview with Sunimal Fernando, the coordinator of the initiative. The respective article was published in the Sunday Observer (cf. 〈http://www.sundayobserver.lk/2010/07/18/spe05.asp〉 (17 October 2014)).

 The Lexis and Lexicogrammar of Sri Lankan English

method of teaching English is adapted in that speaking and listening skills are taught prior to reading and writing skills, which is supposed to cater more adequately to the needs of those who do not have English as a home language. The control over (education) policies concerning English is decentralised since rural Sinhala and Tamil speaking English teachers from all provinces are consulted to settle issues in education and language planning and policy. In addition to that, SLE as it is currently codified will be given priority when it comes to the teaching of spoken English, which is also supposed to create a more open-minded attitude towards English with the local speech community and, thus, to facilitate the processes of teaching and learning it. However, while it may be possible to trigger changes on the administrative level, putting the above strategies into practice for the benefit of pupils may prove difficult simply because there still is a general lack of qualified English teachers in particular in rural schools (cf. Coperahewa 2009: 93). As mentioned above, measures have been taken against this lack of skilled personnel by training teachers as well as teacher trainers from all provinces in collaboration with Indian institutions and by devising a plan to persuade the government to recruit new teachers, but the success of this endeavour remains to be seen. Furthermore, the collaboration of Sri Lankan and Indian institutions in educating Sri Lankan teachers on how to pass on a spoken variant of SLE has unsurprisingly been the target of much local criticism (cf. e.g. Boange 2010). The above presidential campaigns indicate that spreading English skills throughout Sri Lanka and not restricting the language to an elite minority currently are central governmental goals. It is particularly noteworthy in this context that, although equipping the Sri Lankan workforce with adequate English skills to make it more productive in a competitive international market certainly is an overarching motive for the campaigns, the distinct identity of SLE, which has been largely neglected in the past, is also addressed. Several subcommittees currently work on the standardisation of SLE on the levels of phonology, morphology and syntax and document their findings in teacher guides, which are planned to serve as teaching models at schools. Thus, taking a historical bird’s-eye view on the development of SLE from its arrival in Sri Lanka to its current status, one could state that, though only relatively recently, it has started turning from a once foreign language used to exploit the (natural) resources of Sri Lanka into a culturally-embedded link language that is being promoted as a means for the economic as well as sociocultural progress of Sri Lanka. This change in status having led to a stronger entrenchment of SLE with the local speech community is also mirrored in its present-day evolutionary status in Schneider’s (2003, 2007) dynamic model of the evolution of PCEs. Table 1 summarises the evaluation of SLE with regard to the parameters of e ndonormative

Chapter 2. The development of Sri Lankan English 

stabilisation. ‘+’ means the criterion is met, ‘(+)’ represents that the criterion is partially met, ‘–’ shows that the criterion is not met and ‘?’ indicates that it is not possible to provide a reliable evaluation of the criterion at the time of writing. Table 1. Endonormative stabilisation of present-day SLE (adapted from Mukherjee 2008: 360) Parameter

Criterion

History and politics

Post-independence?

+

Self-dependence?

+

Identity construction

Settlers and indigenous population interwoven?

+/(+)/−/?

−

New nation with panethnic identity?

(+)

Sociolinguistics of contact, use and attitudes

Acceptance of local norms?

(+)

Linguistic developments and structural effects

Positive attitude to local variety?

+

Literary creativity?

+

Stabilisation of a new variety?

?

Codification (e.g. dictionaries)? Relative homogeneity of local norms?

(+) ?

In sum, then, the assessment of endonormative stabilisation of SLE in Schneider’s (2003, 2007) dynamic model of the evolution of PCEs brings to the fore that the majority of the criteria are met. Sri Lanka’s political independence creates a liberating environment, which could potentially lead to further entrenchment of local norms and the growing acceptance of and positive attitude towards SLE as well as its literary tradition are certainly indicative of this. Still, despite the codification of SLE, parts of the local speech community are not yet ready to fully let go of their traditional and socially prestigious BrE model. The yet unpredictable impact of the recently-initiated standardisation processes (e.g. English as a Life Skill) might also become a significant factor as regards the future development of SLE and its variety-specific norms. In spite of minor conservative tendencies, this sociolinguistic constellation in particular in the light of the recent emergence of a pan-ethnic Sri Lankan identity seems to depict a concentration of endonormative forces in SLE. What this discussion of SLE in the framework of Schneider’s (2003, 2007) model has also clearly brought to the fore is that an adequate evaluation of the evolutionary status of SLE is highly dependent on further systematic and in-depth structural descriptions of SLE. It is certainly true that a number of investigations into the structures of SLE have already been conducted, but it is also true that, for a large part, they are either rooted in “random examples and personal experience”

 The Lexis and Lexicogrammar of Sri Lankan English

(Parakrama 1995: 34) or constitute isolated studies on single phenomena based on different sets of empirical data. Against this background, the present study examines a number of lexical as well as lexicogrammatical (and thus related) objects of investigation on the basis of the same corpora to draw a more balanced picture of SLE across structural levels of language organisation. With the help of this approach, it may be possible to establish to what extent the lexical and lexicogrammatical structures of SLE warrant the description of SLE as a variety of English in its own right. In this context, it will also be scrutinised whether and by what means the South Asian identity of SLE finds expression in linguistic structures. The data on the basis of which these structural analyses will be conducted are presented and discussed in detail in Chapter 3, where the structural objects of investigation are also introduced.

chapter 3

Methodology The present study will primarily focus on written acrolectal SLE. Acrolectal writings can be defined as “writings held in high esteem by society, which is not the same thing as texts written by people of high social status” (Wright 2000: 6). There are conceptual, production-related and sociolinguistic reasons for this decision. Due to the social prestige associated with the acrolect, as opposed to the other lects on the dialect continuum (cf. Trudgill 2003: 3), it is reasonable to assume that the wider speech community, which may also speak basilectal or mesolectal variants of a given variety, generally turns to acrolects – in either written or spoken form – as salient reference points for linguistic orientation. This sociolinguistic prominence makes acrolects likely candidates for codification (cf. Bernaisch et al. 2011: 1). In the light of this, acrolects can be considered variety-internal norm providers and are thus an adequate starting point for the exploration and description of the structures in scarcely documented New Englishes. With first descriptions of comparatively unexplored New Englishes, production-related factors speak in favour of using written acrolectal text material as a basis. Due to the fact that the written medium, in contrast to spoken language, offers more time for planning lexical as well as syntactic choices and even provides the opportunity of going back to and editing these choices if the user is not satisfied with the linguistic output, written texts can to a certain extent be considered to reflect what the speaker perceives to be an appropriate choice in a given communicative context. Online speech production time constraints (e.g. the danger of losing the floor in face-to-face conversations) might lead to linguistic performances which the speakers themselves (in retrospect) might view as not entirely appropriate, but which they, however, can only correct to a very limited extent. Consequently, linguistic norms might be more readily observable in written than in spoken language. From a sociolinguistic angle, the currency of written English texts in Sri Lankan everyday life should not be underestimated. In her study on Sri Lankan schoolchildren, Kumarasamy (2007: 87) attests that English is not yet prominent in informal spoken contexts, in which the indigenous languages still prevail, while

 The Lexis and Lexicogrammar of Sri Lankan English

English “was significantly predominating in the domain of media and entertainment”. Although there are a number of English-medium TV and radio stations as illustrated above, novels and newspapers in English also figure prominently in Sri Lanka. Given the by and large absence of English in informal spoken domains (with the exception of some upper class families), written English in Sri Lanka certainly needs to be considered more important in comparison to classic ENL countries, in which English is more widely used in spoken contexts. The data, on the basis of which acrolectal written SLE will be scrutinised, are presented in 3.1. In 3.2, the lexical and lexicogrammatical objects of investigation are delineated. 3.1 The corpus environment The subsequent chapters provide an overview of the text material used to contrastively analyse written acrolectal SLE. In 3.1.1, the ICE components relevant to the study at hand will be described. National components of the South Asian Varieties of English (SAVE) corpus, a newspaper text corpus, and the daily news section of the British National Corpus form the larger complements to the ICE components and will be illustrated in Chapter 3.1.2. In order to trace the degree to which certain acrolectal features have penetrated into the (wider) speech communities, online data will also be analysed with the help of Google advanced searches whenever feasible. This approach will be discussed in detail in 3.1.3. Figure 3 illustrates the structure of the corpus environment of the present study.

ICE

– The British component – The Indian component – The Sri Lankan component

Newspaper Corpora

– BNC news – The Indian component of SAVE – The Sri Lankan component of SAVE

The Google Advanced Search Tool

– The British top-level country domain (.uk) – The Indian top-level country domain (.in) – The Sri Lankan top-level country domain (.lk)

Figure 3. The corpus environment of the present study

Chapter 3. Methodology 

On the basis of authentic, naturally-occurring texts, SLE is studied comparatively with the help of two varieties of English which are also highly relevant to the (study of the) English language in the Sri Lankan context, namely BrE and IndE. There are various historical, conceptual as well as sociolinguistic reasons for selecting these varieties as means of comparison. BrE is the historical input variety of SLE in the sense that the English language was brought to Sri Lanka in the form of the British variant when the British arrived on the coasts of the island at the end of the 18th century (cf. e.g. Gunesekera 2005: 11). Consequently, it is reasonable to assume that this initial contact as well as the prolonged presence of the British and their administrative structures on the island provided the basis for what SLE is today. What needs to be noted here is that the variant of BrE brought to Sri Lanka must be assumed to have been in the main a highly educated variant used by British businessmen and later missionaries who could read and write the language. SLE has developed in the main from this educated BrE variant, which was taught in Sri Lanka via a comparatively formal education system, for more than two centuries. Given that the ICE components as well as the newspaper texts represent exactly this type of educated English, the data is ideal in order to delineate to what degree contemporary SLE has developed structures which may not be present (to the same extent) in present-day BrE to shed light on local processes of structural nativisation and related implications for the evolutionary status of SLE. In addition, BrE continues to be of pivotal importance in many sociolinguistic spheres of Sri Lanka. In various sociolinguistic surveys, it has been shown that (a) BrE still is the production goal for many English speakers in Sri Lanka (cf. e.g. Künstler et al. 2009: 69) and (b) a large proportion of the English-speaking community in Sri Lanka considers itself to be using a British variant of English (cf. e.g. Gunesekera 2005: 42). Thus, the (partly assumed) presence of and the prestige associated with BrE makes it an exocentric model of pivotal importance to the study of SLE. Furthermore, it will also be interesting to contrast SLE with an ENL variety to identify areas in which first and second language users converge or diverge. The comparison of SLE and IndE, two ESL varieties, is also imperative. First of all, due to the close physical proximity of India and Sri Lanka and the continuing migration between the countries, in particular between the northern part of Sri Lanka and the southern part of India (cf. de Silva 1981: 369), it might well be the case that linguistic norms show a certain degree of homogeneity. Very much in line with this, Leitner (cf. 1992: 225) identifies IndE as a norm-providing linguistic entity exerting epicentral influences on other varieties of English in South Asia. Even more drastically, earlier models of World English (cf. e.g. Strevens 1980: 86–87) considered several national varieties of English in the South Asian region to be varieties of IndE, thus lacking distinct characteristics. Consequently,

 The Lexis and Lexicogrammar of Sri Lankan English

an empirical investigation contrasting SLE and IndE is called for in order to delineate linguistic features which might allow establishing pan-South Asian characteristics of English as well as variety-specific particularities with possible implications for national variety status of both IndE and SLE. In consequence, the comparison of SLE with BrE and IndE promises to reveal (developing) norms of SLE as well as perspectives on their potential presence in other varieties. These insights are salient for an empirical evaluation of the status of SLE as a variety in its own right. In what follows, the databases to be used for the investigation of these norms are presented. 3.1.1 The International Corpus of English The International Corpus of English (ICE) project is a megacorpus project in nature, which was initiated by Sidney Greenbaum in the late 1980s. More than 25 international research teams have worked or are currently working on the compilation of regional first and second language corpora (cf. Greenbaum 1996: 3). Approximately 20 years after its launch, it seems as if two generations within the ICE framework had emerged, namely an older generation including the earliest examples of finalised ICE components stemming from the 1990s such as ICEGreat Britain or ICE-India and a fairly young generation of corpora still in the making with a particular focus on ESL varieties of English such as Fiji English or SLE (cf. e.g. Biewer et al. 2010: 11). The corpus design of the individual ICE components prescribes the compilation of one million words of authentic text material which is divided into a total of 500 equally-large texts featuring 2000 words each (cf. Nelson 1996: 27). The major distinction in the corpus design is a medium-dependent one; 60% of the texts in each ICE component are spoken while 40% are written (cf. Nelson 1996: 30). The spoken and written parts of the components encapsulate categories ranging from relatively informal texts (such as e.g. social letters or telephone conversations) to highly formal texts (such as e.g. academic writing or parliamentary debates). As sketched at the beginning of this chapter, the present study will exclusively focus on the written parts of the Sri Lankan (ICE-SL), British (ICE-GB) and Indian (ICE-IND) components of ICE. Table 2 illustrates the corpus design of the written part of ICE.1

. Although all written ICE components are designed to include 400,000 words, there are differences with the individual national components concerning the number of words which are eventually featured in the databases. ICE-SL comprises 409,870 words, ICE-IND 420,077 words and ICE-GB 431,650 words.

Chapter 3. Methodology 

The compilation guidelines of the ICE project reflect that the components are supposed to represent acrolectal variants of the regional varieties they cover (cf. Mukherjee et al. 2010: 66). Against this background, age and English-medium education are the general criteria for text selection because they are “quantifiable” (Nelson 1996: 28). Table 2. The ICE corpus design for written texts (cf. Greenbaum & Nelson 1996: 13–14) Text categories Non-printed (W1)

Texts

Words

Student untimed essays

10

20,000

Student examination scripts

10

20,000

Social letters

15

30,000

Business letters

15

30,000

Humanities

10

20,000

Social sciences

10

20,000

Natural sciences

10

20,000

Technology

10

20,000

Humanities

10

20,000

Social sciences

10

20,000

Natural sciences

10

20,000

Technology

10

20,000

20

40,000

Administrative/regulatory

10

20,000

Skills/hobbies

10

20,000

10

20,000

Novels/stories

20

40,000

TOTAL

200

400,000

Non-professional writing (W1A)

Correspondence (W1B)

Printed (W2)

Academic writing (W2A)

Non-academic writing (W2B)

Reportage (W2C) Press news reports Instructional writing (W2D)

Persuasive writing (W2E) Press editorials Creative writing (W2F)

 The Lexis and Lexicogrammar of Sri Lankan English

ICE is investigating ‘educated’ or ‘standard’ English. However, we do not examine the texts to decide whether they conform to our conception of ‘educated’ or ‘standard’ English. To do so would introduce a subjective circularity that would downplay the variability among educated speakers and the variation due to situational factors. Our criterion for inclusion is not the language used in the texts but who uses the language. The people whose language is represented in the corpora are adults (18 or over) who have received formal education through the medium of English to the completion of secondary school, but we also include some who do not meet the education criterion if their public status (for example, as politicians, broadcasters, or writers) makes their inclusion appropriate. (Greenbaum 1996: 6)

Another criterion for the selection of adequate speakers is that of local nativeness. Nelson (1996: 28) states that, in the context of the ICE corpus, the concept of nativeness of speakers “means either that they were born in the country concerned, or if not, that they moved there at an early age and received their school education through the medium of English in that country”. Still, in many ESL countries, extensive stays abroad, in particular to obtain degrees in higher education, are characteristic features of the CVs of the speakers that meet the above criteria regarding age, education and nativeness. Thus, an exclusion of these speakers would most certainly distort the corpus-linguistic reflection of the sociolinguistic makeup of the respective speech communities, which is why some ICE teams including that of ICE-SL consciously opted for their inclusion in the respective components (cf. Körtvelyessy et al. 2012: 6). The speaker selection and the choice of corpus texts, however, are not guided by the aim of creating a balanced corpus representing the pattern of social variables in the respective speech communities. Nevertheless, the project coordinators argue in favour of including a wide range of female and male speakers with distinct social backgrounds (cf. Nelson 1996: 28). The relatively high degree of comparability of the individual components in the ICE framework might be the greatest asset of the project as it paves the way for novel cross-varietal as well as intra-varietal perspectives. Greenbaum, shortly after the launch of the ICE project, anticipates the contrastive analysis of national varieties of English: As the parallel corpora become available, new possibilities open up for rigorous comparative and contrastive studies. I envisage the search for typologies of national varieties of English: first-language versus second-language English, British-type versus American-type English, African versus Asian English, East African versus West African English. Researchers might explore what is common to English in all countries where it is used for internal communication, demonstrating how far it is legitimate to speak of a common core for English or of an international written standard. (Greenbaum 1996: 10)

Chapter 3. Methodology 

Still, the wide array of text categories within a single national component can also be exploited to provide answers to research questions that could only be explored tentatively prior to the creation of the ICE corpus. ICE enables e.g. the description of medium-dependent norms peculiar to or shared by certain varieties or allows for the study of genre conventions. Owing to the multiplicity of genres covered, the individual components of ICE provide a fairly wide-ranging and accurate snapshot of actual language use. The result is that this “broad sampling procedure ensures that the corpora are representative of the English in general use in each participating country” (Nelson 1996: 35). Studies based on ICE may thus be claimed to have of a relatively strong standing in the descriptive body covering the varieties investigated. Cross-varietal comparability and representativeness of the ICE components can in fact be considered two competing goals as e.g. strongly adapting the text selection regarding academic and popular writing to particular sociolinguistic contexts would lead to discrepancies in the text categories concerned across various ICE components. With reference to the Fijian component of ICE, Biewer et al. state that [t]he challenge that we face is to compile a Fiji English component which is at once representative of the current language use in Fiji, and at the same time comparable to all other ICE corpora. […] This corpus should not only enable us to get a clear idea of the actual usage of acrolectal English in Fiji but also to compare this variety with other standard varieties of English worldwide. This task of creating a corpus that matches the general ICE framework, but which is still representative of the universe of the local and culture-specific production of standard English texts in Fiji, quickly started to look as difficult as squaring the circle.(Biewer et al. 2010: 5)

This challenge of making a dataset comparable and representative at the same time also needed to be faced in the collection of data for ICE-SL. Marriage advertisements published in Sri Lankan newspapers such as the one in (4) illustrate the complexity of the issue.

(4) Partner sought by retired father for 5’, 1981, slim B.Sc. Eng., M.Sc. IELTS, employed daughter divorced from few months marriage 2008. Only brother is Doctor. Horoscope necessary. Those abroad also welcome. 〈http://www. sundayobserver.lk/2014/09/07/c_brides.asp〉 (17 October 2014)

Marriage advertisements of the above kind where a parent advertises the daughter for marriage are common in Sri Lankan newspapers and can also be found in other South Asian newspapers, but they are probably not as likely to occur in this format in other English-speaking countries outside South Asia. Consequently, the inclusion of these marriage advertisements – for instance in the reportage section

 The Lexis and Lexicogrammar of Sri Lankan English

(W2C) of ICE – would certainly make the dataset more representative in terms of published newspaper material, but at the same time, this would represent a digression from the news reports featured in this section in other ICE components. For the sake of cross-component comparability, marriage advertisements were thus not included in ICE-SL, although their inclusion would have certainly allowed the corpus to mirror yet another facet of how and where English is used in Sri Lanka. What is also of importance, particularly for the ESL varieties in the ICE project, is that most of the national components are the first systematic attempts at compiling representative databases of the respective varieties (cf. Greenbaum 1996: 10) – this is also true of ICE-SL, while for IndE and BrE, corpora predating the respective ICE components are available (e.g. the Kolhapur Corpus of Indian English (Shastri et al. 1986) or the Lancaster-Oslo/Bergen Corpus (Johansson et al. 1978)). As these databases are primarily meant to be used for the description of in some cases scarcely documented varieties, the compilation of a national ICE component marks a milestone in the codification of a given regional variety of English and might eventually strengthen its perception as a variety of English in its own right (cf. Mukherjee et al. 2010: 74). While it may be true that “[f]or most grammatical and discourse studies, corpora of a million words are more than adequate” (Greenbaum & Nelson 1996: 13), the relatively limited size of the ICE components surely is a restriction with regard to certain investigations, in particular if they are of a lexical nature. Nelson (cf. 1996: 35) himself is aware of this shortcoming and corpus sizes outside the ICE project have continued to increase since its start. However, given that compilation and annotation of ICE components is comparatively labour-intensive, it is to be doubted whether the creation of larger ICE components would be feasible and the restriction in size can certainly be compensated for by complementing the ICE texts with larger sets of data. Although comparability of the ICE components is generally claimed to be one of the main assets of the project, it is simultaneously also one of its covert limitations, which requires a critical awareness. While the individual components all look alike at face value due to the shared corpus design, the variety-specific texts that enter each text category may exhibit variation. This is a potential problem that is just as evident with the first generation ICE components compiled around the 1990s (cf. Greenbaum & Nelson 1996: 5) as with the second generation which is still largely in the making. Biewer et al. illustrate regional text conventions which need to be considered in the compilation of an ICE component: In Fiji, a novel can be a kind of patchwork creation including not only fictional prose but also poetry, autobiographical snippets and excursions into historical events. A student essay may include phrases from secondary literature that were

Chapter 3. Methodology 

learned by heart, class lessons may sound like a teacher’s monologue. A public demonstration may turn out to be an actual demonstration without words. A discussion on TV may look more like an interview as only one person at a time answers the presenter’s questions – moreover, the answers themselves strongly suggest that the questions were known in advance and the answers rehearsed. (Biewer et al. 2010: 15)

In the compilation of ICE-SL, however, the boundaries between the individual text categories as laid down in the general corpus design could relatively easily be mapped onto the written text material available in Sri Lanka (e.g. there is a relatively sharp division between academic (W2A) and non-academic writing (W2B), student untimed essays and student examination scripts were available for nonprofessional writing (W1A), etc.). For this reason, variation induced by differing text conventions across the datasets under scrutiny can probably be regarded as marginal. Not only are text categories sometimes filled with material produced with different textual conventions; in particular in some ESL varieties, some ICE text categories need to be stretched as the relevant texts are not produced in the English language. For that reason, material that is considered to be relatively similar to the original text category may compensate for this lack of data as illustrated for ICE-Malta. Even more problematic, if not impossible, is the collection of data for the categories parliamentary debates, legal presentations and legal cross-examinations, since these are exclusively done in Maltese. For the first of these categories we have contacted Maltese Members of the European Parliament (MEPs), but since Maltese was granted the status of an official EU language as early as 2002, the vast majority, if not all speeches and debates, are held in Maltese. Translations by Maltese interpreters could be made available, but this would probably stretch the category beyond the boundaries originally envisaged by the corpus designers. (Hilbert & Krug 2010: 60)

Although the compilation of some written text categories was certainly also more time-consuming than that of others (e.g. the finalisation of the category of technology in non-academic writing (W2B) turned out to be more challenging than that of other categories of non-academic texts given that simply fewer texts were available and it thus took more time to locate them), the only deviation from the original corpus design in ICE-SL finds itself in the correspondence (W1B) section. Given that the ICE project is already more than two decades old, it is obvious that the compilation of components is subject to sociolinguistic as well as technological changes (with potential consequences for inter-component comparability). Correspondingly, several ICE teams including the one working on ICE-SL have opted for the inclusion of email communication in the social letters section as

 The Lexis and Lexicogrammar of Sri Lankan English

[a]t the time of compilation, emails had already to a large extent replaced handwritten and typed letters and are therefore more representative of presentday Sri Lankan English letter writing. […] [F]rom the perspective of technological and social development handwritten letters have become nearly obsolete nowadays.(Körtvelyessy et al. 2012: 6)

Due to the labour-intensive nature of ICE corpus compilation, it is only natural that diachronic gaps of two kinds emerge. The fact that the first generation of ICE was compiled at the end of the 20th century and the newer ICE teams are still engaged in the process of corpus compilation in 2010 illustrates that there is a time gap between the components of the ICE corpus project (cf. e.g. Mukherjee et al. 2010: 67). However, there is no denying that the individual components of the ICE project themselves have a certain diachronic dimension to them. ICE-SL features texts covering a six-year time span, i.e. from 2003 to 2009 (cf. Körtvelyessy et al. 2012: 5), which “might raise difficulties because ongoing language change might thus have a skewing effect” (Biewer et al. 2010: 10). In this context, however, lexical analyses have generally been argued to be more affected by possible shortterm diachronic distortions than (lexico)grammatical features (cf. Mukherjee et al. 2010: 74). As the ICE data lend themselves more readily to syntactic analyses in the first place due to their size, these diachronic concerns are probably of limited relevance to the majority of ICE-based studies. Still, given that lexical analyses also from part of this study, newspaper datasets and online queries, which are described in more detail in Chapters 3.1.2 and 3.1.3, are used to counteract the potentially undesirable effects of the diachronic dimension in and across the ICE components used. Mechanisms of text production also play a significant role with regard to an evaluation of the ICE project as a corpus-linguistic source. As ICE components are generally geared towards representing variety-specific language use, the role of non-local editors or semi-automatic editing tools is a sensitive issue. [W]riters of printed works are usually required to follow the house style of the publisher or newspaper for which they are writing. Printed material may have been edited by a number of different people, and the final version is often the product of several earlier revisions. Some non-printed texts, of course, are now regularly produced on wordprocessors (business letters, for example), and can benefit from the use of automatic spelling, grammar, and style checkers. (Nelson 1996: 32)

Although the above is certainly true, the texts produced via this editing routine nevertheless form part of the sociolinguistic reality of the respective speech communities and, as a consequence, are representative of local language use. Furthermore, this editing procedure is inherent not only in the ICE project and certainly

Chapter 3. Methodology 

not restricted to a selection of varieties, which implies that editing does not impede comparability in this regard. With a critical awareness of the potential shortcomings of (the individual components of) ICE as regards corpus size and time lag within and between individual components and an adequate choice of objects of investigation which are only peripherally affected by these limitations, the database provides a unique corpus environment as it opens up new avenues for cross-varietal research with the help of representative data. Concerning the components used in the present study, i.e. ICE-SL, ICE-IND and ICE-GB, the only major difference with regard to text sampling is that ICE-SL features emails in the sections covering social and business letters (cf. Mukherjee et al. 2010: 67). The decision to mirror this text linguistic change from written social letters to email communication might negatively affect comparability in informal written communication with some research questions, but the inclusion of emails surely yields a more adequate reflection of the usage of English in present-day Sri Lanka. However, due to additional corpus material and the selection of the objects of investigation, the diachronic gap between and the difference in text selection of ICE-SL, ICE-IND and ICE-GB only minimally affect the comparability of the data with the study at hand. The corpus-based analyses comprise lexical analyses, each of which features (a) large numbers of lexical items and (b) lexemes stemming from several word classes, which pre-empts strong topic-based frequency effects across the datasets, as well as lexicogrammatical studies, with which the diachronic gap (cf. Mukherjee et al. 2010: 74) and topic-related differences between the individual texts in the datasets (cf. Olavarría de Ersson & Shaw 2003: 138) must be considered to be of only minor importance. In order to complement the ICE components with supplementary data, various options are available. Greenbaum and Nelson (cf. 1996: 6) suggest several alternatives regarding the expansion of the ICE components: –– an expanded corpus with more material in each ICE category –– a database featuring more texts of one/several ICE category/categories –– a non-standard corpus containing texts from speakers who do not meet the ICE specifications for speaker selection –– a monitor corpus which is continuously fed with new data. The present study opts for the complementation of the ICE data with a larger set of (partly web-derived) newspaper corpora, i.e. for “specialized corpora” (Greenbaum & Nelson 1996: 6) featuring one basic text category only. The design of the newspaper text collections as well as the reasoning behind this choice of data addition will be illustrated in Chapter 3.1.2.

 The Lexis and Lexicogrammar of Sri Lankan English

3.1.2 The newspaper corpora The South Asian Varieties of English (SAVE) Corpus comprises six national components covering data from Bangladesh, India, the Maldives, Nepal, Pakistan and Sri Lanka.2 Each national component is divided into two subsections, each of which features newspaper texts drawn from the online archives of a major daily newspaper in the respective country. The corpora were compiled using an adapted version of Hoffmann’s (2007) web page to mega-corpus method (cf. Bernaisch et al. 2011: 6). Due to the proficiency of the language users involved in newspaper production, the above newspaper corpora can be considered to reflect acrolectal national standards of English and are thus adequate large-scale complements to the ICE components. Because of the systematic deletion of articles distributed by international news agencies from the corpus, the representativeness of SAVE as regards varietyspecific language is at a relatively high level since the remaining articles can be considered to be fair reflections of local language use. The data cleaning was based on a list of approximately 200 news agencies, but as the identification of news agency reports relied on the presence of a reference to the respective news agency in a given article, it may be the case that an insignificant amount of news agency articles entered the corpus because the reference might have been omitted in the original for some reason (cf. Bernaisch et al. 2011: 3). The Sri Lankan (SAVE-SL) and the Indian (SAVE-IND) national components of SAVE are the ones of relevance for the study at hand. Each national component comprises approximately three million words of newspaper writing (cf. Bernaisch et al. 2011: 2). In order to complement the South Asian data with BrE text material, the daily news section of the British National Corpus (BNC news) represents a useful source for the study at hand. BNC news features about nine million words of BrE newspaper data.3 The newspaper text collections and their sizes and sources are shown in Table 3. There are sociolinguistic as well as corpus-linguistic considerations which make the investigation of the SAVE components in combination with BNC news attractive. From a sociolinguistic perspective, it has been argued that local newspaper English can have a standardising and codifying function for regional varieties which have not yet been documented in dictionaries or grammars . The SAVE Corpus was compiled in the project “Verb complementation in South Asian Englishes: a study of ditransitive verbs in web-derived corpora” (2008–2011; Principal Investigator: J. Mukherjee) funded by the Deutsche Forschungsgemeinschaft (MU 1683/3–1). . For a comprehensive documentation of the newspaper sources of the BNC, please visit 〈http://www.natcorp.ox.ac.uk/docs/URG/bibliog.html〉 (17 October 2014).

Chapter 3. Methodology 

Table 3. The sizes and sources of the newspaper corpora Corpus

Words

Sources

BNC news

8,873,737

various British daily newspapers

SAVE-IND

3,071,735

The Statesman 〈http://www.thestatesman.net〉 (17 October 2014) The Times of India 〈http://www.timesofindia.indiatimes.com〉 (17 October 2014)

SAVE-SL

3,065,820

Daily Mirror 〈http://www.dailymirror.lk〉 (17 October 2014) Daily News 〈http://www.dailynews.lk〉 (17 October 2014)

(cf. e.g. Herat 2001: 7–8; Schilk 2011: 47). While IndE may display a more ample body of descriptive work than SLE, the latter New English has largely been neglected academically, resulting in a lack of lexical as well as grammatical documentation with a few laudable exceptions (cf. e.g. Meyler 2007). It is in this context that local newspapers (and their respective house styles) become standardising forces (cf. McArthur 2001: 6; Coperahewa 2009: 90). In the line of thinking of the complaint tradition as regards the development of new varieties of English, emerging features of New Englishes are often universally discarded as learner mistakes (cf. e.g. Fonseka 2003: 2). While this radically conservative perspective is as contestable as embracing every earlier unrecorded form as a feature of a new variety of English, analyses of newspaper data can help to differentiate between idiosyncratic one-off learner mistakes and novel systematic features characterising a new variety of English since “the authors of these newspapers can be considered very proficient users of the English language, so nativized features are relatively unlikely to be what are often considered learner mistakes” (Schilk 2011: 47). Due to editing processes, newspaper texts are read (and possibly changed) by various speakers, which means that linguistic structures in the published text version have been approved of repeatedly and can thus be viewed as features of the local variety of English. In addition to the sociolinguistic perspectives in favour of the usage of newspaper corpora, they also offer a few corpus-linguistic advantages. First of all, the availability of (online) daily newspapers in India and Sri Lanka (cf. Coperahewa 2009: 90) allows for a compilation of a diversified database which does not have to rely on one major newspaper only, which, as a consequence, would to a certain degree yield a reflection of the respective house style. Although newspaper texts appear to be a homogenous text category at first sight, newspaper articles feature a relatively wide selection of texts ranging from

 The Lexis and Lexicogrammar of Sri Lankan English

argumentative writing (e.g. editorials) to descriptive articles (e.g. sport reports; cf. Schilk et al. 2012: 147). Although this by no means implies that newspaper texts can be regarded as representative of a variety of English in its entirety since they mirror spoken usage to a very limited extent only as, for example, in interview articles, it is an extremely valuable database for the investigation of writing in varieties of English. Though SAVE features text material from online archives, the corpus itself is available as offline data. In contrast to research conducted with online web queries, the results created on the basis of SAVE and BNC news are replicable, which increases intersubjective comparability of findings and can eventually help to make the analysis more transparent. With regard to the technological aspects of corpus compilation, it needs to be pointed out that, once a workable routine has been set up, texts from online archives of newspapers can be collected at a relatively fast pace in comparison to traditional corpus compilation approaches as adopted in the ICE project. The semi-automatic downloads enable the creation of large databases, which can be exploited to investigate low-frequency phenomena, for which e.g. the ICE components would prove too small. In contrast to the web-as-corpus approach, the usage of relatively large offline databases entails several technological advantages. Corpus linguists have more control over what sort of text material or which kind of text types should be included in their databases and once they have been created, they can be accessed with the help of standard (corpus-linguistic) tools or programmes, which also allows for a more in-depth analysis of the data due to the possibility of annotation (cf. Hundt et al. 2007: 3). Nevertheless, the utilisation of newspapers as corpus-linguistic resources can also have certain drawbacks. In particular in regions where ESL varieties are used, it may be the case that newspaper publishers or other publishing houses employ editors from abroad as, for instance, in Fiji (cf. Biewer et al. 2010: 12–13).4 This, however, does not seem to hold true for Sri Lanka. From the experience of

. In the context of text editing, the influence of BrE and American English on the final version of the text is often discussed. While there are various opinions regarding the extent to which the above varieties exert influences on (international) print standards (cf. e.g. M cArthur 2001), Herat (2001: 8), in the context of SLE, puts forward that “[t]he usage and style manuals used by government newspapers, [sic!] advocate a conservative model of Standard English, usually British, which is held up as the ideal and correct form of the language”. Given the prestige the British variety of English enjoys in Sri Lanka, this may be true. Nevertheless, it remains unclear whether any structural level of language organisation would be severely affected by such style manuals.

Chapter 3. Methodology 

a former employee of the Daily Mirror, the major Sri Lankan English-medium newspapers tend to employ Sri Lankan editors (Dilini Algama, p.c.). This observation also finds reflection in the fact that the current editors of the two Sri Lankan newspapers which provided the data for SAVE-SL, i.e. Daily Mirror and Daily News, are also Sri Lankan. Once newspaper texts are circulated, they become part of the sociolinguistic reality of the locale where they are produced and read, which, due to the linguistic prestige of newspaper publishing, very frequently renders those texts models for text production (cf. Schilk 2011: 47). Thus, in the study of (emer ging) norms in a given variety, the investigation of newspaper texts promises to yield valuable insights since the newspapers themselves can be considered to be one of the origins of the linguistic norms spreading in the respective speech communities. Although it could be illustrated above that the newspaper texts under scrutiny stem from a variety of text genres, it is obvious that the representativeness of newspaper corpora is nevertheless limited since newspapers only rarely feature writing located on the lower end of the formality scale. Thus, newspaper corpora are most valuable when they are complemented with more representative databases as well as with even larger sets of data in terms of number of words in order to investigate to what degree a certain feature has penetrated a particular speech community. For that reason, the present study analyses the newspaper data in conjunction with the respective ICE components. In addition to that, the offline databases just mentioned are complemented with online data, which will be in the centre of attention in Chapter 3.1.3. 3.1.3 The Google Advanced Search Tool While carefully designed and balanced corpora, which can be considered to be representative of a given variety of English, are crucial for the investigation of emerging linguistic norms in SLE, they are usually not geared towards the study of low-frequency phenomena such as “neologisms and coinages; newly-vogueish terms; rare or possibly obsolete terms; rare or possibly obsolete constructions” (Renouf et al. 2005: 3) due to size restrictions. From this perspective, the web presents itself as a (corpus-)linguistic resource featuring a steadily growing number of texts. For that reason, the present study uses the Google Advanced Search Tool (GAST) in addition to the ICE and newspaper data. With the help of GAST, the study at hand investigates lexical as well as lexicogrammatical features in SLE, IndE and BrE. The respective top-level country domains (i.e. .lk for Sri Lanka, .in for India and .uk for the United Kingdom) can be specified in the advanced search user interface, which enables what can

 The Lexis and Lexicogrammar of Sri Lankan English

be assumed to be variety-specific searches of selected phenomena. In his corpusbased study on IndE, Sedlatschek (2009) also successfully consults additional evidence from a number of (variety-specific) domains via Google search queries to critically revisit and further refine his findings delineated on the basis of welldefined offline datasets. In general, the standard search settings were used for each of the GAST searches. However, the language of the websites to be retrieved was set to E nglish since the resulting hits could then be considered to originate from (variety- specific) English language use by simultaneously excluding the possibility of counting in transfer phenomena of English lexemes or groups of English words in one of the indigenous languages, which is not the object of investigation of the study at hand. Particularly with regard to quantitative perspectives on GAST searches and resulting analyses, it needs to be stated that the three top-level country domains under scrutiny are not similar in size when it comes to the proportion of English language text. Against this background, it is necessary to provide an estimate of the number of English words in the domains concerned, which were calculated following the example of Grefenstette and Nioche (2000). Grefenstette and Nioche (cf. 2000: 239) provide a list of twenty English word forms and their average number of occurrence per thousand words as established on the basis of a varied set of offline corpora. By using the average frequency per thousand words of each of these word forms, the respective frequencies of each word form in the Sri Lankan, Indian and British domains as established via GAST searches can be used to extrapolate an estimate of the domain-specific total number of English words. With the highest and lowest estimates being discarded respectively, the average of the remaining estimates as documented in Table 4 is considered to roughly represent the number of English words in the domains under investigation. Table 4. Estimates of the number of English words in the top-level country domains (as on 23 June 2011) Top-level country domain .lk

Estimate of number of English words 1,462,810,846.56

.in

86,573,743,386.24

.uk

352,986,111,111.11

In principle, these estimates can be considered corpus sizes of the Sri Lankan (GAST-SL), Indian (GAST-IND) and British English (GAST-GB) online data. The number of English words in the three domains will be particularly relevant in

Chapter 3. Methodology 

order to establish normalised frequencies of the phenomena to be scrutinised, which can then be compared across the different domains. The .in domain is roughly 60 times as big as the .lk domain and the .uk domain is about 240 times as big as the .lk domain, making the .uk domain approximately four times as big as the .in domain.5 In what follows, potential benefits as well as possible pitfalls of the usage of online data for (corpus-)linguistic investigations will be in the centre of attention. Thereafter, GAST as a search engine for linguistic purposes will be discussed in further detail. Corpus-linguistic as well as technological aspects can be employed to argue in favour of online data in English language research. The corpus-linguistic arguments in favour of the usage of web material are the quantity of English texts available online, the novelty and freshness of data and the steadily increasing representativeness of online texts. In discussions on Internet resources for linguistic investigations, one benefit of online data keeps resurfacing – the seemingly endless and continuously growing number of online texts in English (cf. Fletcher 2007: 26). It is undoubtedly true that increasingly large (standard) corpora of (varieties of) English are available for linguistic research, but for research questions on the word-level, for instance, it may well be the case that these databases still prove too small. For some areas in corpus-linguistics, even the new mega-size corpora of the BNCtype are still not large enough. Examples would be most kinds of lexicographic research, in particular. The study of lexical innovations or morphological productivity really needs material that goes far beyond even the new megacorpora. But even the investigation of some of the more ephemeral points in English grammar is not possible on the basis of a 100 million word corpus. (Hundt et al. 2007: 1)

It is under these circumstances that researchers may wish to turn to online sources since “[t]he web has unique potential to yield large-volume data” (Renouf et al. 2007: 47). Given the sizes of the individual offline datasets of the present study, variety-specific online data represent an invaluable complement – in particular for the study of (the frequency of) lexical items in the varieties concerned. Still, it is not only due to the sheer quantity of available texts that linguists resort to web material; the freshness of the data is another salient factor that should not be

. However, as the domain sizes are subject to change over relatively short time spans, updated estimates are used as the basis for calculations of normalised frequencies in GAST whenever necessary in the analytical Chapters 4 and 5.

 The Lexis and Lexicogrammar of Sri Lankan English

underestimated since the novelty of the data available extends from text-level to variety-level. The compilation of standard corpora is an extremely labour-intensive business requiring substantial funding, in particular when the corpora concerned sample spoken language and/or require extensive mark-up. It is thus economically reasonable that once such a database has been created, interested scholars frequently exploit existing databases rather than start assembling text collections themselves. Although it has been argued that the relevance of time gaps in corpus-linguistic investigations correlates with the structural level under scrutiny (cf. Mukherjee et al. 2010: 74), this philosophy nevertheless comes at the cost of up-to-date and recent databases. Despite the fact that the SLE corpus resources are based on relatively recent data – ICE-SL includes texts from 2003 to 2009 (cf. Körtvelyessy et al. 2012: 5) and SAVE-SL from 2001 to 2007 (cf. Bernaisch et al. 2011: 2) – online queries always outperform corpus-linguistic studies based on offline datasets in terms of their currentness. It is in this context that online texts prove particularly valuable also for the study at hand because “the content of compiled corpora ages quickly, but texts on contemporary issues and authentic examples of current, nonstandard, or emerging language usage thrive online” (Fletcher 2007: 27). While the Internet and web-based communication currently are all-pervading elements of day-to-day life, their importance in the early 1990s was only marginal. Nelson, in a discussion of the text categories to be featured in the ICE design, writes that [w]e might wish to include electronic mail messages, faxes, and answer-phone messages, for example, in order to give a more complete view of British English in use in the 1990s. However, these text types are not available in all the ICE countries, and indeed still have restricted use even in Britain. For these reasons, they have been excluded from the general design. (Nelson 1996: 29)

Due to the technological developments that have manifested themselves in a steadily growing number of users of the Internet and in an increasing per capita consumption and production of online material (cf. Fletcher 2007: 25), it may well be argued that the relevance of web material for linguistic analyses is higher today than it was in earlier decades. To put it differently, web material, at present, might be considered to be more representative than at previous stages of language research: “as the proportion of information, communication and entertainment delivered via the net grows, language on and off the web increasingly reflects and enriches our tongue” (Fletcher 2007: 27). In addition to the corpus-linguistic arguments that throw a favourable light on the use of web material in linguistic analyses, there are also economically- motivated aspects that need to be taken into consideration. The high speed of data

Chapter 3. Methodology 

processing certainly is a factor that should not be underestimated (cf. Hundt et al. 2007: 3) and the relative cost-efficient way of data analysis (cf. Hundt et al. 2007: 2) is also to be acknowledged. Nevertheless, the corpus-linguistic and economic benefits of the use of webbased text material do not come without their pitfalls, which need to be identified and sensitively handled in order to be able to draw reliable conclusions on the basis of web texts. The major downsides to online material are text-related issues regarding the quality of the texts themselves and the text categories available online, but linguistic data processing and analysis may also be inhibited by the lack of metainformation of many web texts, non-replicability of results, limited postprocessing of data and statistical limitations. The quality of English-language texts available on the net varies considerably (cf. Mair 2007: 244). While some texts may have been produced by highly proficient ENL/ESL users whose texts reflect the acrolectal end of the respective dialect continuum, other texts may have been written by EFL users who, as a consequence of their more limited familiarity with the English language, are likely to make mistakes when constructing English sentences. While the former web texts lend themselves ideally to e.g. the study of structural nativisation in varieties of English, the latter are only of little value in this endeavour. Still, it is not only English language proficiency which can be held accountable for mistakes in online web texts; due to the lack of editing, online material may also contain performance errors. Thus I considered them [= certain pronoun placements] to be attested but ungrammatical – some examples obtained on the internet in papers I read now strike me as quite the same, that is, possibly produced by nonnative speakers, or typed quickly and thus reflecting performance errors, and so on. (Joseph 2004: 383)

Another online-specific factor which may have an influence on the respective texts is the formats available online since “[t]he multi-media and HTML format […] is also likely to exercise its own constraints and preferences in the use of language” (Leech 2007: 144). Thus, the quality of English-language online material may be inhibited by factors which can hardly be controlled for such as proficiency in English, performance phenomena and web formats themselves. Against this background, it is ensured that the online searches are characterised by a high level of precision via different search routines, which are described in detail when the respective findings are presented and analysed. Given the unavailability of sociobiographic author information, the domain-specific online texts consulted are, however, not subject to further selection because this selection would rely on (a) the absence of preconceived notions of what constitutes acrolectal texts and (b) the presence of what are perceived to be markers of mesolectal and basilectal usage in

 The Lexis and Lexicogrammar of Sri Lankan English

the online texts and thus introduce an undesirable circularity in the description of the (variety-specific) acrolects. Accordingly, the online results are not exclusively based on acrolectal texts and, as a consequence, need to be carefully interpreted in the light of the more reliable offline data. Additional downsides regarding linguistic data processing and analysis are addressed in the overview in Table 5. Table 5. Limitations in the use of web data Limitations in access to and processing of web data

Restricted access to textual metainformation – Author information (age, gender, proficiency in English, etc. (cf. Hundt et al. 2007: 4)) – Text provenance – Place via top-level domain (cf. Renouf et al. 2007: 56) – Time of text production (cf. Leech 2007: 145) Limited possibility of post-processing (e.g. POS-tagging or parsing) data (cf. Leech 2007: 134)

Limitations in analyses of web data

Non-replicability of web-based results (cf. Fletcher 2007: 37) Inadequacy of (inferential) statistical analyses of web-based results (cf. Lüdeling et al. 2007: 11–12) Fluctuating size of the (English language portion) of the web (cf. Mair 2007: 244)

This discussion of the benefits and drawbacks of online text material brings to the fore that each upside has a corresponding downside. The huge amount of texts available renders manual analysis of non-post-processed data impossible, the large number of varieties of English covered can only be studied on the basis of fuzzy top-level country domains, the variety of text categories available online is difficult to grasp in terms of quantitative proportions, etc. It is true that “[w]eb data are dirty” (Mair 2007: 239) and although there are no ways of cleaning them, there are nevertheless ways to make them shine. It appears to be recommendable not to use web data as a detached source for linguistic information, but to triangulate findings with results from various databases ideally including “tried and tested closed corpora” (Mair 2007: 236). The present study adopts this approach in that it complements well-defined (standard) corpora of acrolectal varieties of English with online data. The online material will be accessed with the help of GAST, a commercial search engine in nature, which also needs to be evaluated from a critical perspective. Search engines are indispensable gateways to online data (cf. Fletcher 2007: 30). Given that linguistic search engines for the web are still in the making at the time of writing (cf. Renouf et al. 2007), the choice is limited to the selection between a number of commercial search engines. In this context, GAST is a particularly attractive option.

Chapter 3. Methodology 

The user-friendliness of GAST is evident from each of the central searchrelated procedures and the most central argument in favour of GAST. First of all, GAST has a clear search interface allowing the user to specify all details for the search to be conducted on one screen only. Once the search terms have been specified, it usually only takes split seconds for the search to be completed enabling researchers to conduct a high number of searches in a small amount of time. Additionally, the output of the search is also straightforwardly clear in that the total number of hits is displayed at the very beginning of the results page and the relevant websites are listed afterwards. Consequently, GAST can be considered to be an extremely fast and user-friendly way of obtaining linguistic “webidence” (Fletcher 2007: 25). Nevertheless, several downsides of GAST relating to technological and linguistic issues exist and need to be addressed. These negative aspects of GAST make it obvious why accessing online text material via a commercial search engine adds another layer of challenges to the issues that have been addressed with regard to the difficulties regarding online texts in general. For quantitative approaches to linguistic data, it is particularly unsatisfying that GAST hit counts for a given phenomenon are not fully reliable because of a number of normalisation processes administered by Google, which cannot be disabled. It is obvious that these automated processes further impede the prospects of statistically reliable conclusions on the basis of web data. In order to perform a quantitative study such as measuring the productivity of non-medical -itis, it is essential to have a complete list of types with reliable frequencies, to which a statistical model can then be applied. […] Using frequency data from a search engine (‘Google frequencies’) is much more problematic. For one thing, all search engines perform some sort of normalization: searches are usually insensitive to capitalization (‘poles’ and ‘Poles’ return the same number of matches), automatically recognize variants (‘white-space’ finds white space, white-space and whitespace) and implement stemming for certain languages […]. (Lüdeling et al. 2007: 13–14)

Duplicates also artificially increase the hit count presented by GAST. In contrast to research on the basis of standard offline corpora, there are no means to take into account identical strips of text that might have been copied from one site to another and thus keep reoccurring. Such duplication, which is much more common on the web than in a carefully compiled corpus, may inflate frequency counts drastically. Manual checking could in principle be used to correct the frequency counts, both for normalization and for duplication, but it is prohibitively time-consuming (since the original web pages have to be downloaded) and is hampered by artificial limits that Google imposes on the number of search results returned. (Lüdeling et al. 2007: 14)

 The Lexis and Lexicogrammar of Sri Lankan English

Given the partly distorted total hit counts presented by GAST due to normalisation processes, the lack of duplicate removal and the inability of commercial crawlers to access all websites available (cf. Hundt et al. 2007: 3), it seems to be most advisable not to put too much weight on the exact numbers produced by GAST, but to interpret hit counts contrastively on the basis of several potentially variety-specific searches whenever possible. As the same restrictions apply to each search, it is still possible to deduce certain trends from evaluating total hit counts in juxtaposition to one another. It is not only the quantity of hits that seems to be affected by procedures behind the curtains of Google; there are also concerns regarding the quality of the hits as the search mechanisms are largely opaque due to the fact that “Google employs algorithms which are totally mysterious to the average user” (Leech 2007: 144). What is, however, a certainty – and an unappealing one for linguists – is that search engines generally develop away from producing hits based on on-site information, the standard procedure of so-called first-generation web searches, in that, “third-generation approaches have attempted to identify the ‘need behind the query’ to identify relevant results” (Fletcher 2007: 30). Consequently, content and listing of GAST hits are certainly hampered by non-linguistic mechanisms as Google uses “off-page web-specific information like PageRank, the link popularity ranking introduced […] in 1998 as an indicator of page quality” (Fletcher 2007: 30). Another issue in the context of hit content and listing of search results is that Google, on the basis of IP addresses, produces a rough estimate of the location of the Google user and yields what are considered to be the locally most relevant results (first); as Hundt et al. (2007: 3) put it, “commercial crawlers have an inbuilt local bias”. Just as the identification of the user location allows Google to adjust search results to what it considers to be most relevant, the search behaviour of users themselves is traced.6 On the basis of the resulting user profiles, Google also fine-tunes the search results to adequately meet user demands.7 It is obvious that none of these features helps to make Google more attractive for linguistic searches as they radically decrease intersubjective comparability of online results.

. Google-based user localisation cannot be disabled (cf. 〈http://www.google.com/support/ websearch/bin/answer.py?answer=179386〉 (17 October 2014)). . The web protocol, which saves earlier searches, can be and has been disabled for the study at hand (cf. 〈http://www.google.com/support/websearch/bin/answer.py?answer=54048〉 (17 October 2014)).

Chapter 3. Methodology 

Apart from the technological intricacies of GAST, there are certain restrictions for linguistic analyses. In addition to the normalisation of search items illustrated above, the unavailability of text-type-specific search functions limits the scope of studies on online text data (cf. Mair 2007: 242) and special character searches with GAST do not yield satisfying results (cf. Fletcher 2007: 33). In sum, there is no denying that there are serious drawbacks when employing GAST for linguistic purposes. As the development of (linguistic) search engines stands at the time of writing, however, it is the only feasible way of accessing huge amounts of online text which are meant to complement findings in well-defined offline corpora with a rough-and-ready estimate of the pervasiveness of a given phenomenon. WebCorp, a web search tool adjusted to linguistic needs with the possibility of post-processing data and restricting searches to specific text types, certainly is an extremely valuable resource for the study of online data (cf. Renouf et al. 2007). However, for the present study, it is more important to complement findings from standard corpora with an approximate indication of the usage of a specific phenomenon in a given variety. As WebCorp reports hits on a maximum of 500 websites due to server space restrictions at this time, GAST appears to be more appropriate in the present context as it simply captures many more websites. A corpus-linguistic resource comprising web-based texts is the Corpus of Global Web-based English (GloWbE; cf. Davies 2013). GloWbE is a megacorpus featuring 1.9 billion words from 20 different countries all around the world also including Sri Lanka, India and Great Britain. The dataset can be accessed free-ofcharge and circumvents many of the limitations of processing and analysing web data shown in Table 5 as GloWbE for instance allows searching a vast amount of online data with the help of POS-tags and facilitates replicating results because the corpus texts have been downloaded and, thus, the same set of texts is consulted each time a query is entered. However, as this database was not available when the analyses of the present study were conducted, using GloWbE, the texts of which were also retrieved via Google with concerns comparable to those just discussed, as a complement to the ICE and newspaper data was not an option. Furthermore, in addition to the amount of texts available, the recency of the GAST data constitutes one of the main reasons for including GAST in the corpus environment. While that amount of data available in GloWbE can certainly also be used to study variety-specific low-frequency items in unprecedented detail, it is the freshness of the online data that GloWbE compromises in order to make the texts processable in ways similar to state-of the-art corpus tools. In the light of the rate at which the web changes and grows, however, this certainly is one of the major downsides of GloWbE. When searching GloWbE for structures typical of

 The Lexis and Lexicogrammar of Sri Lankan English

SLE and South Asian English(es) more generally such as COPE up with, you will find numerous examples including Example (5) drawn from the Sri Lankan part of the corpus:

(5) […] you will automatically get your body to adjust to the stress you put it through by repairing your muscles and making it larger and stronger to cope up with the stress you put it through whenever you workout. 〈http:// www.fitness-nutrition-beauty.com/weightlifting-techniques.html〉

However, the domain of the webpage indicated in (5) has expired in the meantime and would thus no longer be retrieved if a GAST search of COPE up with was run right now. This loss of representativeness of corpus-linguistic resources in terms of present-day usage is clearly not restricted to web-derived corpora, but it is probably more pronounced in comparison to corpora sampling offline and in particular published texts. GloWbE certainly is a highly attractive complementary resource in examining World Englishes, but it was not included in the present study due to its release date and the concerns related to its representativeness of present-day English just voiced. Thus, the corpus environment on the basis of which the present study builds its research into contemporary SLE consists of well-defined offline corpora, which come with all the benefits and drawbacks of carefully compiled linguistic data sources, and variety-specific web data to be accessed via GAST. It goes without saying that the latter data source can by no means be regarded as yielding exclusively acrolectal texts, but based on findings in the ICE and newspaper text collections, it will be possible to establish to what degree linguistic trends have penetrated the wider SLE speech community. While it cannot be expected that the distorting effects each database entails will eventually cancel each other out altogether, the above discussion indicates highly attractive synergies of complementary usage from which the present study will profit (e.g. the huge amount of online data can counterbalance the size restrictions of the offline databases, the acrolectal nature of the texts in the offline data can compensate for the varying quality of online texts, etc.). In 3.2, the linguistic phenomena to be scrutinised in this corpus environment will be presented.

3.2 Indicators of structural nativisation in Sri Lankan English As the discussion of the evolutionary status of SLE in Chapter 2 has shown, it seems to be the case that the Sri Lankan variant of English is in the middle of establishing its own variety-specific norms and it is against this background that the study at hand aims to empirically delineate some of these norms on the levels of

Chapter 3. Methodology 

lexis and lexicogrammar. Greenbaum (1988: 136) encourages studies of this kind when he puts forward that “[i]n countries that are still groping for national standards of English, grammatical research can highlight the emerging norms”. With a few notable exceptions, the scarcity of studies on lexical and lexicogrammatical features of SLE – in particular corpus-based ones – is alluded to in the selective review of the descriptive body covering SLE in Chapter 2.3.2 and is indicative of the need for a more in-depth look at SLE structural intricacies. It needs to be pointed out that the present study does not intend to and cannot provide a comprehensive description of several structural levels of SLE. Given that the database is of a written nature, it is evident that SLE phonetics and phonology do not form part of the objects of investigation although the variety-specific sound system of SLE may certainly be considered to be one of the more overt markers of this South Asian variety of English (cf. Hundt 1998: 2). However, the lexicon also provides (ESL) varieties with a comparatively obvious variety-specific make-up, which may differentiate them from other national varieties. In this context, Meyler’s (2007) A Dictionary of Sri Lankan English serves as a valid starting point for studies of SLE lexis. The rich annotation accompanying each dictionary entry enables investigations of SLE usage of lexemes which would be considered rather formal or archaic in BrE, but not necessarily in SLE. The dictionary also highlights a number of lexical items which may be used across several South Asian Englishes, thus facilitating analyses of what could eventually constitute a pool of shared South Asian English lexemes. True, there may be alternative ways of establishing e.g. archaic lexemes or lexical items marked by formality in BrE by, for instance, investigating entries and their annotation in the Oxford English Dictionary (OED), but this would only yield one side of the story. The main interest in studying the respective groups of lexical items lays in the different degrees of association of these lexemes with formality or archaic styles in BrE in comparison to SLE (and IndE) and possible resulting differences in their frequency of use in the varieties concerned. It is due to this focus that the annotation provided in Meyler’s (2007) dictionary is an extremely appealing resource. Meyler, originally a native speaker of BrE, started working in Sri Lanka in 1985 and has lived there since with only short intermissions (cf. Meyler 2007: ix). Consequently, the annotation in the dictionary relies on first-hand familiarity with both BrE and SLE putting Meyler in an ideal position to describe variety-related differences in formality and archaisms related to individual lexical items. In this light, it is understandable that, although Meyler also consulted dictionaries in the process, the annotation accompanying the dictionary entries is “not strictly objective or empirical” (Michael Meyler, p.c.), but it was verified by SLE speakers, the editors of the dictionary Dinali Fernando and Vivimarie VanderPoorten, and Meyler’s British Council colleagues at the time,

 The Lexis and Lexicogrammar of Sri Lankan English

who were native speakers of BrE. Due to Meyler’s long-standing experience with both SLE and BrE and the intersubjective verification of the resulting annotation, the groups of lexical items established on the basis of this annotation can certainly be considered reliable points of departure for a contrastive description of SLE lexis. While lexical analyses of varieties of English provide insights into lexemes which may have accommodated themselves to or have developed in postcolonial settings, the analysis of lexicogrammatical features is generally given highest priority when it comes to the maturation of variety-specific structures. The decisive argument in favour of this level of analysis seems to be that emerging varietyspecific structural features surface first at the meeting point of lexis and grammar. In descriptive terms, it is interesting to observe that in its early stages this indigenization of language structure mostly occurs at the interface between grammar and lexis, affecting the syntactic behavior of certain lexical elements. […] Individual words, typically high-frequency items, adopt characteristic but marked usage and complementation patterns. (Schneider 2007: 46)

Particle verbs, light-verb constructions and verb complementation patterns are the lexicogrammatical features which the present study aims to investigate in the corpus data. Particle verbs (cf. e.g. Schneider 2004; Nesselhauf 2009; Zipp & Bernaisch 2012) and verb complementation patterns (cf. e.g. Mukherjee & Hoffmann 2006; Mukherjee & Schilk 2008; Mukherjee & Gries 2009; Mukherjee 2010; Schilk 2011; Schilk et al. 2012) have repeatedly been shown to exhibit mostly quantitative cross-varietal differences while research on light-verb constructions, with Hoffmann et al. (2011) as a notable exception, has so far been relatively underrepresented in the study of varieties of (South Asian) English(es). On the basis of the lexical and lexicogrammatical features documented in Table 6, the present study hopes to contrastively delineate the structural profile of SLE in comparison to IndE and BrE. In this context, it is to be expected that the phenomena under scrutiny will only rarely, if at all, display (innovative) usages which are categorically restricted to SLE. Table 6. An overview of the objects of investigation of the present study Lexis

Lexemes considered to be formal in BrE Pan-South Asian English lexemes Lexemes considered to be archaic in BrE

Lexicogrammar

Particle verbs Light-verb constructions Verb-complementational patterns

Chapter 3. Methodology 

In particular with regard to the lexis-grammar interface, mostly quantitative differences are likely to be discovered (cf. Mukherjee 2007: 175), which, nevertheless, yield usage patterns typical of the variety under scrutiny. The individual lexical and lexicogrammatical studies will also provide insights into the process(es) of structural nativisation from a multitude of angles and can thus foster a more detailed understanding and modelling of the development of variety-specific norms in New Englishes.

chapter 4

Sri Lankan English lexis Among the structural phenomena under scrutiny in this study, lexical items probably rank among those features which, in comparative terms, overtly give a certain possibly regional character to the written variety of English in which they are used. Borrowings from Sinhala or Tamil are typically attestable in and characteristic of SLE (cf. Mukherjee et al. 2010: 69–70) while IndE might be more frequently marked by items taken over or borrowed from Hindi, Bengali and other indi genous languages of India (cf. Hohenthal 2003). Nevertheless, it is not only borrowed or anglicised lexical items originating from the respective local languages which eventually shape the lexis of new varieties of English; with regard to South Asian Englishes, it has been claimed that they typically (continue to) use a stock of words which is either restricted to more formal contexts or considered to be rather archaic in the present-day usage of the erstwhile input variety, namely BrE (cf. Meyler 2007: xiv). Against this background, it will thus be examined in Chapter 4.1 to what extent and in which contexts SLE displays the usage of lexical items which are considered to be formal in BrE. While South Asian Englishes seem to draw a certain number of vocabulary items from their respective indigenous languages as mentioned above, it is also the case that some of these borrowed lexemes are attestable beyond national boundaries resulting in a pan-South Asian English vocabulary (cf. Meyler 2007: xii), which constitutes the object of investigation in Chapter 4.2. In Chapter 4.3, the degree to which SLE is marked by what are perceived to be archaisms in present-day BrE will be examined. 4.1 Formality markers South Asian Englishes have been described as having a somewhat bookish flavour to them (cf. Mesthrie & Bhatt 2008: 114–116). In relation to IndE, Shastri (1988: 18) explains that “[t]his may be due to the predominance of written language over spoken in the Indian pedagogical context”. In this regard, SLE has not been described to be an exception to the rule. This might be related to the assumption that competent speakers of SLE habitually employ lexical items which are deemed formal (or archaic) in contemporary BrE.

 The Lexis and Lexicogrammar of Sri Lankan English

SLE includes a number of features characteristic of British colonial language which has fallen out of fashion in contemporary British English, including ‘Anglo-Indian’ words which date from colonial times and are common to the whole Subcontinent. These words are considered archaic in British English, or are restricted to more formal contexts. (Meyler 2007: xiv)

The present analysis aims at investigating to what extent SLE is marked by vocabulary items associated with formality in BrE, but not in SLE. To be more precise, this part of the study examines lexemes which noticeably increase the perceived degree of formality of a textual exchange in BrE, but do not affect the perceived formality in SLE. (6) You should consult a doctor immediately. (Meyler 2007: 63)

The sentence in (6) is part of everyday SLE and competent speakers of SLE do not notice this sentence as strikingly formal. However, for BrE speakers, the usage of CONSULT clearly marks this sentence as more formal in comparison to e.g. a sentence where CONSULT would have been replaced with SEE. The term formality markers will be used to refer to these lexical items that increase the perceived formality of a textual exchange in BrE, but not in SLE. The term archaism markers as used in the present study is to be understood in an analogous fashion. In order to look at formality markers from an empirical perspective, the respective national components of ICE and the newspaper corpora are used along with GAST. As a basis for this analysis, Meyler’s (2007) dictionary of SLE is consulted because it features a tag marking certain lexemes as “more formal in BSE” (Meyler 2007: 1), which yields a list of 54 formality markers. An overview of the meanings of the formality markers SLE and BrE speakers perceive differently in terms of their association with formality can be found in Appendix A1: List of formality markers. For a lexeme to be included in the corpus-based analysis, it needs to be attested at least five times in each national component of ICE and in each newspaper dataset respectively, i.e. a lexeme needs to occur at least five times in the Sri Lankan, Indian and British national component of ICE to be considered for the subsequent analyses of the ICE data. Although this number is admittedly arbitrary, lower frequencies of occurrence would be evidence of an only peripheral relevance of these lexemes in the respective corpus environments. These low- frequency vocabulary items are thus excluded from the analyses. In addition to that, the threshold value of five instances has been shown to be a valuable cut-off point in previous studies of South Asian Englishes (cf. e.g. Schilk 2011: 57). Earlier lines of argumentation regarding formality markers postulate that although these lexemes are more restricted regarding the range of contexts in which they may be employed in BrE, this restriction does not hold (at least as rigidly) for postcolonial varieties of English (cf. e.g. Meyler 2007: xiv; Mesthrie & Bhatt 2008: 114–116). For the present analysis, two hypotheses result from this

Chapter 4. Sri Lankan English lexis 

argument. First, the SLE (and IndE) data can be expected to display a more frequent overall use of formality markers than the BrE data since the range of contexts in which these lexemes potentially occur in SLE (and IndE) has been described to be wider than in BrE. Second, it is to be expected that formality markers are biased towards formal text genres in the BrE data while the cross-genre distribution of formality markers in the SLE (and IndE) data may be assumed to be more even. The frequencies of formality markers are shown and discussed in 4.1.1, 4.1.2 pre sents an analysis of the genre-specificity of formality markers and a selection of case studies of formality markers is provided in 4.1.3. 4.1.1 Formality markers: Frequency In the ICE data, 21 formality markers are attested five or more times in each of the national components. The variety-specific absolute and normalised frequencies of these lexical items are presented in Table 7.1, 2 The data show highly significant differences between the distributions of formality markers across the ICE components. The correlation is relatively weak (χ² ≈ 175.91, df = 40, p < 0.001, Cramer’s V ≈ 0.22).3, 4 . Throughout the analytical chapters, absolute frequencies are normalised to occurrences per million words (pmw) in order to enable frequency-based comparisons across different datasets. It should be noted that normalising to a hypothetical corpus of one million words independent of whether the original corpus studied is smaller – as is the case with the written ICE components – or larger – as is the case with the newspaper corpora and the GAST data – rests on the assumption that there is a fixed ratio between the occurrence of the feature studied and the number of words in variety-specific texts. Still, in particular where absolute frequencies are comparatively low, the corresponding ratios may be subject to variation and the respective normalised frequencies need to be interpreted with a measure of caution. . The mean values in Table 7 are calculated by adding up the frequencies of each formality marker and dividing the sum by 21, i.e. the number of formality markers investigated. Analogous procedures apply to the calculations of the mean values for other objects of study. . For statistical results based on Pearson’s (1900) chi-square test, the chi-square value, degrees of freedom, the significance level and Cramer’s V are provided. Cramer’s V, unlike the chi-square value, is “a measure of effect size that is not influenced by sample size” (Gries 2009: 197). It is a correlation coefficient calculated on the basis of the chi-square value and 0 and 1 are, in theory, its extreme values (cf. Gries 2009: 197). As there is, however, no generally accepted scale on when to call a given effect expressed via Cramer’s V weak, moderate or strong (cf. Gries 2009: 240), effects smaller than 0.3 are considered weak, effects from 0.3 to 0.6 moderate and effects larger than 0.6 strong in the present study. For results based on Fisher’s (1922) exact test, i.e. the test used when expected frequencies are too low for Pearson’s (1900) chi-square test, only the significance level is given. In footnotes, however, only significance levels are given independent of the statistical test used. . All pairwise comparisons of the distributions of formality markers in the respective n ational components of ICE are highly significant as well (p < 0.001 for all pairwise comparisons).

 The Lexis and Lexicogrammar of Sri Lankan English

Table 7. Absolute and normalised (pmw) frequencies of formality markers in ICE ICE-SL

ICE-IND

ICE-GB

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq. norm. freq. attend

36

87.83

58

138.07

cease

13

31.72

9

commence

25

60.99

8

correct

36

87.83

earlier

22

enter

41

fully

35

hence herewith

42

97.30

21.42

23

53.28

19.04

8

18.53

45

107.12

37

85.72

53.68

52

123.79

37

85.72

100.03

40

95.22

32

74.13

85.39

41

97.60

47

108.88

76

185.42

75

178.54

34

78.77

18

43.92

33

78.56

12

27.80

highly

25

60.99

54

128.55

49

113.52

leave

11

26.84

8

19.04

14

32.43

persons

48

117.11

76

180.92

20

46.33

previous/previously

36

87.83

23

54.75

50

115.83

proceed

23

56.12

22

52.37

11

25.48

purchase

26

63.43

9

21.42

13

30.12

regarding

59

143.95

32

76.18

24

55.60

seem: it seems

14

34.16

20

47.61

30

69.50

subsequently

16

39.04

12

28.57

10

23.17

thereafter

19

46.36

22

52.37

7

16.22

ultimately

26

63.43

21

49.99

18

41.70

31

75.63

25

59.51

47

108.88

30.29

73.89

32.62

77.65

26.90

62.33

whom MEAN

Formality markers that are noticeably more frequent in ICE-SL than in both ICE-IND and ICE-GB and certainly contribute to significant differences across the variety-specific datasets are, among others, COMMENCE and regarding as illustrated in Examples (7) and (8).

(7) The party will commence at 5 p.m. All our friends will be there. Make sure you come at the earliest. We can have a great time! 〈ICE-SL:W1A-014#6–9:1〉

(8) I 〈w〉haven’t〈/w〉 done anything regarding the M phil. 〈@〉Ramaya〈/@〉 got me the latest book on gender and language. But I still 〈w〉haven’t〈/w〉 read it properly to think critically about it. 〈ICE-SL:W1B-008#160–162:8〉

Chapter 4. Sri Lankan English lexis 

Example (7) – an invitation to a party – is a nice illustration of a formality marker being used within a relatively informal context, which may hint at the fact that COMMENCE in SLE is not in all cases bound to appear in formal contexts. However, it has to be acknowledged that this very text passage is taken from a student examination essay, which is why the seeming informality of the content may be corrupted by the formality of the purpose for which the text was composed. While Example (7) thus does not necessarily offer evidence of a formality marker in use in an otherwise comparatively informal context, Example (8) does. Example (8) is taken from a social letter and regarding is used together with shortened word forms – haven’t in two cases here – implying a relatively low degree of formality. Consequently, the informal co- and context did not prevent the respective writer from using regarding in this social letter, which may be interpreted as a signal of this formality marker not being restricted to occur exclusively in formal contexts in SLE. With some of the formality markers represented in Table 7 such as hence, herewith or thereafter, the relatively formal nature is evident from the words themselves. Still, with other lexemes such as earlier, it is inter alia the constituent order in the sentence that leads to differences in the perceived formality in SLE as opposed to BrE.

(9) Land masses form outsides territories of the Kandyan Kingdom were annexed to the narrow coastal areas earlier held by the Dutch. 〈ICE-SL:W2B-015#26:1〉

(10) This place earlier was for her alone – but today Duleep is sitting in front of her, and she is trying to adjust her mind to his presence in her space. 〈ICE-SL:W2F-008#22:1〉

In the SLE Examples (9) and (10), the adverb earlier directly precedes the verb it premodifies. This may appear more formal in BrE, where earlier would probably be placed at the end of (9) and at the beginning of (10) in stylistically unmarked sentences (cf. Meyler 2007: 81). It goes without saying that ENTER can also be frequently attested in BrE as Tables 7 and 8 show, but it also tends to be employed in contexts in SLE where BrE would probably employ a less formal variant. An example of ENTER is shown in (11). (11) So, I finally managed to pass the Advanced Level Exam and enter the medical faculty. 〈w〉I’ll〈/w〉 also step into the Medical profession in 6 〈w〉years’〈/w〉 time. 〈ICE-SL:W1A-016#71–72:4〉

Due to the reduced form I’ll, the particle verb STEP into and the connector so, the tone of (11) is relatively informal, but the SLE speaker employs ENTER instead of another more informal verb such as GET into, which might have been the choice

 The Lexis and Lexicogrammar of Sri Lankan English

of a BrE speaker here (cf. Meyler 2007: 84). Consequently, the usage of ENTER instead of GET into, i.e. the usage of a simplex verb instead of its more informal corresponding particle verb, in (11) may represent a stylistical incoherence for a BrE speaker, but not for a SLE speaker because ENTER is not perceived to be as formal by SLE speakers as it is by BrE speakers. In a similar vein, the noun leave in the sense of a vacation is also characterised by a lower degree of formality in SLE in comparison to BrE (cf. Meyler 2007: 150). Examples (12) and (13) are taken from the social letters section of ICE-SL (W1B), where leave is repeatedly employed in informal communicative contexts, while the majority of instances of leave in ICE-GB can be found in more formal settings, namely in the business letters section (W1B) and in administrative/regulatory writing (W2D). (12) 〈p〉Anyway 〈w〉that’s〈/w〉 all for 〈}〉〈-〉non〈/-〉〈+〉now〈/+〉〈/}〉. Glad to hear D.C. RA is out of your hair and on leave!〈/p〉 〈p〉Take care and see you soon,〈/p〉 〈p〉Much love〈/p〉 〈ICE-SL:W1B-001#12–15:1〉 (13) 〈p〉〈@〉Danudhri〈/@〉 is on leave these two days. Wonder how she would spend the day at home alone. There are some dvds and then some cooking she would have to do.〈/p〉 〈p〉So 〈w〉that’s〈/w〉 that for now.〈/p〉 〈ICE-SL:W1B-008#112–115:6〉

34 formality markers occur at least five times in each of the newspaper datasets. Their absolute and normalised frequencies are documented in Table 8. For the distribution of formality markers in the newspaper texts, there are highly significant differences, but the correlation between formality markers and the newspaper corpora is moderate (χ² ≈ 4051.20, df = 66, p < 0.001, Cramer’s V ≈ 0.31).5 Among others including COMMENCE and regarding exemplified in (7) and (8), monies/moneys with the meaning of “sums of money” (Meyler 2007: 169) as shown in (14) and thereafter as in (15) constitute formality markers that significantly contribute to the differences between the individual newspaper datasets in that they occur markedly more often in the Sri Lankan than in the Indian or British newspapers. (14) Such corporations show profits due to the monies earned during the monopolistic era being deposited and now earning interest. This cannot be compared to generation of profits contributed by generation of production and employment, and therefore privatisation of these cannot be considered a national disaster. 〈SAVE-SL-DN_2003-01-08〉

. All pairwise comparisons of formality markers in the newspaper datasets exhibit highly significant differences (p < 0.001).

Chapter 4. Sri Lankan English lexis 

Table 8. Absolute and normalised (pmw) frequencies of formality markers in the newspaper data SAVE-SL

SAVE-IND

BNC news

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq. norm. freq. admit

92

30.01

146

47.53

109

12.28

apply

11

3.59

38

12.37

12

1.35

attend

430

140.26

488

158.87

1,026

115.62

cease

50

16.31

57

18.56

120

13.52

commence

474

154.61

62

20.18

40

4.51

consult

44

14.35

52

16.93

102

11.49

convey

30

9.79

31

10.09

12

1.35

correct

138

45.01

74

24.09

256

28.85

earlier

525

171.24

851

277.04

1,533

172.76

enter

396

129.17

394

128.27

741

83.50

family member

80

26.09

121

39.39

33

3.72

fare

14

4.57

27

8.79

61

6.87

forthwith

18

5.87

5

1.63

10

1.13

fully

373

121.66

170

55.34

594

66.94

gift

27

8.81

29

9.44

29

3.27

hail from

30

9.79

51

16.60

15

1.69

hence

259

84.48

225

73.25

88

9.92

highly

346

112.86

172

55.99

492

55.44

leave

45

14.68

65

21.16

144

16.23

monies/moneys

46

15.00

5

1.63

12

1.35

persons

710

231.59

826

268.90

94

10.59

previous/previously

485

158.20

212

69.02

747

84.18

proceed

173

56.43

88

28.65

192

21.64

purchase

252

82.20

154

50.13

139

15.66

refrigerator

29

9.46

30

9.77

12

1.35

regarding

353

115.14

275

89.53

108

12.17

reside

39

12.72

50

16.28

13

1.46

residence

152

49.58

258

83.99

59

6.65

seated

29

9.46

23

7.49

28

3.16

seem: it seems

56

18.27

135

43.95

477

53.75

subsequently

108

35.23

114

37.11

148

16.68

thereafter

195

63.60

55

17.91

57

6.42

ultimately

64

20.88

101

32.88

142

16.00

288

93.94

261

84.97

780

87.90

187.09

61.02

166.03

54.05

247.79

27.92

whom MEAN

 The Lexis and Lexicogrammar of Sri Lankan English

(15) The Fisheries Ministry took steps to revive this breeding centre which was neglected earlier. Thereafter all fish breeding tanks were renovated. 〈SAVE-SL-DN_2001-12-04〉

Again, there are lexemes in Table 8 whose more formal character in BrE, which is also occasionally only related to a certain meaning of a lexeme as documented in Appendix A1: List of formality markers, is not straightforwardly obvious. ADMIT is probably a case in point here. The meaning of ADMIT subject to varying degrees of formality in SLE and BrE is that of admitting someone to a hospital (cf. Meyler 2007: 2) as shown in (16) and (17). (16) Baby 81 was the name given by a hospital in Sri Lanka to little Abilash Jeyarajah, who was found in rubble after the tsunami struck in December. Then two months old, he was the 81st patient admitted in the wake of the tsunami. 〈SAVE-SL-DM_2005-03-12〉 (17) The medical professionals say the new system has been very beneficial to patients. They have the same consultant and same nursing team treating them every time they are admitted. 〈BNC K35〉

ADMIT, the structures in which it is used and the related frequencies of occurrence are ideal to reiterate the rationale behind this corpus-based approach to formality markers. The underlying idea is to see whether the restriction of formality markers to more formal contexts in BrE and the corresponding implication of fewer contextual restrictions in SLE trigger generally higher frequencies of formality markers in SLE (and IndE) than in BrE. While constituent order or differences in contextual formality could illustrate different degrees of formality in SLE compared to BrE with earlier, ENTER or leave in the ICE data, this is not possible with ADMIT in the newspaper corpora. The structures and constituent orders in which ADMIT in the sense of checking someone into a hospital is used are comparable across the two varieties and so are the general degrees of formality of the newspapers. With these factors held constant, why is it that ADMIT in this sense is almost three times more frequent in SAVE-SL than in BNC news? True, one might argue that the Sri Lankan newspaper data was sampled during the times of the Sri Lankan civil war and the tsunami of 2004 and that this reality led to the need to more frequently write about people being admitted to hospitals, but how does this explain that ADMIT occurs in this sense almost four times as often in SAVE-IND compared to BNC news? With structures and contextual formality held constant across the varieties, the explanation probably is that ADMIT is more formal in BrE than in SLE or IndE. While SLE and IndE writers relatively frequently employ ADMIT in their newspaper articles, ADMIT may on occasion appear too formal to BrE writers making them choose other alternatives like taken to hospital as in (18). A collocational analysis supports this argument. Taken is a

Chapter 4. Sri Lankan English lexis 

more frequent L-collocate of hospital than admitted in BrE, but admitted is a more frequent L-collocate of hospital than taken in SLE and IndE. (18) One tenant was taken to hospital with a broken leg after the explosion in Deptford, south-east London. 〈BNC CBF〉

APPLY is another lexeme where only a certain meaning is subject to differences in perceived formality. According to Meyler (cf. 2007: 11), the meaning associated with differing degrees of formality in SLE (and IndE) and BrE is that of putting/ rubbing on a certain substance as shown in (19) and (20). The normalised frequencies for APPLY again show a noticeably higher frequency in the South Asian varieties in comparison to BrE. (19) A cleansing mask with Ginseng and Vitamin C is then applied to detoxify and left to set for about 20 minutes and peeled off to reveal a lighter, firmer skin. 〈SAVE-SL-DM_2005-04-25〉 (20) To make the most of Norma’s dark brown eyes, she applied shades from Estee Lauder’s Warm Effects shadow collection, creating a subtle blend of browns and golds. 〈BNC E9P〉

In order to be able to complement the results from the offline databases with findings via GAST, it is of pivotal importance to ensure that the hits displayed by GAST represent, at least for the vast majority of cases, the formal uses under scrutiny. Take the case of FARE, for instance, the meaning of which Meyler (2007: 88) describes as “do, get on (well/badly, e.g. in an exam)”. Consequently, homographic nouns of fare or fares should ideally not figure in the GAST analysis as formality has been described to be restricted to the use of the verb. This goes to show that homographic word forms representing different word classes may potentially impede GAST-based studies. Similarly, different semantic meanings within the same word class also need to be taken into consideration here since it may only be one semantic meaning of a given lexeme to which the restriction to formal contexts in BrE applies. This, for instance, is the case with the verb REMOVE. Meyler (2007: 220) describes the formal meaning of REMOVE in BrE as “take off (e.g. clothes)”. It goes without saying that there are a number of other facets of meaning associated with the signifier REMOVE which can hardly be argued to be more formal in BrE and have to be excluded from the present analysis. For the ICE and newspaper data, manual selection of the relevant instances is feasible, but for GAST, the sheer amount of data as well as technological restrictions prohibit manual cleaning. Thus, only those lexical items for which a high number of hits can actually be assumed to represent the usage deemed formal in BrE are selected for the GAST searches. In order to identify those lexical items for

 The Lexis and Lexicogrammar of Sri Lankan English

which a given word form represents the more formal usage in the vast majority of cases, the ICE and newspaper data are used. To simulate GAST searches, the lexical items are searched for by using the lemmata of the respective lexemes as single search terms, thus excluding the occurrence of the search words in more complex (and unwanted) morphological constructions (e.g. ADMIT only produces hits for admit, admits, admitted and admitting, but does not retrieve admittance). For each lexeme, the mean relative frequency of irrelevant hits, i.e. those hits which do not match the formal semantic meaning under scrutiny, is calculated across the corpora. A low mean relative value of irrelevant hits thus indicates that most of the hits retrieved for a certain lexeme actually represent the meaning deemed more formal. For the GAST analysis, those lexemes with a mean relative value of irrelevant hits of 10% or lower are chosen.6 A list of 21 formality markers can thus be analysed via GAST. In Table 9, the absolute number of hits of a given word form in the top-level country domains is provided along with normalised frequencies. As the respective normalised frequencies are calculated on the basis of (a) uncleaned Google hits and (b) estimates of the number of English words in the relevant domains as suggested by Grefenstette and Nioche (2000), it needs to be pointed out that these GAST-based normalised frequencies are not as reliable as frequencies based on studies of welldefined offline corpora. To illustrate the calculation of the normalised frequencies in GAST, above mentioned, as shown in (21), can be used. (21) Assuming that the above mentioned advertisement was published in the Sunday Observer […]. 〈http://www.lib.cmb.ac.lk/wp-content/ uploads/2012/04/2011-MKT-1200_English1.pdf〉 (11 February 2013)

Above mentioned is attested 67,000 times in GAST-SL and it can be estimated that the Sri Lankan domain contains 1,462,810,846.56 English words. Consequently, the (rounded) normalised (pmw) frequency of above mentioned is 45.80, i.e. 67,000 divided by 1,462,810,846.56 and multiplied by 1,000,000. The general trends in terms of formality markers in the three varieties under scrutiny can be deduced from Figure 4. The mean values of the normalised frequencies of formality markers are displayed summarising the findings on the basis of the individual datasets. Concerning the ICE components, the picture that emerges is relatively clearcut in that ICE-GB displays the lowest (62.33), ICE-SL the second highest (73.89) and ICE-IND the highest mean value for formality markers (77.65). The general tendency in the newspaper corpora reflects the findings from the ICE data,

. With GAST, verbs are retrieved using their infinitive forms and for nouns, the singular forms are employed as search terms, with monies being the only exception.

Chapter 4. Sri Lankan English lexis 

Table 9. Absolute and normalised (pmw) frequencies of formality markers in GAST

above mentioned benumbed cease commence detrain family member fillip forthwith fully hail from hence hereafter herewith monies persons pugilist refrigerator regarding subsequently ultimately whom MEAN

GAST-SL

GAST-IND

GAST-GB

abs. freq. (norm. freq.)

abs. freq. (norm. freq.)

abs. freq. (norm. freq.)

67,000 (45.80) 134 (0.09) 194,000 (132.62) 374,000 (255.67) 4 (0.00) 22,500 (15.38) 5,640 (3.86) 24,600 (16.82) 1,030,000 (704.12) 5,660 (3.87) 442,000 (302.16) 106,000 (72.46) 9,410 (6.43) 66,700 (45.60) 1,310,000 (895.54) 463 (0.32) 39,000 (26.66) 711,000 (486.05) 234,000 (159.97) 196,000 (133.99) 743,000 (507.93) 265,767.19 (181.68)

988,000 (11.41) 2,800 (0.03) 682,000 (7.88) 1,650,000 (19.06) 638 (0.01) 1,560,000 (18.02) 72,100 (0.83) 235,000 (2.71) 29,500,000 (340.75) 45,200 (0.52) 7,280,000 (84.09) 412,000 (4.76) 287,000 (3.32) 124,000 (1.43) 9,430,000 (108.92) 7,590 (0.09) 4,250,000 (49.09) 23,800,000 (274.91) 1,790,000 (20.68) 2,260,000 (26.10) 16,800,000 (194.05) 4,817,920.38 (55.65)

1,580,000 (4.48) 18,500 (0.05) 7,330,000 (20.77) 11,800,000 (33.43) 7,520 (0.02) 3,450,000 (9.77) 370,000 (1.05) 848,000 (2.40) 118,000,000 (334.29) 348,000 (0.99) 24,600,000 (69.69) 2,220,000 (6.29) 375,000 (1.06) 3,990,000 (11.30) 25,500,000 (72.24) 40,800 (0.12) 11,500,000 (32.58) 48,400,000 (137.12) 13,600,000 (38.53) 41,700,000 (118.13) 33,900,000 (96.04) 16,646,562.86 (47.16)

 The Lexis and Lexicogrammar of Sri Lankan English

Norm. Freq. (pmw)

200 150 100 50 0

Norm. Freq. (pmw)

200 150 100 50 0

Norm. Freq. (pmw)

i.e. the British data feature fewer formality markers (27.92) than the Indian (54.05) and the Sri Lankan data (61.02). Further, it appears to be the case that, on average, newspaper writing uniformly uses fewer formality markers than the texts in the ICE datasets.

200 150 100 50 0

73.89

77.65

ICE-SL

ICE-IND Corpus

ICE-GB

61.02

54.05

27.92

SAVE-SL

SAVE-IND Corpus

BNC news

55.65

47.16

GAST-IND Data

GAST-GB

62.33

181.68

GAST-SL

Figure 4. Mean normalised (pmw) frequencies of formality markers in the ICE, newspaper and GAST data

On the basis of the mean normalised frequencies in GAST, the SLE online data feature the largest number of formality markers (181.68) while GAST-IND displays 55.65 and GAST-GB 47.16 formality markers. The pervasiveness of formality markers in GAST-SL is also evident from the fact that, except for detrain, family member and refrigerator in GAST-IND and detrain and refrigerator in GAST-GB, all formality markers occur less frequently in GAST-IND and GAST-GB than in GAST-SL in normalised terms. As an interim result, it can be stated that the ICE, newspaper and GAST data give clear indications that formality markers are generally more frequent in South Asian Englishes than in BrE. The question as to whether it is SLE or IndE which features most formality markers among the varieties covered cannot be given a definite answer as the different corpus environments yield diverging tendencies.

Chapter 4. Sri Lankan English lexis 

In the newspaper and GAST data, SLE appears to feature most formality markers while the ICE texts suggest that most formality markers can be found in IndE. Still, given that SLE features most formality markers in two sets of data, there may be a tendency that SLE features more formality markers than IndE. Another central insight is that SLE features systematically more formality markers than BrE throughout the ICE, newspaper and GAST data collections. 4.1.2 Formality markers: Genre-specificity In order to examine to what extent formality markers occur in formal contexts across the varieties concerned and also to investigate whether or not the lexemes under scrutiny occur predominantly in more formal genres in ICE-GB while being distributed more evenly across a wider range of contexts in the South Asian data, the genres as represented in ICE can be utilised. However, in order to make valid judgements about the pervasiveness of formality markers in genres associated with different degrees of formality, a reliable formality scale on which the individual genres in the written ICE data are located is needed. Based on an extension of Biber’s (1988) multidimensional analysis, Xiao (2009: 424) establishes the frequency of “141 linguistic features that are functionally related and relevant to language variation research” in five ICE components and subsequently runs a factor analysis to identify central groups of features. In the nine-factor structure considered to best represent the underlying data, there is one factor which is of particular relevance to the study of formality, namely “Factor 1: Interactive casual discourse vs. informative elaborate discourse” (Xiao 2009: 430), where linguistic features characteristic of a high degree of interactivity (e.g. discourse markers) and a lower degree of stylistic elaboration (e.g. contracted forms) load positively on Factor 1 and information-heavy structures (e.g. attributive use of adjectives) and stylistically more elaborate features (e.g. past participial post-nominal clauses) load negatively on Factor 1 (cf. Xiao 2009: 429). In other words, when applied to the genres in the ICE data, Factor 1 provides an empirically valid formality scale with the positions of the individual genres established on the basis of a large set of linguistic features. The mean factor scores of the spoken as well as the written ICE genres for Factor 1 are given in Figure 5. Among the written genres, which are of relevance to the present study, academic writing (W2A) should thus be considered the most formal and creative writing (W2F) – with notable similarities to the spoken ICE genres – the most informal genre constituting the extreme ends of the written formality scale. It also appears to be the case that correspondence (W1B), non-academic writing (W2B) and non-professional writing (W1A) are not as formal as academic writing, while the factors scores for the remaining written genres are closer to that of academic writing and the differences between the genres thus less pronounced.

 The Lexis and Lexicogrammar of Sri Lankan English

–60

–40

–20

0

20

40

S-Private S-Public W-Printed-Creative writing S-Mono-Unscripted W-Nonprinted-Correspondence W-Printed-Non-academic W-Nonprinted-Non-prof. writing S-Mono-Scripted W-Printed-Persuasive writing W-Printed-lnstructional writing W-Printed-Reportage W-Printed-Academic writing 60

Figure 5. The factor scores of the spoken and written ICE genres for Factor 1: interactive casual discourse vs. informative elaborate discourse (taken from Xiao 2009: 436)

This genre-specific formality scale can in principle be considered to hold uniformly across different varieties since the linguistic features homogeneously produce positive or negative factor scores for a specific genre (to the exception of correspondence (W1B)) across the ICE components analysed and the scores themselves are relatively comparable in many cases as well (cf. Xiao 2009: 442–443). For formality-related genre analyses of the present study, differences in frequency of one feature in the same genre across different varieties should thus be interpreted as indicators of differing degrees of formality of that feature across the varieties concerned, but not as indicators of cross-varietal differences in the formality of the genre in which the feature is examined. Consequently, if a given feature sensitive to formality occurs e.g. (significantly) more frequently in the creative writing section of ICE-GB than in that of ICE-SL, this observation will be interpreted as a possible indication that this feature is more strongly associated with informality in BrE than in SLE, but not as a signal of differences in the degrees of formality of creative writing in SLE and BrE. Although Xiao (2009) should certainly be regarded as an empirically valid reference point for (a) different degrees of formality of the individual genres in ICE in general and (b) academic writing being the most and creative writing being the least formal written genre in the ICE framework in particular, it may nevertheless also be productive not to restrict genre-specific (and formality-related) perspectives on possible pairs of comparison derived from his statistical modelling (e.g. academic writing vs. creative writing, academic writing vs. non-academic writing, etc.). For instance, Xiao’s (2009) analysis implicitly assumes the homogeneity of the genre of correspondence, which – on the basis of his extended approach to Biber’s (1988) multidimensional analysis – probably is a perfectly valid assumption to make, but for formality-related studies, it may prove more fruitful to consider business letters and social letters, i.e. the two types of text constituting the genre of correspondence, as separate categories since the former must be considered more

Chapter 4. Sri Lankan English lexis 

813.09 652.98

743.24

1308.93

1718.69

1630.65

1838.38 1271.92 1443.26

1407.24 1442.5

1651.75

730.18

1000

1551.71

1514.78 1712.5 1464.35

1664.5 1211.01

1500

1837.8

Norm. Freq. (pmw)

2000

2160.73

2500

944.15 704.85

3000

1645.73 1897.62 1639.78

2969.73

formal than the latter on text-linguistic grounds. Consequently, while the study of different degrees of formality associated with a given feature across the varieties concerned is empirically rooted in the genres which are most distinct in terms of formality as established by Xiao (2009), it will not focus on them exclusively in the sense that comparisons of genres which may be interesting from a text-linguistic perspective will also be provided. The frequencies of occurrence of formality markers need to be interpreted against this background. The genre-specific normalised and absolute frequencies across the three ICE components are shown in Figure 6. There are statistically

500

0

n = 77 n = 96 n = 29 n = 67 n = 120 n = 32 n = 71 n = 83 n = 17 n = 135 n = 61 n = 32 n = 186 n = 72 n = 32 n = 118 n = 30 n = 29 n = 120 n = 56 N = 685 n = 143 n = 74 N = 565 n = 136 n = 70 N = 636 ICE-SL

ICE-IND Corpus

ICE-GB

Non-professional writing (W1A)

Instructional writing (W2D)

Correspondence (W1B)

Persuasive writing (W2E)

Academic writing (W2A)

Creative writing (W2F)

Non-academic writing (W2B)

TOTAL

Reportage (W2C)

Figure 6. Normalised (pmw) and absolute frequencies of formality markers in the genres of ICE-SL, ICE-IND and ICE-GB

 The Lexis and Lexicogrammar of Sri Lankan English

highly significant differences between the distributions of formality markers across the ICE genres. The correlation between formality markers and the genres in the ICE components is comparatively weak (χ² ≈ 40.86, df = 14, p < 0.001, C ramer’s V ≈ 0.10).7 Due to the fact that the genres in the ICE components are not of equal size, it is the normalised frequencies which can give a first indication of the genrespecific prevalence of formality markers in the varieties concerned. If one zooms in onto those genres which feature the highest and lowest normalised number of formality markers respectively, the picture is consistent across the varieties. For ICE-SL (2160.73), ICE-IND (2969.73) and ICE-GB (1897.62), the highest density of formality markers can be attested in the correspondence sections (W1B) while the lowest density can uniformly be found in the creative writing texts (W2F) with ICE-SL (730.18), ICE-IND (743.24) and ICE-GB (652.98). None of these findings is particularly surprising. The letters sections include social and business letters, with the latter requiring per se a comparatively formal style of writing and being thus likely to feature a large number of formality markers. The low normalised frequency of formality markers in creative writing is probably also genre-bound as novels in ICE generally tend to best reflect “interactive casual discourse” (Xiao 2009: 436). Although the ranking of the highest and lowest normalised values just presented could be interpreted as an indication that genre-specific conventions of formality marker use hold across the varieties covered here, it needs to be pointed out that there are nevertheless noteworthy diffe rences. Taking a closer look at the normalised frequencies in the letters section, the number of formality markers in ICE-IND (2969.73) is much higher than that for ICE-GB (1897.62) although the letters sections display the highest density of formality markers in both varieties. Against this background, Figure 7 gives an indication of the extent to which single data points deviate from their expected values and, thus, of the importance of these data points for the interpretation of the data. The association plot shows dark and light boxes, which represent data points whose observed frequencies are greater and smaller than the expected ones respectively (i.e. what corresponds to negative and positive Pearson residuals), and the area of the box is proportional to the difference in observed and expected frequencies. (Gries 2009: 198)

. In pairwise comparisons of the component-specific data, the differences in the distribution of formality markers across the genres are highly significant for the pair ICE-IND and ICE-GB (p < 0.001), but for the remaining pairs, i.e. ICE-SL and ICE-IND as well as ICE-SL and ICE-GB, no significant differences can be attested (p > 0.05).

Chapter 4. Sri Lankan English lexis 

Corpus

ICE-SL

ICE-IND

ICE-GB

Non-professional Academic Reportage (W2C) Persuasive writing (W1A) writing (W2A) writing (W2E) Correspondence (W1B) Non-academic Instructional Creative writing (W2B) writing (W2D) writing (W2F)

Genre

Figure 7. Association plot of formality markers in the genres of ICE-SL, ICE-IND and ICE-GB

The representation of the data in the association plot suggests that for the ICE-SL data, the absolute number of occurrence of formality markers in the letters section is smaller than expected. For ICE-IND, the contrary is true. Another noteworthy tendency in the ICE-IND data is the remarkably less frequent use of formality markers in academic writing (W2A). With regard to the BrE data, the less frequent use of formality markers in the correspondence section and the more frequent occurrence in academic writing stand out. Thus, correspondence texts and academic writing in the ICE components deserve special attention. In order to gain adequate insights into formality in letter writing, social and business letters constituting the genre of correspondence (W1B) need to be analysed separately since they are associated with different degrees of formality (cf. Nelson 1996: 30). Figure 8 provides the relative and absolute frequencies of formality markers in social and business letters in the ICE components concerned. The differences in social and business letters across the ICE components are highly significant, but correlate only weakly (χ² ≈ 32.63, df = 2, p < 0.001, Cramer’s V ≈ 0.27).8 In line with the different degrees of formality in letters covered in ICE,

. The social and business letters exhibit highly significant differences in comparisons of ICE-SL and ICE-IND as well as ICE-IND and ICE-GB (p < 0.001), but there are no significant differences in a comparison of ICE-SL and ICE-GB (p > 0.05).

 The Lexis and Lexicogrammar of Sri Lankan English

i.e. social letters with a low and business letters with a high formality level, the vast majority of formality markers can be found in the business letters section in ICE-SL (72.59%) and ICE-GB (75.42%). What is particularly interesting is that this is not the case for the IndE data since less than half of the formality markers in the letters in ICE-IND occur in business letters (47.31%) and 52.69% of formality markers occur in social letters. 100

Social letters

Rel.Freq. (%)

52.69

50

0

75.42

72.59

75

25

Business letters

47.31

27.41

n = 37 n = 98 ICE-SL

24.58

n = 98 n = 88 ICE-IND

n = 29 n = 89 ICE-GB

Corpus

Figure 8. Relative and absolute frequencies of formality markers in social and business letters in ICE

The frequent use of formality markers in social letters in IndE results in a higher frequency of formality markers in IndE letter writing in general. Due to the fact that formality markers are employed in social letters only rarely in SLE and BrE, the values for the two varieties are generally lower than that of their IndE counterpart. The comparatively low normalised numbers of formality markers in academic writing (W2A) in ICE-IND (1407.24) as opposed to ICE-SL (1664.50) and ICEGB (1639.78) should be complemented with perspectives on non-academic writing (W2B) as the two genres map differently onto the formality scale with the latter being less formal than the former in general. The normalised frequencies of formality markers in non-academic writing are 1211.01 for ICE-SL, 1442.50 for ICE-IND and 944.15 for ICE-GB. Again, IndE stands out in that its numbers for formality markers in the two genres are more similar than those for ICE-SL or ICE-GB, both of which display systematically lower frequencies for the more informal genre of non-academic writing. The data could thus be seen as an instantiation of a lower degree of formalityrelated differentiation as regards formality markers in IndE, or, in other words, of

Chapter 4. Sri Lankan English lexis 

a tendency in IndE to use formality markers in a wider range of genres associated with different degrees of formality compared to BrE and SLE. With regard to the occurrence of formality markers across the genres in ICE, it seems to be the case that SLE and BrE share a relatively similar pattern, i.e. the more formal the genre, the more likely the usage of formality markers. 4.1.3 Formality markers: Case studies In the ICE data in Table 7, the frequency of persons in ICE-GB stands out in that it occurs less frequently than expected in the BrE data (46.33) while it is found markedly more often in ICE-SL (117.11). In ICE-IND, persons is used in the sense of people 180.92 times. In the newspaper data, persons is also infrequent in BNC news (10.59) and notably more frequent in SAVE-SL (231.59) and SAVE-IND (268.90). On closer inspection, it becomes obvious that phraseological restrictions seem to be in place in BrE, which is not the case with the South Asian varieties. Example (22) illustrates a typical use of persons in BrE. (22) Models thus serve as guides in selecting both the kind of therapy to be used for persons with emotional problems, and the source of treatment and/or intervention required. 〈ICE-GB:W1A-007#12:1〉

In this context, persons could easily be replaced with people without causing a change in meaning of the sentence. This is certainly also the case for the use of persons in the SLE and IndE data. Nevertheless, it seems that in the South Asian varieties, persons can be more readily combined with definite numerals than in BrE as illustrated in (23) and (24). (23) The population density in the South after deducting the land not available for habitation would be approximately 600 persons per square kilometre. 〈ICE-SL:W2C-003#97:3〉 (24) She had spoken to about 10 persons but didn’t find anyone she liked. 〈ICE-IND:W2F-015#146:1〉

The combinability of persons with definite numerals could thus be one factor accounting for the differences in frequencies across the varieties. Table 10 shows the absolute and relative frequencies of definite numeral and other premodification of persons in the ICE and newspaper data. While the types of premodification of persons across the ICE components do not yield significant differences and only show a weak correlation (χ² ≈ 1.80, df = 2, p > 0.05, Cramer’s V ≈ 0.11), the newspaper data produce highly significant differences, but the correlation between the types of premodifiers and the datasets

 The Lexis and Lexicogrammar of Sri Lankan English

Table 10. Absolute and relative frequencies of definite numeral and other premodification of persons in the ICE and newspaper data Definite numeral premodification ICE-SL ICE-IND ICE-GB SAVE-SL SAVE-IND BNC news

Other premodification

TOTAL

abs. freq.

8

40

48

rel. freq.

16.67%

83.33%

100.00%

abs. freq.

17

59

76

rel. freq.

22.37%

77.63%

100.00%

abs. freq.

2

18

20

rel. freq.

10.00%

90.00%

100.00%

abs. freq.

198

512

710

rel. freq.

27.89%

72.11%

100.00%

abs. freq.

374

452

826

rel. freq.

45.28%

54.72%

100.00%

abs. freq.

8

86

94

rel. freq.

8.51%

91.49%

100.00%

is comparatively weak (χ² ≈ 82.28, df = 2, p < 0.001, Cramer’s V ≈ 0.22).9, 10 The relative frequencies of definite numerals premodifying persons show that SAVEIND contains the highest number of the combination of a definite numeral with persons (45.28%), SAVE-SL the second highest number (27.89%) and BNC news the lowest number of said combination (8.51%). Consequently, the above observations support the view that premodification via definite numerals is more firmly rooted in the South Asian colligational profiles of persons, which may result in a higher absolute frequency of usage for persons in SLE and IndE. However, the tendency to complement persons with definite numerals is more pronounced in IndE than in SLE. Yet, it appears that it is not only colligational restrictions in BrE which lead to frequency-based cross-varietal differences; at least in SLE, certain collocations of persons also seem to trigger higher frequencies of use for this vocabulary item as shown in (25). (26) illustrates the usage of IDPs, the alphabetism for internally displaced persons.

. None of the pairwise comparisons of the ICE data yields significant differences (p > 0.05). . All of the pairwise comparisons in the newspaper data show highly significant diffe rences (p < 0.001).

Chapter 4. Sri Lankan English lexis 

(25) He said 180,000 internally displaced persons are returning to their original homes because of the peace process. 〈SAVE-SL-DN_2002-10-22〉 (26) UNHCR Assistant High Commissioner, Kamel Morjane and Director of the Asia Bureau, Marie Fakhouri, visited Trincomalee, Vavuniya, Jaffna and Puttalam earlier this week in order to directly assess the disruptive e ffects of the spontaneous return of more than 103,000 IDPs so far this year and develop both immediate and long term durable solutions. 〈SAVE-SL-DN_2002-08-15〉

Due to the Sri Lankan civil war, the collocations displaced persons and internally displaced persons occur repeatedly in written texts to refer to people who had to leave their home either willingly or by force due to military action in close proximity. The combination (internally) displaced persons occurs five times in ICE-SL and 42 times in SAVE-SL in absolute terms; the abbreviations IDP/IDPs can be found 18 times in ICE-SL and 51 times in SAVE-SL. The recurrence of the lexical grouping (internally) displaced persons and of the related alphabetism in the Sri Lankan data point to the sociocultural saliency of the concept in the distinct Sri Lankan linguistic ecology. Thus, SLE verbalises the unique everyday reality of its speakers, which may lead to higher frequencies of use and/or variety-specific collocations of a given lexeme and possibly, as is the case with persons, to a higher density of formality markers in SLE texts. Along with the moderate complementation of persons with definite numerals in SLE, distinct collocational profiles may serve as an explanation of the high frequency of persons in the SAVE-SL data. In the GAST data in Table 9, detrain, family member and refrigerator are worth discussing since they are noticeably frequent in the Indian and, with the exception of family member, in the British data. They thus deviate from the general trend of GAST-SL featuring the largest number of formality markers. Detrain is exemplified in (27) and (28). (27) I quickly put away my notebook and pen to join the rest who detrain at this station into the chilly dark Fall evening. 〈http://www.island.lk/2003/10/19/ featur07.html〉 (17 October 2014) (28) The heavy crowd in the suburban trains in mumbai are making it difficult for the old and handicapped pasengers to entrain and detrain. 〈http://iricen. indianrailways.gov.in/IRICEN1/modules.php?name=Forums&file=viewtopi c&p=5401〉 (17 October 2014)

A closer look at the occurrences of detrain reveals that all hits displayed in GASTSL represent uses of detrain in the sense of “alight, get out of a train” (Meyler 2007: 73), which are described to be more formal in BrE. While the IndE and BrE online searches also retrieve a large number of hits of detrain in the above sense, proper names, i.e. Detrain as a family name, are also found in the respective data,

 The Lexis and Lexicogrammar of Sri Lankan English

which may play a certain role in explaining the increased frequency of occurrence in the two varieties. Nevertheless, in particular for IndE, browsing the online material provides sufficient evidence to state that detrain in its formal sense is at least as pervasive in BrE and IndE as it is in SLE. In order to be able to explain the comparably high frequency of refrigerator in the IndE and BrE online data, it is useful to contrast the number of hits for refrigerator with that for fridge, the alternative denomination for the concept of refrigerator. Example (29) shows the usage of fridge in the Sri Lankan GAST data and (30) that of refrigerator in the Indian GAST data. Table 11 gives an overview of the absolute and normalised number of hits of refrigerator and fridge in the respective domains. (29) Thieves allegedly broke into a house and took away a fridge valued at Rs.28,000 along with a gas cylinder while the occupants were away. 〈http://epaper.dailymirror.lk/epaper/viewer.aspx〉 (23 October 2012) (30) Do you feel cold when you step out after a shower? That’s because the water on your skin is evaporating, and in the process, cooling your skin. This is basically what happens inside a refrigerator, only, instead of water, there are chemicals that do the cooling. 〈http://www.sharpeners.in/articles/facts/ how-stuff-works/32-working-of-refrigerator.html〉 (11 February 2013) Table 11. Absolute and normalised (pmw) frequencies of refrigerator and fridge in GAST GAST-SL

GAST-IND

GAST-GB

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

refrigerator

39,000

26.66

4,250,000

49.09

11,500,000

32.58

fridge

192,000

131.25

4,500,000

51.98

38,900,000

110.20

The normalised frequencies in Table 11 indicate that fridge, the less formal option, generally occurs more often than refrigerator in each of the three GAST datasets. However, in GAST-IND, refrigerator (49.09) is almost as frequent as fridge (51.98) whereas in GAST-SL, the quantitative difference between refrigerator (26.66) and fridge (131.25) is most pronounced. The British data also show a relatively clear-cut preference for fridge (110.20) as opposed to refrigerator (32.58). Fridge is thus most clearly profiled as the default variant for a place to cool food in SLE and although fridge is also dominant in the other two varieties (in particular in BrE), refrigerator has a certain currency in IndE and BrE, which may serve as an explanation as to why refrigerator figures more prominently in the Indian and British than in the Sri Lankan GAST data. As regards the more frequent use of family member in GAST-IND compared to the equivalent SLE data, the online searches can be considered to be proof of

Chapter 4. Sri Lankan English lexis 

the tendency attested in the newspaper data. For family member, the highest normalised frequency of occurrence could be found in SAVE-IND, and the tendency of IndE to employ this formality marker more frequently than the other varieties under scrutiny is given further support. In sum, formality markers appear to be more frequent in South Asian Englishes than in BrE and there is a tendency in the present datasets which suggests that SLE may generally employ more formality markers than IndE. More fine-grained genre-based perspectives on the ICE data give indications that formality markers are more sensitive to contextual formality in the SLE and BrE than in the IndE data. On the basis of case studies of detrain, family member, persons and refrigerator, collocational profiles of given lexemes (and their potential to express distinct linguistic ecologies), colligational differences and (variety-specific) lexical prefe rences have been identified as potential sources of frequency-related differences of formality markers across the varieties examined. 4.2 Pan-South Asian English lexemes The lexical items of SLE which characterise SLE as a South Asian English and are thus unlikely to be used in varieties of English outside South Asia overlap with the vocabularies of other South Asian Englishes. This common stock of South Asian English vocabulary items is documented in Meyler (2007) in a two-fold way. Meyler (2007: xii) marks those lexemes which may be part of the vocabulary of a number of South Asian Englishes with the tag “also India”, although he (2007: xii) points out that “[i]n this context, ‘India’ often refers loosely to the whole subcontinent”. Consequently, Meyler’s (cf. 2007: xii) tag also India does not imply that the tagged lexical item is exclusively or necessarily part of the IndE vocabulary, but possibly (also) of another South Asian variety of English. The second label highlighting shared South Asian English vocabulary items in Meyler (2007) is the tag “Anglo-Indian” (Meyler 2007: xiv), which marks lexemes imported to SLE during the occupation by the British and which, as a consequence of former British presence throughout most of South Asia, may be found in other South Asian varieties of English as well. 117 vocabulary items are marked with either of the above tags in Meyler (2007). A list of the 117 pan-South Asian English (PSA) lexemes and their corresponding meanings can be found in Appendix A2: List of Pan-South Asian English lexemes. These items form the lexical pool on the basis of which it is to be established to what degree SLE lexis can be characterised in terms of PSA lexemes.11 . The pan-South Asianness of English lexemes is established on the basis of whether a lexical item is shared by SLE and IndE in the study at hand and PSA lexemes should thus be understood as lexical items shared across national borders of South Asian varieties of English.

 The Lexis and Lexicogrammar of Sri Lankan English

As this group of lexemes has a clear South Asian character to them in the sense that it denotes referents which may need to be verbalised more frequently in South Asia than elsewhere, it is to be expected that the lexical items under scrutiny will occur more frequently in the South Asian English data than in the BrE data. With regard to the genre-specific occurrence of PSA lexemes, the relevant literature does not give any indication as to clear patterns of preference or avoidance of PSA lexemes, which is why it is to be expected that PSA lexemes occur relatively evenly across genres.12 4.2.1 Pan-South Asian English lexemes: Frequency In the South Asian components of ICE, three PSA lexemes meet the threshold value of five occurrences. These lexemes along with their absolute and normalised frequencies are displayed in Table 12. Table 12. Absolute and normalised (pmw) frequencies of PSA lexemes in ICE ICE-SL abs. freq.

ICE-IND

norm. freq.

abs. freq.

ICE-GB

norm. freq.

abs. freq.

norm. freq.

gram

6

14.64

6

14.28

0

0

rupee

19

46.36

35

83.32

0

0

21

51.24

21

49.99

0

0

15.33

37.41

20.67

49.20

0

0

saree/sari MEAN

The lexical items retrieved are gram (meaning “chick peas” (Meyler 2007: 104)), rupee (“the currency of Sri Lanka” (Meyler 2007: 225)) and saree/sari (“a garment worn by women, consisting of a length of material wrapped around the body” (Meyler 2007: 229)).13 (31) to (33) exemplify the usage of gram, rupee and saree/ sari in the ICE data. . For a lexeme to be included in the analysis of the ICE and newspaper data, it has to occur at least five times in the respective Sri Lankan and Indian national components. In this context, this criterion is particularly relevant because only if a certain lexeme can be attested with sufficient frequency in both the Sri Lankan and Indian data is it reasonable to claim that this is a PSA vocabulary item. Due to the fact PSA lexemes can be expected to occur only rarely in the BrE data (cf. Meyler 2007: xiv), the frequency of occurrence of PSA lexemes in BrE is neither taken as a criterion for the inclusion in the analysis nor is it included in statistical tests in 4.2.1 and 4.2.2. . In the Sri Lankan and Indian data for rupee, only references to the respective national currency were considered. For the British data, either currency reference would have been included.

Chapter 4. Sri Lankan English lexis 

(31) Items found on their menu include 〈in-fo〉kadala gottu〈/in-fo〉 (roasted gram), ice creams and popsicles […]. 〈ICE-SL:W2B-027#108:4〉 (32) Even if he didn’t have change, he could have easily given the beggar a rupee. 〈ICE-IND:W2F-006#275:1〉 (33) Her dazzling saree at the engagement to a man of her 〈w〉parents’〈/w〉 choice does not blind Arjie to the reality. 〈ICE-SL:W2A-010#122:1〉

These lexemes only occur in the South Asian English data and are absent from the British ICE component. A pairwise comparison of the distribution of PSA lexemes in the SLE and IndE data does not yield any significant differences and the respective correlation is weak (χ² ≈ 2.42, df = 2, p > 0.05, Cramer’s V ≈ 0.15). In the newspaper data, 21 PSA lexemes occur at least five times in SAVE-SL and SAVE-IND respectively. Table 13 shows their absolute and normalised frequencies. There is a statistically highly significant difference and a strong correlation between the PSA lexemes and the Sri Lankan and Indian SAVE components (χ² ≈ 1492.26, df = 21, p < 0.001, Cramer’s V ≈ 0.61). The data points which deviate most strongly from their respective expected values are lakh, rupee, and tank, each of which will be examined in more detail in 4.2.3 to explain which linguistic means may be held accountable for these differences. Table 13. Absolute and normalised (pmw) frequencies of PSA lexemes in the newspaper data SAVE-SL abs. freq.

norm. freq.

SAVE-IND abs. freq.

BNC news

norm. freq.

abs. freq.

norm. freq.

ayurveda

37

12.07

29

9.44

0

0

bund

19

6.20

10

3.26

0

0

bungalow

21

6.85

55

17.91

24

2.70

cess

13

4.24

29

9.44

0

0

compound

14

4.57

13

4.23

9

1.01

felicitate

40

13.05

35

11.39

0

0

goon/goonda

9

2.94

29

9.44

2

0.23

gunny

6

1.96

10

3.26

0

0

jaggery

7

2.28

6

1.95

0

0

jungle

84

27.40

78

25.39

80

9.02

lakh

90

29.36

1,357

441.77

0

0

pandal

11

3.59

108

35.16

0

0

planter

73

23.81

18

5.86

4

0.45 (Continued)

 The Lexis and Lexicogrammar of Sri Lankan English

Table 13. Absolute and normalised (pmw) frequencies of PSA lexemes in the newspaper data (Continued) SAVE-SL abs. freq.

SAVE-IND

norm. freq.

abs. freq.

BNC news

norm. freq.

abs. freq.

norm. freq.

pooja/puja

70

22.83

459

149.43

1

0.11

range

25

8.15

40

13.02

0

0

roti

12

3.91

7

2.28

0

0

rupee

470

153.30

178

57.95

5

0.56

sadhu

6

1.96

29

9.44

0

0

saree/sari

141

45.99

110

35.81

4

0.45

satyagraha

11

3.59

5

1.63

0

0

stupa

25

8.15

7

2.28

0

0

tank

135

44.03

36

11.72

1

0.11

59.95

19.56

120.36

39.18

5.91

0.67

MEAN

Additionally, planter, i.e. “the owner or manager of an estate (esp. tea or rubber)” (Meyler 2007: 202), as exemplified in (34) and stupa in (35) meaning “a dome-shaped Buddhist shrine” (Meyler 2007: 249) represent PSA lexemes that occur markedly more frequently in the Sri Lankan compared to the Indian newspaper texts. (34) Though the innovative planters already started irrigating coconut like in India the conversion is not enough to face the growing demand for coconut. 〈SAVE-SL-DN_2004-06-16〉 (35) The legend says that two merchants engaged in trade between India and Lanka obtained a hair relic of the Buddha and a stupa was built and the hair relic was enshrined therein. 〈SAVE-SL-DN_2002-08-10〉

It is interesting to see that a small number of PSA lexemes (e.g. jungle, bungalow, compound, etc.) are, in fact, to be found in the BrE data. Jungle, as shown in (36), serves as a prime example of a PSA lexeme which is also in use in BrE. In the context of this vocabulary item, it seems that popular culture may exert an influence on the lexis of a variety as the usage of the lexeme jungle may have to a certain extent been influenced by the home media release of Walt Disney’s The Jungle Book, to which frequent reference is made in the BrE data as shown in (37). (36) As we climbed higher the jungle became more marshy and exotic, with the sort of plants you usually only see as plastic decorations. 〈BNC AHC〉

Chapter 4. Sri Lankan English lexis 

(37) He has made an understandable decision to Americanise the story, just as Disney did with European fairy-tales like Pinocchio and English classics such as The Jungle Book. 〈BNC AK4〉

Table 14 summarises the results of the GAST searches. The complete overview of the absolute and normalised (pmw) frequencies of all 98 PSA lexemes investigated via GAST is provided in Appendix A3: Pan-South Asian English lexemes via the Google Advanced Search Tool. Table 14. Mean absolute and normalised (pmw) frequencies of PSA lexemes in GAST GAST-SL

MEAN

GAST-IND

GAST-GB

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

15,659.16

9.39

601,252.35

10.54

742,165.51

2.14

98 out of the total of 117 PSA lexemes can be expected to produce comparatively reliable results given that their word forms appear to exclusively represent the respective concepts described in Meyler (2007) and are thus scrutinised via GAST. The approach to selecting these PSA lexemes differs from that of the formality markers. While almost all formality markers investigated in 4.1. are polysemous, this only holds for a limited number of PSA lexemes. PSA lexemes are checked against the PONS online dictionary accessible via 〈http://www.pons.eu〉 (17 October 2014). All PSA lexemes for which PONS lists meanings other than the one(s) described by Meyler (2007) are disregarded in the GAST searches. In GAST, nouns are searched for in their singular forms, verbs in their infinitive forms and compounds are entered in the search interface as given in M eyler (2007).14 Figure 9 visualises the mean normalised frequencies of PSA lexemes across the ICE, newspaper and GAST texts. In the ICE components studied, the normalised numbers point to a more frequent use of PSA lexemes in IndE (49.20) than in SLE (37.41) mainly caused by the high frequency of rupee (83.32) in ICE-IND. PSA lexemes occur 19.56 times on average in SAVE-SL, 39.18 times in SAVE-IND and 0.67 times in BNC news. Thus, PSA lexemes occur about twice as frequently in SAVE-IND compared to SAVESL while BNC news features hardly any PSA lexemes at all. Compared to the mean normalised frequencies in the ICE components, the respective n ormalised frequencies for the newspaper data are systematically lower (with the BrE data . In case spelling variants are provided in a dictionary entry by Meyler (2007), the first entry is chosen for GAST. However, if spelling variants are given in the same dictionary entry and one of the spelling variants is polysemous while the other is not, the latter alternative is used.

Norm. Freq. (pmw)

Norm. Freq. (pmw)

Norm. Freq. (pmw)

 The Lexis and Lexicogrammar of Sri Lankan English

75 50

37.41

49.2

25 0

0 ICE-SL

ICE-IND Corpus

ICE-GB

75 50 25 0

19.56

39.18 0.67

SAVE-SL

SAVE-IND Corpus

9.39

10.54

GAST-SL

GAST-IND Data

BNC news

75 50 25 0

2.14 GAST-GB

Figure 9. Mean normalised (pmw) frequencies of PSA lexemes in the ICE, newspaper and GAST data

being the exception). Generally, it can be stated that all PSA lexemes studied in the online data are attested in GAST-SL, GAST-IND and GAST-GB respectively (to the exception of cumbly, a typical garment worn by tea pluckers to cover their heads (cf. Meyler 2007: 66), in GAST-SL and dharmachakraya, the “Buddhist wheel symbol” (Meyler 2007: 75), in GAST-IND and GAST-GB). Still, Table 14 and Figure 9 indicate that there are clear quantitative differences when it comes to their frequencies in the domains investigated. With a mean normalised frequency of 10.54 PSA lexemes, the Indian data display the highest frequency of occurrence and GAST-SL displays a lower mean normalised frequency of 9.39. Given that GAST-GB has a mean normalised frequency of 2.14, the online data verify the peripheral role of PSA lexemes in BrE and provide another indication that PSA lexemes are more frequent in IndE than in SLE. It generally appears to be the case that IndE features more PSA lexemes than SLE and BrE, where PSA lexemes – and unsurprisingly so – are only relatively rarely attested. In the South Asian data, the occurrence of PSA lexemes also indicates that the ICE components feature more PSA lexemes than the newspaper and the online datasets. This may be related to the fact that e.g. in its instructional writing (W2D) section, ICE-IND also samples recipes with frequent reference to ingredients such as gram, which represents one of the PSA lexemes investigated.

Chapter 4. Sri Lankan English lexis 

It also needs to be pointed out here that the mean normalised frequencies for ICE are based on the occurrences of three PSA lexemes, which is why these frequencies need to be interpreted with a measure of caution given that although these three PSA lexemes may be on average more frequent than PSA lexemes in the newspaper or GAST data, a much wider range of PSA lexemes can be attested in the latter two groups of databases. In sum, then, frequency-related perspectives on PSA lexemes provide evidence that there is a common stock of vocabulary items in SLE and IndE which is almost completely absent from BrE. Independent of the fact that PSA lexemes – at least from a purely quantitative perspective – play a more peripheral role than formality markers in providing the South Asian Englishes concerned with their lexical character, these vocabulary items are used more frequently in IndE than in SLE. The less frequent use of PSA lexemes in Sri Lankan and Indian newspaper writing compared to the ICE data could be driven by genre conventions. It could well be that South Asian newspaper writing consciously attempts to project an international image to appeal to a wide readership prohibiting extensive use of PSA items. Although they may not be as frequent as formality markers, PSA lexemes have the potential to more overtly characterise the text in which they occur as South Asian given that (a) formality markers are more often in use in varieties outside South Asia than PSA lexemes and (b) some PSA lexemes such as arrack referring to an alcoholic beverage (cf. Meyler 2007: 13) are borrowed from local languages and thus convey a strong local flavour. 4.2.2 Pan-South Asian English lexemes: Genre-specificity The low overall occurrence of gram, rupee and saree/sari in the ICE data is also mirrored in their genre-specific frequency counts. Their absolute and normalised frequencies per genre in ICE-SL and ICE-IND are presented in Table 15 and Table 16 respectively and illustrated in Figure 10. For ICE-GB, no genre-specific distribution can be provided since these lexemes do not occur in ICE-GB as shown in Table 12. A pairwise comparison of PSA lexemes in the genres of ICE-SL and ICE-IND exhibits significant differences (p < 0.01). Gram occurs in academic (W2A; 48.96) and popular writing (W2B; 25.23) in ICE-SL and in non-professional (W1A; 24.65), academic (W2A; 11.73) and instructional writing (W2D; 90.85) in ICEIND. Given that gram stands for a particular kind of pea, it is comprehensible that the texts in which gram occurs are related to cultivating as well as cooking gram, the notions of which, in turn, tend to occur in particular text genres. Example (38) deals with the popularity of gram in the Sri Lankan crop market, which is a topic in academic writing, and Example (39) deals with how to use gram for cooking, with recipes being a typical text genre for instructional writing.

 The Lexis and Lexicogrammar of Sri Lankan English

Table 15. Absolute and normalised (pmw) frequencies of three PSA lexemes in the genres of ICE-SL gram abs. freq.

rupee

norm. freq.

saree/sari

TOTAL

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

Non-professional writing (W1A)

0

0

1

23.87

1

23.87

2

47.73

Correspondence (W1B)

0

0

3

32.01

2

32.01

5

80.03

Academic writing (W2A)

4

48.96

2

24.48

1

12.24

7

85.67

Non-academic writing (W2B)

2

25.23

2

25.23

2

25.23

6

75.69

Reportage (W2C)

0

0

2

49.66

0

0

2

49.66

Instructional writing (W2D)

0

0

3

73.39

0

0

3

73.39

Persuasive writing (W2E)

0

0

6

302.97

0

0

6

302.97

Creative writing (W2F)

0

0

0

0

15

342.27

15

342.27

TOTAL

6

14.64

19

46.36

21

51.24

46

112.23

(38) 〈in-fo〉Cowpea〈/in-fo〉 and 〈in-fo〉greengram〈/in-fo〉 have lost their popularity slowly […]. 〈ICE-SL:W2A-032#57:1〉 (39) Add buttermilk, turmeric, salt, sugar, chilli powder, lemon juice and gram flour. 〈ICE-IND:W2D-016#261:1〉 Table 16. Absolute and normalised (pmw) frequencies of three PSA lexemes in the genres of ICE-IND gram

rupee

saree/sari

TOTAL

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

Non-professional writing (W1A)

1

24.65

1

24.65

0

0

2

49.31

Correspondence (W1B)

0

0

8

127.73

2

31.93

10

159.66

Academic writing (W2A)

1

11.73

3

35.18

0

0

4

46.91

Non-academic writing (W2B)

0

0

4

48.08

0

0

4

48.08

(Continued)

Chapter 4. Sri Lankan English lexis 

Table 16. (Continued) gram abs. freq.

rupee

norm. freq.

saree/sari

TOTAL

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

Reportage (W2C)

0

0

7

178.73

1

25.53

8

204.26

Instructional writing (W2D)

4

90.85

2

45.43

10

227.13

16

363.41

Persuasive writing (W2E)

0

0

1

45.10

0

0

1

45.10

Creative writing (W2F)

0

0

9

209.03

8

185.81

17

394.84

TOTAL

6

14.28

35

83.32

21

49.99

62

147.59

0

363.41

147.59 45.1

48.08

159.66 46.91

49.31

112.23

73.39

49.66

85.67

75.69

80.03

200

100

204.26

342.27

302.97

300

47.73

Norm. Freq. (pmw)

400

394.84

In a similar vein, it might be argued that the comparatively frequent use of the PSA lexeme saree/sari in the creative writing (W2F) sections of both ICE-SL and ICE-IND may be connected to the sociocultural salience of the concept it refers to. In order to narrate the South Asian cultural experience in fiction, it could be that PSA lexemes such as saree/sari are used to evoke (maybe admittedly stereotypical) images of South Asia which set the reader in a South Asian scene.

n = 2 n = 7 n = 2 n = 6 N = 46 n = 2 n = 4 n = 8 n = 1 N = 62 n = 5 n = 6 n = 3 n = 15 n = 10 n = 4 n = 16 n = 17 ICE-SL Non-professional writing (W1A) Correspondence (W1B) Academic writing (W2A) Non-academic writing (W2B) Reportage (W2C)

Corpus

ICE-IND

Instructional writing (W2D) Persuasive writing (W2E) Creative writing (W2F) TOTAL

Figure 10. Normalised (pmw) and absolute frequencies of PSA lexemes in the genres of ICE-SL and ICE-IND

 The Lexis and Lexicogrammar of Sri Lankan English

The comparatively high normalised frequencies for the creative writing texts in both ICE-SL and ICE-IND depicted in Figure 10 bear testimony to this. Examples (40) and (41) illustrate the contexts in which saree/sari is used in ICE-SL and ICE-IND. (40) A fitful wind whipped the wet edges of her sari around her ankles, and her sari 〈indig〉pota〈/indig〉 billowed out behind her. 〈ICE-SL:W2F-002#143:1〉 (41) She wore her 〈indig〉 sari 〈/indig〉 in modern style, comfortably and not too tightly, with a nicely cut blouse, which focused on her attractive bosom and gave her a lean and youthful look. 〈ICE-IND:W2F-005#123:1〉

In contrast to gram and saree/sari, rupee is used in all genres in ICE-SL and ICEIND with creative writing in ICE-SL as the only exception. The notion of money is probably not predisposed to certain genres as it is relevant in a wider range of communicative situations compared to gram and saree/sari, and, thus, also attested in all genres which are supposed to be representations of these communicative situations. From a more general perspective, no clear patterns regarding the occurrence of PSA lexemes across the text genres featured in ICE can be identified for SLE or IndE. On the one hand, this is certainly related to the low frequency of PSA lexemes in the data concerned. On the other hand, the distribution might indeed be a slight indication that PSA lexemes do not group universally along an informal/formal scale as represented by the written ICE genres. However, this interpretation must remain tentative as more genre-specific data need to be looked into. 4.2.3 Pan-South Asian English lexemes: Case studies Lakh, standing for the numeric value of 100,000 (cf. Meyler 2007: 147), is one of the vocabulary items which makes differences in the frequency of PSA lexemes in SLE and IndE particularly obvious since it occurs with a normalised frequency of 29.36 in SAVE-SL and 441.77 in SAVE-IND. The absence of lakh from the BrE data indicates that this clearly is a pan-South Asian lexical item. It is highly unlikely that the difference in the frequency of use of lakh in SAVE-SL and SAVE-IND results from an extremely rare occurrence of numeric values as big as or larger than 100,000 in SAVE-SL. That is to say other realisations of these values must be expected to take the place of lakh in the Sri Lankan newspaper data. To lakh as exemplified in (42), there are two basic alternatives, namely the corresponding numeric value(s) as in (43) and the alternative lexical realisation(s) hundred thousand(s) as in (44). Figure 11 displays the relative and

Chapter 4. Sri Lankan English lexis 

absolute frequencies for lakh as well as its numeric and lexical alternatives in the newspaper data.15 (42) Dr. Ariyaratne said that he was able to muster the strength of six lakhs students in the recent past. 〈SAVE-SL-DN_2002-04-10〉 (43) Almost 900,000 young people are out of work â€” and the problem is worse in the South East. 〈BNC AJH〉 (44) President Chandrika Bandaranaike Kumaratunga has blocked a Government proposal to change the conditions of ownership of lands gifted to nine hundred thousand families under Swarnabhoomi and Jayabhoomi deeds […]. 〈SAVE-SL-DN_2003-01-04〉 99.42

100

88.71

83.44 Rel. Freq. (%)

75

50

25

13.08

11.19

3.48 0

n = 79

n = 504 SAVE-SL

n = 21

n = 959

lakh

0.09

0

0.58

n=1

n=0

n = 10

n = 121 SAVE-IND Corpus Numeric

n = 1728 BNC news

Lexical

Figure 11. Relative and absolute frequencies of lakh and its numeric/lexical alternatives in the newspaper data

The data yield highly significant differences for the frequencies of lakh and its alternatives across the datasets and there is a moderately strong correlation between the databases and the different signifiers for 100,000 (χ² ≈ 2631.83, df = 4,

. 71.73% of the values which are used in connection with lakh are divisible by 100,000 without remainder in the South Asian newspaper data under scrutiny. As this constitutes the vast majority of cases, it was decided that, independent of their structural realisation, only values which can be divided by 100,000 without remainder are included in the present analysis of lakh.

 The Lexis and Lexicogrammar of Sri Lankan English

p < 0.001, Cramer’s V ≈ 0.38).16 There seems to be a clear preference in IndE to lexically realise the value of 100,000 with lakh as this is the case in 88.71% of all instances. In contrast to this, lakh appears to be only a marginal option in SLE (13.08%) while it is absent from BrE (0%). In SLE, the preferred way of realising the value of 100,000 is numeric (83.44%), which is in line with the BrE default option. Thus, despite the fact that lakh is in use in SLE, it is only a minority variant compared to the numeric realisation and thus not as firmly entrenched as in IndE, which also finds reflection in the overall occurrence of lakh in SAVE-SL in comparison to SAVE-IND. Rupee is another PSA lexeme for which the normalised frequencies in SAVESL (153.30) and SAVE-IND (57.95) strongly deviate from each other. However, rupee certainly is a special case among the PSA lexemes examined in that it denotes different concepts in the Sri Lankan and Indian data, namely the Sri Lankan and Indian currency respectively. A close examination of rupee in the Sri Lankan newspaper data reveals usage patterns which are by and large absent or infrequent in the Indian data as exemplified in (45) to (47). (45) The appreciation of the Rupee against the United States Dollar to Rs. 94.30 as against Rs.96 just a couple of weeks back is another positive sign of a stronger economy. 〈SAVE-SL-DN_2003-09-23〉 (46) The situation has been further exacerbated by the depreciation of the rupee against major currencies which has forced importers to price imported tyres at a very high premium rate. 〈SAVE-SL-DM_2007-05-10〉 (47) The projected budget deficit of 6.8%, if achieved, will ensure that interest rates and inflation rates will continue to reduce and the value of the rupee will remain stable against the major currencies of the world. 〈SAVE-SL-DN_2003-11-21〉

Each of the above examples deals with the value of the Sri Lankan rupee (measured against other currencies). Some of the usage patterns that can be attested in the SAVE-SL data in the context of the value of the Sri Lankan currency are appreciation of the rupee (9 occurrences), depreciation of the rupee (14 occurrences) and value of the rupee (5 occurrences). None of these usage patterns can be found with rupee in the IndE newspaper data. A large number of these usage patterns can also be retrieved from the Sri Lankan top-level country domain as shown in Table 17.

. All pairwise comparisons of lakh and its alternatives in the newspaper data are highly significant as well (p < 0.001).

Chapter 4. Sri Lankan English lexis 

Each of the usage patterns of rupee occurs less frequently in GAST-IND and (unsurprisingly) GAST-GB as opposed to GAST-SL. This goes to show that, although PSA lexemes might not be as characteristic of SLE as of IndE, there are nevertheless certain PSA lexemes (e.g. dharmachakraya, swabasha) as well as usage patterns such as those of rupee derived from these lexemes which are almost exclusive to SLE based on the need to verbalise distinct (nation-bound) realities of everyday life. Table 17. Absolute and normalised (pmw) frequencies of three usage patterns of rupee in GAST GAST-SL

GAST-IND

GAST-GB

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

appreciation of the rupee

4,860

2.92

44,500

0.78

1,700

0.00

depreciation of the rupee

4,840

2.90

12,800

0.22

515

0.00

value of the rupee

6,590

3.95

87,200

1.53

9140

0.03

The average inflation rate of the Sri Lankan rupee from 1999 till 2010 is 10.09% whereas the inflation rate of the Indian rupee (6.57%) is notably lower.17 Consequently, the Sri Lankan currency can be considered to be less stable than its Indian counterpart, which might be a reason as to why currency developments in Sri Lanka are covered more consistently in Sri Lankan newspapers and online texts than in their Indian equivalents. It is likely that in these contexts, the varietyspecific SLE usage patterns of rupee exemplified in (45) to (47) have emerged. The PSA lexeme tank with the meaning of reservoir or artificial lake (cf. Meyler 2007: 254) occurs much more frequently in SAVE-SL (44.03) than in SAVE-IND (11.72) and BNC news (0.11). Although artificial lake as a lexical alternative to tank is absent from the South Asian data and occurs only once in BNC news, reservoir as shown in (48) seems to be a more viable substitute for tank as exemplified in (49). (48) Poor inflows into the reservoir has forced the government not to release water to irrigate Krishna delta and farmers, for the first time in the last 150 years, were not able to sow paddy in the region. 〈SAVE-IND-TI_37865〉

. The average inflation rates for the Sri Lankan and the Indian rupee were calculated on the basis of annual inflation rates for Sri Lanka and India retrieved from 〈http://www.indexmundi.com〉 (17 October 2014).

 The Lexis and Lexicogrammar of Sri Lankan English

(49) They mainly depended on tanks within the Maduru Oya National Park for their supply of water for the cultivations. 〈SAVE-SL-DN_2004-11-03〉

The occurrence of tank and reservoir varies considerably across the varieties. Their relative and absolute frequencies in the newspaper data are provided in Figure 12. The differences in the distribution of tank and reservoir across the three datasets are statistically highly significant and there is a moderately strong correlation between the corpora and the lexical alternatives under scrutiny (χ² ≈ 72.83, df = 2, p < 0.001, Cramer’s V ≈ 0.43).18 The data show that tank (59.21%) is more frequent than reservoir (40.79%) in SAVE-SL while SAVE-IND features fewer instances of tank (31.86%) than of reservoir (68.14%). In the British data, tank occurs only once (1.64%) as opposed to reservoir, which occurs 60 times (98.36%). Generally, the concept denominated by tank and reservoir is referred to more frequently in the Sri Lankan newspaper data than in its Indian or British counterparts. 100

98.36 tank

reservoir

Rel. Freq. (%)

75

68.14 59.21

50

40.79 31.86

25

0

1.64 n = 135

n = 93

SAVE-SL

n = 36

n = 77

SAVE-IND Corpus

n=1

n = 60 BNC news

Figure 12. Relative and absolute frequencies of tank and reservoir in the newspaper data

Consequently, the PSA lexeme tank appears to be the preferred option to refer to an artificial lake for water storage in SLE, but IndE conforms to BrE in that reservoir is its default option. Still, the overall higher frequency of occurrence of tank in SAVE-SL does not seem to be triggered by any striking structural differences of tank across the varieties scrutinised. Tank is pre-modified with adjectives and numerals, it is used in compound nouns as the head or non-head element and it forms part of proper nouns in both SLE and IndE. Thus, it appears as if the concept . All pairwise comparisons of tank and reservoir show highly significant differences (p 0.05, Cramer’s V ≈ 0.22). The usage of madam in the SLE ICE texts is exemplified in (50) and (51) and that of sir in (52), where a university lecturer is approached by an applicant in relation to a potential job opportunity.

. As archaism markers must be assumed to be rare in BrE (cf. Meyler 2007: xiv), the frequency of occurrence of archaism markers in the BrE data is not considered a criterion for the inclusion of given archaism markers in the analysis, which is why the BrE data are also excluded from statistical testing in 4.3.1.

Chapter 4. Sri Lankan English lexis 

(50) Honoured Madam, Let me introduce you myself first. I was a student of Prof. 〈@〉Nirmal Jayasinghe〈/@〉 at the University of Sabaragamuwa. 〈ICE-SL:W1B-021#54–56:3〉 (51) My dear 〈@〉Kanthi〈/@〉 madam, How are you after a long time〈space〉! Hope you are in merry spirits〈space〉!! 〈ICE-SL:W1B-007#148–150:11〉 (52) Sir, Having come to understand about your taking retired teachers of English Language to teach English to the students of Kelaniya University. 〈ICE-SL:W1B-023#58–59:4〉

(50) represents a typical case of the usage of madam in that the form occurs in the course of the opening lines of a letter as a general term of address to female addressees. As an aside, also note the interesting use of the verb INTRODUCE in the double-object construction introduce you myself in this example. Although this pattern occurs in only three out of the 20 examples of madam in the ICE-SL data, the usage of the combination proper noun + madam as in (51), where the anonymised proper noun Kanthi precedes madam, certainly is characteristic of SLE, but this structural combination is also shared with IndE as shown in (53) (53) Kindly tell my best regard to 〈indig〉 Sou 〈/indig〉 Shastri Madam and oblige. 〈ICE-IND:W1B-003#33:1〉

The more general pattern of proper noun + address term is also productive in SLE (and IndE) in the sense that terms of address other than madam such as auntie/ aunty or sir can follow proper names. This pattern may be more productive in spoken than in written language, but even fictional names are sometimes constructed on the basis of this pattern as is the case with Radha Aunty – one of the characters in Shyam Selvadurai’s (1994) novel Funny Boy exemplified in (54). The story is told from the perspective of the boy Arjie Chelvaratnam and although Radha also is Arjie’s aunt in this story, in SLE auntie/aunty can just as well be used by younger people when addressing or referring to older women as “a term of respect/affection” (Meyler 2007: 15) independent of family relations. (54) I was in my grandparents’ drawing room, dusting all their teak furniture when I heard Ammachi telling the aunts and uncles about it. The Nagendras were the family interested in Radha Aunty. (Selvadurai 1994: 41)

In the newspaper databases, seven out of 43 archaism markers occur at least five times or more in SAVE-SL and SAVE-IND. Table 19 illustrates the variety-specific distribution of these archaism markers across the newspaper corpora.

 The Lexis and Lexicogrammar of Sri Lankan English

Table 19. Absolute and normalised (pmw) frequencies of archaism markers in the newspaper data SAVE-SL abs. freq.

norm. freq.

SAVE-IND abs. freq.

cess

13

4.24

29

fellow

13

4.24

hail from

30

9.79

lass

5

madam parley sans MEAN

norm. freq.

BNC news abs. freq.

norm. freq.

9.44

0

0

14

4.56

59

6.65

51

16.60

15

1.69

1.63

8

2.60

14

1.58

11

3.59

10

3.26

22

2.48

15

4.89

23

7.49

0

0

22

7.18

19

6.19

2

0.23

15.57

5.08

22.00

7.16

16.00

1.80

If the frequencies of archaism markers in BNC news are excluded from statistical testing, no statistically significant distributional differences can be established for the absolute frequencies of archaism markers in SAVE-SL and SAVE-IND and the respective correlation is weak (χ² ≈ 6.72, df = 6, p > 0.05, Cramer’s V ≈ 0.16). The only archaism marker used more frequently in SAVE-SL (7.18) than in SAVE-IND (6.19) is sans, a borrowing from French meaning ‘without’ (cf. Meyler 2007: 229) as shown in (55). (55) In view of the people’s mandate to the UNF [= United National Front] to create a free media environment sans all restrictions and inhibitions, the Government will be carrying out a massive media reform this year, Mass Communications Minister Imitiaz Bakeer Makar told Parliament yesterday. 〈SAVE-SL-DN_2002-04-10〉

Given that semantic post-processing is not possible via GAST, it is necessary to pre-select archaism markers for which all or at least most occurrences in the online data represent the meaning studied. 17 out of the 43 archaism markers can reasonably be expected to produce hits displaying the respective meanings documented in Meyler (2007). These archaism markers have been chosen for the domain-specific online searches.20 The domain-specific absolute and normalised frequencies of the archaism markers under scrutiny in GAST are displayed in Table 20. . The selection of the archaism markers for the GAST searches is analogous to the selection procedure of PSA lexemes for the GAST searches.

Chapter 4. Sri Lankan English lexis 

Table 20. Absolute and normalised (pmw) frequencies of archaism markers in GAST GAST-SL

GAST-IND

GAST-GB

abs. freq. (norm. freq.)

abs. freq. (norm. freq.)

abs. freq. (norm. freq.)

568 (0.34)

980,000 (10.69)

237,000 (0.40)

26,400 (16.03)

967,000 (10.55)

1,850,000 (3.10)

coolie

958 (0.58)

134,000 (1.46)

750,000 (1.26)

damsel

2,970 (1.80)

107,000 (1.17)

3,390,000 (5.68)

hail from

3,070 (1.86)

72,100 (0.79)

450,000 (0.75)

houseboy

1,220 (0.74)

41,800 (0.46)

93,700 (0.16)

try your level best

42 (0.03)

1,810 (0.02)

8,330 (0.01)

32,200 (19.55)

1,140,000 (12.43)

4,800,000 (8.04)

murder the king

5 (0.00)

139 (0.00)

5,280 (0.01)

do the needful

1,880 (1.14)

321,000 (3.50)

99,100 (0.17)

opening dose

6 (0.00)

7 (0.00)

182 (0.00)

pow-wow

13,300 (8.08)

369,000 (4.02)

421,000 (0.71)

140,000 (85.01)

14,200,000 (154.86)

21,100,000 (35.33)

sweetmeats

1,930 (1.17)

14,700 (0.16)

106,000 (0.18)

toper

65 (0.04)

44,200 (0.48)

87,100 (0.15)

undersigned

11,900 (7.23)

465,000 (5.07)

563,000 (0.94)

yeoman service

15,300 (9.29)

6,890 (0.08)

5,590 (0.01)

14,812.59 (8.99)

1,109,685.06 (12.10)

1,998,016.59 (3.35)

brassiere cess

mater

sans

MEAN

 The Lexis and Lexicogrammar of Sri Lankan English

Norm. Freq. (pmw)

Norm. Freq. (pmw)

Norm. Freq. (pmw)

In relation to the mean normalised frequencies of archaism markers across the datasets as depicted in Figure 13, ICE-IND features the highest number of archaism markers (51.18), ICE-SL shows the second highest frequency (36.60) and ICE-GB displays the lowest number of archaism markers (3.48) among the components of ICE examined. In the newspaper datasets, most archaism markers can be found in SAVE-IND (7.16) and SAVE-SL (5.08) features the second largest number of archaism markers. BNC news features 1.80 archaism markers in mean normalised frequencies. 75 50

36.6

51.18

25 0

3.48 ICE-SL

ICE-IND Corpus

ICE-GB

5.08

7.16

1.8

SAVE-SL

SAVE-IND Corpus

BNC news

8.99

12.1

GAST-SL

GAST-IND Data

75 50 25 0

75 50 25 0

3.35 GAST-GB

Figure 13. Mean normalised (pmw) frequencies of archaism markers in the ICE, newspaper and GAST data

The data, thus, clearly and unsurprisingly show that there is a higher density of archaism markers in the South Asian English as opposed to the BrE data, which verifies the initial hypothesis that archaism markers figure more prominently in South Asian Englishes than in BrE. With regard to differences among South Asian Englishes, the data indicate that IndE may use archaism markers to a larger extent than SLE. In comparison, the newspaper results also show that archaism markers can be found more frequently in the ICE components than in the newspaper corpora since, for each variety, the mean normalised frequencies are uniformly higher in the ICE components than in the respective newspaper databases. Still,

Chapter 4. Sri Lankan English lexis 

this o bservation needs to be interpreted cautiously because only two archaism markers formed part of the ICE-based analysis. Although the mean normalised frequencies in ICE are certainly higher in the Sri Lankan and Indian data, the newspaper data feature a larger number of archaism markers, which, however, are on average not as frequently used as the two archaism markers in ICE. This constellation thus very much resembles the frequency-based results for PSA lexemes discussed in 4.2.1. Consequently, one might come to the conclusion that the archaism markers in ICE, madam and sir, are more rooted in the contexts in which they occur, i.e. correspondence for the most part, while the range of archaism markers used in newspaper writing may not be as tightly bound to particular textual settings. The GAST data confirm the findings based on the offline corpora since archaism markers are most frequent in GAST-IND (12.10). GASTSL (8.99) displays a lower frequency and archaism markers are least frequent in GAST-GB (3.35). All in all, it seems to be the case that, among the groups of lexical items investigated, archaism markers are – in terms of frequency – roughly as important as PSA lexemes in the characterisation or differentiation of SLE, IndE and BrE as these two sets of lexical items occur with largely comparable frequencies in the corpus environment (with the exception of the newspaper data, where PSA lexemes are more frequent). The formality markers investigated, however, are systematically more frequent than the PSA lexemes and archaism markers under scrutiny. That said, archaism markers occur more often in the South Asian than in the BrE data, which is why archaism markers can be considered to constitute lexical differences between the postcolonial varieties and BrE. In turn, IndE displays more archaism markers than SLE in the offline and online data although the differences in the offline data have been shown to be statistically not significant. 4.3.2 Archaism markers: Case studies For an investigation of the prevalence of archaism markers in various communicative settings, the genres covered in the ICE components offer themselves. A closer look at the genre-specific distribution of madam and sir reveals that these lexemes seem to be associated with correspondence (W1B) as in (56) and creative writing (W2F) as in (57). (56) Dear Madam, We the board members of the U.N.Y.A gladly invite you to give a lecture to the advanced level students of our college on the importance of English. 〈ICE-SL:W1B-024#44:4〉 (57) He stared at the headmaster hopefully. ‘Well, sir, what about the boy? […]’ 〈ICE-SL:W2F-018#67:1〉

 The Lexis and Lexicogrammar of Sri Lankan English

In ICE-SL, madam is used in correspondence (W1B; 477.35 pmw) only while sir occurs in letter (W1B; 48.02 pmw) and creative writing (W2F; 159.73 pmw). In ICE-IND, madam is used in letter (207.56 pmw) and creative writing (116.13 pmw) and sir is restricted to letter writing (399.16 pmw). In contrast to this, ICEGB features madam (32.16 pmw) and sir (16.08 pmw) in letter writing only. The occurrence of madam and sir in the letter writing section (W1B) of ICESL suggests that these lexical items are viable options for address terms in SLE. In order to establish the default and more peripheral options when addressing women or male teachers in SLE letters, a more in-depth look needs to be taken at the distribution of madam and sir in comparison to their respective lexical alternatives. For madam, miss and the abbreviations ms and mrs are considered while for sir, other lexical options are mister and mr.21 Figure 14 provides an overview of the distribution of madam, miss and ms/mrs in the correspondence sections (W1B) of the ICE components. 100

75 Rel. Freq. (%)

miss

madam

mrs/ms

75.27

68.29

60 50

25

30

24.39

19.35 7.32

0

n = 20

n=6 ICE-SL

10

5.38 n = 56

n = 18

n=5 ICE-IND Corpus

n = 70

n=2

n=6

n = 12

ICE-GB

Figure 14. Relative and absolute frequencies of madam and its alternatives in the correspondence (W1B) sections in ICE

The distribution of madam and the additional lexical options for address terms for women yields significant differences (p < 0.05).22 The comparatively . The lexical item miss may be biased towards referring to younger women more frequently than to older women, which is why the lexemes under scrutiny are not interchangeable in every given context. . Based on the Bonferroni correction for multiple pairwise comparisons (cf. e.g. Field et al. 2012: 428–429), which has been applied to all instances of multiple pairwise comparisons of

Chapter 4. Sri Lankan English lexis 

f requent occurrence of madam in ICE-SL and the relatively low frequency of this vocabulary item in ICE-GB are the values which deviate most strongly from the expected data distribution. The data suggest that the correspondence sections (W1B) of the South Asian data feature a larger number of address terms for women than ICE-GB. In normalised (pmw) frequencies, the letters featured in ICE-SL and ICE-IND hold 1312.44 and 1484.86 female address terms respectively whereas the letters in ICE-GB contain 321.63. This might be indicative of BrE leaving out female address terms in letters (and possibly opting for first or second names or combinations of these) in contexts in which SLE and IndE users choose to use these address forms. The most frequent address form for women in letters is the abbreviated one, i.e. either mrs or ms, in ICE-SL (68.29%), in ICE-IND (75.27%) and in ICE-GB (60.00%). When it comes to the second most prominent address term, however, variety-related differences can be attested. The South Asian data behave homo genously in that the second most frequently used address term for women in letters is madam in ICE-SL (24.39%) and ICE-IND (19.35%). In contrast, miss is the second most frequent option in ICE-GB (30.00%). Two of the lexical alternatives to sir as an address form for male teachers in letters are mister and its abbreviation mr. The frequencies of these vocabulary items in the correspondence sections (W1B) of the ICE components studied are shown in Figure 15. Across ICE-SL, ICE-IND and ICE-GB, there are significant distributional differences in relation to the lexical alternatives sir, mister and mr (p < 0.01).23 The comparatively low normalised (pmw) frequency of address terms for male teachers in letters in ICE-GB (64.33) is in line with the low frequency of female address terms in this text category and an analogous explanation, i.e. the possible preference to address addressees with proper names, appears plausible. In ICE-SL (256.09 pmw) and ICE-IND (431.09 pmw), the respective address terms can be attested more regularly, which may be seen as an indication that other options such as first- and/or second-name sequences are more prominent in BrE.

the present study, the differences between ICE-IND and ICE-GB (p < 0.01) are significant, but the comparisons between ICE-SL and ICE-IND (p > 0.05) as well as ICE-SL and ICE-GB (p > 0.01) do not produce significant differences. . The differences in the occurrence of sir and its alternatives are significant for the comparison between ICE-IND and ICE-GB (p < 0.01). There are no significant differences between ICE-SL and ICE-GB (p > 0.05) and between ICE-SL and ICE-IND (p > 0.01).

 The Lexis and Lexicogrammar of Sri Lankan English 100

Rel. Freq. (%)

75

92.59

sir

mr 75

62.5

50

37.5 25

25

0

mister

0 n = 10

n=0 ICE-SL

0 n=6

n = 25

n=0 ICE-IND Corpus

7.41 n=2

0 n=1

n=0 ICE-GB

n=3

Figure 15. Relative and absolute frequencies of sir and its alternatives in the correspondence (W1B) sections in ICE

In the Sri Lankan (62.50%) and in the Indian (92.59%) ICE components, sir seems to be the default address term for male teachers in letters. In ICE-GB, however, sir is less frequent (25.00%) and mr is used more frequently (75.00%). Mister is notably absent from all datasets. Although this interpretation needs to be supported by more empirical data, the distribution of sir and its alternatives suggests that the default way of addressing male teachers in letters differs across varieties in that sir is the conventional address term in SLE and in IndE, but the abbreviated mr is the BrE standard. In addition to the occurrences in letter writing (W1B), archaism markers are also attested in the creative writing (W2F) sections of ICE-SL and ICE-IND. Sir occurs 159.73 times (pmw) in creative writing texts in ICE-SL and madam is featured 116.13 times (pmw) in the creative writing section of ICE-IND. Archaism markers are absent from the creative writing texts in ICE-GB. The fact that archaism markers are used in creative writing in both South Asian English databases, but missing from the respective section of ICE-GB calls attention to their importance for fictional plots in SLE and IndE. It could well be the case that the use of archaism markers helps authors of novels, short stories, etc. to narrate local cultural experience and to accommodate prospective readers in settings pronouncedly different from British ones. Example (58) illustrates this. (58) He invited the headmaster to come into the shade, saying that he must have had a tiring journey. And why hadn’t he told him he was coming? He would have met him at the house. He hesitated in his embarrassment. What would he like? A little toddy perhaps? He had some fresh toddy. Or would he like some water to cool his face? No? He began drawing water from the

Chapter 4. Sri Lankan English lexis 

well, washing himself, taking in the new situation. The headmaster, good heavens, the headmaster, come all the way to see him, he who had never been to school! ‘Well, sir, you are out of the sun at least. […]’ (Sivanandan 1997: 7)

In this short excerpt from a SLE novel, the local headmaster of a school visits the grandfather of a highly gifted pupil. While the syntactic structure of this text passage is not noticeably different from BrE syntax, it is the choice of words that helps put the story in a Sri Lankan setting since, apart from the use of the archaism marker sir, the use of the PSA item toddy also adds to situating the plot in a Sri Lankan context. This usage of archaism markers also holds true for examples drawn from ICE-IND, cf. (59). (59) When your daughter comes home, we’ll all be there to do an arti for her … for isn’t she the Lakshmi of your house, madam. 〈ICE-IND:W2F-002#6:1〉

Here, the archaism marker madam is used alongside Hindu terminology, namely arti referring to a ritual of worship and Lakshmi, a Hindu goddess. Consequently, archaism markers, and, more generally, varietally-marked lexical items, may also be a tool for localising creative writing in non-British settings, which may account for the presence of archaism markers in SLE and IndE creative writing and for their absence in BrE creative writing. According to Meyler (2007: 88), the archaism marker fellow may stand for “man, boy, chap, bloke, guy […] [and] can also refer to animals, e.g. a pet”. This prototypical usage, maybe with the exception of the reference to animals, can be shown for all varieties under scrutiny and is demonstrated in (60). (60) One fellow even bought Albie a pint because he said his wife might leave him alone now she knew there was somebody more miserable than him. 〈BNC K3H〉

In this example, fellow could be replaced with e.g. man or guy without bringing about a substantial change of meaning. As mentioned above, the meaning of fellow is relatively homogenous across SLE, IndE and BrE, but one collocational particularity could be observed in the Sri Lankan data as shown in (61) and in (62). (61) “George don’t be nasty, our son’s been challenged to a duel by that Saddam fellow.” 〈SAVE-SL-DN_2002-10-07〉 (62) “[…] Bring your own board with you though, Dads using mine and won’t give it back till I throw that Saddam fellow out.” 〈SAVE-SL-DN_2002-05-07〉

The above examples illustrate that, in the Sri Lankan data, fellow can be directly preceded by that + name of a person. Although fellow may follow other proper

 The Lexis and Lexicogrammar of Sri Lankan English

nouns in the Indian and British newspaper data as well, the syntagmatic combination of that + name of a person + fellow is attested twice exclusively in the Sri Lankan newspaper data. In order to look into this on a wider empirical basis, the combinations that Saddam fellow, that Obama fellow and that Clinton fellow are examined via GAST.24 The absolute and normalised frequencies of these p atterns obtained via GAST are documented in Table 21. The pattern that + name of a person + fellow is attested in each of the varieties concerned via GAST. The absolute and normalised frequencies are, however, extremely low for each of the domains examined. Consequently, the pattern that + name of a person + fellow is probably not restricted to any of the varieties of English studied, but available in each of them. Nevertheless, the low absolute frequencies are indicative of a low productivity of this pattern. Table 21. Absolute and normalised (pmw) frequencies of three usage patterns of that + name of a person + fellow in GAST GAST-SL

GAST-IND

abs. freq.

norm. freq.

abs. freq.

that Saddam fellow

2

0.00

0

that Obama fellow

0

0

that Clinton fellow

0

0

GAST-GB

norm. freq.

abs. freq.

norm. freq.

0

2

0.00

1

0.00

709

0.00

1

0.00

4

0.00

With regard to the archaism marker HAIL from, collocational differences can be attested for the varieties concerned. Example (63) shows the usage pattern of HAIL from which is shared by all varieties studied, namely the combination of HAIL from + place. (63) Stewart, whose father hailed from Glasgow, now travels the country on behalf of the National Cricket Association. 〈BNC K5A〉

. The persons’ names have been chosen due to the following reasons. The combination that Saddam fellow is attested twice in the SAVE data and although Saddam Hussein may be of slightly more importance to Britain for political reasons, the person is not likely to be mentioned more frequently in British than in Sri Lankan or Indian writing. The same holds true for Barack Obama and Bill Clinton. It goes without saying that this exemplary study is a far cry from a comprehensive analysis of that + name of a person + fellow – in particular because the data cannot be cleaned from instances in which e.g. that Clinton fellow does not refer to Clinton himself but to one of his associates.

Chapter 4. Sri Lankan English lexis 

As shown in Figure 16, the combination HAIL from + place constitutes 87.5% of all occurrences of HAIL from in BNC news. In contrast to this, the syntagma HAIL from + place represents 60.71% of all instances of HAIL from in the Sri Lankan and 78.43% in the Indian newspaper data, which, however, still constitutes the majority of cases. 100 87.5 78.43 Rel. Freq. (%)

75 60.71 50 32.14 25

15.69 7.14

0

n = 17

n=2

12.5

5.88 n=9

SAVE-SL HAIL from + place

n = 40

n=3

0 n=8

SAVE-IND Corpus HAIL from + family

n = 14

n=0

n=2

BNC news HAIL from + other

Figure 16. Relative and absolute frequencies of HAIL from + place/family/other in the newspaper data

Yet, this lower relative frequency finds explanation in an additional usage pattern of HAIL from as exemplified in (64) and (65). This combination is absent from the BNC news data, but occurs repeatedly in the South Asian newspaper texts. (64) The headmistress of Bonhooghly Girls’ High School had served a notice to the teachers of the junior school barring them from using any kind of cosmetics to class on the plea that it was undesirable since most of the school’s students hail from economically backward families. 〈SAVE-IND-SM_2005-01-06〉 (65) He hails from a fishing family and, even though he gave up going to sea long ago, owned five fishing boats and a number of large nets. 〈SAVE-SL-DM_2005-01-08〉

The combination of HAIL from + family represents 7.14% in the SLE dataset and 5.88% in the IndE data and may thus be seen as a South Asia-specific complement in the collocational profile of HAIL from. Meyler (2007: 108) also illustrates this usage for SLE with “[h]e hails from a good family”. However, the fact that there

 The Lexis and Lexicogrammar of Sri Lankan English

are no significant differences in the occurrence of the combinations of HAIL from with place/family/other across the newspaper datasets (p > 0.05) suggests a certain degree of cross-varietal uniformity with regard to their usage.25 To follow up on this observation, two patterns of HAIL from + family, i.e. hails from a wealthy family and hails from a poor family and two patterns of HAIL from + place, i.e. hails from a small village and hails from a big city, are scrutinised via GAST. The results of the GAST searches are given in Table 22. Table 22. Absolute and normalised (pmw) frequencies of patterns of hails from in GAST GAST-SL

GAST-IND

abs. freq.

norm. freq.

hails from a wealthy family

1

0.00

hails from a poor family

4

hails from a small village hails from a big city

abs. freq.

GAST-GB

norm. freq.

abs. freq.

norm. freq.

167

0.00

97

0.00

0.00

1,990

0.02

133

0.00

4

0.00

2,390

0.03

361

0.00

0

0

0

0

0

0

Except for hails from a big city, each of the patterns can be found in the country domains. Still, even though these patterns are attestable, each of them is extremely rare in the online data. There are three central characteristics of the distribution of the archaism marker HAIL from across the varieties investigated. First, in BrE, HAIL from is generally used less frequently than in the South Asian Englishes. Second, the combinability of HAIL from + place/family appears to be a feature of SLE as well as IndE, which, third, does not differentiate these varieties categorically, as could be speculated on the basis of the newspaper texts, but quantitatively from BrE. In sum, the case studies of archaism markers point to parallels between SLE and IndE with regard to the preferences in the respective address form systems and the exploitation of archaism markers for expressing indigenous realities (in e.g. creative writing plots). The syntagmatic combinations of that + name + fellow and HAIL from + family are present in all three varieties covered. Against this background, however, at least the usage pattern HAIL from + family certainly adds

. None of the pairwise comparisons of the different combinations of hail from exhibits significant differences either (p > 0.05).

Chapter 4. Sri Lankan English lexis 

to the lexicogrammatical profile of SLE in that it represents a (more likely) structural option in SLE in comparison to BrE, but not a categorical difference between the varieties scrutinised. Neither the ICE data nor the newspaper corpus environment yield any significant quantitative differences between SLE and IndE, which offers the conclusion that both varieties behave comparatively similarly with regard to the usage of archaism markers. Closer inspections of variety-specific address form systems, creative writing texts and South Asian English usage patterns featuring archaism markers further corroborate this alleged uniformity between IndE and SLE regarding the (frequency of) use of archaism markers. With a view to lexical items which are no longer common in BrE, but may still be more wide-spread in younger varieties of English, Mukherjee puts forward that from the point of view of the evolution of a New English variety, such cases of superstrate retention show that today these forms – once part of the input variety – are endonormatively stabilized and no longer based on contemporary native usage. From this perspective, superstrate retention is a reflection of a progressive force at the structural level in a similar vein to genuine innovations. (Mukherjee 2007: 174)

It is certainly true that these lexemes can represent an emancipation from BrE since these vocabulary items have either largely fallen out of contemporary BrE usage or at least bring with them an archaic flavour in BrE while they are much more frequently in use in SLE (and IndE) and possibly less, if at all, stylistically marked. On the basis of the archaism markers studied here, however, it seems that related usage patterns are shared across SLE, IndE and BrE since closer examinations of the address form system and the patterns associated with fellow and HAIL from reveal that each variety has the same range of terms or constructions at its disposal. It appears that, at least in the context of archaism markers, rather the frequencies of occurrence of these individual forms and related constructions shape the lexis (and lexicogrammar) of SLE (and IndE) than categorical diffe rences reflected in the absence or presence of certain lexical items in comparison to native BrE usage.

4.4 Sri Lankan English lexis: An overview Formality markers, PSA lexemes and archaism markers all contribute to the vocabulary of SLE. In principle, this also holds true for IndE as well as BrE, though to a more limited extent for the latter, which is why, with the approach taken in the present study, the variety-specific vocabularies can ultimately only be defined and

 The Lexis and Lexicogrammar of Sri Lankan English

differentiated in terms of general frequency and relative proportion of the above subsets of vocabulary items and in terms of the formal structures the respective lexical items occur in. Formality markers certainly are of central importance to SLE lexis due to their overall frequency and relative pervasiveness, which clearly helps to set off SLE from IndE and BrE lexis. Although a number of PSA lexemes are shared by SLE and IndE and, thus, also shape SLE lexis in a relatively marked fashion, the overall frequency of PSA lexemes in SLE is notably lower than that of formality markers. In addition to that, it is the Indian variety of English which generally displays the highest frequency of use of PSA vocabulary items. Due to their by and large lower frequency of occurrence, archaism markers are neither as salient to SLE lexis as formality markers, but with this subset of lexemes, several quantitative and structural parallels exist between the SLE and IndE vocabulary, which is why archaism markers undoubtedly differentiate the lexis of the South Asian Englishes from BrE lexis.26 Consequently, it may be the case that the distinctiveness of the SLE vocabulary is not as overtly perceptible as the unique character of its IndE counterpart. IndE frequently employs PSA lexemes, i.e. lexemes used predominantly in the South Asian region and often borrowed from the respective indigenous languages (cf. Meyler 2007). Among the different stocks of vocabulary analysed, it is probably the PSA lexemes which, at very first glance, openly characterise the varieties of English using them as South Asian. As a consequence of that, the (more frequent) use of PSA lexemes clearly provides IndE lexis with an unconcealed nonBritish and clearly South Asian character. This overt and easily noticeable IndE lexical emancipation from BrE could have been catalysed by a greater confidence as regards the status of IndE as a variety of English in its own right on the side of the speakers of that variety. Although SLE also employs a range of PSA lexemes in a number of contexts, they do not figure as prominently as in IndE, which implies that SLE lexis is not as overtly marked as a South Asian English set of vocabulary as its IndE equivalent. However, SLE uses more circumspect techniques to structurally accommodate its

. As indicated with the analyses concerned, some of the results just described have been obtained under operation of a threshold value of five occurrences per lexeme. However, it was possible to validate, on the whole, that the results based on this procedure are in line with the results that would be obtained without such a threshold value. The only difference would occur in the frequency-based analysis of archaism markers in the ICE data, in which ICE-SL would display a larger number of archaism markers in normalised terms than ICE-IND and ICE-GB without the threshold value of five.

Chapter 4. Sri Lankan English lexis 

vocabulary and to provide it with its unique SLE character. The sets of formality and archaism markers, i.e. stocks of vocabulary which are not as easily perceptible as non-British as PSA lexemes because the respective lexical items are (still) part of BrE vocabulary, are salient in the lexical nativisation process of SLE. With these two sets of lexemes – and in particular with formality markers – SLE shapes its individual lexical character by specific collocations (e.g. internally displaced persons) and characteristic usage patterns (e.g. definite numerals + persons) on the basis of lexemes which, at face value and in isolation, might be considered to be BrE, but the structures and frequencies of which clearly characterise them as Sri Lankan or South Asian.

chapter 5

Sri Lankan English lexicogrammar For the emergence of new varieties of English and the development of their distinctive structural profile, the interface between lexis and grammar is of pivotal importance because it has been argued that variety-specific constructions first come to light at the lexicogrammatical level of language organisation (cf. S chneider 2007: 46). Hundt (1998: 5), in her discussion on the norm development of New Zealand English, goes even one step further when she gives the lexis-grammar interface (in contradistinction to other structural levels) centre stage in descriptions of New Englishes by stating that indeed “[m]ost of the genuine NZE features are expected to be found at the interface of grammar and the lexicon”. In this regard, recent studies (e.g. Mukherjee 2008; Sedlatschek 2009; Mair & Winkle 2012; N elson & Hongtao 2012) have convincingly shown that lexicogrammatical routines are fruitful grounds for the description of the norm-development of World E nglishes in general and postcolonial varieties of English in South Asia in particular. However, lexicogrammatical characteristics of newly emerging varieties of English do not necessarily manifest themselves in categorical dissimilarities from other existing (and potentially already more established) varieties of English – quite on the contrary. It has been noted that lexicogrammatical differences between varieties of English are typically frequency-related and not necessarily categorical (cf. Mukherjee 2007: 175). Against this background, the present study analyses three lexicogrammatical phenomena to delineate characteristics of the SLE lexis-grammar interface in comparison to IndE and BrE. In Chapter 5.1, particle verbs in SLE, their genre-specific distribution and potentially innovative verb-particle combinations will be at the centre of attention. In Chapter 5.2, the analysis will focus on different types of light-verb constructions and their occurrence in different communicative settings and Chapter 5.3 will present an in-depth study of the verb-complementational profiles of the three verbs HATE, LIKE and LOVE. 5.1 Particle verbs At the interface between lexis and grammar, the verb phrase has repeatedly been focussed on in order to closely scrutinise processes of structural nativisation in and

 The Lexis and Lexicogrammar of Sri Lankan English

across various (postcolonial) varieties of English (cf. e.g. Hundt 1998; Olavarría de Ersson & Shaw 2003; Bresnan & Hay 2008; Mukherjee & Gries 2009; Bresnan & Ford 2010). Particle verbs (PVs), which have also been referred to as “multiword verbs” (Quirk et al. 1985: 1150) or “verb-particle combinations” (Schneider 2004: 230), qualify as such a lexicogrammatical feature of the verb phrase and may thus be expected to yield useful insights into the structural accommodation of varieties of English in distinct linguistic ecologies as elaborated by Schneider: The lexicalized combination of verbs with particles, interacting in precisely circumscribed ways with other clause complements, was considered to be precisely such a lexico-grammatical borderline area that could be expected to lead to variable usage, potentially to be conventionalized with different options preferred from one variety to another. (Schneider 2004: 229)

A set of studies (cf. e.g. Bolinger 1971; Ahulu 1995; Leisi & Mair 1999; Schneider 2004; Xiao 2009) indeed shows that cross-varietal differences do surface in careful examinations of PVs across a number of World Englishes. In relation to SLE, analyses of PVs already conducted clearly point towards variety-specific profiles of verb-particle combinations. Mendis (2010) is interested in stylistic conventions in published writing in SLE in comparison to BrE. Based on the published texts in the written Sri Lankan and British ICE components, she (cf. 2010: 16–17) inter alia finds that a selection of PVs characteristic of SLE such as PASS out in the sense of graduate or MAKE out in the sense of pretend can be attested infrequently in published texts in SLE. Mukherjee (2012) shows the presence of COPE up with as an alternative to COPE with in SLE and IndE with the help of newspaper corpora representing acrolectal IndE and SLE. Kumara and Mendis (2010) verify the existence of a selection of distinct PVs in SLE newspaper data and, in comparison to BrE, categorise PVs into three groups, i.e. PVs with (a) a unique meaning (e.g. GET down foreign workers in the sense of attracting foreign workers), (b) an added particle (e.g. TAKE up an exam) or (c) a dropped particle (e.g. REFER your dictionary) in SLE. Zipp and Bernaisch (2012) investigate PVs with up as the particle in a subset of the written files of a number of ICE components representing first- and second-language varieties world-wide including BrE, IndE and SLE and in corresponding online data to delineate supra-regional genre conventions on the basis of the distribution of this set of PVs. The present study complements earlier approaches to PVs in SLE in that additional sets of PVs with other particles are scrutinised in a more complex corpus environment, which will allow the verification of findings across different sets of data and groups of PVs as well as more systematic cross-varietal comparisons of PV usage. From a methodological perspective, the present study understands PVs as a combination of a lexical verb and one or two particles without further

Chapter 5. Sri Lankan English lexicogrammar 

r estrictions as regards syntactic or semantic criteria. Earlier syntactic distinctions of multi-word verbs are based on whether the particle in a verb-particle combination is a preposition or a (spatial) adverb. In this vein, Quirk et al. (cf. 1985: 1150–1152) create separate groups for phrasal verbs, prepositional verbs and phrasal-prepositional verbs. However, due to the fuzziness of the word class assignment of the particle associated with the verb, this distinction is difficult to uphold, which is why more recent publications (e.g. Huddleston & Pullum 2002) have discarded this distinction altogether. Semantically, the degree to which the individual meanings of the component parts of a given PV add up to the meaning of the entire multi-word verb has been used to describe the transparency of PVs. Semantic transparency of PVs ranges from transparent combinations (e.g. WALK up the street) to opaque combinations (e.g. GIVE up drinking), all of which are taken into consideration in the present study. To operationalise the concept of PVs, the study at hand anchors the corpus searches in the three particles that occur most frequently in verb-particle combinations, i.e. up, out and off (cf. Sinclair 2002: 439), in order to comprehensively examine all occurrences of the respective groups of PVs in the data. Each example of up, out and off is extracted from the corpora and cases not instantiating PVs are then discarded. With ICE, the data are cleaned manually while POS-tags are used as an additional filter prior to manual cleaning with the newspaper datasets.1 As post-processing GAST data is extremely limited, this approach of establishing the frequencies of the three groups of PVs in the varieties under scrutiny is not feasible for the online data. GAST is rather geared towards examining the frequencies of concrete verb-particle combinations and will be employed accordingly. Against this background, it needs to be pointed out that it may well be the case that when studying PVs in postcolonial varieties of English, new and formerly unrecorded verb-particle combinations are attested in addition to well-established and documented PVs. The (potentially) innovative character of a given PV is verified by checking whether the verb-particle combination concerned is recorded in two reference works for PVs, namely the Collins COBUILD Phrasal Verbs Dictionary (CPVD; Sinclair 2002) and the Longman Phrasal Verbs Dictionary (LPVD; Gadsby 2000). If the PV under scrutiny is not listed in either of the dictionaries, it is treated as so far unrecorded. From earlier studies on South Asian Englishes and on SLE, two central hypotheses concerning the distribution of PVs can be extrapolated. First, the overall frequency of PVs can be expected to be lower in the SLE (and IndE) data than

. Also note that nominalised forms of PVs as well as adverbial and adjectival uses of PVs are not taken into consideration in the present study.

 The Lexis and Lexicogrammar of Sri Lankan English

in the BrE data (cf. Schneider 2004: 235; Zipp & Bernaisch 2012: 176). Second, it is to be expected that a high degree of formality in a given genre leads to a comparatively low frequency of PVs and vice versa since PVs have generally been argued to be associated with informality (cf. Dempsey et al. 2007: 219). In 5.1.1, the overall frequencies of PVs with up (PVUs), out (PVOUs) and off (PVOFs) in the ICE and newspaper data are shown, in 5.1.2, genre-specific perspectives on the PVs under scrutiny are offered and so far unrecorded PVs featuring up, out and off are discussed in 5.1.3. 5.1.1 Particle verbs: Frequency The ICE data can be used to exemplify the three groups of PVs concerned. (66) to (68) represent PVs with up, out and off as scrutinised in the present study. (66) The girls grew up and never asked why their father did not stay with them or why they had two mothers. 〈ICE-SL:W2F-017#80:1〉 (67) Nevertheless, despsite the lapse of 47 years since the achievement of Independence in 1947, the state bureaucracy has not taken firm measures to carry out any well-formulated plan in this immeasurably important field. 〈ICE-IND:W2D-004#46:1〉 (68) I saw him off at the station. 〈ICE-GB:W2F-014#76:1〉

Zooming in onto the overall frequencies of these PVs across the ICE datasets, one becomes aware of an uneven distribution in the respective components of ICE. The normalised and absolute frequencies of PVs in the ICE components are given in the first row of Figure 17. The distribution of the three types of PVs in the ICE components shows highly significant differences and a weak correlation (χ² ≈ 33.60, df = 4, p < 0.001, Cramer’s V ≈ 0.07).2 From the total normalised frequencies of PVs in the ICE components, it becomes clear that PVs figure most prominently in ICE-GB (2972.32) whereas they occur less frequently in ICE-IND (2442.41) and ICE-SL (2420.28). While the distribution of PVUs and PVOFs reflects the general trend that PVs occur most frequently in the British data (followed by the Indian and finally the Sri Lankan data), the PVOUs deviate from this in that the highest frequency of

. The pairwise comparison of the PV distribution between ICE-SL and ICE-GB shows highly significant differences (p < 0.001). The differences between ICE-SL and ICE-IND as well as ICE-IND and ICE-GB are not significant (p > 0.01 (Bonferroni correction for multiple pairwise comparisons)).

Chapter 5. Sri Lankan English lexicogrammar 

PVOUs can be found in the Sri Lankan dataset (1295.53). Still, this divergence in ranking should not be overestimated given that the differences in frequency between the ICE components are not as pronounced with PVOUs as with PVUs and PVOFs. Consequently, the initial hypothesis stating that PVs generally figure less prominently in South Asian Englishes than in BrE finds empirical evidence in the present ICE-based analysis. With regard to the national varieties in South Asia, it seems to be the case that speakers of SLE generally use fewer PVs than their IndE counterparts. 2972.32

4000

1255.65

1276.5

1171.21

n = 413 n = 121 n = 492 N = 1026

n = 551 n = 190 n = 542 N = 1283

ICE-SL

ICE-IND Corpus

ICE-GB

3205.35

3852.04

n = 368 n = 93 n = 531 N = 992

4000 3500

500 0

654.52

1571.72

1625.81 330.76

1000

175.48

1500

876.76

2000

1352.98

2500

1521.62

1970.44

3000

918.19

Norm. Freq. (pmw)

440.17

500 0

288.04

1000

226.9

1500

983.15

2000

1295.53

2500

2442.41

2420.28

3000

897.85

Norm. Freq. (pmw)

3500

n = 2815 n = 538 n = 2688 N = 6041

n = 4674 n = 1016 n = 4156 N = 9846

n = 14427 n = 5808 n = 13947 N = 34182

SAVE-SL

SAVE-IND Corpus PVOUs PVOFs

BNC news

PVUs

TOTAL

Figure 17. Normalised (pmw) and absolute frequencies of PVs in the ICE and newspaper data

 The Lexis and Lexicogrammar of Sri Lankan English

In order to approach PVs from another empirical angle, the newspaper data can be consulted. The PVs are retrieved via the (CLAWS C7) POS-tagged versions of the newspaper corpora by collecting all the examples in which any of the particles up, out or off occurs in either first, second, third, fourth or fifth position to the right of a verb. Examples (69) to (71) display the usage of the three types of PVs in the Sri Lankan newspaper texts. After discarding all irrelevant instances, the PVs are distributed across the newspaper datasets as shown in the second row of Figure 17. (69) These parties who are now trying to whip up an anti-US hysteria, fail to give credit to the Hon. Prime Minister who resisted the temptation of lucrative contracts for the country in the post-war reconstruction of Iraq, in return for support for the US to conduct its operations against Iraq. 〈SAVE-SL-DN_2003-11-04〉 (70) But suddenly, a severe drought and the sudden lack of grasses and other edible fauna drove the elephants to seek out the teak. They did not merely munch on a few leaves. The elephants tore down the trees to get at the new leafy shoots and also consumed the bark. 〈SAVE-SL-DN_2004-11-03〉 (71) The Sri Lanka Tourist Board will kick off a massive destination promotion campaign targeting China, Japan and France in the coming months. 〈SAVE-SL-DN_2002-09-25〉

The normalised total values for the newspaper corpora illustrate that PVs occur most often in BNC news (3852.04) followed by SAVE-IND (3205.35) and SAVE-SL (1970.44). Despite the statistically highly significant differences with a weak correlation in the distribution of the three types of PVs across the datasets (χ² ≈ 466.93, df = 4, p < 0.001, Cramer’s V ≈ 0.07), there is a uniform pattern in that each type of PV is most frequent in BNC news and least frequent in SAVE-SL with the frequencies of SAVE-IND in between.3 These observations are generally very much in line with the findings described for the ICE dataset and emphasise that PVs are not used as frequently in South Asian Englishes as in BrE (cf. Schneider 2004: 246; Zipp & Bernaisch 2012: 176). When it comes to PVOUs, with which ICE-SL displays the highest normalised frequency in comparison to the other ICE components, the complementary analysis of the newspaper data shows that PVOUs are most frequent in BNC news (1571.72) and least frequent in SAVE-SL (876.76). Thus, the distribution of PVOUs in ICE must be evaluated as an exception to the general trend. A closer look at the individual PVOUs in the ICE data reveals that . All pairwise comparisons of PVs across the newspaper datasets are significant as well. SAVE-SL compared to SAVE-IND yields significant differences (p < 0.01) and SAVE-SL compared to BNC news and SAVE-IND compared to BNC news show highly significant differences (p < 0.001).

Chapter 5. Sri Lankan English lexicogrammar 

CARRY out, which is attested with comparable ranges of meanings such as that of performing scientific work as instantiated in (72) to (74) across the ICE datasets, may be one of the main factors in accounting for the comparatively high frequency of PVOUs in ICE-SL. (72) The experiment was carried out using 〈}〉〈-〉thiobarbitiuric〈/-〉 〈+〉thiobarbituric〈/+〉〈/}〉 acid reaction substances assay [...]. 〈ICE-SL:W2A-030#65:1〉 (73) Studies were carried out in an integrated aluminium complex with production capacity of 0.1 million tonne of aluminium and fabricated products per annum. 〈ICE-IND:W2A-035#27:1〉 (74) This is the period when the social scientist will need to carry out literature searches, make detailed analysis plans and become familiar with the database and the computing skills necessary to use it. 〈ICE-GB:W1B-025#29:3〉

Given that the meanings and corresponding collocational candidates of CARRY out appear relatively alike in the three varieties studied, it may be that SLE, IndE and BrE prefer different lexical variants to refer to the act of carrying something out. Table 23 depicts the absolute and relative frequencies of CARRY out and two of its semantically equivalent single-word alternatives, i.e. PERFORM and CONDUCT, in the ICE data. Table 23. Absolute and relative frequencies of CARRY out, PERFORM and CONDUCT in ICE ICE-SL

ICE-IND

ICE-GB

abs. freq.

rel. freq.

abs. freq.

rel. freq.

abs. freq.

rel. freq.

CARRY out

96

37.21%

35

24.14%

46

37.10%

CONDUCT

94

36.43%

45

31.03%

29

23.39%

PERFORM

68

26.36%

65

44.83%

49

39.52%

TOTAL

258

100.00%

145

100.00%

124

100.00%

The distribution of CARRY out, CONDUCT and PERFORM across the varieties exhibits statistically highly significant differences with a weak correlation (χ² ≈ 20.17, df = 4, p < 0.001, Cramer’s V ≈ 0.14).4 Only in SLE is CARRY out . The pairwise comparison of the occurrences of CARRY out, CONDUCT and PERFORM between ICE-SL and ICE-IND displays highly significant differences (p < 0.001). The diffe rences between ICE-SL and ICE-GB and ICE-IND and ICE-GB are not significant (p > 0.01 (Bonferroni correction for multiple pairwise comparisons)).

 The Lexis and Lexicogrammar of Sri Lankan English

(37.21%) the most frequent choice, while PERFORM is the most frequent lexical option in IndE (44.83%) and BrE (39.52%), where CARRY out (37.10%), however, is also an important variant. In view of the total absolute frequencies, the process of carrying something out seems to be verbalised more often in the Sri Lankan than in the Indian or British data, which, in conjunction with the preference to express this process via CARRY out in SLE, may be held partially accountable for the high frequency of PVOUs in the Sri Lankan ICE data. While BrE clearly stands out as featuring the largest number of PVs among the varieties investigated, the newspaper data provide further evidence of the tendency of SLE to employ fewer PVs than IndE. What is also striking is that in the IndE and the BrE data, PVs in general occur with a higher frequency in the newspaper data than in the complementary ICE components while the reverse is true for the Sri Lankan datasets. Could this be related to PVs being associated with the notion of informality and their subsequent avoidance in formal newspaper writing in SLE? Chapter 5.1.2 aims at providing answers to genre-related questions of this kind. 5.1.2 Particle verbs: Genre-specificity Dempsey et al. (2007: 217) argue that “phrasal verbs significantly distinguish between both the spoken/written and formal/informal dimensions” and there is no grounds to suppose that this should be different for the less narrowly defined group of PVs under scrutiny here. As the present study focuses on written texts only, it is the formality scale that is of primary interest since the ICE components cover a range of genres positioned at different ends of the formality scale in that – as elaborated in Section 4.1.2 – e.g. academic writing (W2A) comprises relatively formal texts while e.g. creative writing (W2F) consists of comparatively less formal texts. Prior to hypothesizing about the potential influence of formality on the distribution of PVs in given genres, it is revealing to study to what extent PVs are distributed unevenly across the genres in the single ICE components in the first place. Against this background, Gries’ (2008: 414) “deviation of proportions” (DP) measure is ideal in that it can be applied to corpora or components thereof consisting of unequally sized parts – a characteristic which matches the ICE components with their asymmetrical representation of text genres in terms of word counts: [DP] can theoretically range from approximately 0 to 1, where values close to 0 indicate that a [= an object of investigation] is distributed across the n corpus parts as one would expect given the sizes of the n corpus parts. By contrast, values close to 1 indicate that a is distributed across the n corpus parts exactly the opposite way one would expect given the sizes of the n corpus parts. (Gries 2008: 415)

Chapter 5. Sri Lankan English lexicogrammar 

The individual ICE datasets exhibit noteworthy differences in relation to the DP values of the PVs under scrutiny. The values of DP for PVUs, PVOUs and PVOFs in ICE-SL, ICE-IND and ICE-GB are provided in Table 24. Table 24. DP values of PVs in ICE ICE-SL

ICE-IND

ICE-GB

DP of PVUs

0.23

0.33

0.18

DP of PVOUs

0.16

0.19

0.18

DP of PVOFs

0.32

0.40

0.24

Two central tendencies can be observed in the data. First, for each varietyspecific dataset, PVOFs exhibit the highest DP value followed by PVUs, and PVOUs generally show the lowest values with ICE-GB as the only exception, where the values for PVUs and PVOUs are equal. Consequently, the DP values indicate that the distribution of PVOUs – and that of PVUs to a more limited extent – is more bound to the differently-sized corpus parts than that of PVOFs or, to put it differently, that PVOUs as well as PVUs may not be as genre-sensitive as PVOFs.5 Second, the Indian data displaying the highest DP value independent of PV subtype can be suspected to exhibit the strongest genre-related variation. Thus, the DP values across the different verb-particle combinations stress the need to take a more fine-grained genre-specific look at the frequency of PVs in the individual ICE components. The three groups of PVs are analysed separately to possibly identify noteworthy differences. The normalised and absolute frequencies of PVUs, PVOUs and PVOFs across the various genres in the ICE components are documented in Figure 18. The frequencies of PVUs in the genres of the ICE components concerned exhibit statistically highly significant differences, but the correlation between the frequency of PVUs and the genres in the ICE components is comparatively weak (χ² ≈ 67.62, df = 14, p < 0.001, Cramer’s V ≈ 0.16).6 The first trends can be deduced from academic (W1A) and creative writing (W2F). While academic writing features the least amount of PVUs in ICE-SL (183.59), ICE-IND (281.45) and . Although the DP values for e.g. PVOUs may be numerically close to each other across the datasets, this does not mean that their genre-specific distributions are necessarily similar to one another. DP only indicates that there is a deviation from the expected distribution based on the sizes of the single genres in the ICE datasets, but does not provide any insights into which genres deviate from the respective expected frequencies. . All pairwise comparisons of PVUs in the genres of ICE-SL, ICE-IND and ICE-GB are statistically significant as well (p < 0.01).

1276.5

1021.92

860.91

1456.03

1221.75

1559.91 619.22

834.45

983.15

992.24

1327.72 431.54

574.79

281.45

897.85

1312.87 611.61

857.8

320.49

500 0

844.3

183.59

1500 1000

1072.36

2500 2000

1310.27

2304.62

3000

763.76

Norm. Freq. (pmw) of PVUs

3500

2747.01

3205.2

 The Lexis and Lexicogrammar of Sri Lankan English

n = 32 n = 15 n = 34 n = 26 N = 368 n = 13 n = 24 n = 52 n = 22 N = 413 n = 36 n = 54 n = 52 n = 18 N = 551 n = 36 n = 109 n = 19 n = 138 n = 97 n = 128 n = 44 n = 122 n = 67 n = 68 n = 25 n = 101 ICE-SL

ICE-IND

ICE-GB

500 0

2296.68 1255.65

1434.86

929.02

1785.63

1160.28

1704.65

625.84

1171.21

1307.96

908.51

1506.45

1178.04

738.8

986.12

574.79

1295.53

1363.36

1142.29

1122.7

538.21

1000

954.64

1500

1288.84

2500 2000

676.55

3000

1280.43

Norm. Freq. (pmw) of PVOUs

3500

2949.72

3080.43

Corpus

n = 54 n = 78 n = 46 n = 27 N = 531 n = 40 n = 63 n = 59 n = 29 N = 492 n = 27 n = 59 n = 76 n = 30 N = 542 n = 80 n = 89 n = 22 n = 135 n = 36 n = 98 n = 40 n = 127 n = 106 n = 102 n = 40 n = 102 ICE-IND

ICE-SL

ICE-GB

Corpus

0

968.21

440.17

580.64

430.46

386.76

137.6

611.1 92.72

225.51

68.14

561.73

192.33

93.82

175.63

288.04

1184.53 n = 2 n = 5 n = 12 n = 6 N = 93 n = 28 n = 11 n = 8 n = 21

123.27

226.9

479.18

302.97

195.71

297.99

138.76

500

61.2

1000

448.15

1500

587.38

2500 2000

47.73

Norm. Freq. (pmw) of PVOFs

3500 3000

n = 5 n = 8 n = 22 n = 5 N = 121 n = 4 n = 12 n = 25 n = 9 N = 190 n = 11 n = 16 n = 3 n = 51 n = 38 n = 34 n = 25 n = 43

ICE-SL

ICE-IND

ICE-GB

Corpus Non-professional writing (W1A)

Non-academic writing (W2B)

Persuasive writing (W2E)

Correspondence (W1B)

Reportage (W2C)

Creative writing (W2F)

Academic writing (W2A)

Instructional writing (W2D)

TOTAL

Figure 18. Normalised (pmw) and absolute frequencies of PVUs, PVOUs and PVOFs in the genres of ICE-SL, ICE-IND and ICE-GB

ICE-GB (619.22), creative writing contains most PVUs in each of the ICE datasets analysed, which may constitute a formality-induced distributional difference. The lower frequency of PVUs in academic writing seems particularly obvious in SLE. This may be explained by what Mendis (2010: 21) understands as

Chapter 5. Sri Lankan English lexicogrammar 

“ prescriptive practices that dictate the avoidance of phrasal verbs to achieve or maintain a stylistic shift towards formality”. Mendis’ (cf. 2010: 20) study on a selection of PVs brings to the fore a universal trend of avoiding PVs in written scholarly discourse in SLE, which she explains with an argument put forward by Swales and Feak: English often has two (or more) choices to express an action or occurrence. The choice is often between a phrasal verb (verb + particle) or a single verb, the latter with Latinate origins. Often in lectures or other instances of everyday spoken language, the verb + particle is used. However, in written academic style, there is a tendency for academic writers to use a single verb whenever possible. This is one of the most dramatic stylistic shifts from informal to formal style. (Swales & Feak 2004: 18)

According to this line of argumentation, PVs may be considered stylistic indicators of informality, which serves as an explanation as to why PVUs occur relatively infrequently in academic writing (W2A). As creative writing (W2F) features systematically higher than expected frequencies of PVUs, the stylistic signalling function of PVUs is further corroborated. Due to the relatively informal nature of most creative writing texts, which are generally more casual than any other written genre in the ICE design and “more akin to spoken registers (possibly because of the heavy use of fictional dialogues)” (Xiao 2009: 436–437), it is not surprising that PVUs figure prominently in this text category. Still, what are the origins of these prescriptive notions proscribing PV use in SLE in particular in academic writing and – maybe more importantly – what are potential consequences on the evolution of SLE? Mendis draws attention to scholarly training abroad as well as universal stylistic formality of academic writing as two possible sources of the stylistic choices of Sri Lankan scholars. The question to ask here is if this is a result of Sri Lankan researchers and scholars being exposed to pedagogical practices in EAP of an overly prescriptive nature during undergraduate or graduate training in countries such as the UK or the US; or if there are certain features of written academic discourse that are accepted as universal – for instance, formality of tone. (Mendis 2010: 22)

It is certainly correct that formality of tone may be a factor in accounting for low frequencies of PVUs in scholarly discourse in general, but the formal education system through which speakers of SLE prototypically acquire English as a second language – particularly in contrast to the more informal acquisition of English at home in ENL contexts – may be another factor in accounting for comparatively few PVUs. However, it seems debateable whether the frequency of PVUs in the academic writing (W2A) section of ICE-SL represents the grip Western prescriptivism has on SLE scholars. True, ICE-SL does not feature more PVUs than ICEGB, but the frequency of PVUs in the academic writing texts in ICE-SL (183.59) is

 The Lexis and Lexicogrammar of Sri Lankan English

also by no means close to that of ICE-GB (619.22) in that ICE-SL features noticeably fewer PVUs. This observation can be interpreted in different ways. On the one hand, if one considers the effects of Western prescriptivism to lead to structural and frequency-related convergence between SLE and BrE, one might argue that the data at hand do not offer evidence of Western prescriptive practices in SLE given that there is a notable difference in how often PVUs are employed in written academic discourse in ICE-SL and ICE-GB. On the other hand, if one regards Western prescriptivism to negatively affect the usage of PVs in academic writing, SLE writers must be viewed as following these prescriptive practices more consistently than BrE writers themselves, which – opposing the very motivation of prescriptivism – in turn yields significant quantitative differences across the varieties. Consequently, in terms of academic writing, the empirical data show that prescriptive practices have certainly not led to SLE becoming indistinguishably similar to BrE; on the contrary, Western prescriptive practices may have discouraged PV use in SLE to such an extent that the resulting usage patterns allow differentiating SLE from BrE on quantitative grounds. In the light of this, one might provocatively hypothesise that, depending on the rigour with which it is applied in a given variety, prescriptivism can in fact function as an actuator of structural nativisation. Still, the extent to which Western prescriptivism, inter alia with a view to PV usage, is or has been in operation in SLE cannot exclusively be answered on the basis of corpus data and interviews on the subject with SLE scholars and examinations of their writing process would certainly provide invaluable additional perspectives on the matter. The findings for PVOUs are in general in line with those for PVUs. The normalised and absolute frequencies of PVOUs in the genres of ICE are given in F igure 18. The distribution of PVOUs in the genres of the ICE components examined yields highly significant differences, but only a relatively weak correlation (χ² ≈ 62.93, df = 14, p < 0.001, Cramer’s V ≈ 0.14).7 It becomes obvious that creative writing (W2F), as with PVUs, contains most PVOUs in each of the ICE components. However, when it comes to the genres with the smallest number of PVOUs, there does not appear to be a cross-varietally stable trend. While instructional writing (W2D) features the lowest amount of PVOUs in ICE-SL (538.21), the correspondence section (W1B) contains fewest PVOUs in the IndE data (574.79) and in ICE-GB, it is the non-professional writing (W1A) texts where PVOUs are attested with the lowest normalised frequency (625.84).

. All pairwise comparisons of PVOUs in the genres of ICE-SL, ICE-IND and ICE-GB are statistically highly significant as well (p < 0.001).

Chapter 5. Sri Lankan English lexicogrammar 

Also the group of PVOFs behaves relatively uniformly in comparisons to PVUs and PVOUs. The frequencies of PVOFs in the ICE genres are shown as normalised and absolute values in Figure 18 as well. The distributional differences of PVOFs across the genres in the national components of ICE are highly significant, but the correlation between the frequency of occurrence of PVOFs and the genres across the datasets is weak (χ² ≈ 38.63, df = 14, p < 0.001, Cramer’s V ≈ 0.22).8 As with PVUs and PVOUs, the genre of creative writing (W2F) uniformly exhibits the highest number of PVOFs in ICE-SL (479.18), ICE-IND (1184.53) and ICEGB (968.21). In contrast, it is in the non-professional writing sections (W1A) and the academic writing texts (W2A) where the smallest amount of PVOFs can be found and in ICE-IND, instructional writing (W2D) also contains a relatively small amount of PVOFs (68.14). Despite the fact that PVOFs occur almost twice as frequently in ICE-GB (440.17) than in ICE-SL (226.90) (and ICE-IND (288.04)), which is the most pronounced difference among the three sets of PVs, the more frequent occurrence of PVOFs in ICE-GB compared to ICE-SL is not caused by more attestations in a single or a selection of genres, but by a consistently higher frequency of PVOFs in each of the ICE genres. With the exception of non-professional (W1A) and creative writing (W2F), this also holds true for the comparison of ICE-GB and ICE-IND. Thus, not only (a higher degree of) genre sensitivity in SLE and IndE as suggested by the respective DP values may affect the frequency of PVOFs; there also appears to be a genre-independent avoidance of PVOFs in SLE and IndE. Taken together, the lower genre-specificity of PVOFs in BrE allowing PVOFs to occur in genres in which they would not surface with comparable frequencies either in SLE or IndE (e.g. instructional writing (W2D)) along with a general reluctance to employ PVOFs in SLE (and IndE) may be held at least partially accountable for the lower frequencies of PVOFs in SLE and IndE. While the most uniform trend of PV distribution in the genres of the ICE components seems to be the high frequency of PVs in the creative writing (W2F) sections concerned, the most notable differences in the occurrence of both PVUs and PVOUs manifest themselves in the correspondence sections (W1B). While ICE-SL and ICE-GB feature 1072.36 and 1559.91 PVUs in normalised frequencies in this genre, only 574.79 PVUs can be found in the letters collected in ICE-IND. Still, the correspondence section of ICE comprises two types of letters associated

. When it comes to the pairwise comparisons of PVOFs in the genres of the three ICE components, the difference between ICE-SL and ICE-GB is not statistically significant (p > 0.05) while the differences between ICE-SL and ICE-IND as well as ICE-IND and ICE-GB are highly significant (p < 0.001).

 The Lexis and Lexicogrammar of Sri Lankan English

with different degrees of formality, i.e. social letters and business letters, with the latter being more formal than the former. Table 25 provides a more in-depth analysis with regard to PVUs in the social and business letters sections of the ICE databases. Table 25. Absolute and relative frequencies of PVUs in social and business letters in ICE ICE-SL abs. freq.

rel. freq.

Social letters

53

Business letters

14

TOTAL

67

ICE-IND

ICE-GB

abs. freq.

rel. freq.

abs. freq.

rel. freq.

79.10%

28

77.78%

73

75.26%

20.90%

8

22.22%

24

24.74%

100.00%

36

100.00%

97

100.00%

For the distribution of PVUs in social and business letters in the ICE components, no statistically significant differences and a weak correlation can be attested (χ² ≈ 0.35, df = 2, p > 0.05, Cramer’s V ≈ 0.04).9 A closer examination of the relative frequencies reveals that, with each ICE component, the majority of PVUs can be found in the social letters section while PVUs are less frequent in business letters. Apparently, PVU usage in letters is associated with informality to comparable degrees across the varieties and PVUs figure less frequently in the letters in ICE-IND in general. It is also the absolute frequencies of PVOUs in the correspondence texts (in ICE-IND and ICE-GB) with which the most notable cross-varietal differences can be observed. Table 26 provides a more detailed rundown of the distribution of PVOUs in the letters of ICE-SL, ICE-IND and ICE-GB. Table 26. Absolute and relative frequencies of PVOUs in social and business letters in ICE ICE-SL

ICE-IND

ICE-GB

abs. freq.

rel. freq.

abs. freq.

rel. freq.

abs. freq.

rel. freq.

Social letters

46

57.50%

24

66.67%

66

62.26%

Business letters

34

42.50%

12

33.33%

40

37.74%

TOTAL

80

100.00%

36

100.00%

106

100.00%

. None of the pairwise comparisons exhibits statistically significant differences either (p > 0.05).

Chapter 5. Sri Lankan English lexicogrammar 

Although the normalised frequencies in the correspondence texts (W1B) of the ICE components concerned strongly deviate from each other since ICEGB (1704.65) and ICE-SL (1280.43) feature a much larger number of PVOUs than ICE-IND (574.79), no statistically significant differences can be established for the genre-internal distribution of PVOUs across the social and business letters and there is only a weak correlation (χ² ≈ 0.97, df = 2, p > 0.05, Cramer’s V ≈ 0.07).10 PVOUs are uniformly more frequent in social than in business letters, but the relative frequencies for social (57.50% for ICE-SL, 66.67% for ICE-IND and 62.26% for ICE-GB) and business letters (42.50% for ICE-SL, 33.33% for ICE-IND and 37.74% for ICE-GB) are fairly similar to one another. In line with the results for PVUs, it could be that an association of PVOUs with informal contexts may be the reason for lower frequencies of PVOUs in the business than in the social letters. In relation to the DP values for the ICE components, it has been put forward that, at least in SLE and IndE, PVOUs may not be as genre-sensitive as PVUs. In a comparison of the relative frequencies for social and business letters of PVOUs and PVUs, it comes to the fore that the difference between social and business letters is more pronounced, i.e. that the differences between the relative frequencies of social and business letters is greater, with PVUs than with PVOUs in both ICEIND and ICE-SL, which may be interpreted as another indication that PVOUs are indeed less genre-sensitive than PVUs in both SLE and IndE. Quantitative genre-specific perspectives on PVs reveal notable parallels in terms of their occurrence in genres marked by different degrees of formality and their relatively consistent association with informality. PVs seem to be used more frequently in less formal contexts (e.g. creative writing (W2F)) than in formal ones (e.g. academic writing (W2A)), for which the closer investigations of PVUs and PVOUs in the letters sections of ICE could provide additional evidence. Still, it is remarkable that, despite the fact that this sensitivity to formality appears to be shared by PVs, the level of sensitivity can nevertheless vary, which finds reflection in PVOUs being less responsive to varying degrees of context-dependent formality than PVUs and PVOFs in the data at hand. 5.1.3 Particle verbs: Unrecorded particle verbs Apart from the above frequency-oriented insights, it is also interesting to observe that, in each of the ICE components, a number of unrecorded verb-particle

. There are no significant differences with any of the pairwise comparisons (p > 0.05).

 The Lexis and Lexicogrammar of Sri Lankan English

c ombinations are attested.11 These potentially innovative multi-word verbs are established by checking whether the PVs under scrutiny are listed in either the CPVD or the LPVD; if they are not, they are considered to be so far unattested. This group of unrecorded PVs can be subdivided into two broad categories. The first category features PVs with a supplementary (or semantically redundant) particle, the addition of which does not substantially alter the meaning of the original single-word verb. The second category comprises distinct forms created on the basis of what Mukherjee labels semantico-structural analogy [, which] is a process by means of which nonnative speakers of English as a second language introduce new forms and structures into the English language on grounds of semantic and formal templates that already exist in the English language system. (Mukherjee 2007: 175–176)

With regard to PVUs, two examples with supplementary particles are attested in each of the ICE datasets. Examples (75) to (77) illustrate a selection of multi-word verbs with up in which the additional particle at the very most only minimally alters the meaning of the corresponding single- or multi-word verb. (75) Re-colonization of such lands with important local species can be accelerated up by appropriate silvicultural interventions. 〈ICE-SL:W2A-035#6:1〉 (76) But when they finally lurched up on to the grandiose four-lane highway which led from Mananga City to the air terminal, there was very little other traffic and Jean fancied she could hear sirens behind her in the distant streets. 〈ICE-GB:W2F-015#144:1〉 (77) When the oxygen level started rising, chemical and biological strategies to cope up with this new situation had to be adapted. 〈ICE-IND:W2B-024#53:1〉

. Some verb-particle combinations (e.g. WALK up, RUN up) are recorded neither in the CPVD nor in the LPVD because they are semantically transparent in that the meaning of these PVs is directly deducible from its component parts, i.e. the meaning of the verb- particle combination is the sum of the meaning of the verb and that of the particle (cf. Sinclair 2002: vii). In the case of PVUs, the particle index of the CPVD describes the meaning of the particle up as having “to do with movement from a lower position or level to a higher one” (Sinclair 2002: 47). While these instances form part of the quantitative analyses of PVUs in Chapters 5.1.1 and 5.1.2, they are excluded from the discussions on unrecorded PVUs since these examples do not constitute particularly innovative combinations, but simply forms that are not listed in either the CPVD or the LPVD because these dictionaries restrict themselves to semantically opaque verb-particle combinations.

Chapter 5. Sri Lankan English lexicogrammar 

In each of the illustrations, the meanings of the single- or multi-word verbs ACCELERATE, LURCH and COPE with remain (largely) unchanged despite the supplementary particle up. In (75), the meaning of ACCELERATE up by and large matches that of ACCELERATE, i.e. “to start to go faster” (Hornby 2008: 7). The meaning of the single-word verb LURCH being “to make a sudden, unsteady movement forward or sideways” (Hornby 2008: 919) can be mapped onto that of LURCH up in (76) and in (77), the semantics of COPE up with and COPE with meaning “to deal successfully with sth difficult” (Hornby 2008: 339) strongly resemble each other. Although it has been argued that the addition of up to a verb may add an aspectual dimension of completion (cf. Zipp & Bernaisch 2012: 190), none of the examples in the study at hand would qualify as an instantiation of this. Among the examples in (75) to (77), the additional particle usage in the multi-word verb COPE up with attested once in ICE-SL as well as in ICE-IND has probably received most attention in earlier descriptions of SLE (and IndE; cf. e.g. Nihalani et al. 2004; Mukherjee 2012; Mukherjee & Schilk 2012; Zipp & Bernaisch 2012). On the basis of empirical data, Mukherjee (cf. 2012: 204–205) and Zipp and Bernaisch (cf. 2012: 188) posit that, for SLE, COPE up with is the minority variant in comparison to its alternative COPE with. In order to scrutinise the status of COPE up with in the corpus environment at hand more carefully, Figure 19 offers an overview of the (potential) lexicogrammatical alternants COPE with and COPE up with in the ICE, newspaper and GAST data. In the ICE data, no significant distributional differences between the occurrence of COPE with and COPE up with can be attested (p > 0.05), but there are highly significant differences in the newspaper data (p < 0.001).12, 13, 14 For both the ICE and the newspaper data, the results of COPE up with in comparison to COPE with are similar in that COPE up with is relatively infrequent as a phraseological substitute for COPE with in the respective Sri Lankan and Indian datasets. COPE up with is slightly more pervasive in the Indian than in the Sri Lankan data while it is completely absent from the BrE datasets. . None of the pairwise comparisons of COPE with with COPE up with in ICE yields significant differences either (p > 0.05). . There are no significant distributional differences of COPE with and COPE up with between SAVE-SL and SAVE-IND (p > 0.05) and SAVE-SL and BNC news (p > 0.01 (Bonferroni correction for multiple pairwise comparisons)), but there are highly significant differences between SAVE-IND and BNC news (p < 0.001). . The absolute frequencies of GAST in Figure 19 constitute the sums of the domain- specific numbers of hits for the individual word forms of COPE with and COPE up with. This procedure also applies to all calculations of absolute frequencies in GAST-based searches in Chapter 5.1.3 unless otherwise stated.

Rel. Freq. (%)

100 75 50 25 0

Rel. Freq. (%)

 The Lexis and Lexicogrammar of Sri Lankan English

100 75 50 25 0

92.86

Rel. Freq. (%)

20

7.14 n = 13 n = 1 ICE-SL

n=8 n=2 ICE-IND Corpus

95.65

92.45

n = 44

4.35 n=2

SAVE-SL 100 75 50 25 0

100

80

92.29

n = 49

100

7.55 n=2

SAVE-IND Corpus

n = 49346 n = 4123 GAST-SL

n = 344

0 n=0

BNC news 98.99

85.93 7.71

0 n = 19 n = 0 ICE-GB

14.07

1.01 n = 32388000 n = 2660000 n = 330567 n = 435557 GAST-GB GAST-IND Data COPE with COPE up with

Figure 19. Relative and absolute frequencies of COPE with and COPE up with in the ICE, newspaper and GAST data

The GAST searches verify the tendencies just presented. The only exception is that COPE up with is at least marginally frequent (and not entirely missing) in the British top-level country domain. The above results corroborate earlier findings on the status of COPE up with. COPE up with cannot even be considered a peripheral option in BrE, but it is a viable, though (still) comparatively infrequent lexicogrammatical alternative in SLE and IndE with the latter variety displaying a more frequent use of this novel combination of verb and two particles than the former (cf. Zipp & Bernaisch 2012: 188). In addition to the group of unrecorded verb-particle combinations with supplementary up, the ICE data also yield a number of PVUs derived by semanticostructural analogy (cf. Mukherjee 2007: 175–176). This means that new PVUs are created on the basis of structural and/or semantic templates of existing PVUs. For each of the ICE components, three potentially innovative PVUs could be attested and Examples (78) and (79) illustrate two. (78) Whenever this issue comes for discussion even, 〈}〉〈-〉〈w〉It’s〈/w〉〈/-〉〈+〉〈w〉 it’s〈/w〉〈/+〉〈/}〉 just a waste of my time and my colleagues just say things to rile me up, so 〈w〉I’ve〈/w〉 decided to take the 〈quote〉’ignorance is bliss’〈/quote〉 and 〈quote〉’pearls before swine’〈/quote〉 attitude with the lot of 〈w〉’em〈/w〉. 〈ICE-SL:W1B-009#23:2〉

Chapter 5. Sri Lankan English lexicogrammar 

(79) The eastern horizon was slightly glowing up. 〈ICE-IND:W2F-018#89:1〉

The PV construction rile me up in (78) may have been created on the basis of the structural and semantic templates of WORK up since rile me up and work me up are structurally similar and share the semantics of becoming upset. GLOW up in (79) may have been derived from the semantically related PVU LIGHT up, which means that something “shines light on the whole of it, making it easy to see” (Sinclair 2002: 199). None of the unrecorded and innovative PVUs just discussed could be attested in the newspaper corpus environment. However, a closer online-based examination of the PVUs reveals a certain currency of these novel creations in the varieties of English studied. The results of the GAST searches for RILE up and GLOW up are documented in Table 27. Table 27. Absolute and normalised (pmw) frequencies of RILE up and GLOW up in GAST GAST-SL abs. freq.

norm. freq.

RILE up

239

GLOW up

12

GAST-IND

GAST-GB

abs. freq.

norm. freq.

abs. freq.

norm. freq.

0.10

28,235

0.26

175,010

0.25

0.01

7,841

0.07

142,845

0.20

In each variety studied, the novel PVUs can be found. Examples (80) and (81) illustrate instances of RILE up and GLOW up retrieved via GAST. (80) I’ll pretend to be a hip-hopper and see how many rockers I can rile up! 〈http://www.rock.lk/forum/index.php?topic=3185.55;wap2〉 (11 February 2013) (81) […] and his little teddy glows up my little boy loved this when he was little […][.] 〈http://fo.lk/for-sale-trade/selection-baby-toys-fisher-pricero/730638〉 (17 October 2014)

Another cross-domain similarity is that RILE up is more frequent than GLOW up. The normalised frequencies also suggest that RILE up and GLOW up are more frequent in GAST-IND and GAST-GB than in GAST-SL. The GAST searches covering RILE up and GLOW up thus essentially confirm that these innovative PVUs created on the basis of existing semantic and/or formal templates are indeed in use in SLE, IndE and BrE. Consequently, the (albeit limited) currency of these PVUs may be considered an attestation of the creative potential of competent ESL and ENL users to systematically enrich the structural make-up of the varieties of English they use. In this context, the p otential of GAST

 The Lexis and Lexicogrammar of Sri Lankan English

as a gateway to huge amounts of data serving as complements to the smaller and well-defined offline corpora also takes centre stage since unrecorded PVs constitute the kind of low-frequency phenomena (the status of) which can be investigated via GAST. A total of six novel PVOUs can be established in the ICE data.15 This group of potentially novel PVOUs can also be subdivided into those PVOUs with which the particle out does not significantly alter the verb-based meaning of the verb- particle combination and those created on semantico-structural templates of existing PVOUs. PVOUs with additional out, i.e. PVOUs with a largely singleword-verb-based meaning, are exemplified in (82) and (83). (82) It is our protocol that every user at the time of purchase is demonstrated on how to do the test and discuss its limitations etc. even though it is detailed out in the user guide. 〈ICE-SL:W1B-029#113:3〉 (83) Hasthiagara is now leased out to visitors and 〈w〉they’ve〈/w〉 all been very forthcoming about it. 〈ICE-SL:W1B-012#38:4〉

The PVOU DETAIL out in (82) can be found once in ICE-SL and once in ICEIND. The semantics of DETAIL out are very much rooted in the single-word verb DETAIL, the general meaning of which is “to give a list of facts or all the available information about sth” (Hornby 2008: 416). The fact that this PVOU occurs exclusively in the South Asian ICE datasets will have to be scrutinised further to establish whether this PVOU is particular to South Asian English(es). LEASE out as contextualised in (83) is attested once in ICE-SL and its meaning is derived from that of LEASE, i.e. “to use or let sb use sth, especially property or equipment, in exchange for rent or a regular payment” (Hornby 2008: 874). Both PVOUs with additional out just discussed will be subject to analyses in the newspaper data and via GAST to shed further light on their frequencies and contexts of usage. In addition to PVOUs with additional out, a number of PVOUs creatively extrapolated from existing templates of verb-particle combinations is featured in the ICE components as well. They are presented in Examples (84) to (87).

. In line with the procedure for PVUs, those PVOUs which are semantically transparent combinations of the meaning of the verb and the particle out in the sense of “movement from the inside of an enclosed space or container to the outside of it” (Sinclair 2002: 35) are disregarded in this part of the analysis because the reason for their absence from the two dictionaries is not their novelty, but their semantic transparency (cf. Sinclair 2002: vii). However, they are included in the description of the frequency and genre-specificity of PVs in 5.1.1 and 5.1.2.

Chapter 5. Sri Lankan English lexicogrammar 

(84) The 〈indig〉 Kendra 〈/indig〉 could chalk out a course of training young researchers through direct participation, in these historic languages and scripts. 〈ICE-IND:W1B-019#69:1〉 (85) Each day it will net out their payments and work out how much they owe each other. 〈ICE-GB:W2C-016#38:2〉 (86) The power to reject the method of accounting has been split out in the proviso to Section 145 […]. 〈ICE-IND:W2A-015#64:2〉 (87) She locked the door, drew the curtains against the fog, switched out the lights and hauled her body up to bed. 〈ICE-GB:W2F-020#96:1〉

The combination of CHALK with the particle out as shown in (84) may be structurally licensed by other PVs with CHALK such as CHALK up (cf. Sinclair 2002: 48). CHALK out is semantically compatible with other multi-word verbs with out like PLAN out, which means that “you decide in detail what you are going to do” (Sinclair 2002: 248) with regard to certain tasks or events, but at the same time, CHALK out seems to entail a stronger presentational meaning component. The usage of CHALK out in IndE has also been described in an earlier corpus-based study on IndE (cf. Sedlatschek 2009: 156–159) highlighting its semantic similarities with CHART out. In addition to (84), there are two more instances of CHALK out in ICE-IND, which are exemplified in (88) and (89). (88) So he quickly chalked out what could go wrong and in breaking his schedule for the day. 〈ICE-IND:W2F-006#229:1〉 (89) Thoughts which would hitherto have been considered blasphemous, such as a free market economy, are now being openly, unabashedly and forcefully expressed and elaborate programmes chalked out for their speedy attainment. 〈ICE-IND:W2A-012#8:1〉

Example (85) illustrates the usage of NET out – a PVOU semantically close to the single-word verb BALANCE meaning “to show that in an account the total money spent is equal to the total money received; to calculate the difference between the two totals” (Hornby 2008: 103) and structurally as well as semantically related to the PVOU WORK out in the sense of performing mathematical calculations (cf. Sinclair 2002: 432). NET out as shown in (85), (90) and (91) occurs three times in ICE-GB and illustrates that established PVs can serve as templates for novel ones in both ESL and ENL contexts. However, all three instances of NET out in the BrE texts were produced by one author only, which is why the recurrence of this PVOU should not be interpreted as a comparatively high degree of institutionalisation of NET out in BrE. (90) Chase keeps track of its members ’ deals and informs them how they net out. 〈ICE-GB:W2C-016#71:2〉

 The Lexis and Lexicogrammar of Sri Lankan English

(91) This enables many banks, possibly several dozen, to net out their payments to each other through a central clearing house. 〈ICE-GB:W2C-016#33:2〉

The PVOU SPLIT out as provided in (86) is structurally facilitated by the verb SPLIT in other verb-particle combinations such as SPLIT off, SPLIT on and SPLIT up (cf. Sinclair 2002: 346–347). Its meaning may be perceived as relatively comparable to that of SPLIT up used to describe that something “is divided into several smaller sections or parts” (Sinclair 2002: 347). The example of SPLIT out in (86) is taken from ICE-IND and represents the only instance to be found in the ICE data. SWITCH out as in (87) seems structurally as well as semantically very close to SWITCH off, which is used to express inter alia that you stop an electrical device, engine, etc. by using a switch (cf. Sinclair 2002: 373). SWITCH out is attested once in ICE-GB. In order to establish the degree of institutionalisation of the PVOUs described above, it may prove fruitful to observe whether, and with what frequency, these novel forms, either with additional out or derived by means of semantico- structural analogy, are attested in larger sets of data. Figure 20 provides the normalised and absolute frequencies of the unrecorded PVOUs in the newspaper and GAST data. Four out of six unrecorded PVOUs can be exemplified with the acrolectal newspaper data at hand. However, not every single text collection contains all of them, but each dataset features two out of the six PVOUs under scrutiny. LEASE out is the PVOU which is shared across the variety-specific data collections since it occurs with a frequency of 3.59 in SAVE-SL, 3.91 in SAVE-IND and 0.23 in BNC news. Thus, LEASE out is attested more frequently in the South Asian data than in the British counterpart. In addition to that, DETAIL out occurs 0.65 times in the Sri Lankan data, CHALK out 14.65 times in SAVE-IND and SWITCH out 0.34 times in BNC news. The absence of NET out and SPLIT out from the newspaper data to a certain degree indicates that these unrecorded PVs are rather one-off verb-particle combinations than formations making their way into the respective varieties of English. NET out occurs three times in ICE-GB, but each instantiation stems from the same subtext, i.e. the same author in one communicative setting, which is why the fact that several examples of NET out occur in ICE-GB does not provide any grounds to argue that NET out truly is a recurrent form used in the BrE speech community. It appears to be the case that CHALK out, DETAIL out and LEASE out as exemplified with newspaper data in (92) to (94) (and SWITCH out to a more limited extent) should not be considered nonce-formations, but verb-particle

Chapter 5. Sri Lankan English lexicogrammar  14.65

15

SAVE-SL

SAVE-IND

BNC news

10 7.5

0

0

0

n=0 n=0 n = 45 CHALK out

0 0

n=2 n=0 n=0 DETAIL out

n = 11 n = 2 n = 12 LEASE out

0 0 0

0 0 0

n=0 n=0 n=0 NET out

n=0 n=0 n=0 SPLIT out

15

0 0

n=0 n=3 n=0 SWITCH out

GAST-IND

GAST-SL

0.34

2.5

0.23

3.59 3.91

5

0.65

Norm. Freq. (pmw)

12.5

GAST-GB

10 7.5

1.06 0.97

0.15

0.15

0.08 0.19

0.2 0.2

0.08

2.31 0.57

0.48 0.48

0.31

0.15

0

0.13

2.5

1.32

5 2.4

Norm. Freq. (pmw)

12.5

n = 314 n = 735 n = 3144 n = 199 n = 356 n = 348 n = 261800 n = 52330 n = 252130 n = 22250 n = 8901 n = 115550 n = 102640 n = 326100 n = 391500 n = 139650 n = 132170 n = 661500 CHALK out DETAIL out LEASE out NET out SPLIT out SWITCH out

Figure 20. Normalised (pmw) and absolute frequencies of unrecorded PVOUs in the newspaper and GAST data

combinations with the potential of establishing themselves more firmly in the varieties in which they occur. (92) The meeting chalked out details of contingency plans to meet any natural calamity, particularly floods during the ensuing monsoon. 〈SAVE-IND-SM_2003-05-19〉

 The Lexis and Lexicogrammar of Sri Lankan English

(93) In what was seen as the first step towards implementing the United Peoples Freedom Alliance Manifesto, the first chapter of which details out UPFA’s pledges on constitutional reforms, President Kumaratunga in a three-hour preliminary round of discussions kick-started the proposed reforms at President’s House. 〈SAVE-SL-DM_2004-04-13〉 (94) The government, however, will have to block and lease out portions of seabed for exploratory drilling to international oil companies who feel that there may be oil or gas deposits. 〈SAVE-SL-DN_2002-11-06〉

CHALK out seems to have taken root in IndE while the status of this PVOU in the other varieties concerned remains opaque. With regard to DETAIL out in SLE and SWITCH out in BrE, it appears that they are still at the very beginning of the institutionalisation process. The distribution of LEASE out suggests that this verb-particle combination is more firmly rooted in SLE and IndE than in BrE. In Figure 21, the relative and absolute frequencies of LEASE out and its single-word verb alternative LEASE are presented. 100

Rel. Freq. (%)

75 50

96.23 LEASE

LEASE out

63.33

63.16 36.67

36.84

25 3.77 0

n = 19 n = 11 SAVE-SL

n=7 n = 12 SAVE-IND Corpus

n = 51 n=2 BNC news

Figure 21. Relative and absolute frequencies of LEASE and LEASE out in the newspaper data

There are statistically highly significant differences and a moderate correlation between the variants LEASE and LEASE out and the newspaper data sets (χ² ≈ 30.05, df = 2, p < 0.001, Cramer’s V ≈ 0.38).16 In relative terms, LEASE (63.33%) is more frequent than LEASE out (36.67%) in SAVE-SL while the contrary holds true for SAVE-IND since LEASE (36.84%) is used less frequently than

. There are no significant differences between SAVE-SL and SAVE-IND (p > 0.05), but the differences between SAVE-SL and BNC news and SAVE-IND and BNC news are highly significant (p < 0.001).

Chapter 5. Sri Lankan English lexicogrammar 

LEASE out (63.16%). In BNC news, LEASE occurs in 96.23% of all cases whereas LEASE out can be found with 3.77% of the examples. In SLE, LEASE out apparently is a significant minority variant in comparison to the single-word form LEASE – the default option in SLE. In IndE, however, the multi-word verb is the standard variant to communicate that certain entities are (offered to be) let to particular people (cf. Hornby 2008: 874) while LEASE is the IndE minority variant. In BrE, LEASE out should probably not even be assigned minority status due to the scarcity of occurrences. Although the structural contexts of LEASE out and LEASE appear relatively comparable across the varieties covered, it seems to be the case that their combinability with the entity to be let, i.e. the direct object of LEASE (out), may be to some extent variety-specific. Despite the fact that the restricted data do not permit drawing any quantitatively reliable conclusions, abstract (e.g. quota 〈BNC K5H〉) as well as more concrete (e.g. territory 〈BNC AJD〉) patients can be combined with LEASE in the British data whereas LEASE out appears to be more restricted in that it only combines with rather abstract concepts (e.g. rights regarding soccer players 〈BNC CH3〉, milk quota 〈BNC K5H〉). In SAVE-IND, on the other hand, it is LEASE out which appears to enter combinations with abstract (e.g. management 〈SAVE-IND-SM_2003-03-01〉) as well as concrete entities (e.g. land 〈SAVE-IND-SM_2003-08-09〉) while LEASE seems to combine almost exclusively with more concrete entities (e.g. aircraft 〈SAVE-IND-SM_2005-01-10〉). In the Sri Lankan data, however, LEASE is used with both concrete (e.g. stations 〈SAVESL-DM_2007-01-13〉) and rather abstract (e.g. twenty-episode teledrama 〈SAVESL-DM_2003-03-14〉) direct objects, but so is LEASE out, the contexts of which display both concrete (e.g. tanks 〈SAVE-SL-DN_2002-08-01〉) as well as abstract (e.g. Peoplised Transport Boards 〈SAVE-SL-DM_2002-06-14〉) patients. In the light of these observations, it could be that the more restricted semantic profiles of the direct objects of LEASE out in BrE and LEASE in IndE lead to their status of minority variants in the respective varieties. In SLE, both LEASE and LEASE out seem to have comparable semantic scopes, which might be an explanation as to why the assignment of majority/minority status to LEASE and LEASE out might not be as clear-cut as in IndE and BrE. The GAST searches concerning the unrecorded PVOUs add two relevant insights. First, attestations for each potentially new verb-particle combination can be found with the help of top-level country domain searches in each of the varieties. Second, with the exception of SPLIT out in GAST-IND and LEASE out in GAST-GB, the unrecorded PVOUs occur systematically more frequently in the IndE and BrE than in the SLE data. The online texts thus provide evidence that the unrecorded PVOUs have cross-varietal currency and indicate that those PVOUs not attested in acrolectal

 The Lexis and Lexicogrammar of Sri Lankan English

text material from the newspaper corpora may nevertheless be used in contexts marked by a lower degree of formality than traditional newspaper writing (and/or in non-acrolectal communicative settings). SPLIT out, for example, is absent from the newspaper datasets, but (95), taken from an online IT magazine, can illustrate its usage. (95) For example, “A valid color code shall be R for red” and “A valid color code shall be G for green” might be split out as separate requirements […]. 〈http://www.digit.lk/anupamaweerabahu〉 (10 January 2012)

More data cleaned from irrelevant hits are needed to establish the reasons as to why SPLIT out is the only verb-particle combination which occurs less frequently in GAST-IND than in GAST-SL. However, the online data show that LEASE out figures most prominently in GAST-IND, which may be considered further evidence of the importance of this variant in comparison to LEASE in IndE. When it comes to the British online data, it is noteworthy that the minority status of LEASE out as opposed to LEASE finds expression in LEASE out being the only PVOU examined in the online data where the normalised frequency is lower for GASTGB than for GAST-SL. Although Sedlatschek (2009: 159) regards CHALK out “as a contemporary South Asianism”, neither the newspaper nor the online data fully warrant this conception given that this PVOU occurs comparatively infrequently in Sri Lankan texts. Further, the comparatively high frequency of SWITCH out in GAST-GB is yet another indication of the currency of this formerly unrecorded verb-particle combination in BrE. It is possible to find one formerly unrecorded PVOF in the ICE data, namely WAIVE off.17 The only example of this PVOF semantically close to the single-word verb WAIVE in the sense of “to choose not to demand sth in a particular case, even though you have a legal or official right to do so” (Hornby 2008: 1713) can be found in ICE-IND and is shown in (96). (96) 〈foreign〉 Communidade 〈/foreign〉 land cannot be sold to private parties or individuals except by auction. This rule is waived off in case of cooperative societies or landless persons. 〈ICE-IND:W2C-019#58–59:1〉

As WAIVE off constitutes a case of a verb-particle combination in which the particle does not have a substantial effect on the meaning of the structure, it may

. In analogy to the procedure with PVUs and PVOUs, verb-particle combinations in which the meaning of off, i.e. having “to do with movement away from something or separation from it” (Sinclair 2002: 27), is a transparent semantic addition to the meaning of the verb are not considered as possibly innovative forms. These PVOFs are, however, part of the analyses of the frequency and the genre-specificity of PVs in 5.1.1 and 5.1.2.

Chapter 5. Sri Lankan English lexicogrammar 

be fruitful to contrast the frequency of the single-word verb WAIVE with that of the corresponding multi-word verb WAIVE off. The newspaper-based results of this comparison are given in Figure 22. 100

100

Rel. Freq. (%)

75

WAIVE 77.78

WAIVE off

75

50 25

22.22

25 0

0

n=7 n=2 SAVE-SL

n = 18 n=6 SAVE-IND Corpus

n = 40 n=0 BNC news

Figure 22. Relative and absolute frequencies of WAIVE and WAIVE off in the newspaper data

For the occurrences of WAIVE and WAIVE off in the newspaper data, statistically significant differences can be attested (p < 0.01).18 With a relative frequency of 22.22%, WAIVE off occurs in place of WAIVE in SAVE-SL whereas WAIVE off is chosen in preference over WAIVE in 25.00% of all cases in SAVE-IND. Examples (97) and (98) represent instances of WAIVE off in the Sri Lankan and Indian newspaper texts respectively. (97) Last month government waived off the VAT on diesel in a bid to control the escalating inflationary conditions. 〈SAVE-SL-DM_2005-08-27〉 (98) While army officials do waive off the stringent checks usually conducted on visitors in the cantonment, the temple caretakers graciously take due responsibility for any eventuality. 〈SAVE-IND-TI_37671〉

WAIVE off is absent from BNC news. Despite the fact that the absolute frequencies of WAIVE and WAIVE off are comparatively low, the data nevertheless suggest that WAIVE off might be a viable minority variant to the single-word verb WAIVE in SLE and IndE. The absolute and normalised frequencies of WAIVE off in the respective top-level domains in Table 28 partly verify this view. . As regards the pairwise comparisons of WAIVE and WAIVE off in the newspaper data, SAVE-SL compared to SAVE-IND (p > 0.05) and SAVE-SL compared to BNC news (p > 0.01 (Bonferroni correction for multiple pairwise comparisons)) do not exhibit significant diffe rences while SAVE-IND compared to BNC news does (p < 0.01).

 The Lexis and Lexicogrammar of Sri Lankan English

In normalised terms, WAIVE off is most frequent in GAST-SL (2.91). GASTIND displays a normalised frequency of 0.64 and GAST-GB of 0.01 for WAIVE off. The GAST data thus give further strength to the status of WAIVE off as a minority multi-word alternative to WAIVE in SLE and also indicate that WAIVE off is at best of marginal importance in BrE and by no means an additional option to WAIVE in this variety. While the newspaper-based analysis provides grounds to argue that WAIVE off is a valid substitute for WAIVE in IndE, the GAST data do not confirm this unconditionally, which is why the status of this verb-particle combination as a minority alternative to WAIVE in IndE should probably remain tentative based on the data studied. Table 28. Absolute and normalised (pmw) frequencies of WAIVE off in GAST GAST-SL

WAIVE off

GAST-IND

GAST-GB

abs. freq.

norm. freq.

abs. freq.

norm. freq.

abs. freq.

norm. freq.

7,564

2.91

72,430

0.64

10,625

0.01

Interestingly, in addition to WAIVE off, there are verbs with which the particle off occurs in positions where the preposition of would generally be expected. This is the case with BOAST and DISPOSE and more detailed investigations are needed to clarify the status of these marked uses as given in Examples (99) and (100). The meanings of BOAST of can be “to talk with too much pride about sth that you have or can do” (Hornby 2008: 158) or “to have sth that is impressive and that you can be proud of ” (Hornby 2008: 158) and DISPOSE of can mean “to get rid of sb/sth that you do not want or cannot keep” (Hornby 2008: 441). It is also these meanings that BOAST off and DISPOSE off represent respectively. Based on the ICE-IND data, BOAST off could be seen as a case of a non-recurrent structure as it occurs only once, but the fact that DISPOSE off occurs five times in a number of contexts in academic writing (W2A) may hint at a difference in status between BOAST off and DISPOSE off. (99) It has hot-air tumble drier which no other Indian machines can boast off. 〈ICE-IND:W2C-006#85:10〉 (100) In some situations, however, waste can be a big health hazard and must be disposed off properly, for example by sanitary land fill. 〈ICE-IND:W2A-031#20:1〉

The absence of BOAST off in ICE-SL and ICE-GB may be considered a tentative indication that this verb-particle combination is of little importance in SLE and BrE. Figure 23 displays the relative and absolute frequencies of BOAST of and BOAST off in the newspaper and GAST data.

Chapter 5. Sri Lankan English lexicogrammar 

There are no statistically significant differences in the distributions of BOAST of and BOAST off in the newspaper datasets (p > 0.05).19 BOAST off is absent from SAVE-SL and BNC news, but one attestation shown in (101) can be found in SAVE-IND.

Rel. Freq. (%)

(101) The women in the film include Binodini (Aishwarya Rai) a beautiful, intelligent widow who is not afraid of rebelling and Ashalata (Raima Sen), a young, naÃ¯ve [= naive] girl with no intellectual credentials to boast off. 〈SAVE-IND-TI_37900〉 100 75 50 25 0

100

97.73

0 n = 27

n=0

Rel. Freq. (%)

SAVE-SL

100 75 50 25 0

99.91

2.27 n = 43

n=1

SAVE-IND Corpus

n = 16 GAST-SL

n=0

96.54

0.22 n = 1485100 n = 3204 GAST-IND Data BOAST of

0 n = 21

BNC news

99.78

0.09 n = 18060

100

3.46 n = 2212000 n = 79228 GAST-GB

BOAST off

Figure 23. Relative and absolute frequencies of BOAST of and BOAST off in the newspaper and GAST data

This distribution clearly shows that BOAST off cannot be considered an alternative to BOAST of in any acrolectal texts drawn from the varieties under scrutiny. This interpretation is also warranted by the GAST data. BOAST off is in relative terms virtually absent from GAST-SL (0.09%) and GAST-IND (0.22%) and marginally frequent in GAST-GB (3.46%). As a consequence, BOAST off cannot be regarded as a lexicogrammatical substitute for BOAST of and the former appears to be a structure with a diminishingly low frequency and an unsystematic pattern of occurrence across the varieties covered.

. None of the respective pairwise comparisons produces significant differences either (p > 0.05).

 The Lexis and Lexicogrammar of Sri Lankan English

Rel. Freq. (%)

Rel. Freq. (%)

In a similar fashion, the status of DISPOSE off as opposed to DISPOSE of can also be established with the newspaper and the online data. Figure 24 shows the relative and absolute frequencies of DISPOSE of and DISPOSE off in the newspaper and the GAST data. 100 75 50 25 0

100 75 50 25 0

100

91.3

80 20

8.7 n = 21 n = 2 SAVE-SL 98.06

n = 40 n = 10 SAVE-IND Corpus

99.64

90.85

9.15

1.94 n = 17446

n = 346 GAST-SL

n = 3327000 n = 335060 GAST-IND Data DISPOSE of

0 n = 76 n = 0 BNC news

0.36 n = 33520000 n = 121181 GAST-GB

DISPOSE off

Figure 24. Relative and absolute frequencies of DISPOSE of and DISPOSE off in the newspaper and GAST data

The distributions of DISPOSE of and DISPOSE off in the newspaper data yield highly significant differences (p < 0.001).20 DISPOSE of can be attested with a higher relative frequency than DISPOSE off in SAVE-SL (91.30%), SAVE-IND (80.00%) and BNC news (100.00%). No instance of DISPOSE off can be found in BNC news (0.00%), but it seems to recur in the Sri Lankan (8.70%) and, with a higher frequency, in the Indian (20.00%) newspaper data. Examples (102) and (103) show the usage of the default variant DISPOSE of and Examples (104) and (105) demonstrate DISPOSE off as used in Sri Lankan and Indian newspapers. Syntactically, DISPOSE off is similar to DISPOSE of in that both variants are prototypically followed by a noun phrase, which is why – at

. As regards the respective pairwise comparisons, there are neither significant diffe rences between SAVE-SL and SAVE-IND (p > 0.05) nor SAVE-SL and BNC news (p > 0.05), but the distributional differences between SAVE-IND and BNC news are highly significant (p 0.01 (Bonferroni correction for multiple pairwise comparisons)).

Chapter 5. Sri Lankan English lexicogrammar 

631.43

differences result from the frequencies of HAVE and TAKE in SAVE-IND and BNC news. Although TAKE is the default option with LVCs in BNC news (47.56), it is not as dominant as in SAVE-SL (61.97) or SAVE-IND (106.78) since GIVE (33.36) and HAVE (30.76) also display a high frequency in LVCs in BNC news. In SAVE-IND, in contrast to that, the normalised frequency of HAVE has the lowest value in a comparison across the datasets (24.74), which shows that HAVE is not as likely a candidate to occur in a LVC in SAVE-IND as in BNC news. The degree to which LVCs are marked by informality can be examined by looking at the communicative contexts in which the constructions are used. The normalised and absolute frequencies of LVCs with GIVE, HAVE, PUT and TAKE across the genres in ICE are shown in Figure 27.

100 0

162.25 241.22 160.54 284.38 352.43 139.35 191.31 337.75 233.99

311.85

487.75 246.27 180.31 280.86 227.13

419.1 351.26

73.39

200

302.97 205.36 251.3

300

277.52 223.49

400

159.11

Norm. Freq. (pmw)

500

320.11

600

501.22

700

n = 21 n = 22 n=6 n = 17 n = 15 n = 14 n=7 n = 25 n=4 n = 20 n=9 n = 22 n = 11 n=9 n = 21 n = 15 n = 15 n = 15 n = 3 N = 103 n = 21 n = 10 N = 131 n = 14 n = 6 N = 101 n = 13 ICE-SL ICE-IND ICE-GB Corpus Non-professional writing (W1A)

Instructional writing (W2D)

Correspondence (W1B)

Persuasive writing (W2E)

Academic writing (W2A)

Creative writing (W2F)

Non-academic writing (W2B)

TOTAL

Reportage (W2C)

Figure 27. Normalised (pmw) and absolute frequencies of LVCs in the genres of ICE-SL, ICE-IND and ICE-GB

The differences in the distribution of the LVCs in the various genres of the individual ICE components are statistically significant, but the correlation is

 The Lexis and Lexicogrammar of Sri Lankan English

r elatively weak (χ² ≈ 25.58, df = 14, p < 0.05, Cramer’s V ≈ 0.20).26 Figure 27 shows that, in normalised terms, most LVCs occur in ICE-SL in non-professional writing (W1A; 501.22) while fewest can be found in instructional writing (W2D; 73.39). For ICE-IND, the highest density of LVCs can be attested in persuasive writing (W2E; 631.43) and non-academic writing (W2B) is least marked by the occurrence of LVCs (180.31). In ICE-GB, yet another genre features the largest amount of LVCs, namely reportage (W2C; 352.43), and instructional (W2D; 139.35), academic (W2A; 160.54) as well as non-professional writing (W1A; 162.25) display a low normalised frequency of LVCs. While the distribution in ICE-GB just described provides evidence for the informal character of LVCs since they group in more informal genres such as creative writing (W2F; 337.75) and are infrequent in formal genres such as academic writing, the distributions of LVCs in ICE-SL and ICE-IND are relatively surprising. The genre of non-professional writing (W1A) displays a high density of LVCs in both ICE-SL (501.22) and ICE-IND (419.10) whereas the respective frequency for ICE-GB (162.25) is markedly lower. The British ICE data substantiate that non-professional writing is relatively similar to academic writing (W2A; cf. Zipp & Bernaisch 2012: 186) in that the normalised frequency of LVCs in nonprofessional writing is almost as low as that of academic writing (160.54). This, however, does not hold true for the Sri Lankan and Indian ICE data since there is a pronounced gap between the frequencies of LVCs in non-professional and academic writing in both datasets. Although the higher degree of formality in academic writing in comparison to non-academic writing finds reflection in lower frequencies of LVCs in the former in ICE-SL and ICE-GB, ICE-IND deviates from this trend. In normalised frequency counts, academic writing in ICE-IND features 246.27 LVCs while the non-academic writing section of the component features 180.31 LVCs, which is unexpected given that non-academic writing is less marked by formality than academic writing. The cross-varietal comparison of LVCs in creative writing (W2F) does not yield a uniform picture either. In ICE-IND (487.75) as well as ICE-GB (337.75), LVCs are relatively frequent, but with ICE-SL (205.36), it is interesting to find that they occur less frequently than in the persuasive writing (W2E; 302.97) and reportage (W2C; 223.49) sections of the component – genres which are not per se markedly informal.

. No pairwise comparison of the occurrence of LVCs in the ICE genres across the different ICE components yields statistically significant differences (p > 0.05).

Chapter 5. Sri Lankan English lexicogrammar 

Still, when it comes to the correspondence (W1B) section, a common trend across the datasets can be observed. Table 29 documents the absolute and relative frequencies of LVCs in business and social letters across the ICE components. Table 29. Absolute and relative frequencies of LVCs in social and business letters in ICE ICE-SL

ICE-IND

ICE-GB

abs. freq.

rel. freq.

abs. freq.

rel. freq.

abs. freq.

rel. freq.

Social letters

15

75.00%

16

72.73%

8

53.33%

Business letters

5

25.00%

6

27.27%

7

46.67%

TOTAL

20

100.00%

22

100.00%

15

100.00%

For the distribution of LVCs in social and business letters, no statistically significant differences can be attested and a weak correlation can be shown to exist (χ² ≈ 2.17, df = 2, p > 0.05, Cramer’s V ≈ 0.20).27 In each of the datasets, LVCs are attested more often in social letters than in business letters although this tendency may be stronger in ICE-SL and ICE-IND than in ICE-GB. Consequently, with correspondence texts, LVCs group according to formality across the varieties covered. With regard to the interplay between formality as represented by the genres covered in ICE and the frequency of LVCs, two central findings emerge from the data. In BrE, it appears to be the case that LVCs are indeed marked by a certain degree of informality (cf. Dixon 2005: 483) since they appear to have a tendency to accumulate in relatively informal genres such as e.g. creative writing (W2F). If formality increases, LVCs in BrE become more infrequent as is the case in e.g. academic writing (W2A). In relation to SLE and IndE, however, the data do not provide sufficient evidence to establish a similarly clear-cut association between informality and LVCs. True, there are some cases where comparatively informal genres feature a high number of LVCs (e.g. social letters in SLE) and formal genres do not (e.g. academic writing (W2A) in SLE), but LVCs are also frequent in texts at the rather formal end of the formality cline (e.g. in non-professional writing (W1A) in SLE) and they are rare in more informal genres (e.g. creative writing (W2F) in SLE). In sum, then, the informal character of LVCs as posited by Wierzbicka (1982), Live (1973) and others is more clearly visible in BrE than in SLE and IndE. In SLE and IndE, LVCs do not seem to be consistently stylistically marked by informality

. There are no significant differences in pairwise comparisons of social and business letters across ICE-SL, ICE-IND and ICE-GB (p > 0.05).

 The Lexis and Lexicogrammar of Sri Lankan English

and appear to be valid alternatives to simplex-verb constructions more independent of the communicative context in which they are employed. Apart from the frequency and distribution of LVCs, their internal structural realisation is also subject to variation within single and across varieties. The definition of LVCs at hand essentially allows for three variants of a combination of LV and noun, all of which are related to the realisation of the determiner. (114) If 〈w〉you’ve〈/w〉 never done it before, take the challenge and give it a try. 〈ICE-SL:W2D-014#74:2〉 (115) Our intrepid correspondent took the plunge, literally, and lived to tell the tale. 〈ICE-SL:W2D-012#225:9〉 (116) 〈bold〉 P〈/bold〉〈smallcaps〉ARTIES〈/smallcaps〉 concerned with financial reports should take note of the following recent developments in India: […]. 〈ICE-IND:W2A-020#3:1〉

In the construction LV + article + noun phrase, the article slot can be filled with the indefinite article a(n) as in (114), the definite article the as in (115) or the zero article Ø as in (116). Figure 28 provides an overview of the frequencies of the possible article realisations across the three ICE components and the newspaper datasets.28 The significant differences in the distribution of the article realisation in the three ICE components (p < 0.05) point to interesting tendencies.29 In relative terms, the indefinite article is the most frequent article realisation in ICE-SL (61.29%) and ICE-GB (49.21%) while the zero article occurs most frequently in the article slot in ICE-IND (56.00%). In each of the datasets investigated, the definite article occurs rarely in LVCs as it is the least frequent realisation in ICE-SL (8.06%), ICE-IND (5.00%) and ICE-GB (9.52%). Thus, the first row in Figure 28 shows similarities between SLE and BrE when it comes to article realisations in LVCs since the indefinite article is the dominant option and the zero article an important minority variant in both datasets. This preference in SLE supports Hoffmann et al.’s (cf. 2011: 270) stance on internal structural variation stating that LVCs with the definite or the zero article are less frequent alternatives to LVCs with the indefinite article in South Asian Englishes. Apparently, this trend also applies to BrE, but not to IndE.

. As mentioned above, LVCs with HAVE in the ICE data have been retrieved via the search term HAVE a/an. Consequently, the LVCs with HAVE are not taken into consideration when investigating the internal structural variation of LVCs. . In the pairwise comparisons of article realisation in LVCs, ICE-SL and ICE-IND are significantly different from each other (p < 0.01), but neither the pair ICE-SL and ICE-GB nor ICE-IND and ICE-GB differ significantly from each other (p > 0.05).

Chapter 5. Sri Lankan English lexicogrammar  100

Rel. Freq. (%)

75 61.29

56

50

49.21

39

41.27

30.65 25 0

8.06

9.52

5

n = 38 n = 19 n=5

n = 39 n = 56 n=5

n = 31 n = 26 n=6

ICE-SL

ICE-IND Corpus

ICE-GB

100

Rel. Freq. (%)

75 58.67

52.75

50 34.44 25 0

44.93

42.42

38.06

6.89

9.18

12.65

n = 145 n = 247 n = 29

n = 228 n = 316 n = 55

n = 501 n = 473 n = 141

SAVE-SL

SAVE-IND Corpus

BNC news

Indefinite article

Definite article

Zero article

Figure 28. Relative and absolute frequencies of article variants in LVCs in the ICE and newspaper data

The newspaper data add another perspective on article realisation in LVCs in the varieties concerned. Between the internal article realisations in the newspaper datasets, there are highly significant differences with a weak correlation (χ² ≈ 40.66, df = 4, p < 0.001, Cramer’s V ≈ 0.10).30 The definite article is least common in SAVE-SL (6.89%), SAVE-IND (9.18%) and BNC news (12.65%), which indicates that this structural option is the least admissible alternative in

. The pairwise comparisons of the article realisation in LVCs across the datasets show that there are no significant differences between SAVE-SL and SAVE-IND (p > 0.05) while there are statistically highly significant differences between SAVE-SL and BNC news and SAVEIND and BNC news (p < 0.001).

 The Lexis and Lexicogrammar of Sri Lankan English

BrE and in the South Asian Englishes and thus confirms the ICE-based findings. There are, however, discrepancies in relation to the preferred article realisation in the newspaper data in that the zero article figures most prominently in SAVESL (58.67%) and in SAVE-IND (52.75%). With BNC news, the indefinite article (44.93%) is slightly more frequent than the zero article (42.42%). In comparison to the trends delineated in the ICE data, the preferred article use in LVCs in IndE, i.e. the zero article, and BrE, i.e. the indefinite article, is corroborated, but in SLE, the zero article is the most frequent realisation in the newspaper data in relative terms while the indefinite article is the structural realisation with the highest relative frequency in ICE-SL. In the light of these findings, it can be stated that SLE, IndE and BrE share the tendency to feature the definite article least frequently in LVCs. As regards the most frequent article in LVCs, the ICE and newspaper datasets show stable tendencies for IndE preferring the zero article and BrE favouring the indefinite article. In SLE, the default article realisation in LVCs is less clear-cut in that both indefinite articles and zero articles are frequent structural options. The statement that “the more ‘exotic’ patterns like give Ø boost and take the walk are really minority variants in IndE and in the other South Asian Englishes” (Hoffmann et al. 2011: 270) thus needs to be discussed in more detail. While the present study can also show that the definite article in LVCs is infrequent across the varieties covered, zero articles are central structural options in the formation of LVCs (e.g. TAKE Ø care, TAKE Ø note). Therefore, zero article usage in LVCs is not at all out of the ordinary in SLE, IndE or BrE. What primes GIVE Ø boost as out of the ordinary is the fact that this construction has a productive alternative with the indefinite article in the LVC GIVE a boost, which is not the case with most other LVCs using zero articles. The combinability of a given noun with a particular set of LVs has also been subject to cross-varietal examinations (cf. Hoffmann et al. 2011: 271) and is taken up here as well. Each combination of LV and noun which occurs five or more times in any of the ICE datasets is cross-checked in the remaining ICE components to examine whether the same noun is used with different LVs under the condition that the meaning of the entire LVC stays the same. The LVCs for which at least five examples in a single ICE component could be found are HAVE a(n)/the/Ø effect, TAKE a(n)/the/Ø account, TAKE a(n)/the/Ø care, HAVE a(n)/the/Ø impact, PUT a(n)/the/Ø end, TAKE a(n)/the/Ø look and TAKE a(n)/the/Ø note. On the basis of the ICE data, the picture that emerges with regard to LV variation with the same noun given semantically stable conditions is uniform across the varieties. No variety-specific variation with the LV can be attested with any

Chapter 5. Sri Lankan English lexicogrammar 

of the above LVCs. The ICE data indicate that the associations between LV and noun to produce a particular meaning are cross-varietally stable. The LVC TAKE a(n)/the/Ø look is complemented by HAVE a(n)/the/Ø look, but the latter form is productive in all three varieties as well. Investigating the above LVCs in the newspaper corpora gives further empirical support to the cross-varietal stability of the association between a LV and a given noun to convey certain semantic information. With none of the patterns examined can any manifest differences across SAVE-SL, SAVE-IND and BNC news be substantiated. 5.2.2 L ight-verb constructions: Potentially innovative light-verb constructions In addition to some of the variety-specific trends presented in 5.2.1, a number of LVCs described as potential candidates for structurally nativised forms may also shape the lexicogrammar of the varieties under scrutiny. These LVCs are –– –– –– –– –– –– –– ––

HAVE a(n)/the/Ø glimpse TAKE a(n)/the/Ø benefit from TAKE a(n)/the/Ø lease TAKE a(n)/the/Ø call PUT a(n)/the/Ø chat PUT a(n)/the/Ø fight PUT a(n)/the/Ø nap PUT a(n)/the/Ø rest.

In the newspaper data, the searches are lexically anchored in the nouns and the corresponding LV can occur in a window of ten words to the left and ten words to the right of the node word. For the GAST searches, the search pattern features the LV, the indefinite article and the noun. The only exception here is TAKE benefit from, which does not take the indefinite article (cf. Hoffmann et al. 2011: 275–276) and was thus searched for with a zero article in GAST. The results of HAVE a(n)/ the/Ø glimpse, TAKE a(n)/the/Ø benefit from and TAKE a(n)/the/Ø lease in the newspaper data are given in Table 30. Generally, the three candidates for lexicogrammatical nativisation shown above are extremely rare in the newspaper data and they are absent from the BrE dataset altogether. HAVE a(n)/the/Ø glimpse as in Examples (117) and (118) occurs twice in SAVE-SL and SAVE-IND respectively while TAKE a(n)/the/Ø benefit from is absent from the newspaper writing available and TAKE a(n)/the/Ø lease as in (119) is attested once in the Indian newspaper data.

 The Lexis and Lexicogrammar of Sri Lankan English

Table 30. Absolute and normalised (pmw) frequencies of HAVE a(n)/the/Ø glimpse, TAKE a(n)/the/Ø benefit from and TAKE a(n)/the/Ø lease in the newspaper data SAVE-SL

SAVE-IND

BNC news

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq. norm. freq. HAVE a(n)/the/Ø glimpse

2

0.65

2

0.65

0

0

TAKE a(n)/the/Ø benefit from

0

0

0

0

0

0

TAKE a(n)/the/Ø lease

0

0

1

0.33

0

0

(117) Locals flocked to the spot to have a glimpse of the idols. 〈SAVE-IND-SM_2005-01-06〉 (118) Thousands flocked to the streets of Kandy on that day to have a glimpse of Saradiel. 〈SAVE-SL-DN_2002-05-07〉 (119) Forcibly occupied land was distributed amongst CPI-M cadres who failed to plough economically. Now multinationals are invited, to take lease of such land for economic production. 〈SAVE-IND-SM_2003-03-01〉

This distribution could be taken as a very subtle indication that HAVE a(n)/the/Ø glimpse may be in use in SLE and IndE and TAKE a(n)/the/Ø lease in IndE. The GAST search results depicted in Table 31 may shed some additional light on the currency of the LVCs under scrutiny. Table 31. Absolute and normalised (pmw) frequencies of HAVE a glimpse, TAKE Ø benefit from and TAKE a lease in GAST GAST-SL

GAST-IND

GAST-GB

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq. norm. freq. HAVE a glimpse TAKE Ø benefit from TAKE a lease

2,945

1.40

221,880

3.50

100,130

0.31

9

0.00

155,180

2.45

50,350

0.16

35

0.02

6,383

0.10

67,220

0.21

The normalised frequencies show that HAVE a glimpse occurs more frequently in GAST-SL and GAST-IND than in GAST-GB, but TAKE Ø benefit from and TAKE a lease are more frequent in GAST-IND and GAST-GB than in GASTSL. The data for HAVE a glimpse support the claim that this LVC is “[a] likely candidate for a South Asian regionalism” (Hoffmann et al. 2011: 273) as it is relatively frequent in both SLE and IndE. TAKE Ø benefit from, however, is frequent in the

Chapter 5. Sri Lankan English lexicogrammar 

IndE online data and occurs only rarely in SLE online texts. Given that this LVC is also marginally frequent in the BrE online data, it is probably the case that TAKE Ø benefit from is a novel lexicogrammatical construction, which, however, seems to be neither particularly South Asian nor Sri Lankan. To a more limited extent, this also holds true for TAKE a lease, which occurs in IndE and BrE online data, but is extremely rare in the respective SLE data. In the light of the by and large absence of these constructions from the newspaper data, however, it seems that these possibly recent instantiations of LVCs have not yet permeated into (varietyspecific) acrolectal language use and are probably still nothing but candidates for structural nativisation in SLE (and IndE) or structural innovation in BrE. The first noun to provide further insights into the productivity of PUT as a LV is call. The frequencies of the LVCs with call which have the same meaning as the simplex verb CALL (someone via phone) in the newspaper data are shown in Table 32. Table 32. Absolute and relative frequencies of LVCs with call in the newspaper data SAVE-SL

SAVE-IND

BNC news

abs. freq.

rel. freq.

abs. freq.

rel. freq.

abs. freq.

rel. freq.

GIVE a(n)/the/Ø call

3

37.50%

1

10.00%

3

18.75%

MAKE a(n)/the/Ø call

5

62.50%

9

90.00%

13

81.25%

PUT a(n)/the/Ø call

0

0%

0

0%

0

0%

TAKE a(n)/the/Ø call

0

0%

0

0%

0

0%

TOTAL

8

100.00%

10

100.00%

16

100.00%

The frequencies of the LVCs with call do not exhibit statistically significant differences (p > 0.05).31 In the newspaper data, PUT is not used as a LV with call. Instead, MAKE occurs most frequently with call in SAVE-SL (62.50%), SAVEIND (90.00%) and BNC news (81.25%) and relevant examples are shown in (120) and (121). (120) While police was busy at work, one of the masons who had also gone to Mumbai made a call to a resident of Bahin village and told the stories about police harassment. After getting the telephone call, the villagers managed to catch three of the killers. 〈SAVE-IND-SM_2004-08-17〉

. The respective pairwise comparisons of the newspaper datasets concerned do not yield significant differences either (p > 0.05).

 The Lexis and Lexicogrammar of Sri Lankan English

(121) It ‘s important to let your children know that when a person steps out of a social or familial situation to use a mobile phone, they keep themselves from experiencing the moment; cell phones can become a constant p ending (and sometimes realized) distraction. […] You’ll need to teach your children to make a call by excusing himself or herself. 〈SAVE-SL-DM_2007-04-10〉

There is one particularly interesting instance in which call combines with TAKE in the SAVE-SL texts as shown in Example (122). However, this construction does not meet the definition of LVCs in the present study since the noun is a compound. (122) Saman Indrakumara who was with a girl was arrested there and at the time of the arrest the girl was taking a telephone call from a nearby telephone box and the accused was leaning to a lamp post. 〈SAVE-SL-DM_2002-06-14〉

In this example, the girl in company of Saman Indrakumara calls someone via phone and this situation is encoded as was taking a telephone call. The LVC TAKE a(n)/the/Ø call with this meaning, which contrasts with that of answering the phone, has been described as being markedly Sri Lankan (cf. Meyler 2007: 252; Mukherjee 2012: 206) and constitutes an example of a nativised LVC in SLE. However, on the basis of the SAVE-SL data, there is also no denying that the majority of the LVCs TAKE a(n)/the/Ø call stand for answering the phone. Supplementary findings from the GAST searches are provided in Figure 29.32 The most frequent LV in connection with call is MAKE in GAST-SL (97.74%), in GAST-IND (59.09%) and GAST-GB (86.20%) while GIVE a call and in particular PUT a call occur with markedly lower frequencies. Consequently, MAKE a call seems to be the cross-varietally stable default LVC variant to the simplex verb CALL in the sense of calling someone via phone whereas GIVE and PUT a call, potential alternatives to the default option, should probably not be regarded as viable alternants due to their low frequency with GIVE a call in GAST-IND as the only exception. Still, what needs to be considered here is that MAKE a call typically does not take an indirect object, but GIVE a call and PUT a call potentially can. The presence or absence of an indirect object may thus have an influence on which LV for call is chosen in a given syntactic environment.

. TAKE as a LV was not included in the searches as it would not have been possible to semantically clean the data from instances in which TAKE a call does not mean calling someone via phone. It goes without saying that semantic variation can also be found with the other LVCs scrutinised here, but this does not distort the results with GIVE/MAKE/PUT a call as much as it does with TAKE a call.

Chapter 5. Sri Lankan English lexicogrammar  97.74

100

GIVE a call

MAKE a call

PUT a call

86.2

Rel. Freq. (%)

75 59.09 50 39.88 25

0

8.82

2.24

0.01 n = 13 n = 2698 n = 117498 GAST-SL

4.98

1.03 n = 5441 n = 210450 n = 311880

n = 91950 n = 51840 n = 898200

GAST-IND Data

GAST-GB

Figure 29. Relative and absolute frequencies of LVCs with call in GAST

When it comes to LVCs with chat, the newspaper and the GAST data show relatively clear and uniform tendencies. In SAVE-SL, SAVE-IND and BNC news, all instances featuring chat as a noun in a LVC show that chat is combined with HAVE as in (123) and (124). With regard to the different LVCs in which chat is used, it is thus not surprising that there are no significant differences across the newspaper datasets (p > 0.05).33 (123) It is true that I had a chat with him, but I did not make such a proposal. 〈SAVE-SL-DM_2005-03-12〉 (124) It is not unusual to see late night revellers parking at night ‘Kades’ to fill themselves up with some real spicy ‘grub’, most of them park themselves having a chat over a cup of coffee or iced orange juice. 〈SAVE-SL-DN_2004-04-30〉

The GAST data give further backing to this association in that HAVE is used as a LV with chat in 97.09% of all cases in GAST-SL, in 98.47% of all cases in SAVEIND and in 99.39% of all cases in GAST-GB. Consequently, if chat is employed in a LVC, the default LV seems to be HAVE in SLE, IndE and BrE. There does not appear to be a significant minority variant in any of the varieties although PUT a chat can be attested in SLE and is even institutionalised in the title of the TV series

. The respective pairwise comparisons do not yield significant differences either (p > 0.05).

 The Lexis and Lexicogrammar of Sri Lankan English

‘Put a chat with auntie netta’ broadcast by ETV, a Sri Lankan English-medium TV station. The tendencies in the newspaper texts with regard to fight in LVCs, however, are not as straightforward. While all occurrences of fight are complemented with HAVE in SAVE-SL as in (125), the highest relative frequencies in SAVE-IND (60.00%) and BNC news (50.00%) are found with GIVE a(n)/the/Ø fight as in (126) and (127). (125) The underworld gangs we have often fight with each other as rivals and they often adhere to different political parties. 〈SAVE-SL-DN_2002-08-15〉 (126) The PDF is gearing up for the elections ready to give a tough fight to the ruling GNLF, who has boycotted the last three Parliamentary elections, “handing over” the seat to the CPI-M candidate. 〈SAVE-IND-SM_2004-02-17〉 (127) SPORTS enthusiasts representing Petersfield Chamber of Trade gave their opposition a tough fight in the four chamber sports competition on Saturday with Liphook emerging as victors in the final moments. 〈BNC C88〉

HAVE a(n)/the/Ø fight is a frequent minority variant in the Indian (40.00%) and British (37.50%) data. However, these differences are not statistically significant (p > 0.05).34 In contrast to this distribution, the GAST data yield much less ambiguous tendencies. HAVE a(n)/the/Ø fight is most frequent in GAST-SL (91.72%), GAST-IND (64.25%) and GAST-GB (93.61%) and only in GAST-IND could the alternative LVC with GIVE (22.07%) be attested with a noteworthy frequency. In the light of these findings, HAVE a(n)/the/Ø fight seems to be the default option in SLE independent of communicative context. In IndE and BrE, GIVE a(n)/the/Ø fight seems a viable alternative in (highly edited) newspaper texts, but it retains this status with less edited online texts in IndE only. BrE tends to employ HAVE a(n)/the/Ø fight in online texts. PUT as a LV with fight is extremely rare in SLE as well as IndE and BrE. While LVCs with nap occur only twice in the newspaper data, i.e. once in HAVE a(n)/the/Ø nap in SAVE-IND and once in TAKE a(n)/the/Ø nap in BNC news, on the basis of which no statistically significant trends can be established (p > 0.05), the GAST results seem more prolific.35 The variants of the LVCs with nap investigated and their frequencies in the variety-specific domains are shown in Figure 30.

. None of the pairwise comparisons concerned produced statistically reliable differences either (p > 0.05). . The respective pairwise comparisons do not show significant differences either (p > 0.05).

Chapter 5. Sri Lankan English lexicogrammar 

The distribution shows shared tendencies among the SLE and IndE data while BrE usage of nap in LVCs appears to be markedly different. In GAST-SL (87.45%) and GAST-IND (70.23%), the most frequent LV in combination with nap is TAKE whereas HAVE a nap occurs in 12.40% of all cases in GAST-SL and in 29.51% of all cases in GAST-IND. In GAST-GB, the most frequent option is HAVE a nap (87.96%) and TAKE a nap is attested with a relative frequency of 12.02%. Although PUT is used in combination with nap in each variety, it is of marginal importance from a frequency-based perspective. SLE and IndE apparently prefer TAKE a nap as exemplified in (128) over HAVE a nap as shown in (129) while the reverse is true for BrE. 100 87.45

87.96

Rel. Freq. (%)

75

70.23

50 29.51 25 12.02

12.4 0

0.07

0.07

n=1 n=1 n = 167 n = 1178 GAST-SL GIVE a nap

0

0

0.26

n=2 n = 241 n = 27840 n = 66250 GAST-IND Data HAVE a nap

PUT a nap

0.02

n = 87 n = 708 n = 3384400 n = 462300 GAST-GB TAKE a nap

Figure 30. Relative and absolute frequencies of LVCs with nap in GAST

(128) One day, just like his grandfather, he passed by the same forest, it was very hot, and he took a nap under the same tree and left the hats on the floor. 〈http://www.elakiri.lk/forum/showthread.php?t=34468〉 (17 October 2014) (129) Santa was all ready for Christmas and was having a nap. 〈http://www. landofmagic.co.uk/christmas_stories.htm〉 (17 October 2014)

A similar pattern of variety-specific LV preference can be attested with the noun rest. The frequencies of rest with the LVs under scrutiny in the newspaper data point towards shared preferences in SLE and IndE since TAKE occurs with rest in all examples in SAVE-SL and SAVE-IND as shown in (130) and (131). In contrast, 50.00% of the BrE text samples taken from BNC news show HAVE as a LV as in

 The Lexis and Lexicogrammar of Sri Lankan English

(132), and TAKE occurs in 29.17% of the BrE newspaper examples. The diffe rences between the newspaper datasets are statistically significant (p < 0.05).36 (130) Recently while shooting on the sets of Mani Ratnamâs [= Ratnam’s] Guru, the beautiful Miss Rai had a minor accident while riding a bicycle for a scene. Even though the lass was advised to take rest, the actress decided to resume shooting after a dayâs [= day’s] rest. 〈SAVE-SL-DM_2006-06-05〉 (131) After 65 years of toil put into making idols of deities and models of the great, this octogenarian artist has been virtually compelled to take rest this year owing to old-ageand [= old-age and] illness. 〈SAVE-IND-SM_2004-10-05〉 (132) Corrie’s Original Easy Kneeler Stool (about Â£33 [= £33]) is robustly constructed and comes with a foam mat that can be used to kneel on when weeding or, turned over, to sit on while having a rest and a cup of tea. 〈BNC AHK〉

The findings based on the online searches are documented in Figure 31. They support the trends deduced from the newspaper texts. TAKE a rest is the LVC which occurs most frequently in relative terms in the Sri Lankan (73.11%) and the Indian (68.30%) online data. HAVE a rest is the second most frequent LVC in both domains. In the BrE data, HAVE a rest (73.37%) can be found most frequently and TAKE a rest occurs with a relative frequency of 19.05%. Thus, with rest as a noun in LVCs, SLE and IndE have a shared default option, namely TAKE a rest followed by HAVE a rest, but BrE deviates from this pattern in that HAVE a rest appears to be the preferred realisation in this variety. LVCs with PUT and rest as a noun are attested, but very infrequent in comparison to the other LVCs with rest. The findings in relation to frequency, distribution and types of LVCs in ICE and the newspaper data can be summarised as follows. Although IndE and SLE to a more limited extent may have a tendency to use more LVCs than BrE, these cross-varietal differences in terms of the overall frequency of LVCs are systematic for IndE in comparison to BrE only. While both HAVE and TAKE are characteristic (though not the most frequent) LVs in SLE (and BrE), TAKE generally appears to be the most typical LV in IndE. BrE favours the use of the indefinite article in LVCs while IndE displays zero articles most frequently. Preferred article use in SLE LVCs is not as clear-cut since both indefinite and zero articles figure prominently. The informal character of LVCs is relatively obvious in the BrE data, but there is enough evidence against a similarly strong association between LVCs

. The pairwise comparisons, however, do not exhibit statistically significant differences (p > 0.05).

Chapter 5. Sri Lankan English lexicogrammar  100

GIVE a rest

PUT a rest

73.11

75 Rel. Freq. (%)

HAVE a rest

TAKE a rest 73.37

68.3

50

19.05 3.36

0

29.64

23.46

25

0.07

n = 91 n=2 n = 636 n = 1982 GAST-SL

1.4

0.67

n = 6520 n = 3115 n = 138390 n = 318917 GAST-IND Data

5.92

1.66

n = 81222 n = 22832 n = 1007300 n = 261580 GAST-GB

Figure 31. Relative and absolute frequencies of LVCs with rest in GAST

and informality in SLE and IndE. BrE, IndE and SLE are relatively uniform in that a given noun is combined with the same (set of) LV(s) to produce a particular meaning across all the varieties concerned. As regards potentially innovative LVCs, the newspaper and GAST data point towards HAVE a(n)/the/Ø glimpse as a novel South Asian LVC, which, however, has not been integrated into the respective acrolects. In contrast, TAKE a(n)/the/Ø benefit from and TAKE a(n)/the/Ø lease are neither particularly Sri Lankan nor Indian, but possible candidates for the integration into the stock of recurrent LVCs in each of the varieties examined including BrE. Although PUT is undoubtedly a productive LV in SLE, LVCs with PUT are generally infrequent and thus do not constitute default LVC variants in SLE (or IndE/BrE) with the nouns call, chat, fight, nap and rest. The data show shared tendencies between SLE and IndE in the use of LVCs with nap and rest (in juxtaposition to BrE; cf. Hoffmann et al. 2011: 272) while the LVCs MAKE a call, HAVE a chat and, though not as clearly, HAVE a fight are the default LVC patterns of the respective nouns across all varieties covered. 5.3 Verb complementation Another structurally intriguing area of investigation located at the interface between lexis and grammar is verb complementation. It has been shown that (habitual) combinations of a given verb with its complements differ quantitatively

 The Lexis and Lexicogrammar of Sri Lankan English

between (regional) varieties (cf. Mukherjee 2008 for SLE; Schilk 2011 for IndE) and may thus also be a fruitful starting point when delineating the development of newly-emerging Englishes (cf. Schneider 2007: 46). Olavarría de Ersson and Shaw (2003: 138) even go so far as to assign verb complementation supremacy over lexis in the characterisation of varieties of English when they state that “[v]erb complementation is an all-pervading structural feature of language and thus likely to be more significant in giving a variety its character than, for example, lexis”. The present study combines lexical with lexicogrammatical analyses to describe the emergence of SLE as a full-fledged variety, and thus avoids ascribing priority to either lexis or lexicogrammar in variety characterisation in favour of a complementary approach. Still, it is true that certain verbs (e.g. GIVE, OFFER, SEND, etc.) and their (deducible variety-specific preferences of) complementation patterns occur with relatively high frequencies in each variety while the analysis of lexical items may prove less informative in some cases since certain lexical items may occur only rarely. Investigations of high-frequency phenomena such as patterns of verb complementation therefore facilitate cross-varietal comparison and description although they do not necessarily characterise a particular variety as overtly as lexical items given that (variety-specific) verb-complementational profiles are more likely to escape speakers’ linguistic awareness (cf. Schneider 2007: 87). Verb complementation patterns may thus be considered more implicit variety markers and indicators of variety status than lexical items. When it comes to verb complementation in SLE, the verbs HATE, LIKE and LOVE have been described to display verb-complementational preferences different to the ones in BrE. Meyler (2007: 151) puts forward that “[v]erbs such as ‘like/love/hate’ can be followed either by the gerund (‘playing’) or by the infinitive (‘to play’). In BSE the gerund is more common, while in SLE the infinitive is more common”. Thus, it is certainly interesting to revisit this claim on an empirical basis to examine whether the suggested verb-complementational preferences in fact represent diverging lexicogrammatical trends between SLE and BrE (as well as IndE). Examples of the alternations between ‑ing complements and infinitive complementation of HATE, LIKE and LOVE are given from (133) to (138). (133) There is good news for those who hate going to doctors and taking bitter medicines. 〈SAVE-IND-TI_38019〉 (134) Things men hate to hear: “My last boyfriend…[.]” 〈SAVE-SL-DM_2007-04-10〉 (135) I did not like sleeping alone in the dark – or even with a light on come to think of it. 〈SAVE-SL-DN_2002-11-04〉 (136) We would also like to explore other economic areas. 〈SAVE-SL-DN_2001-12-28〉

Chapter 5. Sri Lankan English lexicogrammar 

(137) Men love feeling useful, and seeking their advice definitely fits the bill. 〈SAVE-SL-DM_2007-04-10〉 (138) Esha who prides herself original, said she loved to design and tailor since she was very young. 〈SAVE-SL-DM_2004-12-23〉

The alternation between ‑ing and infinitival complements of certain (groups of) verbs (e.g. AVOID, PREVENT) has been subject to diachronic change in the sense that there is a (recent) tendency that some verbs prefer ‑ing over infinitival forms as far as non-finite complements are concerned (cf. Vosberg 2006: 273). This trend away from infinitival towards ‑ing forms with non-finite complements has been referred to as the Great Complement Shift (cf. e.g. Rohdenburg 2006: 159; V osberg 2006: 18). In her diachronic corpus-based study on subject-control verbs, i.e. “verbs governing a complement clause whose unexpressed subject has the matrix subject as antecedent or ‘controller’” (Fanego 1996: 29), Fanego dates the onset of the Great Complement Shift to approximately 1650. Verbal gerunds developed very slowly in object position, to the extent that only one was recorded in stage I (1400–1570) and just two more in stage II (1570–1640). It is only from the second half of the 17th century onwards that the verbal gerund is found in significant numbers as the complement of a subject-control verb and can thus be considered an established feature of English usage. (Fanego 1996: 55)

Except for the group of verbs of attempting and venturing (e.g. ATTEMPT, ADVENTURE), most verbs concerned display ‑ing complements with incrementally increasing frequency by the 18th century (cf. Fanego 1996: 49). Still, the reasons for this general rise in ‑ing complements are far from clear, but it may be that the increase of ‑ing forms as verbal complements may be a contemporary corollary of an expansion of ‑ing forms in other areas of grammar such as the progressive (cf. Fanego 1996: 56). Despite this general development, there are certain principles rooted in the structural surrounding which can exert an influence on the choice of one verbal complement over the other. First, the principle of horror aequi, which was probably coined by Brugmann (1909; cf. Rohdenburg 2003: 244), “involves the widespread (and presumably universal) tendency to avoid the use of formally (near-) identical and (near-)adjacent (non-coordinate) grammatical elements or structures” (Rohdenburg 2003: 236). In relation to ‑ing and to-infinitive verbal complements, this means that if the verb in the superordinate clause is realised with an ‑ing form, then the verbal complement is likely to be infinitival (e.g. The parents blamed their children for neglecting to help with the housework) and if the verb in the superordinate clause takes an infinitival form, the ‑ing complement is preferred (e.g. To neglect helping with the housework can lead to tension). There is corpus-based evidence that the principle of horror aequi is clearly at work as regards non-finite verbal complements

 The Lexis and Lexicogrammar of Sri Lankan English

(cf. Vosberg 2006: 275). Second, Vosberg (cf. 2006: 276) is able to show that infinitival complementation is the more frequent option with discontinuous structures featuring non-finite complements (e.g. You should remember by all means to bring enough money rather than You should remember by all means bringing enough money). Third, with non-canonical clause element structures (e.g. with restrictive relative clauses as in The test I failed to pass was difficult), infinitives are preferred as verbal complements (cf. Vosberg 2006: 278). The degree to which the Great Complement Shift and the related principles have permeated into language use in South Asia has not yet been systematically explored. However, in accordance with Meyler (cf. 2007: 151), the research hypothesis for the complementation of HATE, LIKE and LOVE is that infinitival complementation patterns are more frequent than complementation patterns with ‑ing forms in the SLE data while the reverse is true for the BrE data. By implication, this would mean that the Great Complement Shift has (already) affected the structure of BrE, but not that of SLE. Due to their semantic characteristics, HATE, LIKE and LOVE belong to the group of what Quirk et al. (1985: 1192) refer to as “emotive verbs”. For this group of verbs, several selection criteria which supposedly guide the choice between ‑ing form and infinitive in the verbal complement have been posited. While to-infinitives have been argued to be chosen as complements of emotive verbs to express “‘potentiality’” (Quirk et al. 1985: 1192) or to refer to a concrete event (cf. Z andvoort 1965: 28–29; Poutsma 1904: 625), ‑ing complements are described to stand for factuality (cf. Quirk et al. 1985: 1192) or a general statement (cf. Z andvoort 1965: 28–29; Poutsma 1904: 625). In this context, Bladon (cf. 1968: 214) suggests an even more complex complement selection process under consideration of whether the complement expresses fulfilled or unfulfilled desire or actual or conditional enjoyment. Given that the functional profiles of the ‑ing and infinitival complements of emotive verbs delineated by Quirk et al. (1985) in comparison to Zandvoort (1965) and Poutsma (1904) could be regarded as contradictory, it is hardly surprising that Fanego’s (1996: 31) corpus data for emotive verbs contain “counterexamples showing the inadequacy of practically every criterion proposed in the literature”. In terms of the diachronic development of ‑ing as opposed to infinitival complements, Fanego’s (cf. 1996: 49) largely BrE historical data show that emotive verbs generally follow the trend of other verbs becoming more frequently complemented with ‑ing forms to the exception of LOVE, which – possibly due to its high frequency – appears to be reluctant to participate in the Great Complement Shift. The data at hand will be able to shed light on the present-day complementational profile of the three emotive verbs in BrE as well as the two South Asian Englishes studied. With the offline datasets, the data for the ‑ing complements are retrieved via the search term HATE/LIKE/LOVE *ing while the combinations of the given

Chapter 5. Sri Lankan English lexicogrammar 

verbs with to-infinitive constructions are searched for using the expression HATE/ LIKE/LOVE to. In the GAST searches, the frequencies of ‑ing complements are established by the frequencies of the patterns HATE/LIKE/LOVE being/having/ saying since BE, HAVE and SAY are the most frequent verbs in the offline datasets and may thus be considered decent candidates to frequently combine with HATE, LIKE and LOVE. Accordingly, infinitival complementation of HATE, LIKE and LOVE is examined via the expressions HATE/LIKE/LOVE to be/to have/to say. The verb-complementational profiles of the three verbs in SLE, IndE and BrE are closely examined in Chapters 5.3.1 to 5.3.3 and then discussed in relation to variety-formation of SLE. 5.3.1 Verb complementation: HATE

Rel. Freq. (%)

Variety-specific verb-complementational differences have been argued to exist with HATE. The relative and absolute frequencies of ‑ing complements (Ving) and infinitival complementation (to V) of HATE across the variety-specific offline and online datasets are given in Figure 32. 100 75 50 25 0

75 25 n=3

n=1

Rel. Freq. (%)

ICE-SL

100 75 50 25 0

Rel. Freq. (%)

n=2

60

50

50

n=3

n=1

n=1

ICE-IND Corpus

ICE-GB

100

0 n=0

n=2

SAVE-SL 100 75 50 25 0

40

50

50

53.7

46.3

n=3

n=3

n = 29

n = 25

SAVE-IND Corpus

90.74

9.26 n = 863

n = 8453 GAST-SL

BNC news

77.58

70.34 29.66 n = 246061 n = 583559 GAST-IND Data HATE Ving HATE to V

22.42 n = 543903 n = 1882491 GAST-GB

Figure 32. Relative and absolute frequencies of HATE Ving and HATE to V in the ICE, ewspaper and GAST data n

 The Lexis and Lexicogrammar of Sri Lankan English

In the ICE and newspaper data, HATE in complementation with either an ‑ing form as in (139) or a to-infinitive construction as in (140) is generally infrequent. The offline data do not yield statistically significant differences in either the ICE (p > 0.05) or the newspaper data (p > 0.05).37 (139) This is the best I can do, as I hate trying to understand all this stuff and that is why I moved from being a macroeconomist to a microeconomist who deals in labour/poverty/inequality/gender, but of course it is all related as people lose jobs when companies cut back investment. 〈ICE-SL:W1B-012#30:3〉 (140) I hate to sound so fed up, particularly because you are doing so much at your own level & at your centre; so I would like to hear from you. 〈ICE-IND:W1B-015#153:3〉

It is for this very reason not possible to deduce clear-cut verb-complementational trends for neither SLE nor IndE. The BNC news texts show that ‑ing form (53.70%) and infinitival complementation (46.30%) are almost equally frequent options in BrE. The GAST data, however, provide a clearer picture. The complementation of HATE with ‑ing forms is homogeneously less frequent in GAST-SL (9.26%), GAST-IND (29.66%) and GAST-GB (22.42%) than infinitival complementation. The shared preference for to-infinitive constructions is apparently most pronounced in SLE while ‑ing complements appear to be minority variants in IndE and BrE. The data thus show that HATE in SLE seems to have a tendency to be followed by to-infinitive constructions, but data scarcity in the SLE offline datasets demands additional empirical evidence to delineate more reliable verb- complementational trends. The same preference for infinitival complementation can also be established for IndE and BrE and ‑ing complements can be argued to be more important minority variants in the complementational profiles of HATE in IndE and BrE than in SLE. 5.3.2 Verb complementation: LIKE The second verb for which verb-complementational differences between SLE and BrE have been posited is LIKE. The frequencies of ‑ing and infinitival complementation of LIKE across the variety-specific text collections are plotted in F igure 33. In the ICE data, a cross-varietally stable trend for the complementation of LIKE can be shown since the to-infinitive construction as in (141) seems to be the

. All the respective pairwise comparisons in the ICE and newspaper data do not show statistically significant differences either (p > 0.05).

Chapter 5. Sri Lankan English lexicogrammar 

Rel. Freq. (%)

Rel. Freq. (%)

default complement of LIKE in ICE-SL (95.95%), ICE-IND (93.22%) and ICE-GB (98.53%). There are no significant differences in comparisons across all datasets (p > 0.05),38 which provides further evidence of the stability of the association between LIKE and the infinitival complementation across the varieties. To-infinitive constructions also constitute the vast majority of LIKE complements in SAVE-SL (98.70%), SAVE-IND (95.88%) and BNC news (91.08%), but there are highly significant differences with a weak correlation in their quantitative distribution (χ² ≈ 21.36, df = 2, p < 0.001, Cramer’s V ≈ 0.12).39 The low frequency of ‑ing complements of LIKE in SAVE-SL stands out and calls attention to the fact that, by implication, to-infinitives are particularly dominant in SAVE-SL. An example of the markedly infrequent ‑ing forms functioning as complements of LIKE in the Sri Lankan newspaper texts is given in (142). 100 75 50 25 0

100 75 50 25 0

95.95

93.22

4.05

6.78

n = 3 n = 71 ICE-SL

n = 4 n = 55 ICE-IND Corpus

98.7

1.3 n=3

Rel. Freq. (%)

SAVE-SL

100 75 50 25 0

91.08

8.92

n = 12 n = 279

n = 95 n = 970

SAVE-IND Corpus

BNC news

89.6 10.4

1.47 n = 1 n = 67 ICE-GB

95.88

4.12 n = 227

98.53

92.42

88.23 11.77

7.58

n = 11384 n = 98099

n = 697153 n = 5227876

n = 3725108 n = 45424379

GAST-SL

GAST-IND Data

GAST-GB

LIKE Ving

LIKE to V

Figure 33. Relative and absolute frequencies of LIKE Ving and LIKE to V in the ICE, newspaper and GAST data

. There are no significant differences in any of the pairwise comparisons in ICE (p > 0.05). . With the pairwise comparisons of the newspaper data, SAVE-SL differs highly significantly from BNC news (p < 0.001), but there are no significant differences with the remaining pairs (p > 0.01 (Bonferroni correction for multiple pairwise comparisons)).

 The Lexis and Lexicogrammar of Sri Lankan English

(141) Life is often too predictable, you like to be spontaneous and unscheduled. 〈ICE-SL:W2D-020#19:1〉 (142) Smell the roses. Nehru liked doing that. 〈SAVE-SL-DN_2002-08-15〉

The online data support the tendencies established in the offline data in that toinfinitives also represent the default option in GAST-SL (89.60%), GAST-IND (88.23%) and GAST-GB (92.42%). Consequently, in contrast to HATE, where ‑ing forms play a minor role as verb-complementational alternatives to to-infinitive constructions at least in the British and Indian data, there is hardly any indication that ‑ing complements are of major importance in the verb-complementational profile of LIKE in any of the varieties covered. To-infinitives are the default option in the complementation of LIKE across the varieties and this tendency appears to be particularly pronounced in SLE. 5.3.3 Verb complementation: LOVE

Rel. Freq. (%)

LOVE is another verb which has been described as exhibiting cross-varietal differences in its verb-complementational profiles (cf. Meyler 2007: 155). Figure 34 presents the frequencies of ‑ing and to-infinitive complements of LOVE in the offline and online data. 100 75 50 25 0

80 20 n=1

Rel. Freq. (%) Rel. Freq. (%)

100 75 50 25 0

40 10

n=4

ICE-SL 100 75 50 25 0

90 60

n=3

68 33.94

32 n = 8 n = 17 SAVE-SL

n=2

n=1

70.87

66.06 29.13

n = 37 n = 72 SAVE-IND Corpus

n = 74 n = 180 BNC news

92.35

89.61

66.82 7.65

n=9

ICE-GB

ICE-IND Corpus

33.18

10.39

n = 3051 n = 36842

n = 814117 n = 1639307

n = 1404499 n = 12114434

GAST-SL

GAST-IND Data

GAST-GB

LOVE Ving

LOVE to V

Figure 34. Relative and absolute frequencies of LOVE Ving and LOVE to V in the ICE, newspaper and GAST data

Chapter 5. Sri Lankan English lexicogrammar 

Despite the low frequency of LOVE in ICE, the data indicate that the to- infinitive complementation as in (143) is dominant in ICE-SL (80.00%) and ICEGB (90.00%) while in ICE-IND, ‑ing complements (60.00%) as in (144) occur more frequently. However, as these differences in verb-complementational preferences do not reflect statistically significant trends (p > 0.05), the ICE-based insights have to be considered in the light of additional data.40 (143) Apart from that I loved to do experiments even in my childhood. I wanted to see the logical side of any problem. 〈ICE-SL:W1A-014#158–159:12〉 (144) I am 〈O〉 drawing 〈/O〉 surprised perhaps. But 〈w〉 that’s 〈/w〉 all. Surprise is a nice thing. I love surprising people. One doesn’t get easily surprised these days. 〈ICE-IND:W2F-006#120–124:1〉

LOVE is generally more frequent in the newspaper corpora than in ICE, which renders the findings and resulting interpretations more reliable. To-infinitive constructions complement LOVE more frequently than ‑ing forms in SAVE-SL (68.00%), SAVE-IND (66.06%) and BNC news (70.87%). While the ICE data suggest differences in the complementational preferences of LOVE across the varie ties, a more uniform picture emerges on the basis of the newspaper texts in that to-infinitive constructions seem to be the default complementational option of LOVE in SLE, IndE and BrE. The absence of statistically significant differences and a particularly weak correlation (χ² ≈ 0.85, df = 2, p > 0.05, Cramer’s V ≈ 0.05) give further support to to-infinitives as the cross-varietal standard complement of LOVE.41 This description is confirmed by the GAST data. LOVE to V in comparison to LOVE Ving occurs more frequently in relative terms in GAST-SL (92.35%), GAST-IND (66.82%) and GAST-GB (89.61%). Although to-infinitive constructions appear to be the default complements of LOVE in IndE, the relative frequencies in GAST-IND (as well as SAVE-IND and ICE-IND) hint at infinitival complementation being less dominant in IndE than in SLE and BrE. The complementation of LOVE is thus comparable to that of HATE and LIKE. Infinitival complementation of LOVE apparently is the default option in SLE and BrE with ‑ing complementation only of marginal importance. While to-infinitive constructions also constitute the conventional complementation of LOVE in IndE, ‑ing complements appear to be significant minority variants in IndE.

. None of the pairwise comparisons of the ICE datasets yields statistically significant differences (p > 0.05). . The pairwise comparisons of the verb-complementational patterns of LOVE in the newspaper data do not yield significant differences either (p > 0.05).

 The Lexis and Lexicogrammar of Sri Lankan English

In revisiting Meyler’s (cf. 2007: 151) statement that infinitival complementation of HATE/LIKE/LOVE is the default choice in SLE while BrE opts more frequently for ‑ing complements with said verbs, the empirical analysis of these tendencies calls for a more detailed description. For SLE, it is certainly true that to-infinitive constructions are the common complements of HATE, LIKE and LOVE. This tendency also holds true for IndE, where LIKE by default combines with infinitival complements. In IndE, to-infinitive constructions are also the majority variants in the complementational profiles of HATE and LOVE, but with these verbs, ‑ing complements may also be regarded as important minority variants. When it comes to BrE, the data clearly showed that to-infinitives are also the default complements of HATE, LIKE and LOVE and that HATE Ving appears to have a noteworthy minority status. Accordingly, while Meyler’s (cf. 2007: 151) verb-complementational description of HATE, LIKE and LOVE is accurate for SLE (and IndE), the present analysis indicates that BrE does not seem to differ in its verb-complementational preferences from the two South Asian varieties. The implications of these findings are twofold. First, there is no clear-cut indication that SLE (or IndE) has undergone “verb-complementational nativization” (Schilk 2011: 172) with either HATE, LIKE or LOVE since the respective verb-complementational profiles of SLE and IndE are comparable to those of BrE. Second, the Great Complement Shift leading to a preference of ‑ing over infinitival complements (cf. Vosberg 2006: 273) has not (yet radically) affected the complementation of HATE, LIKE or LOVE in any of the varieties covered given that their default variant continues to be the infinitive at the moment. 5.4 Sri Lankan English lexicogrammar: An overview With a bird’s-eye view on the lexicogrammatical analyses conducted, it becomes obvious that structural nativisation processes can manifest themselves in both frequency-related (cf. Mukherjee 2007: 175) as well as categorical aspects (cf. Lange 2007), which sometimes interrelatedly shape newly-emerging varie ties of English. In the study of PVs, it is evident that despite the common sensitivity of PVs to formality, SLE features fewer PVs than BrE (and IndE). Lower frequencies of use, however, do not seem to inhibit the creative formation of earlier unrecorded multi-word verbs such as LEASE out, WAIVE off or COPE up with. In relation to LVCs, SLE shows a higher frequency than BrE, but a lower frequency than IndE. A distinctive description of LVCs in SLE also entails that

Chapter 5. Sri Lankan English lexicogrammar 

(a) HAVE and TAKE are typical LVs, (b) indefinite and zero articles both figure prominently and (c) the association between LVCs and informality is not as visible as in BrE. HAVE a glimpse constitutes a novel form in SLE (and IndE), but LVCs with PUT do not yet seem to have been institutionalised and thus still only play a marginal role in the acrolectal variety of SLE. The verb-complementational profiles of HATE, LIKE and LOVE do not appear to be marked by strong variety-specific differences. With each of the verbs, the to-infinitive construction is the default complement in SLE, IndE and BrE.

chapter 6

A model of (the emergence of) distinctive structural profiles of semiautonomous varieties of English This final chapter serves multiple interrelated purposes. It summarises the main results from the contrastive corpus-based investigations into SLE to re-illustrate its distinctive structural profile, which sets it apart from other World Englishes. Further, it depicts the scenario of multiple language contact and its major constituents which catalysed and continue to catalyse (the development of) this distinctive SLE structural profile. Against this background, it will be shown that the scenario of multiple language contact exerts conservative as well as progressive forces on the structures of SLE justifying the perception of SLE as a semiautonomous variety (cf. Mukherjee 2007: 182) developing in a complex area of tension. In a last step, a model (of the emergence) of distinctive structural profiles of semiautonomous varieties of English describing how these relatively abstract forces can have an impact on the concrete structural realisations of SLE is proposed and thus – with reference to and complementary suggestions on the process of structural nativisation (cf. Schneider 2003, 2007) – explains how this distinctive structural profile of SLE (and more generally semiautonomous varieties of English) comes into existence. The chapter ends on some concluding remarks. The lexical and lexicogrammatical investigations of SLE in contrast to IndE and BrE in Chapters 4 and 5 highlight a variety-specific structural profile of SLE on the basis of synchronic data. The examination of formality markers in Chapter 4.1 formed the starting point of the empirical analyses. The vocabulary of SLE contains some items which tend to be restricted to formal contexts in BrE (cf. Meyler 2007: xiv). Given that this restriction to formal contexts had been described to be weaker in PCEs (cf. Meyler 2007: xiv; Mesthrie & Bhatt 2008: 114–116), it was examined whether (a) SLE (and IndE) generally showed a higher frequency of formality markers than BrE and (b) formality markers in BrE tended to be used in more formal contexts, while formality markers appeared more evenly across genres associated with different degrees of formality in SLE (and IndE). From a merely quantitative perspective disregarding genre-related distributions, formality markers were shown to occur systematically more frequently in South Asian

 The Lexis and Lexicogrammar of Sri Lankan English

Englishes (and slightly more frequently in SLE than in IndE) than in BrE. With regard to the genre-related occurrence of formality markers, it became obvious that contextual formality appeared to be a more central prerequisite for the occurrence of formality markers in SLE and BrE than in IndE. Case studies on detrain, family member, persons and refrigerator revealed variety-specific collocational and colligational differences as well as preferences in the choice between variants of quasi-synonymous lexical pairs as potential sources of the frequency-related differences of formality markers in the varieties. The group of lexical items characterising SLE as a South Asian variety of English (e.g. lakh, dhal, etc.), which is consequently not frequently used in varie ties outside South Asia, was described in Chapter 4.2 as being partly shared with other South Asian Englishes. Thus, it was expected that this group of PSA lexemes would figure more prominently in South Asian Englishes than in BrE and, admittedly unsurprisingly, studies in the ICE, newspaper and GAST data verified this expectation, but also showed that PSA lexemes were slightly more frequent in IndE than in SLE. Based on this observation, it was argued that there were indications of a shared stock of local South Asian English vocabulary, which is in fact likely to characterise the varieties employing them more strongly as South Asian than formality markers or archaism markers and to indicate a lexical emancipation from BrE. Among the case studies conducted, rupee again provided evidence of distinct collocational preferences in the varieties studied, which, in this concrete case, could be related to differences in the extralinguistic realities of India and Sri Lanka, i.e. the stability of the currency to be more precise. With lakh, the state of the institutionalisation process of this PSA lexeme in comparison to other forms representing the value of 100,000 could be observed in SLE and IndE and it was shown that lakh had already been established as the standard variant in IndE while this PSA lexeme was still a minority variant in SLE. In line with descriptions of the vocabulary of IndE (cf. McArthur 2002: 323), it has been put forward that the lexis of SLE is marked by forms considered archaic from a BrE perspective (cf. Meyler 2007: xiv). Again, it was assumed that the group of archaism markers studied would be more frequent in South Asian Englishes than in BrE and the data as shown in Chapter 4.3 largely corroborated this assumption and provided indications that IndE featured more archaism markers than SLE. In this regard, variation in the degree of institutionalisation of different address terms for women and male teachers – with the archaism marker sir being the standard variant for male teacher in SLE and IndE – as well as cross-varietally shared constructions such as that + name of a person + fellow or HAIL from + family could be observed. In sum, the lexical studies provided ample evidence of quantitative as well as in some cases qualitative differences between SLE, IndE and BrE. From a merely

Chapter 6. A model of (the emergence of) distinctive structural profiles 

frequency-oriented perspective, formality markers may be regarded as characterising SLE (and IndE) more strongly than PSA lexemes or archaism markers, but it should not be overlooked that PSA lexemes are specific to the South Asian context. Consequently, in varieties outside South Asia, they were for the most part never in frequent use in the past and are currently also only marginally attestable, which is not the case with formality or archaism markers with a stronger tradition of use in e.g. BrE. Against this background, it seems reasonable to assume that PSA lexemes – although they occurred less frequently than formality markers – may nevertheless be stronger varietal markers for South Asian Englishes than the two other groups of lexemes, but corpus-linguistic evidence is only of little help here and cognitive approaches to the perception of PSA lexemes in the varieties covered can be expected to provide fruitful complementary insights. This also holds true for formality and archaism markers since it would be interesting to test with cognitive methods whether the groups of formality and archaism markers are perceived differently in South Asian Englishes than in BrE, e.g. to what extent do speakers of SLE judge lexical items considered formal by BrE speakers to be formal in their use of English as well? In Chapter 5.1, the study of PVs constituted the first lexicogrammatical investigation of SLE in the present study. In the light of previous research on PVs showing a generally lower frequency in South Asian Englishes than in BrE (cf. e.g. Schneider 2004: 246; Zipp & Bernaisch 2012: 176), the data at hand supported earlier findings in that BrE displayed the highest frequency of PVs followed by IndE and SLE – a finding that holds true across all three types of PVs studied, i.e. PVUs, PVOUs and PVOFs, in the newspaper data. The genre sensitivity of the three groups of PVs, among which PVOFs showed the highest degree of genre sensitivity and PVOUs the lowest, was inferred from Gries’ (2008) DP value. Despite their different degrees of genre sensitivity, the distribution of the individual groups of PVs in the ICE genres was relatively homogeneous and also held across the different varieties since the PVUs, PVOUs and PVOFs occurred uniformly less frequently than expected in academic writing (W2A), i.e. in a relatively formal context, and more frequently in creative writing (W2F), i.e. a less formal genre among the text categories in ICE. Consequently, it appeared to be the case that PVs would rather occur in informal than in formal contexts across the varieties studied. In comparisons to reference works on PVs (Gadsby 2000; Sinclair 2002), unrecorded PVs evident from the corpus data were discussed in more detail. In the context of COPE up with featuring an additional particle in comparison to the common-core majority variant COPE with, it was observed that the nativised form COPE up with was a minority variant in SLE and IndE proving that novel forms slowly establish themselves alongside their respective majority v ariants,

 The Lexis and Lexicogrammar of Sri Lankan English

which can, however, eventually also be displaced by these novel forms as could be observed with LEASE out. The newspaper data provided evidence that this unrecorded form had become the dominant variant compared to the single-word alternative LEASE in IndE, while LEASE out still had minority status in SLE and was only marginally attested in BrE. It was also in this context that variety- specific semantic preferences were discussed as another factor in accounting for frequency-related differences of variants since LEASE had a broader semantic scope in BrE and LEASE out in SLE and IndE. Explanations for the creation of new verb-particle combinations focussed on the notion of semantico-structural analogy, i.e. resorting to existing formal and semantic templates to derive novel forms (cf. Mukherjee 2007: 175–176), and cognitive differences in terms of path lexicalisation resulting in corresponding differences in structural realisations (cf. Hartford 1989: 113). In contrast to PVs, LVCs figured more prominently in South Asian Englishes than in BrE and this higher frequency was more pronounced in IndE than in SLE as shown in Chapter 5.2. Variation with LVCs could also be attested with regard to article realisation since the indefinite article was the most frequent form in LVCs in the BrE offline data, while the zero article was found most often in IndE. Given that both the indefinite and zero articles represent apparently almost equally productive structural options in SLE LVCs, article realisation with LVCs in SLE can be distinguished from that in BrE and IndE, but does not allow identifying one dominant realisation. The combinability of different LVs with certain nouns appeared to be stable across the varieties since a given noun in SLE, IndE and BrE was either realised with the same LV (e.g. PUT a(n)/the/Ø end) or the same set of LVs (e.g. TAKE a(n)/the/Ø look and HAVE a(n)/the/Ø look). In relation to the occurrence of LVCs in contexts with different degrees of formality, the BrE data showed a tendency of LVCs to occur more readily in informal contexts, while this association between informality and LVCs did not hold as consistently for SLE and IndE. In terms of innovative LVCs as examined with newspaper and GAST data, the status of HAVE a glimpse as a novel South Asian LVC could be verified, although it has apparently not (yet) been integrated into the respective acrolects. TAKE Ø benefit from and TAKE a lease can be attested in all varieties studied and PUT as a LV is cross-varietally productive as well, but none of the related constructions with the nouns call, chat, fight, nap or rest represents frequent combinations. An analysis of the complementation of the verbs HATE, LIKE and LOVE in Chapter 5.3 showed that, as posited by Meyler (cf. 2007: 151), to-infinitive complements occurred more frequently than ‑ing complements in SLE, but this is also true of IndE and BrE. It thus appeared reasonable to assume that (a) verb- complementational nativisation (cf. Schilk 2011: 172) had so far not affected the

Chapter 6. A model of (the emergence of) distinctive structural profiles 

Re

re Co

gi o

on

m

m Co

na (In l Ep dE ice ) nt

re

verbs concerned to an extent that would produce differences in complementational preferences across SLE, IndE and BrE and (b) HATE, LIKE and LOVE had not yet participated in the Great Complement Shift (cf. Rohdenburg 2006: 159) away from infinitival towards ‑ing complements (cf. Vosberg 2006: 273) in any of the varieties concerned. Consequently, the corpus-based examinations clearly highlight the distinctive structural profile of SLE differentiating it from other varieties of English around the world. Still, the question remains as to what possible origins this structural differentiation of SLE may have. Considering the continuous situation of multiple language contact in which SLE in particular, but also postcolonial Englishes (PCEs) more generally emerge may be beneficial in the identification of potential sources of variety-specific structural profiles. Figure 35 depicts the languages and varieties of English a PCE generally comes into contact with and relates this contact situation to the concrete case of SLE.

gu aF ra n in aL as gl ish En

(s) ge ua il) ng am La , T us ala no inh ge S di e.g. (

In

ca

PCE (SLE)

Figure 35. Multiple language contact situations of PCEs (SLE)

In the development of a PCE, there seem to be four major contacts that may have an effect on the structural (and functional) profile of a PCE. Regional epicentres, the “common core” (Quirk et al. 1985: 16) of English, indigenous languages of the respective speech communities and English as a lingua franca are the central linguistic factors which have the potential to influence PCEs.

 The Lexis and Lexicogrammar of Sri Lankan English

Regional epicentres of English are varieties of English exercising influences over the varieties in its surroundings (cf. Peters 2009: 108). Hoffmann et al. elaborate on the sociolinguistic configuration of (emerging) epicentres: On the one hand, an epicentre is marked internally by endo-normative stabilisation, i.e. by the wide-spread use, general acceptance and codification of the local norms of English. […] On the other hand, an epicentre should also have the potential to serve as a model of English for neighbouring countries, i.e. exert an influence on other speech communities in the region and, thus, challenge – to some extent at least – the all-encompassing gate-keeping function of British English […] and American English […]. (Hoffmann et al. 2011: 259)

Apart from BrE, American English and Australian English, i.e. L1 varieties which (may) serve as norm-providing models for other varieties of English, L2 varie ties such as Singapore English, a variety relatively advanced in Schneider’s (2003, 2007) evolutionary developmental model and thus characterised by endonormative stabilisation, have accordingly been argued to (have the potential to) develop into such regional epicentres (cf. Leitner 1992: 225). While Singapore English may thus function as a model for Southeast Asian Englishes, IndE has been described as taking on this role for the South Asian Sprachraum (cf. Leitner 1992: 225) and IndE undoubtedly is and has been in contact with SLE via diverse scenarios. In multilingual India, it is not surprising that English also takes on the role of a panregional and pan-ethnic link language (cf. Lange 2012: 7). For this reason, English figures prominently in various spheres of life in India such as business, education, the media, etc. This also entails that whenever a Sri Lankan e.g. conducts business or follows a sports event in India, it is not unlikely that the language used will be English. Given that human character is inevitably social, these encounters constitute contact scenarios between SLE and IndE with the potential to trigger conscious or subconscious linguistic accommodation on the part of the speakers and thus to instantiate epicentral linguistic influence from IndE onto SLE. These epicentral influences, however, can also be of a more institutionalised nature. As described in Chapter 2.3.3, in the context of the recent campaign English as a Life Skill, the Sri Lankan government agreed to have Sri Lankan master teachers trained in Hyderabad, India, where a teacher guide meant to be disseminated among the teachers located in Sri Lanka was also devised. It is relatively probable that the Indian speakers of English conducting the training in Hyderabad had a direct influence on the English of the Sri Lankan master teachers as well as on the linguistic content of the teacher guide, which will represent a lasting epicentral influence of IndE on SLE if the teacher guide is put into practice in Sri Lanka. In the light of this, a bottom-up modelling of the norms underlying the dative alternation in six South Asian Englishes and BrE could provide first strictly

Chapter 6. A model of (the emergence of) distinctive structural profiles 

empirical evidence of IndE as the linguistic epicentre of South Asia (cf. Gries & Bernaisch fc.), but diachronic studies to come will have to complementarily establish the extent to which potential epicentres have shaped the regional varieties in their close proximity in the past. L2 epicentres consequently fulfil a dual function for PCEs. While L2 epicentres provide structural models distinct from the respective historical input varie ties and thus promote regional structures of English different from the ones used in BrE or American English, they, at the same time, expose the neighbouring varieties to (admittedly regional) exonormative influences. Thus, the label “postcolonial English squared” (Bernaisch & Lange 2012: 13) may capture a part of the sociolinguistic reality of these regional varieties under the influence of an L2 epicentre since, in addition to that of the historical input variety, another exonormative influence is provided by a local L2-variety of English which, by itself, is a product of the evolution from the respective historical input variety via nativisation processes. For SLE, this means that although IndE may catalyse the linguistic emancipation of SLE from BrE and foster the development of SLE as a markedly South Asian variety, its influence must be considered counterproductive in establishing SLE as a variety of English in its own right. There is no doubt that contact with the varieties constituting the common core of English is the most salient factor in the development of a PCE. According to Quirk et al., [a] common core or nucleus is present in all the varieties so that, however esoteric a variety may be, it has running through it a set of grammatical and other characteristics that are present in all the others. It is this fact that justifies the application of the name ‘English’ to all the varieties. (Quirk et al. 1985: 16)

Without the linguistic interaction between mainly British or, in fewer cases, American speakers of English and local speech communities, most PCEs may have never come into existence. This is not meant to imply that BrE or American English or any other variety for that matter solely constitutes the common core of English – quite to the contrary. The common core is variety-unspecific in that it encapsulates the structures shared by all varieties of English world-wide excluding structures occurring in only one variety or a selection of varieties. Still, the initial contact of Sri Lankans with certain varieties of English containing ipso facto structures of the common core is of utmost importance to the foundation of SLE. Also well beyond this foundation phase and the independence from the colonial power is the contact with these varieties which brought the structures of the common core to a new territory in the first place salient to the development of PCEs. In this regard, BrE and American English have probably been crucial to the development of SLE. During the colonial administration of Sri Lanka,

 The Lexis and Lexicogrammar of Sri Lankan English

both British (cf. Gunesekera 2005: 11) and American settlers (cf. Jayawardena 2003: 205), though the latter group to a much lower extent, were present and must be assumed to have brought different, but related models of English to Sri Lanka. The current positive attitudes towards both varieties of English in the Sri Lankan speech community (cf. Bernaisch 2012: 289) and the apparent “linguistic schizophrenia” (Kachru 1992: 67) on the part of some users of SLE (cf. Künstler et al. 2009: 69) provide testimony that both L1-models are (still) of relevance to SLE speakers. For this reason, it may be argued that exonormative forces of the varie ties which represented the common core in the foundation phase of SLE have not ceased to exist. The contact of PCEs with indigenous languages is also a salient factor in their structural development as constructions in the local languages may serve as templates for corresponding structures in the respective local variety of English (cf. e.g. Lange 2012 on the presentational focus marker itself in IndE). SLE is mainly in contact with two indigenous languages, namely Sinhala and Tamil, and the fact that constructions in these languages can serve as structural outlines for SLE has been exemplified with regard to the usage of PROVIDE in the double-object construction (cf. Koch & Bernaisch 2013: 83–84). Sinhala features structures in which the Sinhala lexeme corresponding to PROVIDE is used ditransitively in doubleobject constructions, which represents a syntactic constellation not attestable in BrE, and the occurrence of PROVIDE in the double-object construction in SLE could thus be a structural reflection of contact scenarios with the indigenous languages of Sri Lanka. Still, diachronic text collections are needed to more closely scrutinise the emergence of PROVIDE in the double-object construction in SLE because this lexicogrammatical pattern could also be the result of structural influences from American English, where PROVIDE is often used in the double-object construction (cf. Koch & Bernaisch 2013: 83–84). Yet another contact situation that should not be underestimated is the one between a PCE and (the notion of) English as a lingua franca, i.e. “a fully formed ‘high’ language that […] serves as a medium for people who do not use it natively” (McArthur 2002: 2). English as a lingua franca facilitates global communication in English and is thus very much in demand not only in the corporate sector (cf. McArthur 2002: 416). Consequently, the scope of English as a lingua franca is necessarily international and its prime objective is that of at best global intelligibility, which is not as central for PCEs since they, in contrast, tend to linguistically represent local identities. In Sri Lanka, this global scope of English as a lingua franca has often placed obstacles in the developmental path of SLE, which may be, in the light of the disparate scopes just outlined, perceived as a threat to the international intelligibility and, thus, competitiveness of Sri Lankan speakers of English on the international labour market.

Chapter 6. A model of (the emergence of) distinctive structural profiles 

To tell the rural youth of Sri Lanka that they should speak English in whatever way they see as fit may be a momentarily empowering scenario. But what does it offer for their future? We as a nation have great wealth in terms of human capital that are very talented and with a wealth of ingenious skills. So why hamper the future of these skilled youth to develop effective English communicative skills in their childhood by offering this comfort zone that anything can pass? English after all is the number one international language, and the potential for the future of Sri Lanka is immense in the sphere of global commerce when equipped with effective English language skills for communication. (Boange 2010)

Acrolectal speakers of SLE have different styles of SLE at their disposal, can consequently switch competently between more local and more internationally- oriented forms of English (as outlined in Chapter 1) and are thus in a position to avoid these anticipated economic constraints of what is in Boange’s (2010) statement described as, but is not truly a laissez-faire approach to (the teaching of) SLE. This statement was made in relation to the presidential media campaign Speak English Our Way, which may – in the light of Boange’s (2010) perspective on the matter – not have fully succeeded in communicating that the usage and teaching of SLE is also guided by locally-developed norms as established by various expert groups set up in the context of English as a Life Skill to standardise SLE on a number of structural levels as discussed in Chapter 2.3. It may be for this reason that the promotion of a local form of English has been mistaken by some as the adoption of a laissez-faire approach to the teaching and usage of English in Sri Lanka. However, the capability of switching between different context-related styles may indeed not as readily apply to speakers at the lower ends of the SLE dialect continuum. In general, the notion of English as a lingua franca clearly puts pressure on speakers of SLE not to drift too far away from and conform with international standards of English or, in other words, to look towards non-local standard varie ties of English for orientation. Consequently, contact situations between SLE and English as a lingua franca may result in the adoption/preference of structures of what are considered internationally more current varieties in contrast to locally evolved features of SLE. In the light of this, it is central to point out that although Figure 35 depicts the individual contact situations of a PCE in isolation from the others, the boundaries between the single contact situations are fuzzy. The contact between a PCE and regional epicentres to some extent also entails a mediated contact between a PCE and the historical input variety, which is part of the common core (cf. Bernaisch & Lange 2012: 13). This is also the case with the contact situation between a PCE and English as a lingua franca. Thus, the multilayerdness of the contact of a PCE with its respective historical input variety is indicative of the salience of the input variety for the development of a PCE. However, what are the processes that e ventually

 The Lexis and Lexicogrammar of Sri Lankan English

facilitate the emergence and manifestation of local variety-specific structural profiles (and possible resulting norms) and culminate in the differentiation of a PCE from its historical input variety in this scenario of multiple language contact? Chapters 4 and 5 present insights into SLE (in contrast to IndE and BrE) lexis and lexicogrammar. The empirical observations of parts of the internal organisation of SLE (and IndE) serve as yardsticks to derive a model of (the emergence of) distinctive synchronic structural profiles of semiautonomous varieties of English. A semiautonomous variety of English is a nonnative variety that takes over to a very large extent – and includes – the common core of established native varieties of English, but it is also characterized both by interference (that is, forms and structures which can be traced back to speakers’ L1s) and by L2-internal creative autonomy (that is, by a potential for the development of new forms and structures in English as a second language). (Mukherjee 2007: 182)

As SLE shows trades of an endonormatively stabilised variety in Schneider’s (2003, 2007) model – a central prerequisite according to Mukherjee (cf. 2007: 181) – and as the three major forces shaping semiautonomous varieties, i.e. common core, L1 influence and autonomy, are at work in SLE, the notion of a semiautonomous variety can clearly be mapped onto SLE.1 Meyler (2010) puts forward that SLE shares many more features with BrE than there are uniquely SLE formal structures. This may to a certain extent be interpreted as a sign of the pervasiveness of the common core, which is inter alia constituted by BrE, in SLE. The influence of the local speakers’ L1 background is also a central driving force in shaping SLE as can be exemplified with a number of vocabulary items (see Chapter 4.2) and with earlier lexicogrammatical studies (cf. Koch & Bernaisch 2013: 83–84). The currency of formerly unrecorded PVs in Chapter 5.1.3 is one attestation of the creative potential of and the autonomous forces at work in SLE. Still, what are the mechanisms by means of which these relatively abstract forces on a semiautonomous variety can have an impact on the concrete structural realisations characterising this particular variety? How can these structural realisations help to establish the developmental state of the variety concerned? The model (of the emergence) of distinctive structural profiles of semiautonomous varieties of English in Figure 36 offers some explanations as to the emergence of structural profiles particular to semiautonomous varieties.

. Mukherjee (2007: 182) uses the term “interference” to describe structures that may have emerged against speakers’ L1s. As this expression is, however, to a certain extent a ssociated with foreign language acquisition and related prescriptive notions (cf. Kortmann 2005: 156), L1 influence may be a more neutral term for the description of PCEs and will thus be adopted here.

Nativisation Indicators

Distinctive Structural Profile of Semiautonomous Variety

Lexical

Phraseological

Nativisation Indicator 1

Textual

Nativisation Indicator 2

Cognitive

Nativisation Indicator 3

Endonormative Tendency

Semiautonomous Variety

Paths of Structural Nativisation

L1 Influence

Forces on Semiautonomous Variety

Autonomy

Chapter 6. A model of (the emergence of) distinctive structural profiles 

Common Core

Reference for Nativisation Indicator 1

Reference for Nativisation Indicator 2

Reference for Nativisation Indicator 3

Reference Variety

Exonormative Tendency

Figure 36. A model (of the emergence) of distinctive structural profiles of semiautonomous varieties of English

The model suggests four interwoven layers that shape and characterise the structural make-up of a semiautonomous variety. In the order from abstract to concrete, these layers are (a) forces on semiautonomous variety, (b) paths of structural nativisation, (c) nativisation indicators and (d) the distinctive structural profile of the semiautonomous variety. Each of these layers will be presented in detail, in the context of which the role of the reference variety will also be taken up. The three general forces, i.e. common core, L1 influence and autonomy at the top of the model, found and expedite (the emergence of) variety-specific structural profiles of semiautonomous varieties (cf. Mukherjee 2007: 182) and to a large extent originate from the multiple language contact situation of PCEs depicted in Figure 35. The force of the common core originates from several (elements of) language contacts, namely those of PCEs with (a) the common core, (b) English as a lingua franca and (c) a regional epicentre. Even though English as a lingua franca may predominantly serve as a rather non-localised means of communication between speakers who do not share a first language (cf. McArthur 2002: 2) and a regional epicentre may in contrast display more locally evolved structures, their respective contact scenarios with a PCE eventually constitute contact s ituations

 The Lexis and Lexicogrammar of Sri Lankan English

between PCEs and features shared by all varieties of Englishes, i.e. the common core. The force of the L1 influence derives primarily from the contact between PCEs and its respective indigenous languages and can potentially, if PCEs share indigenous languages with their respective regional epicentres, also originate from the contact between PCEs and features of regional epicentres that have been induced via these shared indigenous languages. To exemplify this with regard to SLE, it needs to be pointed out that Sri Lanka and India, with the latter supposedly being the South Asian regional epicentre (cf. Leitner 1992: 225), both share (partly different varieties of) Tamil as an indigenous language. Consequently, the force of the L1 influence may directly originate from Tamil usage in Sri Lanka, but may also be rooted in the contact between SLE and features of IndE that have been induced via the Tamil language in India. Due to the prominence of independence in the very concept of autonomy, i.e. “the ability to act and make decisions without being controlled by anyone else” (Hornby 2008: 89), the force of autonomy cannot exclusively originate from language contact scenarios. The force of autonomy may thus rather be assumed to stem from the unique sociolinguistic circumstances including the situation of multiple language contact in which individual PCEs are used to verbalise distinct everyday realities and (thus) evolve. In general, the three forces on semiautonomous varieties could be considered the developmental drive behind (the emergence of) semiautonomous varieties of English. However, as these forces are highly abstract notions, they need additional mechanisms – paths of structural nativisation – to exert an influence on actual language use. Paths of structural nativisation are abstract levels of language organisation via which the forces shaping semiautonomous varieties can permeate to impact concrete (structural) realisations. Although they have the scope to cover a number of structural nativisation processes, the paths listed in Figure 36 are not to be understood as an exclusive list. Peters (2009), for instance, highlights morphological innovations in Australian and New Zealand English, which may be seen as preliminary evidence of a morphological path of structural nativisation given that also in relation to SLE, Gunesekera (cf. 2005: 147) elaborates on the salience of morphological processes in providing SLE with its distinct lexical characteristics. Only those paths of structural nativisation for which the present study could offer empirical evidence entered the model, but it would certainly profit from future studies adding further paths of structural nativisation. In turn, a number of abstract subprocesses, i.e. branches of these paths, constitute each of these paths. The lexical path of structural nativisation is concerned with the organisation of single-word units. It is at this point that the default variants of (partially) synonymous lexical pairs are established, which would result in concrete preferred lexical options to refer, for example, to an artificial lake, i.e. either tank reflecting a certain degree of autonomy or reservoir showing a certain tendency towards what

Chapter 6. A model of (the emergence of) distinctive structural profiles 

could be interpreted as common core variants as described in 4.2.3. It is also via the lexical path of nativisation that particular lexemes from L1s are allowed into the respective semiautonomous varieties of English while others are not. As with paths of structural nativisation, the branches presented here are not meant to be exhaustive, but merely reflect the mechanisms that the present study could delineate. In other words, investigations to come can certainly contribute additional branches to the respective paths. Multi-word units are in the focus of the phraseological path of structural nativisation. On this path, lexemes or larger syntactic structures receive their collocational and colligational profile and their semantic preference, i.e. their combinability with other lexemes, grammatical categories and sets of semantically-related words (cf. Stubbs 2001: 65) is set. The textual path takes into account texts as a whole or sets of texts. Genre sensitivity of given linguistic structures is one of the elements shaped by the textual path and the functional scope, i.e. the selection of communicative contexts in which a variety of English is used, another. Finally, the mindset of speakers of a semiautonomous variety is configured via the cognitive path of structural nativisation. This path shapes speakers’ perceptions of and attitudes towards linguistic structures and/or the entire semiautonomous variety of English constituted by these structures. It needs to be stressed again that both the three general forces as well as the paths of structural nativisation are abstract concepts. For that reason, there must be concrete linguistic structures on the basis of which these notions can be traced and this is where nativisation indicators become relevant. Nativisation indicators are defined (sets of) structural realisations with the potential of depicting (a combination of) one or several active path(s) of structural nativisation. In essence, any structural object of investigation can function as a nativisation indicator although lexicogrammatical objects have been argued to be most fruitful in studies of PCEs (cf. e.g. Hundt 1998: 5; Mukherjee & Schilk 2008: 163) and are thus promising nativisation indicators. The present study opted for three lexical nativisation indicators, i.e. formality and archaism markers as well as PSA lexical items, and three lexicogrammatical nativisation indicators, i.e. PVs, LVCs and patterns of verb complementation. For an adequate description of the distinctive structural profile of a semiautonomous variety, it is central to elaborate on the assumptions under which the model operates. First, only via a point of comparison – the reference variety at the bottom of Figure 36 – can shared or distinct structural preferences in varieties of English be delineated. Thus, the degree of structural nativisation and related exonormative and endonormative tendencies of a semiautonomous variety can only be measured contrastively, i.e. primarily in comparison to the relevant historical

 The Lexis and Lexicogrammar of Sri Lankan English

input variety. As secondary points of reference, additional semiautonomous varieties and their structural profiles may offer relevant complementary insights. Second, all instantiations of structural nativisation eventually find reflection in frequencies of occurrence of given structural phenomena produced by members of a specific speech community in a postcolonial setting. Several studies (cf. e.g. Bauer 2002; Mair 2002; Sedlatschek 2009) have shown that many distinctive features of varieties of English can be found at the interface between lexicon and grammar, and […] these lexicogrammatical differences are usually quantitative in nature and not categorical. (Mukherjee 2007: 175)

Still, also categorical differences have the potential to add to the distinctive structural profile of semiautonomous varieties (cf. e.g. Lange 2012; Bernaisch & Lange 2012). As these categorical differences are rooted in the occurrence of a given phenomenon in one variety and the absence in another, they can, however, easily be translated into quantitative cross-varietal preferences at the extreme end. Third, the process of structural nativisation as suggested by Schneider (2003, 2007) is a generalisation of the subprocesses occurring at the level of nativisation indicators. Nativisation indicators are fine-grained units of language organisation on the basis of which endonormative or exonormative tendencies can be empirically investigated and modelled. Consequently, exonormative and endonormative trends on larger levels of language organisation, i.e. on structural levels such as lexis or syntax or on the level of entire semiautonomous varieties, are abstractions of the processes occurring and manifesting themselves at the level of nativisation indicators. This conceptualisation is necessary in order to account for instances where certain structural features of a semiautonomous variety are marked by locally-evolved norms while others rely on exonormative models. Against this background, this fine-grained modelling of structural nativisation would also accommodate processes of structural nativisation proceeding at different speeds in relation to different features. On a more abstract level, this conceptualisation also resonates with Mukherjee’s (2007: 157) description of semiautonomous varieties as being in “a stable, productive steady state in the evolutionary process in which there is an equilibrium between conflicting forces of progression and conservativism”. Against this background, the distinctive structural profile of a semiautonomous variety provides a synchronic snapshot of the degrees to which endonormative and exonormative tendencies characterise this semiautonomous variety on the basis of (a selection of) nativisation indicators. These tendencies rely on comparative frequency-based investigations of said nativisation indicators. The more structural facets of a nativisation indicator (such as overall frequency of occurrence or collocational/colligational profiles) are shared between the semiautonomous

Chapter 6. A model of (the emergence of) distinctive structural profiles 

and the reference variety, the stronger the exonormative tendency of the nativisation indicator concerned; the fewer shared structural facets, the stronger the endonormative tendency. In general, statistically significant differences in overall frequencies of occurrence (of e.g. LVCs) may be first indications that certain paths of structural nativisation have been active (e.g. due to autonomous forces channelled via the textual path of structural nativisation, formality-related quantitative differences with LVCs emerged in SLE). Still, if there are no significant differences in overall frequencies of occurrence with a certain nativisation indicator, this does not automatically mean that there are no nativised elements in the structural profile of the nativisation indicator concerned. With reference to the nativisation indicators formality markers (as discussed in Chapter 4.1) and verb-complementational patterns (as in Chapter 5.3), the mechanisms of the model can be exemplified. In this context, adding a secondary reference variety – in this case IndE as another semiautonomous variety – also provides relevant insights substantiating findings in relation to endonormative and exonormative tendencies in SLE. Formality markers occur more frequently in SLE than in BrE, which is thus one structural facet in relation to which differences between the semiautonomous variety and the reference variety can be attested. As this trend is only to some extent shared with IndE, it constitutes an endonormative tendency in SLE. This difference in overall frequency of a given nativisation indicator also suggests that there may be additional nativised elements in its structural profile possibly facilitating this divergence in frequency. Forces of the common core (level of forces on semiautonomous variety (see Figure 36)) have been channelled via the lexical path of structural nativisation (level of paths of structural nativisation) since SLE, like BrE (and IndE to a more limited extent), prefers fridge over refrigerator to refer to a place to cool food (level of nativisation indicators). This shared preference of SLE and BrE/IndE represents an exonormative tendency in SLE (level of distinctive structural profile of semiautonomous variety). The phraseological path of structural nativisation is also active in the structural characterisation of formality markers in SLE, which is evident from the collocational and colligational profiles of persons: in SLE, this formality marker frequently combines with word forms (as in internally displaced persons) and grammatical categories (e.g. definite numerals) which can only more rarely be found in the context of persons in BrE and with lower structural variability in IndE. Consequently, autonomous forces have been channelled to formality markers via the phraseological path, thus instantiating another structural facet mirroring endonormative tendencies.

 The Lexis and Lexicogrammar of Sri Lankan English

Forces of the common core, however, have permeated via the textual path of structural nativisation to formality markers and manifest an exonormative tendency in SLE since formality markers in SLE, BrE and IndE (though to a lesser extent) share sensitivity to the degree of formality of a given genre. As a result, although SLE in this case shows a certain adherence to exonormative standards, IndE to some extent provides evidence for the potential of the textual path of structural nativisation to profile autonomous forces so that they can find reflection in nativisation indicators. Autonomous forces have had an impact on formality markers via the cognitive path of structural nativisation to establish an endonormative tendency. This finds reflection in the fact that the mindset of speakers of SLE seems to differ from that of BrE speakers in that formality markers appear to be considered as markedly formal by BrE speakers (cf. Meyler 2007: xiv). (Differences in) frequency of use of formality markers across varieties may thus be a structural reflection of the activity of the cognitive path of structural nativisation. While there is some evidence of autonomous forces having been channelled through a number of paths of structural nativisation with regard to formality markers in SLE, this is not the case with another nativisation indicator, namely the verb-complementational profiles of HATE, LIKE and LOVE. Principally, forces of the common core have permeated through the phraseological path of structural nativisation resulting in a stable and default association between the verbs scrutinised and the infinitival complementation pattern. Consequently, with regard to this nativisation indicator, exonormative tendencies characterise the distinctive structural profile of SLE (and IndE). Still, establishing whether exonormative or endonormative tendencies dominate in the concrete synchronic structural profile of SLE as a semiautonomous variety is challenging. Above, this profile has generally been described as providing an objective picture of endonormative and exonormative tendencies on the basis of frequency-related characteristics of the individual nativisation indicators, but what should be the decisive facet(s) in establishing whether the profile of a nativisation indicator generally shows exonormative or endonormative tendencies? Should it be, for example, the frequency of occurrence of a given nativisation indicator or its collocational profile or both? And how much difference in frequency or what number of variety-specific collocations is needed to claim that endonormative forces outweigh the exonormative ones? Relating the synchronic structural profile of a given variety exclusively to one facet or a selection of facets ultimately involves the danger of failing to notice interesting potentially varietyspecific structural intricacies. In addition to that, it should not be ignored that the present study approaches SLE (and IndE) from a variationist perspective in which characteristics demarcating one variety from another take centre stage

Chapter 6. A model of (the emergence of) distinctive structural profiles 

while the (large amount of) shared structural features are of secondary interest. Consequently, it seems plausible to assume that it is the systematic presence of endonormative structural facets across a set of nativisation indicators located at different levels of language organisation (e.g. lexis, syntax, etc.) that allows arguing in favour of endonormative tendencies in a given semiautonomous variety. In the light of the fact that SLE (and IndE) display variety-specific facets with all nativisation indicators to the exception of the verb-complementational profiles of HATE, LIKE and LOVE, it is valid to argue that endonormative structural tendencies prevail in SLE (and IndE).2 Local norms in structural language organisation are of pivotal importance in the evolution of PCEs. All (socio-)linguistic developments of varieties of English in postcolonial settings are considered to culminate in a unique structural profile as there is a monodirectional causal relationship operating between them: (1) Extralinguistic factors, like historical events and the political situation, result in (2) characteristic identity constructions on the sides of the parties involved. These, in turn, manifest themselves in (3) sociolinguistic determinants of the contact setting (conditions of language contact, language use, and language attitudes), which, consequently, cause specific (4) structural effects to emerge in the form(s) of the language variety/-ies involved. (Schneider 2007: 30–31)

Thus, the present study provides empirical evidence for structural effects in SLE that, in conjunction with local sociolinguistic parameters, would validate perceiving this South Asian English as an endonormatively stabilised variety in Schneider’s (2003, 2007) model of the evolution of PCEs. From a more fine-grained perspective, however, it should not be overlooked that, even though several locally evolved norms on different structural levels are in place, there are also many reflections of the common core in SLE. For this reason, the evolutionary progress SLE has made in Schneider’s (2003, 2007) model is probably more adequately represented by the notion of a semiautonomous variety, in which exonormative and endonormative trends have established a distinctive variety-specific structural profile of SLE in a complex area of tension. . It should not go unmentioned here that reasons for the development of variety-specific structures that have been put forward earlier such as culture (cf. Olavarría de Ersson & Shaw 2003), simplification and (over)generalisation (cf. Williams 1987) as well as influences of particular language types (cf. Mesthrie 2006) can be reconciled with the model at hand. The notion of culture, for instance, is present in the force of L1 influence and autonomy, which can e.g. permeate via the phraseological path of structural nativisation to provide certain lexemes with their collocational profile reflecting culture-specific everyday experiences (as is e.g. the case with the collocation internally displaced persons (see Chapter 4.1.3)).

 The Lexis and Lexicogrammar of Sri Lankan English

While SLE displays clear traces of structural (and sociolinguistic) emancipation from its historical input variety, its South Asian character is also undeniable. In addition to particularly South Asian English vocabulary (e.g. lakh), there are numerous parallels between SLE and IndE in terms of frequencies of given structures (e.g. with PVs and verb-complementational patterns) and related structural phenomena (e.g. constructions such as COPE up with and HAVE a glimpse). This, however, by no means validates the perception of SLE as a variant of IndE (cf. Kachru passim, Strevens 1980: 86–87) given their abundant structural diffe rences delineated here (e.g. different overall frequencies of PSA lexemes, different collocational profiles with persons or different lexical default variants to refer to an artificial lake). Due to the synchronic nature of the empirical data, the present study refrains from formulating hypotheses about a possible epicentral influence of IndE (cf. Leitner 1992: 225) on SLE (or maybe vice versa with regard to some features) because only diachronic datasets will be able to provide an adequate perspective on the historical diffusion of linguistic structures and possibly related cross-varietal influences in the South Asian Sprachraum. Based on sociolinguistic and structural considerations, SLE is a South Asian variety of English displaying characteristics that largely meet the criteria for endonormative stabilisation in Schneider’s (2003, 2007) dynamic model of the evolution of PCEs. However, earlier research (cf. e.g. Mukherjee 2007; Schilk et al. 2012) as well as the present study point to the continuing influence of the common core on PCEs even though a given variety may have already developed a number of local norms and this also holds true for SLE. Accordingly, the notion of a semiautonomous South Asian variety of English in its own right captures the current developmental status of SLE most adequately. It was the central aim of this study to establish whether SLE could be considered a distinct, full-fledged South Asian English on the basis of empirical data; the lexical and lexicogrammatical investigations and the systematicity of the structural features of SLE so outlined have offered substantial proof for this conceptualisation. Consequently, the empirical findings at hand clearly validate a portrayal of SLE as a South Asian variety of English in its own right with an identity distinct from that of IndE. Further, the degree of institutionalisation of the SLE structures, i.e. their codification, recurrence and usage in a wide array of formal contexts, is also evidence against the inter alia local perception of SLE as a learner or interlanguage (cf. Passé 1950: 133; Gunesekera 2000: 112; Fonseka 2003: 2). The variety-specific structural profile of SLE has been described as emerging via complex processes of structural nativisation and it is in the light of this synchronic corpus-based investigation of written acrolectal SLE in comparison to IndE and BrE that a number of avenues for future research open up. Central topics that may be attractive for studies to come are the diachronic development of SLE,

Chapter 6. A model of (the emergence of) distinctive structural profiles 

spoken SLE, non-acrolectal variants of SLE or investigating other structural levels of acrolectal SLE. While the linguistic products of structural nativisation apparent in present- day SLE data could be successfully identified, the cycles in which structural nativisation proceeds can only be adequately studied on the basis of diachronic corpus data, which are generally lacking for varieties of English other than A merican and BrE. What renders diachronic data for L2-varieties in general and South Asian Englishes in particular all the more desirable is the fact that the structural influence from other varieties on SLE could then be studied on empirical bases. IndE has been argued to constitute a South Asian English epicentre (cf. Leitner 1992: 225) and, by implication, SLE should be considered to potentially be subject to influence from it. As the present study is synchronic in nature, however, it can only attest that SLE and IndE have some structural features (e.g. the PVs COPE up with or LEASE out) in common, but it cannot evaluate if or to what extent these are results of diachronic cross-varietal influences in South Asia. If it were to be empirically substantiated that IndE exerted influences on the structures of SLE, one would expect the presence of a given structure in early IndE data first and the subsequent surfacing of this structure in SLE texts. Although the earlier presence of the structure concerned in the IndE data would not suffice to unequivocally identify IndE as the source of this structure in SLE (since it could still be the case that the given structure emerged in SLE via other contact scenarios or via means of coinciding, but independent structural developments), the absence of this structure in the earlier IndE data would, however, clearly speak against an influence from IndE on SLE. Given that the potential influence of American English on SLE (cf. Koch & Bernaisch 2013: 82–84) could be studied in a similar fashion, diachronic corpus data of SLE (and other (postcolonial) regional varieties) would undoubtedly be highly beneficial in advancing the understanding of the potential sources and related developments of given structural phenomena of SLE (and (postcolonial) regional varieties in general). The dialect continuum of SLE (see Chapter 1) is constituted by acrolectal, mesolectal and basilectal variants of SLE. The present study focussed on acrolectal SLE, but the role mesolectal and basilectal SLE play in the local sociolinguistic scenery should not be underestimated given their numbers of speakers. The majority of tuk tuk drivers, owners of small restaurants or businesses and cleaning personnel – to name but a few groups – certainly does not constitute speakers of acrolectal SLE who competently and frequently resort to the English language in a number of domains. Still, the sheer amount of speakers of mesolectal and basilectal SLE would certainly warrant investigations into the structures of these variants of SLE as they can be assumed to outnumber acrolectal SLE in terms of speakers. However, the more limited contexts in which the mesolect and basilect of SLE

 The Lexis and Lexicogrammar of Sri Lankan English

are used in speech and writing and the resulting difficulty in obtaining considerable amounts of spoken and/or written data certainly are interrelated factors that have so far hindered the systematic empirical study of these variants located at the lower ends of the SLE dialect continuum. Spoken SLE has so far not been in the limelight of empirical studies either (with Herat (2005, 2006) being laudable exceptions). At the time of writing, the spoken part of the Sri Lankan component of ICE is compiled at Justus Liebig University Giessen, Germany, in collaboration with the University of Colombo, Sri Lanka – a project that aims at compiling the first representative collection of spoken SLE texts as used by competent speakers of SLE. On the basis of this authentic material, it will be possible to more closely study the sound level of SLE, but also other structural levels as well as the creativity of SLE speakers, all of which constitute promising fields of investigation for studies on SLE to come. Although the present study offers a contrastive in-depth perspective on a number of lexical and lexicogrammatical features, there are (a) other objects of investigation located at these structural levels that merit linguistic scrutiny in SLE (e.g. neologisms in SLE vocabulary (cf. e.g. Gunesekera 2005) or the distinction between count nouns/non-count nouns and related structural patterns (cf. e.g. Hundt 1998)) and (b) other levels of language organisation such as morphology or semantics that have so far largely been studied in isolation in SLE (cf. e.g. W erner & Mukherjee 2012). As alluded to earlier in this chapter, the model of (the emergence of) distinctive structural profiles of semiautonomous varieties of English would certainly profit from further empirical studies on the structure of SLE and other postcolonial varieties in general in that it can accommodate additional (e.g. morphological or semantic) paths of structural nativisation (and branches thereof) that the present study did not investigate. Under consideration of the results of the present study and earlier research, SLE has been modelled as a distinct South Asian semiautonomous variety of English – a conceptualisation which does justice to what eventually constitutes SLE. It shares a large amount of vocabulary and structures with other varieties of English and mere categorical assessments of the presence or absence of lexical items or lexicogrammatical combinations would probably produce only a relatively restricted number of variety-exclusive features of SLE (cf. Meyler 2010). Still, the fact that there may be only a few truly variety-exclusive features does not invalidate the perception of SLE as a full-fledged South Asian English in its own right since it is not only variety-exclusive features that characterise and, thus, help to distinguish (postcolonial) Englishes. Careful examinations of occurrences, genre-specificity and usage patterns yielded a unique variety-specific frequency-related structural profile of SLE. In the framework of World Englishes, its recurrent and systematic structures undoubtedly profile SLE as a distinct semiautonomous South Asian English on a par with other (New) Englishes around the globe.

References Ahulu, S. 1995. Variation in the use of complex verbs in international English. English Today 11(2): 28–34. DOI: 10.1017/S0266078400008233 Algama, D. 2008. Who’s collecting ‘bits and bobs’: Meyler or Dylan? Sunday Times, 23 March 2008. 〈http://mirisgala.net/Dilini_Algama_review.html〉 (17 October 2014). Algeo, J. 2006. British or American English? A Handbook of Word and Grammar Patterns. Cambridge: CUP. DOI: 10.1017/CBO9780511607240 Ansaldo, U. 2008. Sri Lanka Malay revisited: Genesis and classification. In Lessons from Documented Endangered Languages [Studies in Language Companion Series 78], K.D. Harrison, D.S. Rood & A. Dwyer (eds), 13–42. Amsterdam: John Benjamins. DOI: 10.1075/tsl.78.02ans Balasubramanian, C. 2009. Register Variation in Indian English [Studies in Corpus Linguistics 37]. Amsterdam: John Benjamins. DOI: 10.1075/scl.37 Bauer, L. 2002. Globality and locality in New Zealand. In From Local to Global English: Proceedings of Style Council 2001/2002, P. Peters (ed.), 54–67. Sydney: Macquarie University, Dictionary Research Centre. Bernaisch, T. 2012. Attitudes towards Englishes in Sri Lanka. World Englishes 31(3): 279–291. DOI: 10.1111/j.1467-971X.2012.01753.x Bernaisch, T. 2013. The verb-complementational profile of OFFER in Sri Lankan English. In Corpus Linguistics and Variation in English: Focus on Non-native Englishes, M. Huber & J. Mukherjee (eds), 〈http://www.helsinki.fi/varieng/series/volumes/13/bernaisch/〉 (17 October 2014). Helsinki: Research Unit for Variation, Contacts and Change in English. Bernaisch, T. & Lange, C. 2012. The typology of focus marking in South Asian Englishes. Indian Linguistics 73(1–4): 1–18. Bernaisch, T., Koch, C., Mukherjee, J. & Schilk, M. 2011. Manual for the South Asian Varieties of English (SAVE) Corpus: Compilation, Cleanup Process, and Details on the Individual Components. Giessen: Justus Liebig University. Biber, D. 1988. Variation across Speech and Writing. Cambridge: CUP. DOI: 10.1017/CBO9780511621024 Biewer, C., Hundt, M. & Zipp, L. 2010. ‘How’ a Fiji corpus? Challenges in the compilation of an ESL ICE component. ICAME Journal 34: 5–23. Bladon, R. 1968. Selecting the to- or -ing nominal after like, love, hate, dislike and prefer. English Studies 49(1–6): 203–214. DOI: 10.1080/00138386808597300 Boange, D. 2010. The bowl-or-ball dilemma of rubbishing English standards. Sunday Observer, 6 June 2010. 〈http://www.sundayobserver.lk/2010/06/06/mon08.asp〉 (17 October 2014). Bolinger, D.L. 1971. The Phrasal Verb in English. Cambridge MA: Harvard University Press. Bolton, K. (ed.). 2002. Hong Kong English: Autonomy and Creativity. Hong Kong: Hong Kong University Press. Bresnan, J. & Ford, M. 2010. Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language 86(1): 186–213. Bresnan, J. & Hay, J. 2008. Gradient grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua 118(2): 245–259. DOI: 10.1016/j.lingua.2007.02.007

 The Lexis and Lexicogrammar of Sri Lankan English Brinton, L.J. 1996. Attitudes towards increasing segmentalization: Complex and phrasal verbs in English. Journal of English Linguistics 24(3): 186–205. DOI: 10.1177/007542429602400304 Brugmann, K. 1909. Das Wesen der lautlichen Dissimilationen. Abhandlungen der philologischhistorischen Klasse der königlich-sächsischen Gesellschaft der Wissenschaften 27: 141–178. Coperahewa, S. 2009. The language planning situation in Sri Lanka. Current Issues in Language Planning 10(1): 69–150. DOI: 10.1080/14664200902894660 Davies, M. 2013. Corpus of Global Web-based English: 1.9 Billion Words from Speakers in 20 Countries. 〈http://corpus2.byu.edu/glowbe/〉 (17 October 2014). de Silva, K.M. 1981. A History of Sri Lanka. Berkeley CA: University of California Press. Dempsey, K.B., McCarthy, P.M. & McNamara, D.S. 2007. Using phrasal verbs as an index to distinguish text genres. In Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference, D. Wilson & G. Sutcliffe (eds), 217–222. Menlo Park CA: AAAI Press. Dixon, R.M.W. 2005. A Semantic Approach to English Grammar. Oxford: OUP. Fanego, T. 1996. The development of gerunds as objects of subject-control verbs in English (1400–1760). Diachronica 13(1): 29–62. DOI: 10.1075/dia.13.1.03fan Fernando, C. 1977. English and Sinhala bilingualism in Sri Lanka. Language in Society 6(3): 341–360. DOI: 10.1017/S0047404500005054 Fernando, S. 1985. Changes in Sri Lankan English as reflected in phonology. University of Colombo Review 5: 41–53. Fernando, S. 2003. The vocabulary of Sri Lankan English: Words and phrases that transform a foreign language into their own. Paper presented at the 9th International Conference on Sri Lankan Studies, Matara, Sri Lanka, 28–30 November 2003. Field, A., Miles, J. & Field, Z. 2012. Discovering Statistics Using R. London: Sage. Fisher, R.A. 1922. On the interpretation of χ² from contingency tables, and the calculation of P. Journal of the Royal Statistical Society 85(1): 87–94. DOI: 10.2307/2340521 Fletcher, W.H. 2007. Concordancing the web: Promise and problems, tools and techniques. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 25–45. Amsterdam: Rodopi. Fonseka, E.A.G. 2003. Sri Lankan English: Exploding the falacy. Paper presented at the 9th International Conference on Sri Lankan Studies, Matara, Sri Lanka, 28–30 November 2003. Gadsby, A. (ed.). 2000. Longman Phrasal Verbs Dictionary. Harlow: Longman. Goonetilleke, D. 2005. Sri Lankan English Literature and the Sri Lankan People 1917–2003. Colombo: Vijitha Yapa Publications. Greenbaum, S. 1988. Language spread and the writing of grammars. In Language Spread and Language Policy: Issues, Implications and Case Studies, P.H. Lowenberg (ed.), 133–139. Washington DC: Georgetown University Press. Greenbaum, S. 1996. Introducing ICE. In Comparing English Worldwide: The International Corpus of English, S. Greenbaum (ed.), 3–12. Oxford: Clarendon. Greenbaum, S. & Nelson, G. 1996. The International Corpus of English (ICE) Project. World Englishes 15(1): 3–15. DOI: 10.1111/j.1467-971X.1996.tb00088.x Grefenstette, G. & Nioche, J. 2000. Estimation of English and non-English language use on the www. In Computer-Assisted Information Retrieval (Recherche d’Information et ses Applications) – RIAO 2000, 6th International Conference, College de France, France, April 12–14, 2000. Proceedings, J.-J. Mariani & D. Harman (eds), 237–246. CID. Gries, S.Th. 2008. Dispersion and adjusted frequencies in corpora. International Journal of Corpus Linguistics 13(4): 403–437. DOI: 10.1075/ijcl.13.4.02gri

References  Gries, S.Th. 2009. Quantitative Corpus Linguistics with R. London: Routledge. Gries, S.Th. & Bernaisch, T. Forthcoming. Exploring epicentres empirically: Focus on South Asian Englishes. English World-Wide 37(1). Gunesekera, M. 2000. Morphosyntactic errors of fluent speakers of English in Sri Lanka. Vaag Vidya 7: 112–133. Gunesekera, M. 2005. The Postcolonial Identity of Sri Lankan English. Colombo: Katha Publishers. Gunesekera, M. 2006. Why teach Sri Lankan English in a multilingual environment? In English in the Multilingual Environment: Selected Papers from the 3rd International Conference of the Sri Lanka English Language Teachers’ Association, H. Ratwatte & S. Herath (eds), 29–45. Colombo: SLELTA. Hartford, B.S. 1989. Prototype effects in non-native English: Object-coding in verbs of saying. World Englishes 8(2): 97–117. DOI: 10.1111/j.1467-971X.1989.tb00647.x Herat, M. 2001. Speaking and writing in Lankan English: A study of native and non-native users of English. California Linguistic Notes 26(1), 〈http://hss.fullerton.edu/linguistics/CLN/ spring01_articles/herat.pdf〉 (17 September 2010). Herat, M. 2005. BE variation in Sri Lankan English. Language Variation and Change 17(2): 181–208. DOI: 10.1017/S0954394505050088 Herat, M. 2006. Substitute one in Sri Lankan English. Leeds Working Papers in Linguistics 11, 〈http://www.leeds.ac.uk/linguistics/WPL/WPL11.html〉 (11 February 2014). Hilbert, M. & Krug, M. 2010. The compilation of ICE Malta: State of the art and challenges along the way. ICAME Journal 34: 54–63. Hoffmann, S. 2007. From web page to mega-corpus: The CNN transcripts. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 69–85. Amsterdam: Rodopi. Hoffmann, S., Hundt, M. & Mukherjee, J. 2011. Indian English – An emerging epicentre? A pilot study on light verbs in web-derived corpora of South Asian Englishes. Anglia 129(3–4): 258–280. DOI: 10.1515/angl.2011.083 Hohenthal, A. 2003. English in India: Loyalty and attitudes. Language in India 3(5), 〈http://www. languageinindia.com/may2003/annika.html〉 (17 October 2014). Hornby, A.S. 2008. Oxford Advanced Learner’s Dictionary of Current English. Oxford: OUP. Huddleston, R. & Pullum, G.K. 2002. The Cambridge Grammar of the English Language. Cambridge: CUP. Hundt, M. 1998. New Zealand English Grammar, Fact or Fiction? A Corpus-Based Study in Morphosyntactic Variation [Varieties of English Around the World G23]. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g23 Hundt, M., Hoffmann, S. & Mukherjee, J. 2012. The hypothetical subjunctive in South Asian Englishes: Local developments in the use of a global construction. English World-Wide 33(2): 147–164. DOI: 10.1075/eww.33.2.02hun Hundt, M., Nesselhauf, N. & Biewer, C. 2007. Corpus linguistics and the web. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 1–5. Amsterdam: Rodopi. Jayawardena, K. 2003. Nobodies to Somebodies: The Rise of the Colonial Bourgeoisie in Sri Lanka. Colombo: Social Scientists’ Association & Sanjiva Books. Johansson, S., Leech, G. & Goodluck, H. 1978. Manual of Information to Accompany the Lancaster-Oslo/Bergen Corpus of British English, for Use with Digital Computers. Oslo: Department of English, University of Oslo. Joseph, B.D. 2004. On change in Language and change in language. Language 80(3): 381–383. DOI: 10.1353/lan.2004.0132

 The Lexis and Lexicogrammar of Sri Lankan English Kachru, B.B. 1982. South Asian English. In English as a World Language, R.W. Bailey & M. Görlach (eds), 353–383. Ann Arbor MI: University of Michigan Press. Kachru, B.B. 1992. Models for non-native Englishes. In The Other Tongue: English across Cultures, B.B. Kachru (ed.), 48–74. Urbana IL: University of Illinois Press. Kandiah, T. 1981a. Disinherited Englishes: The case of Lankan English (Part II). Navasilu 4: 92–113. Kandiah, T. 1981b. Lankan English schizoglossia. English World-Wide 2(1): 63–81. DOI: 10.1075/eww.2.1.05kan Kandiah, T. 1984. Kaduva: Power and the English language weapon in Sri Lanka. In Honouring E.F.C. Ludowyk, P. Colin-Thomé & A. Halpé (eds), 117–154. Colombo: Tisara Prakasakayo. Koch, C. & Bernaisch, T. 2013. Verb complementation in South Asian English(es): The range and frequency of ‘new’ ditransitives. In English Corpus Linguistics: Variation in Time, Space and Genre – Selected Papers from ICAME 32, G. Andersen & K. Bech (eds), 69–89. Amsterdam: Rodopi. Kortmann, B. 2005. English Linguistics: Essentials. Berlin: Cornelsen. Kortmann, B. & Schneider, E.W. 2008. General introduction. In Varieties of English: Africa, South and Southeast Asia, R. Mesthrie (ed.), 1–7. Berlin: Mouton de Gruyter. DOI: 10.1515/9783110208429 Körtvelyessy, M., Bernaisch, T., Mukherjee, J. & Mendis, D. 2012. Manual to the Written Component of the International Corpus of English – Sri Lanka ICE-SL [W200]. Giessen: Justus Liebig University. Kumara, S.M.D.S. & Mendis, D. 2010. Making out SLE: A corpus based study of Sri Lankan English phrasal verbs. Paper presented at the 6th International SLELTA Conference, Colombo, Sri Lanka, 15–17 October 2010. Kumarasamy, S. 2007. Understanding Language Policy and Planning in Sri Lanka. MA thesis, University of Duisburg-Essen. Künstler, V., Mendis, D. & Mukherjee, J. 2009. English in Sri Lanka: Language functions and speaker attitudes. Anglistik: International Journal of English Studies 20(2): 57–74. Labuhn, U. 2001. Von Give a Laugh bis Have a Cry. Zu Aspektualität und Transitivität der V + N-Konstruktionen im Englischen. Frankfurt: Peter Lang. Lange, C. 2007. Focus marking in Indian English. English World-Wide 28(1): 89–118. DOI: 10.1075/eww.28.1.05lan Lange, C. 2012. The Syntax of Spoken Indian English [Varieties of English Around the World G45]. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g45 Leech, G. 2007. New resources, or just better old ones? The Holy Grail of representativeness. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 133–149. Amsterdam: Rodopi. Leech, G., Hundt, M., Mair, C. & Smith, N. 2009. Change in Contemporary English: A Grammatical Study. Cambridge: CUP. DOI: 10.1017/CBO9780511642210 Leisi, E. & Mair, C. 1999. Das heutige Englisch: Wesenszüge und Probleme. Heidelberg: Winter. Leitner, G. 1992. English as a pluricentric language. In Pluricentric Languages: Differing Norms in Different Nations, M. Clyne (ed.), 179–237. Berlin: Mouton de Gruyter. Lim, L. (ed.). 2004. Singapore English: A Grammatical Description [Varieties of English Around the World G33]. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g33 Live, A.H. 1973. The TAKE-HAVE phrasal in English. Linguistics 95: 31–50.

References  Lüdeling, A., Evert, S. & Baroni, M. 2007. Using web data for linguistic purposes. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 7–24. Amsterdam: Rodopi. Mair, C. 2002. Three changing patterns of verb complementation in Late Modern English: A real-time study based on matching text corpora. English Language and Linguistics 6(1): 105–131. DOI: 10.1017/S1360674302001065 Mair, C. 2007. Change and variation in present-day English: Integrating the analysis of closed corpora and web-based monitoring. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 233–247. Amsterdam: Rodopi. Mair, C. & Winkle, C. 2012. Change from to-infinitive to bare infinitive in specificational cleft sentences: Data from World Englishes. In Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes [Varieties of English Around the World G43], M. Hundt & U. Gut (eds), 243–262. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g43.10mai McArthur, T. 1998. Concise Oxford Companion to the English Language. Oxford: OUP. McArthur, T. 2001. World English and world Englishes: Trends, tensions, varieties, and standards. Language Teaching 34(1): 1–20. DOI: 10.1017/S0261444800016062 McArthur, T. 2002. The Oxford Guide to World English. Oxford: OUP. Mendis, D. 2002. Language planning and ethnicity: Attitudes and perceptions from the education sector. The Sri Lanka Journal of the Humanities 17–18: 161–184. Mendis, D. 2010. Formality in academic writing: The use/non-use of phrasal verbs in two varieties of English. In English for Professional and Academic Purposes, M.F. Ruiz-Garrido, J.C. Palmer-Silveira & I. Fortanet-Gomez (eds), 11–24. Amsterdam: Rodopi. Mendis, D. & Rambukwella, H. 2010. Sri Lankan Englishes. In The Routledge Handbook of World Englishes, A. Kirkpatrick (ed.), 181–196. London: Routledge. Mesthrie, R. 2006. Anti-deletions in an L2 grammar: A study of Black South African English mesolect. English World-Wide 27(2): 111–145. DOI: 10.1075/eww.27.2.02mes Mesthrie, R. & Bhatt, R.M. 2008. World Englishes: The Study of New Linguistic Varieties. Cambridge: CUP. Meyler, M. 2007. A Dictionary of Sri Lankan English. Colombo: Mirisgala. Meyler, M. 2009. Sri Lankan English: A distinct South Asian variety. English Today 25(4): 55–60. DOI: 10.1017/S0266078409990447 Meyler, M. 2010. A snooty English speaker’s reply. Groundviews, 3 June 2010. 〈http://www. groundviews.org/2010/06/03/a-snooty-english-speaker%E2%80%99s-reply/#more-3538〉 (17 October 2014). Mukherjee, J. 2007. Steady states in the evolution of New Englishes: Present-day Indian English as an equilibrium. Journal of English Linguistics 35(2): 157–187. DOI: 10.1177/0075424207301888 Mukherjee, J. 2008. Sri Lankan English: Evolutionary status and epicentral influence from Indian English. In Anglistentag 2007 Münster: Proceedings, K. Stierstorfer (ed.), 359–368. Trier: Wissenschaftlicher Verlag Trier. Mukherjee, J. 2009. The lexicogrammar of present-day Indian English: Corpus-based perspectives on structural nativisation. In Exploring the Lexis-Grammar Interface [Studies in Corpus Linguistics 35], U. Römer & R. Schulze (eds), 117–135. Amsterdam: John Benjamins. DOI: 10.1075/scl.35.9muk

 The Lexis and Lexicogrammar of Sri Lankan English Mukherjee, J. 2010. Corpus-based insights into verb-complementational innovations in Indian English: Cases of nativised semantico-structural analogy. In Grammar between Norm and Variation, A.N. Lenz & A. Plewnia (eds), 219–241. Frankfurt: Peter Lang. Mukherjee, J. 2012. English in South Asia – Ambinormative orientations and the role of corpora: The state of the debate in Sri Lanka. In English as an International Language in Asia: Implications for Language Education, A. Kirkpatrick & R. Sussex (eds), 191–208. Heidelberg: Springer. DOI: 10.1007/978-94-007-4578-0_12 Mukherjee, J. & Gries, S.Th. 2009. Collostructional nativisation in New Englishes: Verb- construction associations in the International Corpus of English. English World-Wide 30(1): 27–51. DOI: 10.1075/eww.30.1.03muk Mukherjee, J. & Hoffmann, S. 2006. Describing verb-complementational profiles of New Englishes: A pilot study of Indian English. English World-Wide 27(2): 147–173. DOI: 10.1075/eww.27.2.03muk Mukherjee, J. & Schilk, M. 2008. Verb-complementational profiles across varieties of English: Comparing verb classes in Indian English and British English. In The Dynamics of Linguistic Variation: Corpus Evidence on English Past and Present [Studies in Language Variation 2], T. Nevalainen, I. Taavitsainen, P. Pahta & M. Korhonen (eds), 163–181. A msterdam: John Benjamins. DOI: 10.1075/silv.2.14muk Mukherjee, J. & Schilk, M. 2012. Exploring variation and change in New Englishes: Looking into the International Corpus of English (ICE) and beyond. In The Oxford Handbook of the History of English, T. Nevalainen & E.C. Traugott (eds), 189–199. Oxford: OUP. Mukherjee, J., Schilk, M. & Bernaisch, T. 2010. Compiling the Sri Lankan component of ICE: Principles, problems, prospects. ICAME Journal 34: 64–77. Nelson, G. 1996. The design of the corpus. In Comparing English Worldwide: The International Corpus of English, S. Greenbaum (ed.), 27–35. Oxford: Clarendon. Nelson, G. & Hongtao, R. 2012. Particle verbs in African Englishes: Nativization and innovation. In Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes [Varieties of English Around the World G43], M. Hundt & U. Gut (eds), 197–213. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g43.08nel Nesselhauf, N. 2009. Co-selection phenomena across New Englishes: Parallels (and differences) to foreign learner varieties. English World-Wide 30(1): 1–26. DOI: 10.1075/eww.30.1.02nes Nihalani, P., Tongue, R.K., Hosalim, P. & Crowther, J. 2004. Indian and British English: A Handbook of Usage and Pronunciation. New Dehli: OUP. Olavarría de Ersson, E. & Shaw, P. 2003. Verb complementation patterns in Indian Standard English. English World-Wide 24(2): 137–161. DOI: 10.1075/eww.24.2.02ers Parakrama, A. 1995. De-Hegemonizing Language Standards: Learning from (Post) Colonial Englishes about ‘English’. London: Macmillan. DOI: 10.1057/9780230371309 Passé, H.A. 1943. The English language in Ceylon. University of Ceylon Review 1(2): 50–65. Passé, H.A. 1950. Common errors in Ceylon English. University of Ceylon Review 8(3): 133–160. Pearson, K. 1900. X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine 50(5): 157–175. DOI: 10.1080/14786440009463897 Peters, P. 2009. Australian English as a regional epicentre. In World Englishes – Problems, Properties and Prospects [Varieties of English Around the World G40], T. Hoffmann & L. Siebers (eds), 107–124. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g40.09pet

References  Poutsma, H. 1904. A Grammar of Late Modern English: For the Use of Continental, Especially Dutch, Students, Part I: The Sentence. Groningen: Noordhoff. Quirk, R., Greenbaum, S., Leech, G. & Svartvik, J. 1985. A Comprehensive Grammar of the English Language. London: Longman. Raheem, R. 2006. Configuring the mosaic: Investigating language use and attitude in Sri Lanka. In English in the Multilingual Environment: Selected Papers from the 3rd International Conference of the Sri Lanka English Language Teachers’ Association, H. Ratwatte & S. Herath (eds), 13–27. Colombo: SLELTA. Renouf, A., Kehoe, A. & Banerjee, J. 2005. The WebCorp Search Engine: A Holistic Approach to Web Text Search. Birmingham: University of Birmingham. Renouf, A., Kehoe, A. & Banerjee, J. 2007. WebCorp: An integrated system for web text search. In Corpus Linguistics and the Web, M. Hundt, N. Nesselhauf & C. Biewer (eds), 47–67. Amsterdam: Rodopi. Renský, M. 1966. English verbo-nominal phrases: Some structural and stylistic aspects. Travaux Linguistique de Prague 1: 289–299. Rohdenburg, G. 2003. Cognitive complexity and horror aequi as factors determining the use of interrogative clause linkers in English. In Determinants of Grammatical Variation in English, G. Rohdenburg & B. Mondorf (eds), 205–249. Berlin: Mouton de Gruyter. DOI: 10.1515/9783110900019 Rohdenburg, G. 2006. The role of functional constraints in the evolution of the English complementation system. In Syntax, Style and Grammatical Norms: English from 1500–2000, C. Dalton-Puffer, D. Kastovsky & H. Schendl (eds), 143–166. Frankfurt: Peter Lang. Samarakkody, M. & Braine, G. 2005. Teaching English in Sri Lanka: From colonial roots to Lankan English. In Teaching English to the World: History, Curriculum, and Practice, G. Braine (ed.), 147–157. Mahwah NJ: Lawrence Erlbaum Associates. Schilk, M. 2011. Structural Nativization in Indian English Lexicogrammar [Studies in Corpus Linguistics 46]. Amsterdam: John Benjamins. DOI: 10.1075/scl.46 Schilk, M., Bernaisch, T. & Mukherjee, J. 2012. Mapping unity and diversity in South Asian English lexicogrammar: Verb-complementational preferences across varieties. In Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes [Varieties of English Around the World G43], M. Hundt & U. Gut (eds), 137–165. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g43.06sch Schneider, E.W. 2003. The dynamics of New Englishes: From identity construction to dialect birth. Language 79(2): 233–281. DOI: 10.1353/lan.2003.0136 Schneider, E.W. 2004. How to trace structural nativization: Particle verbs in world Englishes. World Englishes 23(2): 227–249. DOI: 10.1111/j.0883-2919.2004.00348.x Schneider, E.W. 2007. Postcolonial English: Varieties Around the World. Cambridge: CUP. DOI: 10.1017/CBO9780511618901 Sedlatschek, A. 2009. Contemporary Indian English: Variation and Change [Varieties of English Around the World G38]. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g38 Selvadurai, S. 1994. Funny Boy. Toronto: McClelland and Stewart. Senaratne, C.D. 2009. Sinhala-English Code-Mixing in Sri Lanka: A Sociolinguistic Study. Utrecht: LOT. Seneviratne, M. 2010. The waylaying ways of ‘English Our Way’. Daily Mirror, 7 August 2010. 〈http://print2.dailymirror.lk/opinion1/17853.html〉 (11 February 2014). Shastri, S.V. 1988. The Kolhapur Corpus of Indian English and work done on its basis so far. ICAME Journal 12: 15–26.

 The Lexis and Lexicogrammar of Sri Lankan English Shastri, S.V., Patilkulkarni, C.T. & Shastri, G.S. 1986. Manual of Information to Accompany the Kolhapur Corpus of Indian English, for Use with Digital Computers. Kolhapur: Shivaji University. Sinclair, J. (ed.). 2002. Collins COBUILD Phrasal Verbs Dictionary. Glasgow: HarperCollins. Sivanandan, A. 1997. When Memory Dies. London: Arcadia Books. Smith, A. 2009. Light verbs in Australian, New Zealand and British English. In Comparative Studies in Australian and New Zealand English: Grammar and Beyond [Varieties of English Around the World G39], P. Peters, P. Collins & A. Smith (eds), 139–155. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g39.09smi Stein, G. 1991. The phrasal verb type ‘to have a look’ in Modern English. International Review of Applied Linguistics in Language Teaching 29(1): 1–29. DOI: 10.1515/iral.1991.29.1.1 Strevens, P. 1980. Teaching English as an International Language: From Practice to Principle. Oxford: Pergamon Press. Stubbs, M. 2001. Words and Phrases: Corpus Studies of Lexical Semantics. Oxford: Blackwell. Swales, J.M. & Feak, C.B. 2004. Academic Writing for Graduate Students. Ann Arbor MI: University of Michigan Press. Thirumalai, M.S. 2002. Sri Lanka’s language policy: A brief introduction. Language in India 1(9), 〈http://www.languageinindia.com/jan2002/srilanka1.html〉 (17 October 2014). Trudgill, P. 2003. A Glossary of Sociolinguistics. Edinburgh: EUP. Vosberg, U. 2006. Die Große Komplementverschiebung: Außersemantische Einflüsse auf die Entwicklung satzwertiger Ergänzungen im Neuenglischen. Tübingen: Gunter Narr. Werner, J. & Mukherjee, J. 2012. Highly polysemous verbs in New Englishes: A corpus-based study of Sri Lankan and Indian English. In Corpus Linguistics: Looking back, Moving forward, S. Hoffmann, P. Rayson & G. Leech (eds), 249–266. Amsterdam: Rodopi. Wickramasinghe, W. 1999. British English, American English and Sri Lankan English, Vol. 3. Nugegoda: Wimal Wickramasinghe. Wickramasuriya, C. 1961. Some common mistakes in written English. Journal of the National Education Society of Ceylon 10(1): 34–54. Wickramasuriya, C. 1962. Mistakes in vocabulary and grammar resulting from difficulties with phonemes of English. Journal of the National Education Society of Ceylon 11(2): 32–39. Wierzbicka, A. 1982. Why can you Have a Drink when you can’t *Have an Eat. Language 58(4): 753–799. DOI: 10.2307/413956 Williams, J. 1987. Non-native varieties of English: A special case of language acquisition. English World-Wide 8(2): 161–199. DOI: 10.1075/eww.8.2.02wil Wright, L. 2000. Introduction. In The Development of Standard English, 1300–1800: Theories, Descriptions, Conflicts, L. Wright (ed.), 1–8. Cambridge: CUP. DOI: 10.1017/CBO9780511551758.001 Xiao, R. 2009. Multidimensional analysis and the study of world Englishes. World Englishes 28(4): 421–450. DOI: 10.1111/j.1467-971X.2009.01606.x Yogasundram, N. 2008. A Comprehensive History of Sri Lanka: From Prehistory to Tsunami. Colombo: Vijitha Yapa Publications. Zandvoort, R.W. 1965. A Handbook of English Grammar. London: Longmans. Zipp, L. 2014. Educated Fiji English: Lexico-Grammar and Variety Status [Varieties of English Around the World G47]. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g47 Zipp, L. & Bernaisch, T. 2012. Particle verbs across first and second language varieties of English. In Mapping Unity and Diversity World-Wide: Corpus-Based Studies of New Englishes [Varieties of English Around the World G43], M. Hundt & U. Gut (eds), 167–196. Amsterdam: John Benjamins. DOI: 10.1075/veaw.g43.07zip

Appendix A1. List of formality markers Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

above mentioned

mentioned above, as mentioned earlier

1

admit

take to hospital, admit to hospital

2

apply

put, spread, rub on (e.g. paint or ointment)

11

attend

go to (a school, etc.)

15

benumbed

numb

25

cease

stop

50

commence

begin, start

62

consult

see, go to (a doctor, etc.)

63

convey

send (wishes, regards, etc.)

63

correct

right

65

detrain

alight, get out of a train

73

earlier

before (e.g. The minister earlier told he will come.)

81

encash

cash (a cheque)

84

enter

go in, go to, get in (to something)

84

entrain

board, get into a train

84

expire

die, pass away

86

family member

a member of somebody’s family

88

fare

do, get on (well/badly, e.g. in an exam)

88

fillip

boost, incentive

89

forthwith

immediately

92

fully

really, completely, totally, absolutely

93

gift

give, donate

98

hail from

come from (e.g. He hails from a good family.)

108

hence

so

112

hereafter

after this, from now on

112

herewith

(e.g. Please find enclosed herewith …)

112

highly

very, extremely

112 (Continued)

 The Lexis and Lexicogrammar of Sri Lankan English

A1. List of formality markers (Continued) Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

indent (for something)

put in an official order (for something)

120

instant, inst.

this month

120

leave

a holiday, a day off

150

meet: to meet with an accident

to have an accident

166

moneys, monies

sums of money

169

notify

announce, publish (e.g. Results will be notified later.)

181

persons

people

198

peruse

read

198

previous; previously

last, before

210

proceed

go, continue, carry on, go on

211

pugilist

boxer

212

purchase

buy

213

refrigerator

fridge

220

regarding

about (e.g. What is it regarding?)

220

remove

take off (e.g. clothes)

220

reside

live (in a place)

221

residence

house, home

221

same

(e.g. Please acknowledge receipt of same.)

228

seated

sitting (e.g. He is seated over there.)

233

seem: it seems

apparently, they say, he said, I heard, etc. (a common way of passing on reported information)

234

subsequently

then, next, after that

249

thereafter

after that, from then on

262

ultimately

finally, in the end, at last, eventually

274

undermentioned

mentioned below

275

undersigned

(e.g. Please feel free to contact the undersigned.)

275

vomit

be sick, throw up

283

whom, to whom, from whom

who, who to, who from

288

Appendix 

A2. List of Pan-South Asian English lexemes Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

almirah

wardrobe

5

anicut

an irrigation channel created by damming a river

10

arecanut

a type of palm tree

12

arrack

a popular alcoholic drink distilled from coconut toddy

13

atta flour

wholeweat flour

15

ayah

nanny

16

ayurveda

traditional indigenous herbal medicine

17

bandicoot

a very large rat

20

banian, banyan

vest, singlet

20

batta (1)

an extra allowance paid for going outstation on work

23

beedi

a small cheap hand-made cigarette

24

betel

a combination of betel leaf, arecanut and chunam (lime paste), which is chewed as a stimulant

25

boondi

a type of sweet made with gram flour

32

brinjal

aubergine

38

bulbul

a small black bird with a crest on its head

41

bund

dam, dyke, the wall of a tank; also the low mud dyke around a paddy field

42

bungalow

a large house

42

cadjan

dried palm leaves woven together and used to make roofs, fences, etc.

46

cadju, cadjunut

cashew, cashew nut

46

cess

a type of tax levied for a particular purpose

50

chappals

Indian style leather slippers

52

choli

saree blouse

54

chummery

male boarding house

54

chunam

lime paste which is used in the preparation of betel; also used for whitewashing houses, etc.

54

coir

coconut fibre, used to make ropes, brushes, mats, etc.

59

compound

garden, yard, the enclosure round a house

62

coolie

unskilled labourer, esp. toilet cleaner or estate labourer

63

copra

dried coconut kernel, used to make coconut oil

64

cumbly

a woollen blanket used as a head cover by tea pluckers

66 (Continued)

 The Lexis and Lexicogrammar of Sri Lankan English

A2. List of Pan-South Asian English lexemes (Continued) Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

Deepavali

Diwali, the Hindu festival of lights

72

detenu

detainee

73

dhal

lentils

74

dharmachakraya Buddhist wheel symbol

75

dhobi

a person who washes clothes

75

dicky

boot (of a car)

76

dosai

a type of pancake made with slightly fermented batter made of rice flour and kurrakkan flour

78

eversilver

stainless steel, a silver-coloured metal alloy, commonly used for plates, cups, etc.

85

faluda

a sweet drink made with milk and rosewater

87

felicitate

honour somebody with a formal ceremony or presentation

88

gingelly, gingili

sesame

98

godown

shed, store, warehouse

101

good name

(e.g. And what is your good name?)

102

goon, goonda

thug

102

gram

chick peas

104

gunny

jute, a coarse material used to make sacks

106

hackery

a small bullock cart used for carrying people, smaller and lighter than a buggy

107

hartal

an unofficial strike, organised closing of shops, etc. as a mark of political protest

110

hill station

a town in the hills

113

hotel

a small restaurant serving cheap meals

116

iddly, idli

a steamed rice flour cake

119

jaggery

a coarse brown sugar made from the sap of the coconut tree or kitul tree

122

jak

a very large green fruit, eaten either unripe (like a vegetable) or ripe (like a fruit)

123

jambu

a small pink pear-shaped fruit with crisp white flesh

125

jungi

(coll.) underwear, knickers, pants (used when addressing children)

126

jungle

any wild, uncultivated land

126

kabaddi

a traditional team game in which players have to cross a line into opposition territory while holding their breath and chanting the word ‘kabaddi-kabaddi-kabaddi’

128

(Continued)

Appendix 

A2. (Continued) Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

kachcheri

office of local government administration

128

kesari

a Tamil sweet made with milk, semolina, cashew nuts and sultanas, and cut into square and diamond shapes

137

krait

a poisonous snake; two varieties are found in Sri Lanka: common krait and Sri Lankan krait

144

kurtha

a long loose shirt usually worn by men

146

laddu

a type of sweet made with semolina, milk, cashews and raisins

147

lakh (1,00,000)

one hundred thousand (100,000)

147

lungi

a length of material worn round the waist, esp. by women; also used to refer to the outfit, a combination of cloth and blouse

156

Mahasivarathri

a Hindu festival

159

mahout

a man whose job is to look after an elephant

159

mammoty

an agricultural tool with a wooden handle and metal blade, used for digging and hoeing

162

Masoor dhal

red split lentils, the most common variety of dhal

164

monitor lizard

a large lizard

169

mudalali

businessman, merchant, trader, shopkeeper; also used colloquially to refer to someone who is good with money, or to someone who is rather fat

171

must, musth

an annual condition affecting male elephants, making them dangerous and unpredictable

174

mutton

goat’s meat (as opposed to sheep’s meat)

174

Mysore dhal

red split lentils, the most common variety of dhal

175

needful: to do the needful

to do what’s required, to take the necessary action

177

pakora

a deep fried savoury snack

188

palmyrah, palmyrah tree

a type of palm tree which is common in the dry zone, especially associated with Jaffna and the North

189

pandal

a temporary structure (usually decorated with Buddhist scenes and flashing lights) erected for religious festivals (esp. Vesak)

190

pandan

obsequious, grovelling, subservient

190

papadam

a crisp deep-fried wheat flour bread which accompanies rice and curry or thali meals

191

payasam

a Tamil dessert made with milk, semolina, cashew nuts and sultanas

196 (Continued)

 The Lexis and Lexicogrammar of Sri Lankan English

A2. List of Pan-South Asian English lexemes (Continued) Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

peon

office aide, office assistant, messenger

196

pindrop silence

absolute silence

200

plantain

banana

202

planter

the owner or manager of an estate (esp. tea or rubber)

202

planter’s chair

a large rattan armchair with folding leg-rests

203

pooja, puja

a Hindu or Buddhist ritual offering

208

poori

a small round bread like a deep-fried roti

208

pukka

great, superb

212

range

an administrative district in the police

218

reeper

a small piece of wood which supports the tiles on a roof

220

resthouse, rest house

a government-run hotel/guest house

221

roti

a type of bread made with wheat flour and water

224

rupee

the currency of Sri Lanka

225

sadhu

monk, priest, holy man

226

sambhur

a large species of deer

228

saree, sari

a garment worn by women, consisting of a length of material wrapped around the body

229

sarpina

harmonium

230

satyagraha

a non-violent protest or demonstration

231

shalwar kameez

an outfit worn by women (esp. Muslims) consisting of loose trousers […], a long shirt […], and a shawl

237

sherbet

a sweet drink made with rosewater

238

sherwani

a formal knee-length tunic worn by men

238

shroff

cashier

240

stupa

a dome-shaped Buddhist shrine

249

swabasha

an indigenous language (Sinhala or Tamil as opposed to English)

251

tabla

a pair of drums used in Indian and Sri Lankan music

252

tank

an artificial lake or reservoir

254

tat

a screen or blind made of bamboo or cane slats, which is rolled down to protect a window, doorway or open verandah from sun, rain, etc.

255

teapoy

a small three-legged table

256

thali (1)

a gold necklace given to the bride in a Tamil wedding

260 (Continued)

Appendix 

A2. (Continued) Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

thali (2)

a vegetarian rice meal usually served on a metal plate with separate compartments for different curries

260

thosai

a type of pancake made with slightly fermented batter made of rice flour and ulundu flour

264

thug

a politician or other influential person who is prepared to threaten, beat up or murder his opponents, or one of his henchmen hired to do the dirty work; in BSE, ‘thug’ is used more loosely to refer to any violent person

266

tiffin

a light mid-morning or mid-afternoon meal

267

toddy

a sweet white drink extracted from the flower of the coconut, kitul or palmyrah tree, which is drunk slightly fermented, or distilled to make arrack; also used to make honey and jaggery

269

vadai

a small, deep-fried savoury snack

278

veena

a traditional Indian stringed instrument

280

vihara, viharaya

Buddhist temple

282

wedding hall

a reception hall usually hired out for weddings and homecomings

286

A3. Pan-South Asian English lexemes via the Google Advanced Search Tool GAST-SL

GAST-IND

GAST-GB

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq.

norm. freq.

almirah

2,190

1.31

373,000

6.54

16,100

0.05

anicut

9,450

5.67

12,400

0.22

2,720

0.01

906

0.54

57,600

1.01

1,330

0.00

4,900

2.94

9,410

0.16

13,300

0.04

88

0.05

4,290

0.08

792

0.00

2,070

1.24

247,000

4.33

5,590,000

16.09

ayurveda

407,000

244.16

4,140,000

72.56

2,740,000

7.89

bandicoot

4,350

2.16

210,000

3.68

1,110,000

3.20

arecanut arrack atta flour ayah

banian

355

0.21

155,000

2.72

7,350

0.02

batta

7,000

4.20

50,300

0.88

159,000

0.46

beedi

1,780

1.07

136,000

2.38

7,810

0.02

betel

28,200

16.92

197,000

3.45

258,000

0.74 (Continued)

 The Lexis and Lexicogrammar of Sri Lankan English

A3. Pan-South Asian English lexemes via the Google Advanced Search Tool (Continued) GAST-SL

GAST-IND

GAST-GB

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq.

norm. freq.

boondi

1,400

0.84

15,300

0.27

4,900

0.01

brinjal

18,300

10.98

666,000

11.67

301,000

0.87

bulbul

1,190

0.71

238,000

4.17

220,000

0.63

bund

51,200

30.71

1,550,000

27.17

5,730,000

16.49

cadjan

1,870

1.12

223

0.00

1,070

0.00

cadju

384

0.23

341

0.01

124

0.00

24,300

14.58

834,000

14.62

1,330,000

3.83

chappals

151

0.09

120,000

2.10

3,120

0.01

choli

655

0.39

1,670,000

29.27

253,000

0.73

56

0.03

373

0.01

576

0.00

134

0.08

16,000

0.28

661

0.00

21,300

12.78

1,050,000

18.40

1,300,000

3.74

1,010

0.61

147,000

2.58

723,000

2.08

0

0

28

0.00

1,250

0.00

5,950

3.57

197,000

3.45

124,000

0.36

80

0.05

4,000

0.07

1,540

0.00

14,300

8.58

1,530,000

26.81

321,000

0.92

4

0.00

0

0

0

0

dhobi

904

0.54

806,000

14.13

119,000

0.34

dosai

355

0.21

24,700

0.43

13,400

0.04

3

0.00

2,100

0.04

28

0.00

321

0.19

16,000

0.28

2,960

0.01

felicitate

14,300

8.58

35,100

0.62

7,030

0.02

gingelly

2,830

1.70

34,300

0.60

945

0.00

godown

298

0.18

1,220,000

21.38

76,100

0.22

cess

chummery chunam coir coolie cumbly Deepavali detenu dhal dharmachakraya

eversilver faluda

goonda gunny hackery

255

0.15

36,700

0.64

3,050

0.01

6,090

3.65

382,000

6.69

1,160,000

3.34

130

0.08

2,560

0.04

60,000

0.17

hartal

2,530

1.52

18,500

0.32

10,800

0.03

hill station

9,900

5.94

642,000

11.25

1,220,000

3.51

iddly jaggery jak

8

0.00

481

0.01

2,450

0.01

2,080

1.25

166,000

2.91

19,700

0.06

38,300

22.98

1,050,000

18.40

9,600,000

27.63 (Continued)

Appendix 

A3. (Continued) GAST-SL

GAST-IND

GAST-GB

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq. jambu

norm. freq.

1,990

1.19

212,000

3.72

131,000

0.38

375

0.22

4,890

0.09

5,800

0.02

kabaddi

6,730

4.04

208,000

3.65

69,400

0.20

kachcheri

70,400

42.23

6,440

0.11

150

0.00

265

0.16

357,000

6.26

35,100

0.10

jungi

kesari krait

353

0.21

50,500

0.89

69,400

0.20

30,300

18.18

55,900

0.98

29,500

0.08

413

0.25

239,000

4.19

88,700

0.26

lakh

37,400

22.44

8,510,000

149.14

348,000

1.00

lungi

540

0.32

153,000

2.68

297,000

0.85

Mahasivarathri

732

0.44

536

0.01

49

0.00

kurtha laddu

mahout

1,180

0.71

13,400

0.23

76,100

0.22

mammoty

382

0.23

1,710

0.03

347

0.00

monitor lizard

177

0.11

13,700

0.24

194,000

0.56

3,080

1.85

896

0.02

87

0.00

117

0.07

1,430

0.03

4,790

0.01

1,990

1.19

240,000

4.21

73,400

0.21

95

0.06

45,200

0.79

488,000

1.40

4,240

2.54

43,900

0.77

1,440

0.00

pandal

943

0.57

124,000

2.17

3,650

0.01

pandan

867

0.52

89,500

1.57

171,000

0.49

papadam

817

0.49

4,280

0.08

22,200

0.06

mudalali musth do the needful pakora palmyrah

payasam peon pindrop silence planter’s chair pooja poori

123

0.07

55,300

0.97

4,370

0.01

6,890

4.13

775,000

13.58

1,160,000

3.34

40

0.02

150

0.00

91

0.00

250

0.15

47

0.00

234

0.00

77,800

46.67

8,760,000

153.53

1,490,000

4.29

310

0.19

85,700

1.50

95,200

0.27

pukka

2,070

1.24

70,100

1.23

3,490,000

10.05

reeper

119

0.07

4,750

0.08

37,200

0.11

resthouse roti rupee

2,350

1.41

6,900

0.12

21,000

0.06

10,100

6.06

1,310,000

22.96

2,180,000

6.28

350,000

209.96

3,250,000

56.96

17,900,000

51.53 (Continued)

 The Lexis and Lexicogrammar of Sri Lankan English

A3. Pan-South Asian English lexemes via the Google Advanced Search Tool (Continued) GAST-SL

GAST-IND

GAST-GB

abs. freq. norm. freq. abs. freq. norm. freq. abs. freq. sadhu

3,050

1.83

1,110,000

19.45

239,000

0.69

685

0.41

254

0.00

672

0.00

29,900

17.94

8,250,000

144.59

2,460,000

7.08

sambhur saree sarpina satyagraha

norm. freq.

6

0.00

34

0.00

8

0.00

3,340

2.00

98,700

1.73

90,900

0.26

shalwar kameez

989

0.59

25,100

0.44

101,000

0.29

1,440

0.86

645,000

11.30

371,000

1.07

shroff

17,900

10.74

1,120,000

19.63

344,000

0.99

stupa

12,200

7.32

291,000

5.10

497,000

1.43

576

0.35

7

0.00

41

0.00

6,530

3.92

964,000

16.89

3,660,000

10.54

sherwani

swabasha tabla teapoy

53

0.03

9,540

0.17

15,300

0.04

2,640

1.58

631,000

11.06

1,280,000

3.68

756

0.45

4,360

0.08

785

0.00

tiffin

2,500

1.50

683,000

11.97

1,100,000

3.17

toddy

8,640

5.18

153,000

2.68

699,000

2.01

vadai

668

0.40

117,000

2.05

10,300

0.03

veena

7,820

4.69

2,000,000

35.05

754,000

2.17

vihara

131,000

78.59

29,800

0.52

46,800

0.13

1,590

0.95

31,000

0.54

58,100

0.17

15,659.16

9.39

601,252.35

10.54

742,165.51

2.14

thali thosai

wedding hall MEAN

A4. List of archaism markers Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

blackguard

scold

30

boarded (passive form)

boarding, staying in a boarding house (the passive form is dated in BSE)

30

brassiere

bra

36

bugger

chap, bloke, guy

40

cad

a womaniser

46 (Continued)

Appendix 

A4. (Continued) Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

cess

a type of tax levied for a particular purpose

confab

conference, meeting; in BSE, ‘confab’ is a dated word for a private chat

62–63

coolie

unskilled labourer, esp. toilet cleaner or estate labourer

63

damsel

young woman

70

dicky

boot (of a car); in BSE, ‘dicky’ is an archaic word meaning an extra folding seat at the back of a two-seater car

76

expire

die, pass away

86

fellow

man, boy, chap, bloke, guy; can also refer to animals, e.g. a pet

88

hail from

come from

108

houseboy

male servant

117

instant, inst.

this month

120

jerkin

jacket, anorak; in BSE, ‘jerkin’ is a dated word for a short sleeveless jacket or waistcoat

125

lass

girl, young woman

149

level best: to try your level best

to try/do your (very) best

151

madam

(term of respect); also used (sometimes ironically) to refer to President Chandrika Kumaratunga

158

mater

mother

195

muffler

scarf

172

murder the King/Queen

speak incorrect English

173

needful: to do the needful

to do what’s required, to take the necessary action

177

one

(with names) a certain (e.g. He is one Lionel Gunasekera.)

184

opening dose (OD)

laxative

185

parley

conference, meeting

194

pater

father

195

pow-wow

conference, meeting

210

purge

have diarrhoea; in BSE, ‘purging’ normally refers to cleansing the system as part of a ‘detox’ treatment. It is dated in the sense of ‘emptying the bowels’

213

rogue

thief, crook, devil; in BSE, ‘rogue’ is rather dated; it may still be used jokingly in the sense of a crook or devil (‘a likable rogue’), but never in the sense of a thief or robber

223

50

(Continued)

 The Lexis and Lexicogrammar of Sri Lankan English

A4. List of archaism markers (Continued) Dictionary entry in Meyler (2007)

Meaning as provided in Meyler (2007)

Page number in Meyler (2007)

same

(e.g. Please acknowledge receipt of same.)

228

sans

without

229

sherbet

a sweet drink made with rosewater; in BSE, ‘sherbet’ is a type of powder used in sweets or (dated) to make a sweet fizzy drink

238

sir

male teacher

243

sweetmeats

traditional sweets prepared for festive occasions

251

tavern

a bar selling toddy

255

tinker

a skilled worker who repairs the bodywork of vehicles, etc.; in BSE, ‘tinker’ is an archaic word for an itinerant craftsman who repaired pots, kettles, etc.

268

toper

drinker, drunkard

271

topping

excellent

271

undersigned

(e.g. Please feel free to contact the undersigned.)

275

worry

pester, hassle, harass

291

wristlet

watch, wristwatch; in BSE, ‘wristlet’ is a dated word for a bracelet

292

yeoman service

sterling service, faithful and devoted service

293

Index A above mentioned 92–93, 233 academic writing see writing accelerate up 153, 169 acrolect 5–6, 10–13, 16–17, 21, 55–56, 59, 61, 66, 73–74, 78, 138, 158, 161, 165, 167, 187, 193, 203, 208, 213, 222–223 see also dialect continuum admit 89–92, 233 America 9, 20, 23, 25, 60, 68, 119, 168, 173, 210–212, 223 American English 25, 68, 168, 173, 210–212, 223 apply 89, 91, 233 association plot 98–99, 177–178 attitudes 20, 23, 26, 27, 38, 41–42, 48, 52–53, 154, 212, 217, 221 see also linguistic schizophrenia see also prescriptivism auntie 121, 190 aunty see auntie autonomy 32, 35, 214–216, 219–221 see also semiautonomous variety B Bandaranaike, S.W.R.D. 30, 33–35 Bangladesh 1, 66 Bangladeshi English 2 basilect 10–11, 13, 21, 55, 73, 223 see also dialect continuum boast off 164–165 Borahs 3 British Raj 1 Buddhism 2, 6, 7, 34, 108, 110 Burghers 3–4, 6–7, 9–10, 35, 40 business letters see letters

C carry out 140, 143–144 chalk out 157–160, 162 chi-square test 85 Christians 6, 27 codification 2, 48–49, 52–53, 55, 62, 66, 210, 222 Colebrooke Report 26–27 see also ColebrookeCameron Commission Colebrooke-Cameron Commission 26–28 see also Colebrooke Report colligation 102, 105, 206, 217–219 collocation 21, 90, 102–103, 105, 129–131, 135, 143, 173, 206, 217–222 Colombo 2, 6, 11–12, 14, 29, 48, 224 colony 1, 8–9, 17, 23–27, 31, 33, 40–41, 43, 45, 84, 211 colonisation 4, 6, 25 coloniser 8, 30, 38, 40 see also Dutch see also Portuguese see also settlers commence 86–89, 93, 233 common core 47, 60, 207, 209, 211–217, 219–222 conservative forces see forces constitution 13–15, 31, 35–37, 39–40 consult 84, 89, 233 cope up with 46, 78, 138, 152–154, 169, 202, 207, 222–223 see also cope with cope with 46, 51, 138, 153–154, 169, 207 see also cope up with Coperahewa, S. 2, 7–8, 11, 15–16, 25–27, 29–33, 35–37, 40, 42, 52, 67 Corpus of Global Web-based English 77–78

correspondence 28, 59, 63, 95–99, 112–113, 125–128, 146, 148–151, 179, 181 see also letters Cramer’s V 85 creative writing see writing D de facto 14, 24, 38 de jure 13–14 de Silva, K.M. 7, 24–35, 40, 57 detail out 156, 158–160 detrain 93–94, 103–105, 206, 233 dialect continuum 10, 12–13, 16, 55, 73, 213, 223–224 see also acrolect see also basilect see also mesolect dispose off 164, 166–167 domain 2, 17, 56, 69–71, 73–74, 78, 92, 104, 110, 116, 122, 130, 132, 153–155, 161, 163, 190, 192 DP 144–145, 149, 151, 207 Dravidian 2 Dutch 4, 24, 40, 45 see also colony E earlier 86–87, 89–90, 233 East India Company 1, 24 elections 32–34 elite 6, 16, 26, 28–29, 36, 51–52 endonormative 20, 23, 39–40, 48–49, 52–53, 119, 133, 210, 214–215, 217–222 see also evolution see also nativisation English as a Life Skill 49–51, 53, 210, 213 English as a second language 10–11, 23, 57–58, 60, 62–63, 68, 73, 79, 155, 157 enter 86–90, 233

 The Lexis and Lexicogrammar of Sri Lankan English epicentre 5, 57, 209–211, 213, 215–216, 222–223 ESL see English as a second language ethnicity 2–4, 6–7, 15–17, 28, 31, 33–35, 37, 40–42, 53, 210 evolution 17, 19–23, 39, 48, 52–53, 57, 78, 133, 147, 210–211, 218, 221–222 see also endonormative see also exonormative see also nativisation exonormative 8–9, 20, 28, 48, 211–212, 215, 217–221 see also evolution see also nativisation F family member 89, 93–94, 103–105, 206, 233 fare 89, 91, 233 fellow 122, 129–130, 132–133, 206, 243 Fisher’s (1922) exact test 85 forces conservative forces 22, 47 progressive forces 18, 22–23, 42, 133, 205 G genre sensitivity 145, 149, 151, 169, 207, 217 glow up 155 GloWbE see Corpus of Global Web-based English Google 17, 56, 69–70, 75–77, 92, 109, 239–242 Goonetilleke, D. 29, 33–37, 40, 43 gram 106–107, 110–114, 236 Great Complement Shift 195–196, 202, 209 Greenbaum, S. 58–60, 62, 65, 79 Gunesekera, M. 5, 7, 11, 16, 24–25, 27–29, 33, 35–38, 42, 44–49, 57, 212, 216, 222, 224 H hail from 89, 93, 122–123, 130–133, 206, 233, 243 hate 18, 137, 194, 196–198, 200–203, 208–209, 220–221

have a(n)/the/Ø glimpse 185–186, 193 Herat, M. 47–48, 67–68, 224 Hindus 6–8, 129 Hornby, A.S. 153, 156–157, 161–162, 164, 216 horror aequi 195 Hundt, M. 47, 68, 71, 73–74, 76, 79, 137–138, 217, 224 I IDPs see internally displaced persons independence 9, 17, 19, 23, 30–31, 34, 38–41, 43, 53, 211, 216 Indo-Aryan 2 institutionalisation 1, 9–10, 23, 28, 39, 43–46, 49, 157–158, 160, 189, 203, 206, 210, 222 instructional writing see writing internally displaced persons 102–103, 135, 219, 221

lingua franca 15, 209, 212–213, 215 linguistic schizophrenia 212 see also attitudes link language 13–16, 37, 52, 210 love 18, 137, 194–197, 200–203, 208–209, 220–221 LTTE see Liberation Tigers of Tamil Eelam lurch up 152–153

K Kandy 2, 24–25, 27 Künstler, V. 14, 23, 38, 42–43, 48, 57, 212

M madam 120–122, 125–129, 243 make out 138 Malay 1, 3, 40 Maldives, the 1–2, 66 marriage advertisement 61–62 Memons 3 Mendis, D. 4, 8–9, 14–15, 29, 43, 47, 138, 146–147, 168 mesolect 6, 10–13, 21, 55, 73, 223 see also dialect continuum missionary school 1, 15, 25, 27 moneys 88–89, 92–93, 234 monies see moneys Moors 3, 40 Mukherjee, J. 7, 10, 14–15, 18, 20–23, 31, 38–39, 46–48, 53, 59, 62, 64–66, 72, 80–81, 83, 133, 137–138, 152–154, 169, 188, 194, 202, 205, 208, 214–215, 217–218, 222, 224 Muslims 3–4, 6, 27, 35

L L1 influence 7, 214–216, 221 lakh 107, 114–116, 206, 222, 237, 241 lease out 156, 158–162, 169–170, 202, 208, 223 leave 86, 88–90, 234 letters business letters 59, 64–65, 88, 96, 98–100, 150–151, 181 social letters 58–59, 63, 65, 87–88, 96, 100, 150–151, 181 see also correspondence Liberation Tigers of Tamil Eelam 37, 40 like 18, 137, 194, 196–203, 208–209, 220–221

N national language 30–32, 35–37 nativisation 5, 18, 20–23, 28, 38–39, 57, 73, 78, 81, 135, 137, 148, 170–171, 173, 175, 185, 187–188, 202, 205, 207–208, 211, 215–224 nativisation indicator 215, 217–221 see also structural facet structural nativisation 5, 18, 20–22, 28, 57, 73, 81, 137, 148, 171, 187, 202, 205, 215–224 see also endonormative see also evolution see also exonormative

J Jaffna 25 jungle 107–109, 236

Index  Nepal 1, 66 net out 157–159 New Englishes 10, 20, 55, 67, 81, 133, 137, 170, 224 newspaper 5, 8, 17, 42, 44, 50–51, 56–57, 61–62, 64–69, 77–78, 84–85, 88, 90–92, 94–95, 101–102, 105–111, 114–118, 121, 124–125, 130–133, 138–140, 142, 144, 153, 155–156, 158, 160, 162–167, 173, 175–178, 182–185, 187, 189–193, 198–199, 201, 206–208 non-academic writing see writing non-professional writing see writing O official language 13–15, 24–26, 29–38 P Pakistan 1, 66 Pakistani English 2 pass out 138 Passé, H.A. 5–6, 44–46, 222 path lexicalisation 169, 208 path of structural nativisation 215–217, 219–221, 224 cognitive path of structural nativisation 217, 220 lexical path of structural nativisation 216, 219 phraseological path of structural nativisation 217, 219–221 textual path of structural nativisation 219–220 PCEs see postcolonial Englishes persons 86, 89, 93, 101–103, 105, 135, 206, 219, 221–222, 234 persuasive writing see writing planter 107–108, 238 pluricentric 44 Portuguese 4, 7, 24, 40, 45 see also colony POS-tags 77, 139, 175 postcolonial English squared 211

postcolonial Englishes 17–22, 39, 43, 52–53, 205, 209, 211–217, 221–222, 224 prescriptivism 11, 46–48, 147–148, 214 see also attitudes proficiency 2, 6, 10–12, 16, 27–29, 38, 48, 66, 73–74 progressive forces see forces provide 212 put a(n)/the/Ø chat 173, 175, 185, 189–190, 193, 208 put a(n)/the/Ø fight 173, 175, 185, 190, 193, 208 put a(n)/the/Ø nap 173, 175, 185, 191, 193, 208 put a(n)/the/Ø rest 173, 175, 185, 192–193, 208 R refrigerator 89, 93–94, 103–105, 206, 219, 234 regarding 86–89, 93 remove 91, 234 reportage 59, 61, 96–97, 99, 112–113, 146, 179–180 representativeness 61, 66, 69, 71, 78 rile up 155 rupee 106–109, 111–114, 116–117, 206, 238, 241 S saree 106–108, 111–114, 238, 242 sari see saree SAVE Corpus see South Asian Varieties of English (SAVE) Corpus Schilk, M. 1, 4, 17, 47, 67–69, 80, 84, 153, 194, 202, 208, 217, 222 Schneider, E.W. 5, 16–24, 28, 38–42, 44, 48, 52–53, 80, 137–138, 140, 142, 167–168, 194, 205, 207, 210, 214, 218, 221–222 semantic profile 161 semantic scope 161, 208 semantic transparency 139, 156 semantico-structural analogy 152, 154, 158, 169, 208 semiautonomous variety 18, 205, 214–221, 224

forces on semiautonomous variety 215–216, 219 see also autonomy settlers 4, 20, 23–24, 28, 38, 40–41, 53, 212 see also colony simplex verb 88, 169–172, 174, 182, 187–188 Sinhala 2, 4–5, 7–11, 13–17, 29–37, 42, 45–46, 48, 52, 83, 209, 212 Sinhala Only Bill 34 Sinhala Only Policy 9, 30, 34–35 Sinhalese 2, 6–7, 14–15, 17, 27–30, 32–36, 40–41 sir 120–121, 125–129, 206, 244 SLFP see Sri Lanka Freedom Party social letters see letters South Asia 1, 5, 43, 46, 57, 61, 105–106, 111, 113, 131, 137, 141, 172, 196, 206–207, 211, 223 South Asian English(es) 1, 4, 18, 21, 23, 43, 47, 66, 78–80, 83–84, 94, 105–107, 109, 111, 119, 124, 128, 132–134, 139, 141–142, 156, 167–168, 170, 173, 182, 184, 196, 205–208, 210, 221–224 South Asian Varieties of English (SAVE) Corpus 56, 66 Speak English Our Way 51, 213 split out 157–159, 161–162 Sri Lanka Freedom Party 32–34 structural facet 218–219, 221 see also nativisation indicator structural profile, distinctive 18, 137, 205, 209, 214–215, 217–220, 224 stupa 108, 238, 242 switch out 157–160, 162 T take a(n)/the/Ø benefit from 173–175, 185–187, 193, 208 take a(n)/the/Ø call 173–175, 185, 187–188 take a(n)/the/Ø lease 173–175, 185–187, 193, 208

 The Lexis and Lexicogrammar of Sri Lankan English Tamil 2–11, 13–17, 25, 27–37, 40–42, 48, 52, 83, 209, 212, 216 tank 107–108, 117–119, 216, 238 thereafter 86–90, 234 U United National Party 32–34 UNP see United National Party W waive off 162–164, 169, 202 war, civil 32, 37, 41–42, 90, 103 WebCorp 77 writing

academic writing 47, 58–59, 63, 95–97, 99–100, 111–113, 144–149, 151, 164, 167–168, 179–181, 207 creative writing 30, 43, 59, 95–99, 112–114, 125–126, 128–129, 132–133, 144–149, 151, 168, 179–181, 207 instructional writing 59, 97, 99, 110–111, 113, 146, 148–149, 179–180 non-academic writing 59, 63, 95–97, 99–100, 112–113, 146, 179–180 non-professional writing 59, 63, 95, 97,

99, 112–113, 146, 148–149, 179–181 persuasive writing 59, 96–97, 99, 112–113, 146, 179–180 X Xiao, R. 95–98, 138, 147, 168 Y Yogasundram, N. 15, 25–28, 31–32, 34–37 Z Zipp, L. 46, 80, 138, 140, 142, 153–154, 167–168, 180, 207