The Oxford Handbook of Endangered Languages 9780190610036, 9780190610043, 9780190877040, 9780190610029, 0190610034

The endangered languages crisis is widely acknowledged among scholars who deal with languages and indigenous peoples as

125 29 50MB

English Pages 776 [977] Year 2018

Table of contents :
Cover
The Oxford Handbook of Endangered Languages
Copyright
Contents
Foreword
Biographical note
Introduction: Endangered languages
Part I Endangered Languages
1. The status of the world’s endangered languages
2 Assessing degrees of language endangerment
3 Language contact and language endangerment
4 Indigenous language rights—Miner’s canary or mariner’s tern?
Part II Language Documentation
5 The goals of language documentation
6 Documentation, linguistic typology, and formal grammar
7 The design and implementation of documentation projects for spoken languages
8. Endangered sign languages: An introduction
9 Design and implementation of collaborative language documentation projects
10. Tools and technology for language documentation and revitalization
11. Corpus compilation and exploitation in language documentation projects
12 Writing grammars of endangered languages
13. Compiling dictionaries of endangered languages
14 Orthography design and implementation for endangered languages
15. Language archiving
16. Tools from the ethnography of communication for language documentation
17. Language documentation in diaspora communities
18. Ethics in language documentation and revitalization
Part III Language Revitalization
19. Approaches to and strategies for language revitalization
20. Comparative analysis in language revitalization practices: Addressing the challenge
21 The linguistics of language revitalization: Problems of acquisition and attrition
22. New media for endangered languages
23. Language recovery paradigms
24 Myaamiaataweenki: Revitalization of a sleeping language
25. Language revitalization in kindergarten: A case study of Truku Seediq language immersion
26 Māori: Revitalization of an endangered language
27. Language revitalization in Africa
28. Planning minority language maintenance: Challenges and limitations
Part IV Endangered Languages and Biocultural Diversity
29. Congruence between species and language diversity
30. Sustaining biocultural diversity
31 Traditional and local knowledge systems as language legacies critical for conservation
32 Climate change and its consequences for cultural and language endangerment
33 Interdisciplinary language documentation
34 Why lexical loss and culture death endanger science
Part V Looking to the Future
35. Funding the documentation and revitalization of endangered languages
36 Teaching linguists to document endangered languages
37 Training language activists to support endangered languages
38 Designing mobile applications for endangered languages
39 Indigenous language use impacts wellness
Afterword
Index

Recommend Papers

The Oxford Handbook of Endangered Languages 9780190877040, 9780190610036, 9780190610043, 9780190610029, 0190877049

The endangered languages crisis is widely acknowledged among scholars who deal with languages and indigenous peoples as

114 108 14MB Read more

Cataloguing the World's Endangered Languages 1138922080, 9781138922082

Cataloguing the World’s Endangered Languages brings together the results of the extensive and influential Catalogue of E

331 137 2MB Read more

The Oxford Handbook of African Languages 9780199609895, 0199609896

This book provides a comprehensive overview of current research in African languages, drawing on insights from anthropol

109 17 13MB Read more

The Oxford Handbook of African Languages 9780191007385, 9780199609895, 0191007382

This book provides a comprehensive overview of current research in African languages, drawing on insights from anthropol

101 6 17MB Read more

Endangered Languages: An Introduction 1139033816, 9781139033817

"Most of the 7,000 languages spoken in the world today will vanish before the end of this century, taking with them

113 2 3MB Read more

Endangered Languages in the 21st Century 1032196742, 9781032196749

Endangered Languages in the 21st Century provides research on endangered languages in the contemporary world, the challe

157 61 6MB Read more

The Oxford Handbook of Languages of the Caucasus 9780190690694, 9780190690700, 9780190690717, 0190690690

The Oxford Handbook of Languages of the Caucasus is an introduction to and overview of the linguistically diverse langua

97 64 11MB Read more

Endangered Languages in the 21st Century 9781000835496, 9789978105412

Endangered Languages in the 21st Century provides research on endangered languages in the contemporary world, the challe

176 69 5MB Read more

Documenting Endangered Languages: Achievements and Perspectives 9783110260021, 9783110260014

The rapid decline in the world's linguistic diversity has prompted the emergence of documentary linguistics. While

147 9 2MB Read more

The Oxford Handbook of the Septuagint (Oxford Handbooks) 9780199665716, 0199665710

The Septuagint is the term commonly used to refer to the corpus of early Greek versions of Hebrew Scriptures. The collec

99 97 62MB Read more

The Oxford Handbook of Endangered Languages
9780190610036, 9780190610043, 9780190877040, 9780190610029, 0190610034

Author / Uploaded
Kenneth L. Rehg
Lyle Campbell

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

T h e Ox f o r d H a n d b o o k o f

E N DA N G E R E D L A N G UAG E S

The Oxford Handbook of

ENDANGERED LANGUAGES Edited by

KENNETH L. REHG and

LYLE CAMPBELL

1

3 Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and certain other countries. Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America. © Oxford University Press 2018 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by license, or under terms agreed with the appropriate reproduction rights organization. Inquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above. You must not circulate this work in any other form and you must impose this same condition on any acquirer. Library of Congress Cataloging-in-Publication Data Names: Rehg, Kenneth L., 1939– editor. | Campbell, Lyle, editor. Title: The Oxford handbook of endangered languages / edited by Kenneth L. Rehg and Lyle Campbell. Description: New York, NY : Oxford University Press, [2018] | Includes bibliographical references and index. Identifiers: LCCN 2017041411 (print) | LCCN 2017043115 (ebook) | ISBN 9780190610036 (updf) | ISBN 9780190610043 (online course) | ISBN 9780190877040 ( epub) | ISBN 9780190610029 (cloth : acid-free paper) Subjects: LCSH: Endangered languages—Handbooks, manuals, etc. Classification: LCC P40.5.E53 (ebook) | LCC P40.5.E53 O94 2018 (print) | DDC 408.9—dc23 LC record available at https://lccn.loc.gov/2017041411 1 3 5 7 9 8 6 4 2 Printed by Sheridan Books, Inc., United States of America

Contents

Foreword Michael Krauss Biographical Note Introduction: Endangered languages Lyle Campbell and Kenneth L. Rehg

ix xv 1

PA RT I E N DA N G E R E D L A N G UAG E S 1. The status of the world’s endangered languages Anna Belew and Sean Simpson

21

2. Assessing degrees of language endangerment Nala H. Lee and John R. Van Way

48

3. Language contact and language endangerment Sarah G. Thomason

66

4. Indigenous language rights—Miner’s canary or mariner’s tern? Teresa L. McCarty

82

PA RT I I L A N G UAG E D O C UM E N TAT ION 5. The goals of language documentation Richard A. Rhodes and Lyle Campbell

107

6. Documentation, linguistic typology, and formal grammar Keren Rice

123

7. The design and implementation of documentation projects for spoken languages Shobhana Chelliah

147

vi Contents

8. Endangered sign languages: An introduction James Woodward

168

9. Design and implementation of collaborative language documentation projects Racquel-María Sapién

203

10. Tools and technology for language documentation and revitalization Keren Rice and Nick Thieberger

225

11. Corpus compilation and exploitation in language documentation projects Ulrike Mosel

248

12. Writing grammars of endangered languages Amber B. Camp, Lyle Campbell, Victoria Chen, Nala H. Lee, Matthew Lou-Magnuson, and Samantha Rarrick

271

13. Compiling dictionaries of endangered languages Kenneth L. Rehg

305

14. Orthography design and implementation for endangered languages Michael Cahill 15. Language archiving Andrea L. Berez-Kroeker and Ryan E. Henke 16. Tools from the ethnography of communication for language documentation Simeon Floyd

327 347

370

17. Language documentation in diaspora communities Daniel Kaufman and Ross Perlin

399

18. Ethics in language documentation and revitalization Jeff Good

419

PA RT I I I L A N G UAG E R E V I TA L I Z AT ION 19. Approaches to and strategies for language revitalization Leanne Hinton

443

Contents vii

20. Comparative analysis in language revitalization practices: Addressing the challenge Gabriela Pérez Báez, Rachel Vogel, and Eve Koller

466

21. The linguistics of language revitalization: Problems of acquisition and attrition William O’Grady

490

22. New media for endangered languages Laura Buszard-Welcher

510

23. Language recovery paradigms Alan R. King

531

24. Myaamiaataweenki: Revitalization of a sleeping language Daryl Baldwin and David J. Costa

553

25. Language revitalization in kindergarten: A case study of Truku Seediq language immersion Apay Ai-yu Tang

571

26. Māori: Revitalization of an endangered language Jeanette King

592

27. Language revitalization in Africa Bonny Sands

613

28. Planning minority language maintenance: Challenges and limitations 637 Sue Wright

PA RT I V E N DA N G E R E D L A N G UAG E S A N D B IO C U LT U R A L DI V E R SI T Y 29. Congruence between species and language diversity David Harmon and Jonathan Loh

659

30. Sustaining biocultural diversity Luisa Maffi

683

31. Traditional and local knowledge systems as language legacies critical for conservation Will C. McClatchey

701

viii Contents

32. Climate change and its consequences for cultural and language endangerment Christopher P. Dunn

720

33. Interdisciplinary language documentation Gary Holton

739

34. Why lexical loss and culture death endanger science Ian Mackenzie and Wade Davis

761

PA RT V L O OK I N G TO T H E F U T U R E 35. Funding the documentation and revitalization of endangered languages Susan Penfield

785

36. Teaching linguists to document endangered languages Carol Genetti

803

37. Training language activists to support endangered languages Nora C. England

824

38. Designing mobile applications for endangered languages Steven Bird

842

39. Indigenous language use impacts wellness Alice Taff, Melvatha Chee, Jaeci Hall, Millie Yéi Dulitseen Hall, Kawenniyóhstha Nicole Martin, and Annie Johnston

862

Afterword David Crystal Index

885 895

Foreword Michael Krauss

I am deeply reassured to see the progress the Handbook of Endangered Languages represents, the progress made over the last quarter century, in our concern for the survival of our intellectual heritage, our diverse languages. Twenty-five years ago it seemed we had completely forgotten the lesson of Babel, and even linguists seemed altogether oblivious to the threat of unprecedented mass language extinction. American linguistics in the tradition of Boas, Sapir, and Bloomfield had at least considered documentation important and urgent; as anyone could see, American languages were vanishing. Helping them not vanish was probably impossible in those days, but making written record of them was still an ancient tradition central to linguistics. My own training was in the 1950s, the very end of that period. Then came Chomsky, say 1957 (Chomsky 1957) or 1962 (Chomsky 1964), whose intellect and personality, both, virtually redefined linguistics. That an extraterrestrial would find Earthese interesting though essentially uniform, with but trivial variation, was the new perspective. English, maybe with some other language for double-checking, would do for a sample. It might be said that up to 1957 theory in linguistics was languishing, but Chomskyian insight brought such impetus to theory in linguistics as to relegate documentation to the dustbin. I spent 1969–1970 at MIT, “to learn how to document languages better.” That puzzled Chomsky, understandably, though it did not puzzle Ken Hale, who was not only tolerated but well appreciated at that very citadel. In fact, at MIT in those days, documentation was called “fact-grubbing,” and the one course on the linguistics of the forerunners, such as Sapir and Bloomfield, was called the “bad guys” course. That terminology was half-satirical though, and I felt perfectly welcome there. The sad point is that the effect more generally elsewhere became such that mere “fact- grubbing” had to fall seriously beneath the dignity of True Linguists. The Chomsky effect pulled the pendulum so mightily in the direction of theory as opposed to documentation that it got stuck there for thirty years. It was Ken Hale who unstuck the pendulum by organizing the panel on Language Endangerment at the Linguistic Society of America (LSA) 1991 annual meeting that may be considered the turning point, and who invited me to speak on the scope of it. The only person in the whole audience who said anything to me right after my tirade there was in fact Morris Halle: “I hope you weren’t blaming us [MIT] for the overkill.” His eloquent choice of the term “overkill,” in itself

x Foreword made me answer something like “Not really.” An even better reason for not blaming MIT is the fact that MIT harbored Ken Hale, which made the difference in 1992. After all, there is a sociology to linguistics too. There were complaints about the published version of that paper (Krauss 1992), about my comparing the dire endangerment rate of languages to biological endangerment rates—far higher than the rates for birds and mammals—appealing creatures. It was complained that the comparison was a “cheap shot.” Fair enough, if we don’t want to compare the value of languages with that of biological species, but the comparison of linguistics with biology has multiple important facets, and one of these, even a trivial one in fact, is the sociology of the field and the chronology thereof. It was also in the 1950s that biologists discovered DNA. The interest and importance of that galvanized the field of microbiology so dramatically that there were fears that what we might call macrobiology, the study of the biosphere, “butterfly-collecting,” might be eclipsed, and “ecology” was almost becoming a bad word. Any such imbalance was short-lived in biology. After all, our lives depend on the biosphere, “why save [even] the snail-darter” was hardly a question, and Rachel Carson’s 1962 Silent Spring was merely a clincher. Biology is a big profession, and we all need to breathe. Linguistics is tiny, and has the Chomsky effect. Small wonder, perhaps, that linguistics has taken so long even to start regaining some balance. Bad timing for imbalance! The comparison extends all too well to the issue of climate change and to how well we act at the brink of disaster. The rate of endangerment may be a controversial issue, but now at least it is an issue. In 1992 it was of course mainly guesswork. We had only our own experience along with what was by far our best broad source of information or worldwide perspective, namely SIL’s Ethnologue. I unhesitatingly brandished that source at the LSA. Hardly irrelevant, but in fact symptomatic, was that only missionary rather than strictly academic linguists had cared enough to undertake a basic inventory of the world’s languages. I had myself been on a National Science Foundation panel that granted support during the 1980s to advance the Ethnologue project. As for personal experience about the endangerment rate itself, I remember talking about that especially with Ken Hale, Steve Wurm, and Robert Austerlitz. Their “impressions” basically reinforced my 90% rate. Granted, we were strong in America and Australia, also Asia, weaker on Africa, tending to skew my guess somewhat upward, or failing to lower it. It seemed to me far easier and safer to guess what proportion of the world’s languages were “Safe” rather than endangered, i.e., would still be neither extinct nor moribund for the foreseeable future. I defined that future explicitly as the year 2100, arbitrarily the end of the coming century; such languages being, I now add, still spoken then by a viable or sustainable proportion of children. The main criteria were sheer number of speakers (over a million), and some kind of governmental support. Those two groups of course greatly overlap and could total only in the hundreds of languages, even the low hundreds. My best guess for “Safe” in that sense was at most 10%, so the rest were “unsafe” or endangered, at least 90%. At the same time, I must confess I feel it better by far to err by being too careful than too careless, and would rejoice to find the

Foreword xi 90% too high. Given the position of linguistics, at least in the United States at the time, it was hard for me to feel that my estimate could be too alarmist. It is indeed gratifying to see the response in the progress of linguistics since then to the catastrophe looming, including serious research on the rate of language endangerment, especially now by the Catalogue of Endangered Languages Project. That project currently defines the rate at 45.9% (this volume), and that figure is uncannily close to recent SIL figures. I’m afraid I remain skeptical about a figure that optimistic. If it is at all true that the world’s median-sized language has 7,000 speakers (a figure available still only from Ethnologue), that implies that languages averaging well below 7,000, at the 46th percentile on the population curve, are “Safe.” The 46th percentile on Ethnologue’s curve is around 5,000. It seems quite impossible or counterintuitive to me that an ordinarily situated language of 5,000 in today’s world could in fact be safe. This would seem to be the case even in the still diverse parts of best-off /less unstable Africa, e.g., Cameroon, where perhaps no local languages widely dominate, with only ex-colonial English versus French doing that job, perhaps allowing 250 languages to last longer than they would elsewhere; but for how long? How long can that “stability,” such as it is, last in the ever-more-rapidly changing world? And if Africa is relatively stable or safe for languages, what then of New Guinea, Brazil, or China, for example? How many nation- states positively value all their indigenous languages, take active measures to support them all, or even allow them all to be valued? At last facing the imminent loss of human language diversity, whatever the precise proportion of its massiveness may be, linguistics could hardly have picked a worse time to be caught so long off balance. Thinking again in terms of percentages, I would guess that from maybe 2% of linguists in academe concerned with language endangerment and prioritizing to do something appropriate about it in 1992, we now might have 25%, optimistically. That is significant progress toward a balance, whatever that is, considering the urgency. Theorizing can wait; documentation and activism can’t, so it could be argued that the appropriate balance for that in these times should be not 50% of linguists but maybe 98%, until the situation is under control and we’re ready for another Chomsky. The ironies in all this are enormous. Microbiology, e.g., DNA, and macrobiology, e.g., ecology, biosphere, coexist nicely in biology as a science, even though macrobiology is “tainted” with issues of our survival as living creatures, and so is linked inevitably to environmentalism, to our benefit. Chomsky is thought by many to have made linguistics a True Science, while the other half of him is more than “tainted” with concern for the human condition. How ironic then that linguistic science is so separated from concern for the human condition instead of being inextricably linked by language, the very essence of our humanity. As humans, have we not evolved beyond mere “survival of the fittest” for language? Things are still backward in academe for languages, unilaterally. Too often, languages must serve linguistics, and not the reverse. The National Science Foundation (NSF), in addition to its support for linguistics, now has a program specifically for documenting endangered languages, partly as a result of the movement in linguistics starting in 1992.

xii Foreword Yet NSF’s guidelines require a grant proposal to show how that documentation will be of value to Linguistic Science, presumably because NSF’s charter is for science alone; taxpayers are paying the NSF for Science, not for good deeds, or even just for language facts that may be going away. Sadly, care is needed for fact-based prioritization as to what is truly an endangered language, but we should not have to show that documentation proposed for a truly endangered language will contribute to linguistic theory before it is even done, as though good documentation of a truly endangered language were not of sufficient value in itself. Even unskilled or random documentation could be of what I call scientific as well as humanistic value, and of course the more skilled the better, as documentation, again, is a science as well as an art. The field of “conservation” of endangered languages has burgeoned dramatically, as this handbook shows. This movement needs of course to keep doing so, but with unity as well as vigor. The relation between the two obvious branches of preservation, i.e., documentation (including secure archiving) and support (maintenance, revitalization), is bound to be a big subject of concern for us. Any tension there should be kept healthy and productive. My own experience in Alaska is typical enough, that there is a natural tension between the need to document languages as well as possible (particularly for any which will become dormant before that happens), and the needs or wishes of the community for language support. (The latter too often for revitalization rather than maintenance, unfortunately, as community awareness is all too liable to come only after its language is moribund.) The priority for documentation is inescapable if we rigorously consider posterity over the immediate desires of the community, not just for science but even for the community itself, if they want more of their moribund language to be left for future generations of their children to learn. One quibble here with terminology tendencies. Let’s not get carried away with euphemisms. “Moribund” is worse than “endangered” or “severely endangered.” Certainly a living species that had lost all reproductive capacity would be definitively doomed to extinction. Continuing the biological comparison, “extinct” (of biological species) is much worse than “dormant” or “sleeping” (of languages) unless some day we can resurrect an extinct species from its DNA. A language with no living speakers is certainly extinct, not dormant, unless we have documentation. Insofar as we have documentation and a community, the language can indeed be “reawakened,” insofar as Hebrew or Cornish or Miami can be considered living examples. Even ignoring those examples, archived documentation in principle not only provides for possible future resurrection but also provides knowledge for linguistic science, not to mention for other fields, history for one. This handbook shows how the movement has begun to burgeon, not only for both the documentation and revitalization but more broadly to include the close relation it has with our concern for our biosphere and its essential diversity, which most people know we need to preserve. So should it be with the heritage of our linguistic diversity, essential to our humanity. We’ve made real headway in academe, discussed already in simplistic terms of percentages. But, to repeat a 1992 question still unanswered, where can one get a

Foreword xiii PhD with something like a good dictionary of an endangered language for a dissertation? There is a dramatically increasing need for help, and hopefully a demand, in endangered-language communities, but that also requires people, linguists, and/or community members with training in appropriate language pedagogy as well as analysis. Where in academe is such training to be had? Beyond academe and the language communities themselves, there are further domains we need to consider, some addressed in this handbook, others that may not be addressed here. One domain is the general public. We have written eloquently about the threat to our linguistic diversity, there is the lesson of Babel, and there have been pulses of coverage in the press, but much more is needed for public awareness even to approach that for biodiversity. To what extent has it been broached, for example, in our educational system? This brings up the administrative domain, government. I have already asked the question of what governments positively value their linguistic diversity. We should know and act—not only at the nation-state level, but perhaps the more local the better. Just to take the case I know best, the United States in general and Alaska in particular, is a sad one. Highly negative until late last century, federal policy officially recognized indigenous languages as a national asset at last in 1992, with over 90% of its indigenous languages extinct or moribund. In Alaska, also as devastated as that, we got the legislature to allow indigenous languages in its schools in 1972, with a university center for documentation and support. The public voted for English Only in a 1991 referendum, a landslide, but in 2016 the legislature, voting almost unanimously, recognized all twenty Alaska Native Languages as official state languages, including Eyak with no living native speakers. The effects of these vicissitudes, of course, remain to be seen.

References Carson, Rachel. 1962. Silent Spring. Boston: Houghton Mifflin. Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton. Chomsky, Noam. 1962. “The Logical Basis for Linguistic Theory.” Proceedings of the 9th International Congress of Linguists, Cambridge, MA, 1962, edited by Horace Lunt, 914–978. The Hague: Mouton. Krauss, Michael. 1992. “The World’s Languages in Crisis.” Language 68: 4–10.

Biographical Note

Kenneth L. Rehg is an Associate Professor of Linguistics at the University of Hawai‘i at Mānoa (retired) and an authority on the languages of Micronesia, a region in which he has conducted fieldwork over the course of the past five decades. He is the (co)author of three books and numerous papers on these languages, founding editor of Language Documentation & Conservation, and the 2009 Chair of the Linguistic Society of America’s Committee on Endangered Languages and their Preservation. His interests include language documentation, lexicography, phonology, historical linguistics, and the application of linguistics to the formation of educational policies and practices in the developing nations of the Pacific. Lyle Campbell (PhD, UCLA) is professor emeritus at the University of Hawai‘i Mānoa. His specializations include language documentation, historical linguistics, indigenous languages of the Americas, and typology. He was director of the Catalogue of Endangered Languages project at the University of Hawai‘i 2009–2016. He is a linguist but has also held appointments in Anthropology, Latin American Studies, Linguistics, and Spanish. His publications include 23 books and approximately 200 articles; he won the Linguistic Society of America’s “Bloomfield Book Award” twice, for American Indian Languages (Oxford University Press, 1997) and Historical Syntax in Cross-Linguistic Perspective (with Alice Harris, Cambridge University Press, 1995). Daryl Baldwin is a citizen of the Miami Tribe of Oklahoma and has been engaged with his family and community in Myaamia language and cultural revitalization since the early 1990s. He received an MA in English (linguistics) from the University of Montana in 1999 and in 2001 became the founding director of the Myaamia Center (formerly Myaamia Project) at Miami University. The Myaamia Center is recognized for its research, planning, and implementation of community language and cultural revitalization programs and initiatives. In 2016 Baldwin received the MacArthur Award for his work in language, culture, and community revitalization. Anna Belew is a PhD candidate in the Department of Linguistics at the University of Hawai‘i at Mānoa. Her research interests include language documentation, sociolinguistics, and language endangerment, particularly in African contexts. Her doctoral research combines language documentation with qualitative and quantitative sociolinguistic approaches to language shift in Iyasa, a Bantu language of southern Cameroon. She served as a project coordinator for the Catalogue of Endangered Languages from

xvi Biographical Note 2011–2016, and continues to be actively involved with the Catalogue and the Endangered Languages Project. Andrea L. Berez-Kroeker is associate professor in the Department of Linguistics at the University of Hawai‘i at Mānoa, where she teaches classes in the language documentation and conservation stream. She has been the director of the Kaipuleohone University of Hawai‘i Digital Language Archive since 2011, and she served as the president of the Digital Endangered Languages and Music Archiving Network (DELAMAN) from 2015 to 2017. Her research interests include Athabaskan languages and reproducibility in linguistic science. She recently coedited the volume Language Contact and Change in the Americas (with Carmeny Jany and Diane Hintz). Steven Bird is a linguist and computer scientist. He divides his time between Darwin, Australia’s most culturally diverse city, and a remote Aboriginal community where he is learning to speak Kunwinjku. His language work has taken him to West Africa, South America, Central Asia, and Melanesia. He has a PhD in computational linguistics and is professor in the College of Indigenous Futures, Arts, and Society at Charles Darwin University. He serves as Linguist at Nawarddeken Academy in West Arnhem, and Senior Research Scientist at the International Computer Science Institute, University of California Berkeley. Laura Buszard-Welcher is Director of Operations and The Long Now Library at the Long Now Foundation where she develops projects on human languages that seek to preserve and promote global linguistic diversity. These include the Rosetta Project, an archive of all human languages, and the Rosetta Disk, a microscopic analog backup of the archive designed to last and be readable for thousands of years. She has a PhD in linguistics, and maintains research interests in endangered language documentation, description, and archiving, as well as in developing tools to enable the use of the world’s languages in the digital domain. Michael Cahill is Orthography Services Coordinator for SIL International. He lived in Ghana and worked for several years on the Konni language project of northern Ghana, for which he helped develop an orthography, and from which developed his dissertation, “Aspects of the Morphology and Phonology of Konni.” He has published on the linguistic and sociopolitical aspects of orthography (co-editing Developing Orthographies for Unwritten Languages), cross-cultural communication, tone languages, and labial- velars. He periodically teaches tone analysis at the Graduate Institute of Applied Linguistics, and has served as the Chair of the LSA Committee on Endangered Languages and their Preservation. Amber B. Camp is a PhD student in the Department of Linguistics at the University of Hawai‘i at Mānoa. Her primary research interests include phonetics, psycholinguistics, and first-and second-language acquisition, focusing on underdocumented and understudied languages. Her current projects include investigations of various phonetic and phonological phenomena in Thai, Hawai‘i Creole, and Lakota.

Biographical Note xvii Melvatha Chee earned her PhD in linguistics from the University of New Mexico and accepted an assistant professor position at the University of Alberta, Edmonton. Her dissertation, “A Longitudinal Cross-Sectional Study on the Acquisition of Navajo Verbs in Children Aged 4 Years 7 Months Through 11 Years 7 Months,” analyzed Navajo child language data she collected. Her research interests are in the areas of morphophonology, polysynthesis, semantics, and acquisition. Her clans are Tsé Nahabiłnii, Kin Îichíi’nii, Hooghan Îání, and Áshįįhí. She maintains a connection to her culture, which enriches her Navajo language skills and knowledge. Shobhana Chelliah is professor of linguistics at the University of North Texas. Her research focuses on the documentation of the Tibeto-Burman languages of Northeast India. A documentation project on Lamkang (Kuki-Chin) focuses on health communication. Her publications include A Grammar of Meithei (Mouton de Gruyter, 1997) and The Handbook of Descriptive Linguistic Fieldwork (Springer, 2011) as well as articles on differential case marking and language contact in Tibeto-Burman. As a program director at the National Science Foundation from 2012 to 2015, she ushered in a data management plan for the Documenting Endangered Languages grant program to encourage long-term preservation and access of language data. Victoria Chen is a PhD candidate in the Department of Linguistics at the University of Hawai‘i at Mānoa. Her primary research interest lies in the comparative morphosyntax of Western Austronesian languages, in particular the core syntax of Philippine-type Austronesian languages. Her ongoing dissertation, “A Reexamination of the Philippine- Type Voice System and Its Implication for Austronesian Primary-Level Subgrouping,” investigates the synchronic syntax of Philippine-type languages and their implications for the primary-level subgrouping of the Austronesian language family. David J. Costa is the program director for the Language Research Office at the Myaamia Center at Miami University of Ohio. In this capacity he conducts continuing research on the Miami-Illinois language and helps design language curricula. Costa is also now involved in a long-term project to analyze and annotate the data from the Miami-Illinois language manuscripts that have been uploaded into MIDA (the Miami-Illinois Digital Archive). In addition to his work on Miami-Illinois, Costa has also done extensive research on the Shawnee language, the Algonquian languages of southern New England, and comparative Algonquian. David Crystal is honorary professor of linguistics at the University of Bangor, and works from his home in Holyhead, North Wales, as a writer, lecturer, and broadcaster. He read English at University College London, specialized in English-language studies, then joined academic life as a lecturer in linguistics, first at Bangor, then at Reading, where he became professor of linguistics. He received an OBE for services to the English language in 1995. His books include The Cambridge Encyclopedia of the Language and Language Death. His current research is chiefly in applied historical English phonology, with particular reference to Shakespearean original pronunciation.

xviii Biographical Note Wade Davis is professor of anthropology and the BC Leadership Chair in Cultures and Ecosystems at Risk at the University of British Columbia. Between 1999 and 2013 he served as Explorer-in-Residence at the National Geographic Society. Author of twenty books, including One River, The Wayfinders, and Into the Silence, and winner of the 2012 Samuel Johnson prize, he is the recipient of eleven honorary degrees, as well as the 2009 Gold Medal from the Royal Canadian Geographical Society, the 2011 Explorers Medal, the 2012 David Fairchild Medal for botanical exploration, and the 2015 Centennial Medal of Harvard University. In 2016 he was made a Member of the Order of Canada. Christopher P. Dunn is the Elizabeth Newman Wilds Executive Director of Cornell Botanic Gardens. He has published widely in plant conservation and has long been interested in the relationships between peoples and places. More recently, he has been studying how environmental change alters the connections between local and indigenous communities and their landscapes and how such communities might adapt to climate change. He established the Biocultural Initiative of the Pacific while at the University of Hawai‘i. At Cornell University, he is developing the Biocultural Botanic Gardens Network, a new global network to promote botanic gardens as biocultural resources. He serves as Chair of the IUCN National Committee for the USA. Nora C. England is the Dallas TACA Centennial Professor in the Humanities and Professor of Linguistics at the University of Texas at Austin. She works on the description and documentation of contemporary Mayan languages spoken in Guatemala and Mexico. She has also taken a leadership role in several different models of teaching linguistics to native speakers of these languages, including in an NGO and an undergraduate program in Guatemala and a graduate program in the United States. Her most recent publication is The Mayan Languages, co-edited with Judith Aissen and Roberto Zavala, and published in 2017 by Routledge. Simeon Floyd is professor in the Department of Anthropology of the Universidad San Francisco de Quito, Ecuador and affiliated researcher at the Max Planck Institute for Psycholinguistics, the Netherlands, Programa Prometeo, Secretaría Nacional de Educación Superior, Ciencia, Tecnología e Innovación, Ecuador. He is a linguistic anthropologist specializing in South American indigenous languages in areas including language description and documentation, the ethnography of communication, linguistic typology, multimodality, conversation analysis, and field psycholinguistics. His recent publications include articles in journals such as Language, Language in Society, and Discourse Processes. His current research project is the documentation of Highland Ecuadorian Quichua in regions of high language endangerment with support from the Endangered Languages Documentation Programme. Carol Genetti is a professor of linguistics and dean of the Graduate Division at the University of California, Santa Barbara. Her research interests include Himalayan languages, Tibeto-Burman linguistics, phonology, prosody, grammar, language change, and language documentation. Her books include A Grammar of Dolakha Newar (Walter

Biographical Note xix de Gruyter, 2007) and How Languages Work (Cambridge University Press, 2014). She was the founding director of InField/CoLang in 2008, a biennial institute which brings together linguists and members of endangered-language speech communities for shared research and teaching in techniques of language documentation, conservation, and revitalization. She currently holds the Anne and Michael Towbes Graduate Dean Chair at UC Santa Barbara. Jeff Good is associate professor of linguistics at the University at Buffalo, State University of New York. His research interests include comparative Niger-Congo linguistics, morphosyntactic typology, and language documentation. His documentary work focuses on a group of Bantoid languages of Cameroon, and, in collaboration with colleagues in Cameroon and elsewhere, he is currently examining patterns of rural multilingualism in this part of the world. His typological work has examined patterns of grammatical complexity, and, in particular, variation in templatic constructions. Jaeci Hall is currently working on a PhD in linguistics at the University of Oregon. She works on language revitalization of her heritage language, Tututni, an Athabaskan language from Southern Oregon. Her research interests include language revitalization theory and methodology, syntactic and morphological reconstitution of languages that have lost their first-language speakers, as well as Athabaskan language reconstruction. She is employed as a graduate researcher at the Northwest Indian Language Institute (NILI) in Eugene, Oregon. Millie Yéi Dulitseen Hall comes from the traditional inland Tlingit territory of Teslin, Yukon and has lived in the Yukon Territory for most of her life. She started her studies of the Tlingit language in Juneau, Alaska in 2013. Her intention is to use the first language of her grandmother (her name was Jiyil.axhch Mabel Johnson) daily and in meaningful ways, for the rest of her life. She is pleased and proud that one of her sons, Timothy Shkooyéil Hall, is also learning and interested in the revitalization and teaching of the Tlingit language. David Harmon is an independent researcher who writes about biocultural and linguistic diversity, place- based conservation, and secular values. He co- founded the NGO Terralingua, which is devoted to biocultural diversity. With his collaborator Jonathan Loh, he developed the Index of Biocultural Diversity and the Index of Linguistic Diversity; the latter is one of the indicators used by the Biodiversity Indicators Partnership. With Loh, he co- authored Biocultural Diversity: Threatened Species, Endangered Languages (WWF Netherlands, 2014). His most recent book is A Naturalistic Afterlife: Evolution, Ordinary Existence, Eternity (Palgrave Macmillan, 2017). Ryan E. Henke is a PhD student in linguistics at the University of Hawai‘i at Mānoa. His research interests center on language documentation, revitalization, and acquisition. His work focuses on the indigenous languages of North America, particularly those in the Algonquian and Siouan language families. He is currently investigating the first- language acquisition of nominal morphology in Northern East Cree and contributing

xx Biographical Note to the documentation and revitalization of an underdocumented variety of Nakota (Stoney) spoken near Edmonton, Alberta, Canada. Leanne Hinton is professor emerita of linguistics at the University of California at Berkeley and advisory member of the Advocates for Indigenous California Language Survival. She has written, edited, and co-edited numerous books and articles on Native American languages and language revitalization, including her most recent Bringing Our Languages Home and The Routledge Handbook of Language Revitalization (2018). She works with endangered languages as an advocate and practicing trainer in the field of language revitalization. With other language activists, she has helped found organizations devoted to language revitalization, and helped design language learning methods that are now used worldwide. Gary Holton has published on a variety of topics related to holistic language documentation, including the documentation of knowledge domains which are expressed through language. His current research explores the way traditional ecological knowledge— particularly knowledge of landscape—is encoded in human language, with particular focus on the languages of Eastern Indonesia and Alaska. Holton is currently professor of linguistics and co-director of the Biocultural Initiative at the University of Hawai‘i at Mānoa. Annie Johnston, whose Tlingit name is Ḵa’yaadéi, is from the Kóoḵhittaan Clan (Raven Children’s Clan) and the Kóok Hit (Pit House). She lives in Teslin, Yukon Territory, Canada. She lost the use of her Tlingit Language at the Indian residential school she attended for ten years. Today, she continues to learn the language through cultural and traditional activities. Daniel Kaufman specializes in historical, descriptive, and theoretical issues in Austronesian languages with a focus on the languages of the Philippines and Indonesia. He is co-founder and executive director of the Endangered Language Alliance, a nonprofit organization dedicated to documenting and conserving the endangered languages of New York City’s immigrant communities and is assistant professor at the Department of Linguistics and Communication Disorders at Queens College, CUNY. Alan R. King is a freelance linguist, language teacher, and specialist translator with a focus on issues relating to minority, endangered, and indigenous languages. His interests include grammatical analysis and practical/typological description, language standardization, and recovery strategies and techniques for endangered and extinct languages, taking into account social dimensions and the creative use of social media to promote learning, awareness, and effective action. His language of specialization includes Basque and several indigenous languages of Central America. After working mainly on Nawat (El Salvador) for a decade, he is currently developing materials for Honduran Lenca. King resides in the Basque Country.

Biographical Note xxi Jeanette King is a professor in Aotahi: School of Māori and Indigenous Studies at the University of Canterbury, Christchurch, New Zealand, where she also heads the bilingualism theme of the New Zealand Institute of Language, Brain and Behaviour (NZILBB). She has published on language revitalization, the phrasal lexicon of Māori, and Māori English. As part of the Māori and New Zealand English (MAONZE) project she has studied sound change in Māori language over the last 100 years. Her most recent work focuses on the Māori language in Māori immersion education settings and the intergenerational transmission of minority languages in New Zealand. Michael Krauss did doctoral fieldwork in Irish Gaelic. Since 1960 his career has been with Alaska Native Languages, endangered minority languages being his lifetime concern. His political work for Alaskan languages succeeded in legislation allowing those languages in Alaska’s schools since 1972, with a university center for their support and documentation in an archive for their permanent future. He would also believe he has helped to start a rebalance of the priorities of the linguistics discipline by drawing attention to the global level of language endangerment. Since his “retirement” in 2000 he has been lucky enough to complete a full-scale Grammar, Dictionary, and Texts for Eyak. Nala H. Lee is an assistant professor of linguistics at the National University of Singapore. She is interested in the spectrum of language change brought about by multilingualism. Specifically, her research interests include language endangerment, language death, and creole studies. She wrote a grammar of Baba Malay for her PhD dissertation, and is a co-developer of the Language Endangerment Index, which is used by the Catalogue of Endangered Languages (www.endangeredlanguages.com). She has published in Language in Society, Journal of Pidgin and Creole Languages, Language Documentation & Conservation, and Language. Jonathan Loh is an independent scientist specializing in the conservation, monitoring, and assessment of biological and biocultural diversity. He has devised, developed, and published many indicators of the changing state of global, regional, and national biodiversity, ecosystems, languages, and culture. He has a PhD in ethnobiology and is an Honorary Research Fellow of the School of Anthropology and Conservation at the University of Kent, Canterbury, UK. Matthew Lou-Magnuson is a PhD candidate (expected 2017) at Nanyang Technological University, Singapore, in the program for Linguistics and Multilingual Studies. He is a founding member of the Language Evolution Acquisition and Plasticity lab (LEAP), where his research focuses on the computational modeling of diachronic linguistic processes, such as grammaticalization and language change. His dissertation work combines complex-network and information-theoretic methods to investigate underlying mechanisms behind the correlation between social structure and language typology.

xxii Biographical Note Ian Mackenzie is a linguist, author, photographer, and film maker from Vancouver, Canada. He has conducted linguistic and ethnographic fieldwork with the Penan hunter gatherers of Borneo since 1993. He created a dictionary and grammar of Eastern Penan, and has investigated the lexical semantics of the language. He is co-author with Wade Davis of Nomads of the Dawn (1995); his work is featured in the 2008 documentary The Last Nomads. He collected, edited, and is translating into English a four-volume autobiography of one of these traditional nomads; the first volume has been published for Penan readers. He holds a BA from University of British Columbia (1978), an MA from Université de Montréal (1985), and is a Fellow of the Royal Canadian Geographical Society and a Fellow of the Explorers Club and recipient of their Lowell Thomas Award (2010). Luisa Maffi is director of Terralingua, an international nonprofit organization she co-founded in 1996, and editor of its publication Langscape Magazine. A linguist and anthropologist, she pioneered the concept of biocultural diversity—the interconnectedness and interdependence of biological, cultural, and linguistic diversity. She has published widely on that topic, including the books On Biocultural Diversity: Linking Language, Knowledge, and the Environment (Smithsonian Institution Press, 2001) and Biocultural Diversity Conservation: A Global Sourcebook (Earthscan, 2010). She spearheads Terralingua’s program of work and has collaborated with research and academic institutions and international organizations worldwide. Kawenniyóhstha Nicole Martin received a recognition of completion at the Onkwawénna Kentyóhkwa Mohawk Adult Immersion Program, Six Nations, Ontario in May 2008. She is now a language instructor at the DeadiwÓnöhsnye’s Gëjohgwa’ Seneca Adult Immersion program located in Coldspring in the Country of the Seneca Nation. Her grassroots interests include the 2004 International Indigenous Elders Summit, 2005–2010 Haudenosaunee (Iroquoian) Unity Run, Indigenous Youth United Nations Declaration presentation, and more currently assisting in Haudenosaunee (Iroquoian) language revitalizing efforts where she continues assisting with immersion curriculum development and delivery techniques, and teaching basic conversational language to various groups within the Haudenosaune (Iroquoian) Confederacy homelands. Teresa L. McCarty is the GF Kneller Chair in Education and Anthropology, and Faculty in American Indian Studies, at the University of California, Los Angeles, USA. Her research, teaching, and outreach focus on Indigenous education, language planning and policy, and ethnographic studies of education in and out of schools. She has published extensively on these topics, including 20 books and edited volumes. In 2010 she received the George and Louise Spindler Award from the American Anthropological Association for lifetime contributions to educational anthropology. Her current research, funded by the Spencer Foundation, is a US-wide study of Indigenous-language immersion schooling. Will C. McClatchey is manager and co-owner of Woodland Valley Meadows Farm near Eugene Oregon, USA. He is a long-term-care pharmacist and botanical consultant

Biographical Note xxiii on projects such as the Flora of Oregon. He is former professor of botany, University of Hawai‘i at Mānoa where he developed educational and research programs in ethnobotany. His past research has largely taken place in the Western Pacific region with emphasis on plant systematics and conservation of traditional plant and ecosystem management strategies. His current research investigates resilience of artificial ecosystems such as traditional Mediterranean orchards and Central European woodland meadows as transported landscapes in other parts of the earth. Ulrike Mosel, professor emerita of general linguistics at the University of Kiel, Germany, wrote her PhD thesis on classical Arabic grammaticography (1974, University of Munich), but then became interested in the grammar and lexicon of Oceanic languages. She was one of the initiators of the Dokumentation-Bedrohter-Sprachen program (DoBeS) funded by the Volkswagen Foundation, and is currently finalizing The Teop Language Corpus and a corpus-based grammar and dictionary of the Teop language spoken in Bougainville, Papua New Guinea. Her publications include Tolai Syntax and its Historical Development (Australian National University, 1984) and Samoan Reference Grammar (Scandinavian University Press, 1992, with Even Hovdhaugen). For further publications see https://www.isfas.uni-kiel.de/de/linguistik/mitarbeitende/prof.-dr.-ulrike-mosel/. William O’Grady is professor of linguistics at the University of Hawai‘i at Mānoa. Drawing on his expertise in first-and second-language acquisition, he has written several articles on the conditions that must be met if endangered languages are to be learned and maintained. He is currently working on Jejueo, the critically endangered language of Korea’s Jeju Island. He and two co-authors have recently published the first of a four-volume series of textbooks to support teaching of the language in high schools and colleges. A reference grammar of the language, to be published by the University of Hawai‘i Press, is forthcoming. Eve Koller received a PhD in linguistics from the University of Hawai‘i at Mānoa in 2017, where she worked on the Catalogue of Endangered Languages (ELCat) team and taught an introductory course in linguistics as a graduate student. She is currently a postdoctoral researcher in linguistics at the University of Hawai‘i at Mānoa. Her research interests include historical linguistics, language documentation and conservation, and language typology. Prior to her work at the University of Hawai‘i, she worked at the Smithsonian Institution’s National Museum of the American Indian. Her research interests include the typology of numeral systems, ancient writing systems, and Afroasiatic languages. Susan Penfield is affiliate faculty in linguistics at the University of Arizona and the University of Montana. She received a PhD in linguistic anthropology from the University of Arizona in 1980 where she was later an instructor for the Second Language Acquisition and Teaching Program (SLAT) and for the American Indian Language Development Institute (AILDI). From 2008 to 2011, Penfield directed the Documenting Endangered Languages Program at the National Science Foundation (NSF). She was a research associate for the Smithsonian’s Museum of Natural History (Native American Programs)

xxiv Biographical Note and currently teaches grant writing to community linguists in support of indigenous languages. Dr. Penfield specializes in language documentation, indigenous languages and technology, language revitalization and community-based language/linguistic training. Gabriela Pérez Báez is Curator of Linguistics at the National Museum of Natural History, Smithsonian Institution. She holds a doctorate in linguistics from the University at Buffalo. Pérez Báez works on Zapotec languages and has published on the impact of migration on language vitality, on verbal inflection and derivation, and on semantic typology and the relationship between language and cognition. She has compiled two dictionaries of Diidxazá (Isthmus Zapotec) within a participatory, interdisciplinary model. Pérez Báez has been director of the Recovering Voices initiative and is co-director of the National Breath of Life Archival Institute for Indigenous Languages. Ross Perlin is co-director of the Endangered Language Alliance in New York City. He has written on language endangerment and revitalization for The Guardian, Dissent, and n+1, among other publications. His research interests include the documentation and description of Himalayan languages, urban linguistic diversity and multilingualism, and the sociology of Jewish languages. He received his PhD from the University of Bern. Perlin is currently working on a book about the languages of New York. Samantha Rarrick is a postdoctoral fellow with the National Science Foundation’s Directorate for Social, Behavioral & Economic Sciences. She is currently affiliated with the University of Hawai‘i at Mānoa, where she was awarded her PhD in 2017. Her research focuses on a holistic approach to language documentation in the Pacific, addressing both signed and spoken languages, primarily in Papua New Guinea. Richard A. Rhodes is an associate professor of Linguistics at the University of California, Berkeley and an internationally recognized expert in Algonquian studies. He is the author of a major dictionary of Ojibwe, an important Algonquian language. His recent work has focused on descriptive syntax and nineteenth-century Ojibwe/ Ottawa documents. He has also worked on the documentation of Michif, a mixed language of western Canada, and of Sayula Popoluca, a Mixe-Zoquean language of southern Mexico. Keren Rice is a University Professor in the Department of Linguistics at the University of Toronto. She has done documentary, descriptive, and theoretical research, focusing on Dene languages, phonology, and morphology. She has been involved in work on language revitalization, and has written on fieldwork and the ethics of fieldwork. Her publications appear in journals such as Phonology, International Journal of American Linguistics, Language, and Language Documentation & Conservation. She has published with Cambridge University Press and Mouton de Gruyter, among others. Bonny Sands is an adjunct professor at Northern Arizona University. She is a leading expert on the phonetics and historical linguistics of African languages. Her research looks at how clicks and other rare sounds provide a window into the linguistic prehistory of

Biographical Note xxv Africa. As a principal investigator on grants from the National Science Foundation of the United States, she has researched the lexicons and sound systems of endangered languages Nǀuu, ǂHoan and ǃXuun. Her fieldwork with minority languages spoken by hunter-gatherers in the Kalahari and in East Africa has led to her current interest in language revitalization. Racquel-María Sapién has been working with speakers in Konomerume, Suriname (since 2005) to document, describe, preserve, and revitalize the Aretyry dialect of Kari’nja. Prior to that, she had served the community as a Peace Corps Volunteer in Rural Community Development from 1995 to 1998. In her work with community members, she seeks to develop collaborative projects that are of balanced mutual benefit. Her academic work includes research foci in community-inclusive field research methodologies, Cariban morphosyntax, and methods and materials development for endangered languages revitalization. Sean Simpson is a PhD candidate in computational linguistics at Georgetown University. His dissertation research is centered around the integration of sociolinguistic principles, findings, and theory into systems for Automated Speaker Profiling. Simpson also serves as a research consultant at the Center for Advanced Study of Language, where he is currently developing methods for the automatic detection of novel and emerging illicit-substance terminology in geographically targeted streaming social media corpora. His research interests lie primarily in computational sociolinguistics, language variation, and sociophonetics. Alice Taff works to foster Alaskan language continuity by engaging language community members to document their languages, re-establish situations for language use, and create materials in their languages. Examples of such materials are: Deg Xiyanʼ Xidhoy: Deg Xinag narratives http://www.uas.alaska.edu/arts_sciences/humanities/alaska-languages/deg-xinag.html Woosh Een áyá Yoo X̱ʼatudli.átk: Tlingit Conversations http://w ww.uas.alaska.edu/arts_s ciences/humanities/a laska-l anguages/c uped/ video-conv/ Unangam Tunuu (Aleut language) conversation corpus https:// elar.soas.ac.uk/ Collection/MPI78647 She is past president of the Society for the Study of the Indigenous Languages of the Americas. Her current research interest is finding links between ancestral Indigenous language use and health. Apay Ai-yu Tang, is an associate professor with the Department of Indigenous Languages and Communications, College of Indigenous Studies, National Dong Hwa University, Taiwan where she began teaching in 2012. Her primary interest is in endangered-language revitalization and conservation, in particular for the indigenous languages spoken in Taiwan. She is a semi-speaker of Truku Seediq. Her other interests that have led to collaborative presentations and publications include a

xxvi Biographical Note sentence production and comprehension study, university-community partnerships, and participatory action research, as well as culturally responsive teaching for indigenous language learners. Nick Thieberger established the Pilbara Aboriginal Language Centre in Port Hedland in the late 1980s. He went on to write a grammar of South Efate (Nafsan), a language from central Vanuatu that pioneered methods in citing primary data from a media corpus. He helped establish the Pacific and Regional Archive for Digital Sources in Endangered Cultures (http://paradisec.org.au), and is now its director. He is the editor of Language Documentation & Conservation. He taught in the Department of Linguistics at the University of Hawai‘i at Mānoa and is now an Australian Research Council Future Fellow at the University of Melbourne, Australia where he is a Chief Investigator in the ARC Centre of Excellence for the Dynamics of Language. Sarah G. Thomason, after receiving her PhD from Yale University, taught Slavic linguistics at Yale and then general linguistics at the University of Pittsburgh and, currently, the University of Michigan. Since 1981 she has worked with elders at the Salish & Pend d’Oreille Culture Committee in Montana, compiling a dictionary and other materials for the tribes’ language program. Her research focuses on contact-induced language change, endangered languages, and Salishan linguistics. A few of her publications are Language Contact, Creolization, and Genetic Linguistics (with Terrence Kaufman, University of California Press, 1988, 1991), Language Contact: An Introduction (University of Edinburgh Press & Georgetown University Press, 2001), and Endangered Languages: An Introduction (Cambridge University Press, 2015). John R. Van Way is a doctoral candidate in the Department of Linguistics at the University of Hawai‘i at Mānoa. In his dissertation, he is documenting Nyagrong Minyag, an understudied and endangered language of western China. This language is experiencing a unique and imminent threat to its livelihood due to construction of a hydroelectric dam which will displace all of its speakers. John also served as project coordinator for the Catalogue of Endangered Languages during the initial phase of the project. Rachel Vogel joined the linguistics PhD program at Cornell University in fall 2017. Her research interests include phonetic and phonological documentation of endangered languages and language revitalization strategies and practices. She holds a degree in linguistics from Swarthmore College, with a thesis on the phonetics and phonology of Bantawa, an endangered Tibeto-Burman language. Rachel has carried out extensive research at the Smithsonian Institution’s Recovering Voices initiative on language revitalization efforts worldwide. She also served as the Program Assistant for the 2017 National Breath of Life Archival Institute for Indigenous Languages. James Woodward holds appointments as adjunct professor in the Center for Sign Linguistics and Deaf Studies at The Chinese University of Hong Kong (CUHK) and in the Department of Linguistics at the University of Hawai‘i at Mānoa (UHM). Through

Biographical Note xxvii CUHK, he provides in-country training in sign language documentation to Culturally Deaf individuals in Southeast Asian countries. He is currently working with Culturally Deaf individuals in Myanmar to develop teaching materials and companion dictionaries for Yangon Sign Language. At UHM, his efforts are focused on the documentation, conservation, and revitalization of Hawai‘i Sign Language, a critically endangered language isolate. Sue Wright is emeritus professor at the University of Portsmouth, UK. Her research focuses on the political and social contexts which affect language choices, spanning investigation at regional, national, supranational, and international level. Her most recent book is Language Policy and Language Planning: From Nationalism to Globalisation (Palgrave, 2017). She is co-editor (with Jeroen Darquennes and Ulrich Ammon) of the journal, Sociolinguistica. She is co-editor (with Helen Kelly-Holmes) of the Palgrave book series, Language and Globalisation. She is a member of the International Panel on Social Progress (Belonging and Solidarity sub-panel).

Introdu c t i on Endangered Languages Lyle Campbell and Kenneth L. Rehg

1. Endangered Languages Scholarship involving endangered languages has both an empirical and an applied/advocacy character. Some scholars focus more on the empirical—research aimed at contributing to knowledge of value to linguistics and related fields. Others concentrate on the applied— advocacy activities aimed at supporting and fostering maintenance and revival of endangered languages. An increasing number of scholars attend to both. The thirty-nine chapters of The Oxford Handbook of Endangered Languages reflect these various approaches. Its purposes are (1) to provide a reasonably comprehensive reference volume for endangered languages, with the scope of the volume as a whole representing the breadth of the field, (2) to highlight both the range of thinking about language endangerment and the variety of responses to it, and (3) to broaden understanding of language endangerment, language documentation, and language revitalization, and, in so doing, to encourage and contribute to fresh thinking and new findings in support of endangered languages. Many linguists, perhaps especially those working in North America, have long been concerned about the endangerment and loss of indigenous languages. During the latter half of the twentieth century, however, these concerns largely receded into the background as attention was turned to the exciting developments that were taking place in linguistic theory. It was not until the publication of Michael Krauss’s “The world’s languages in crisis” in 1992 that linguists once again began to take note of what was happening to the foundation of their discipline.1 Krauss (1992, 8) considered it “a plausible calculation that—at the rate things are going—the coming century will see 1 The history of concern for endangered languages did not, of course, begin with Krauss’s 1992 publication. Franz Boas, founder of America anthropology and American linguistics was very concerned that the American Indian languages (and cultures) were so rapidly being lost and instructed his students

2 Lyle Campbell and Kenneth L. Rehg either the death or the doom of 90% of mankind’s languages”; only 10%, he argued, appeared to be safe. He consequently urged linguists to work responsibly to address this issue, observing that “if we do not act, we should be cursed by future generations for Neronically fiddling while Rome burned” (1992, 8). Today, the endangered-languages crisis is widely acknowledged among linguists and other scholars who deal with languages and indigenous peoples as one of the most pressing problems facing humanity, posing moral, practical, and scientific issues of enormous proportions. The endangered-languages crisis is becoming increasingly well-known to the general public as well. For example, the media now commonly report “language obituaries,” in which the death of a language is reported as part of the obituary of that language’s last known speaker. Some recent examples are: Tommy George, the last fluent speaker of Awu Laya, an aboriginal language [of Australia], died August 12, 2016, at the age of 87. He had been awarded an honorary doctorate by James Cook University for help in documenting language and traditional fire management of land (The Australian, Aug. 13, 2016)2 Doris Jean Lamar-McLemore, the last remaining Wichita speaker, passed away on August 30, 2016; she was 89.3 Edwin Benson, last known native speaker of Mandan, died in December of 2016 at 85 years of age.4

Some widely reported cases are quite famous. The death of the last speaker of Eyak (of Alaska) was widely reported. Eyak lost its last speaker when Marie Smith Jones died at the age of 89, January 21, 2008. Another often cited case is that of Ubykh, a Northwest Caucasian language, reported to have “died at daybreak, Oct. 8, 1992” when Tevfik Esenç, its last speaker, passed away (Crystal 2000, 2). Such language obituaries give a personal and dramatic face to language extinction. They help to spotlight the widespread problem of language endangerment and language loss, though actually for most extinct languages, the names of the last speakers are not known. It is nevertheless certain

in the urgency of describing them before they became extinct; that sense of urgency persisted. At about the time of Krauss’s important paper, others also expressed concern about endangered languages. Among others we can mention the examples of Joshua Fishman, who wrote of reversing language loss (for example, Fishman 1990, 1991) and the plenary panel on sociolinguistics of language endangerment at the 15th International Congress of Linguists in Ottawa in 1992 with the presentations published in advance in Robins and Uhlenbeck (1991). Nevertheless, it is our belief that it was Michael Krauss’s contribution at the Linguistic Society of America symposium on Endangered Languages in 1991 and the publication of his article (Krauss 1992) that more than anything else captured the attention of linguists generally and contributed most to launching modern concern for documenting and revitalizing endangered languages. 2 http://www.theaustralian.com.au/news/nation/language-lost-with-the-passing-of-great-elder- tommy-george/news-story/3fca836f8f19e249e18437bc2732415f. 3 http://www.indiancountrynews.com/index.php/news/19-educational-news-and-programs/ 5915-doris-jean-lamar-mclemore-the-last-to-speak-wichita-indian-language. 4 http://www.kfyrtv.com/content/news/Edwin-Benson-last-known-fluent-speaker-of-Mandan- passes-away-at-85-405723515.html.

Introduction 3 that, unless determined and concerted action is taken, languages will continue to be lost and obituaries such as these will become increasingly common.

2. How many languages are endangered? How many of the world’s languages are endangered? How do we know which ones are endangered, and how endangered they are? We turn to these questions and to the context of language endangerment here.5

2.1. What is an endangered language? Criteria for determining language endangerment The main criteria used to determine whether a language is endangered are: The absolute number of speakers—the fewer the number of speakers, the less likely the language’s long-term survival. Intergenerational transmission—if a language is not being learned by children in the traditional way, passed on from one generation to the next, it is essentially doomed to extinction unless revitalization efforts prove successful. The greater the intergenerational transmission, the more likely the language’s survival. Decreasing number of speakers—the more the number of speakers decreases, the more endangered the language is. Decrease in domains of use—the more the domains in which the language is used are reduced, the greater its endangerment becomes. (See Lee and Van Way, Chapter 2, this volume)

The Catalogue of Endangered Languages (www.endangeredlanguages.com) reports the vitality of the various endangered languages, i.e., their degree of endangerment. Based on these criteria, its Language Endangerment Index (LEI) gives a score for the degree of endangerment of each endangered language. (See Lee and Van Way, Chapter 2, this volume.)

2.2. Language endangerment in context Language extinction is hardly new. Language loss has been going on throughout human history, with many well-known cases from antiquity, e.g., Akkadian, Ancient Egyptian, Etruscan, Gothic, Hittite, Sumerian, and so on. So, some might ask, why the alarm? The sense of crisis is due to the strikingly accelerated rate of language extinction in recent times.

5

Some of the discussion in this section parallels the discussion in Campbell and Belew (in press).

4 Lyle Campbell and Kenneth L. Rehg The sharp increase in language loss is seen in the fate of languages everywhere. For example, California had some 100 American Indian languages at the time of the Gold Rush, c.1850, but only eighteen are still spoken today; none is being learned by children in the conventional way; all are highly endangered. Of the some 314 languages spoken in what is now the United States and Canada when Europeans first arrived, 152 no longer have native speakers (48%). Of 280 languages at the time of first European contact in present-day US territory, 117 are still spoken (42%); however, this number is misleading—all the remaining 117 are endangered, most of them critically so, and only about a dozen are still being passed on to children. Many will soon be extinct, unless language revitalization efforts succeed. The statistics for extinct languages and critically endangered languages in Australia, much of Latin America, and swaths of northern Eurasia are similarly dire. No area of the world is free from language endangerment. As of April 28, 2017, the Catalogue of Endangered Languages lists 430 languages as “critically endangered”; 299 languages have fewer than ten speakers. The Catalogue also includes sixty-eight sign languages, six mixed languages, and twenty-four pidgin and creole languages. Altogether, the Catalogue of Endangered Languages lists 3,150 currently endangered languages6; that is 46% of the 6,879 living languages in the world as listed by Ethnologue.7 To see the crisis from a different perspective, we can turn to whole language families. Of the world’s c.407 independent language families (including language isolates, language families which have only one member), 96 are extinct—no language belonging to any of these families has any remaining native speakers—24% of the linguistic diversity of the world, calculated in terms of language families, has been lost (Barlow and Campbell in press). Related to this, 59 of the world’s 159 language isolates are already extinct (37%) (Okura in press). (See also Belew and Simpon, Chapter 1, this volume.) Of all the millennia in which languages could have disappeared, two-thirds of these extinct language families became extinct in only the last sixty years, dramatically underscoring the accelerating rate of language loss in recent times. Many other languages and language families are on the brink of losing their last native speakers and will soon follow, which will result in a drastic change in the world’s linguistic diversity and in the numbers cited here.

6 The total number of languages listed in the Catalogue of Endangered Languages (ELCat) is 3,409, but this includes 254 dormant or awakening languages, that is, languages with no known native speakers. For forty-eight, the number of speakers is unknown. 7 Ethnologue’s total number of languages is 7,099; adjusted for the 219 with no known speakers and one constructed language, their number of living languages is 6,879 (https://www-ethnologue-com.eres. library.manoa.hawaii.edu/statistics/size, accessed April 18, 2017). The percentage of the world’s languages that are currently endangered, calculated based on Ethnologue’s total number of living languages compared to the Catalogue of Endangered Languages’ total number of endangered languages, is only an approximation, since Ethnologue contains language entries which ELCat does not accept and lacks some languages which ELCat does recognize. However, since ELCat concerns only endangered languages, Ethnologue is the best source that lists the non-endangered languages of the world, even where the two do not match with respect to a good number of endangered languages.

Introduction 5 What does this mean? Though not really comparable, the loss of a specific language may be likened in gravity to the loss of a single species, say the Bengal tiger or the Right whale. However, the extinction of whole families of languages is a tragedy similar in magnitude to the loss of whole branches of the animal kingdom, say to the loss of all felines or all cetaceans. Just imagine the distress of biologists attempting to understand the animal kingdom with major branches missing. Yet what confronts us is the staggering loss of a quarter of the linguistic diversity of the world, gone forever.

3. Causes of language endangerment Why do languages become endangered and die? The obvious answer is that languages undergo attrition and die because people stop speaking them. This response, while true, leaves unmentioned the many things that lead speakers to abandon their language and to shift to another language. Nevertheless, it serves a useful purpose. It reminds us that language loss is a consequence of historically contingent events and circumstances that impact the speakers of a language (see Rogers and Campbell 2015 [2017]). As Swadesh (1948, 235) pointed out long ago, “the factors determining the obsolescence of languages are non-linguistic.” Some of the factors that can impact speakers and result in language loss are listed here, grouped into five broad (sometimes overlapping) categories (see also Campbell in press). 1. Economic factors, often considered the most important: lack of economic opportunity, rapid economic transformation, ongoing industrialization, shifts in work patterns, migration and migrant labor, resource depletion, forced changes in subsistence patterns, communication with outside regions, resettlement, destruction of habitat, globalization, etc. 2. Political, geographical, demographic, and sociocultural factors: discrimination, repression, ethnic cleansing, official language policies, level of education available, population dispersal, rapid population collapse, marriage patterns, birth rates, access to education, refugee status, ethnic identity, the role of language in religion, religious proselytizing, military service, lack of linguistic and other human rights, number and concentration of speakers, range and distribution of the language, low socioeconomic status of speakers, lack of revitalization or revival efforts, degree of acculturation, lack of social cohesion among speakers, lack of physical proximity among speakers, war, slavery, famine, epidemics, natural disasters, etc. 3. Lack of institutional support: absence of institutional support as represented in the roles the language plays in education, government, churches, the media, recreational activities (sports events, popular culture, music, etc.), military service, the judicial system, etc.; other causal factors include lack of official recognition of the language, lack of or a very limited degree of autonomy and self-determination, etc.

6 Lyle Campbell and Kenneth L. Rehg 4. Aspects of language, language use, and language choice: influences from language contact, code-switching, different kinds of multilingualism, language instruction, lack of recognition of linguistic (and other human) rights, the nature of language transmission in the community, limited literacy in the minority language, restricted degree of competence in the minority language, etc. 5. Subjective attitudes (motivations): attitudes of the speakers toward the languages under threat and toward the official national language(s) and the dominant languages that surround them, attitudes of members of mainstream society toward minorities and their languages; the symbolic value of the dominant language (e.g., as a symbol of a nation, civilization, progress, the future, affluence, upward mobility, etc.) vs. the symbolic value of the endangered language (e.g., as a symbol of the past, poverty, lack of opportunity, backwardness, etc.); the relative prestige of the language (as a cultural symbol, symbolically connected with notions of being international and urban vs. local and rural); the stigmatization of a local language (low prestige of the endangered language); language loyalty; the minority language as a marker of ethnic identity and group membership, etc. Each of these various factors might operate simultaneously, in tandem, or play no role whatsoever in the decline and death of a particular language. Consequently, the task of providing an adequate explanation for why languages are lost is an undertaking of great complexity.

4. Why should we care? Why does it matter that languages are becoming extinct? The chapters of this Handbook provide a range of positive answers to this question. Some of the main reasons that have been offered for why language endangerment matters follow. 1. Social justice and human rights. Language loss is often not voluntary; it frequently involves violations of human rights, pushed by political or social repression, oppression, aggression, prejudice, violence, and at times by ethnic cleansing and genocide. This is simply a matter of right and wrong—one that should be important to everyone. Language loss is typically experienced as a crisis of social identity. For many communities, work toward language revitalization is not about language in isolation, but is part of a “larger effort to restore personal and societal wellness” (Pfeiffer and Holm 1994, 35). Some scholars and community activists insist that ongoing language loss leads to damaged communities and dysfunctional behaviors. They argue that one’s psychological, social, and physical well-being is connected with one’s native language; it shapes

Introduction 7 one’s values, self-image, identity, relationships, and ultimately success in life. Indigenous voices testify to the crucial role language plays in their cultural and personal identity. The following, though few, are representative: How can I believe the foolish idea That my language is weak and poor If my mother’s last words Were in Evenki? (Alitet Nemtushkin, Evenki poet)8 For centuries our languages have been a reflection of those cultural distinctions that have made us who we are as a people, and in a sense have been an element of the many things that have made us strong. (Stephen Greymorning [Arapaho] 1999, 6.) Why save our languages, since they now seem to have no political, economic, or global relevance? That impression is exactly the reason why we should save our languages, because it is the spiritual relevance deeply embedded in our own languages that makes them relevant to us as American Indians today. (Richard Littlebear [Northern Cheyenne] 1999). Each language still spoken is fundamental to the personal, social and—a key term in the discourse of indigenous peoples—spiritual identity of its speakers. (Zepeda [Tohono O’odham nation] and Hill 1991, 1) I can’t stress enough the importance of retaining our tribal languages, when it comes to the core relevance or existence of our people . . . You could argue that when a tribe loses its language, it loses a piece of its inner-most being, a part of its soul or spirit . . . When it comes to native languages, the situation is simple: Use it or lose it. (Sonny Skyhawk [Sicangu Lakota, Hollywood actor] 2012)

2. Human concerns. Languages are treasure houses of information for history, literature, philosophy, art, and the wisdom and knowledge of humankind. Their stories, ideas, and words help us make sense of our own lives and of the world around us—of the human experience, of the human condition in general. When a language goes extinct without documentation, we lose incalculable amounts of human knowledge. We illustrate this by mention of only two areas, literature and history.

8 http://www.unesco.org/new/en/culture/themes/endangered-languages/faq-on-endangered- languages/.

8 Lyle Campbell and Kenneth L. Rehg Literature: The life-enriching value of literature is well understood—“by studying literature, we learn what it means to be human.”9 This is equally true of the oral literatures of the indigenous peoples of the world. They, too, have grappled with the complexities of their world and the problems of life, and the insights and discoveries represented in their literatures—whether written or oral—are of no less value to us. When a language becomes extinct without documentation, taking into oblivion all its oral literature, oral tradition, and oral history with it, all of humanity is diminished. History: We study history “to gain access to the laboratory of human experience.”10 Great reservoirs of historical information are contained in languages. The classification of related languages teaches us about the history of human groups and how they are related to one another, and we gain understanding of contacts and migrations, the original homelands where languages were spoken, and past cultures from the comparison of related languages and the study of language change—all irretrievably lost when a language is lost without adequate documentation. 3. Loss of knowledge. The world’s linguistic diversity is one of humanity’s most valuable treasures. This means that the loss of the many hundreds of languages that have already become extinct is a cataclysmic intellectual disaster, on many different levels. To cite a single example, encoded in each language is knowledge about the natural and cultural world it is used in. This knowledge is often not known outside the small speech communities where the majority of endangered languages are spoken. When a language dies without adequate documentation, it takes with it this irreplaceable knowledge. Loss of such knowledge in principle, it is argued, could have devastating consequences even for humankind’s very survival. Reduction of language diversity diminishes the adaptational strength of the human species because it lowers the pool of knowledge from which we can draw. A telling example comes from the Seri (of Sonora, Mexico, with c.700 speakers). The Seri have knowledge of “eelgrass” (Zostera marina L.) and “eelgrass seed,” which they call xnois. It is “the only known grain from the sea used as a human food source . . . eelgrass has considerable potential as a general food source . . . Its cultivation would not require fresh water, pesticides, or artificial fertilizer” (Felger and Moser 1973, 355–356). Seri has a whole set of vocabulary items dealing with eelgrass and its use. According to the argument, it is all too plausible to imagine a future in which some natural or human-caused disaster might compromise land-based crops, leaving human survival in jeopardy because of the loss of knowledge of alternative food sources such as the knowledge of eelgrass reflected in the Seri language.

9 http://www.cliffsnotes.com/cliffsnotes/literature/why-should-literature-be-studied. Accessed November 27, 2014. 10 American History Association, http://www.historians.org/teaching-and-learning/why-study- history. Accessed November 27, 2014.

Introduction 9 Speculation aside, it is clear that documentation of the languages of small-scale societies and of the knowledge they hold has significantly benefited humanity. Other examples come from medicine. It has been reported that 75% of plant-derived pharmaceuticals were discovered by examining traditional medicines, where the language of curers often played a key role (see, for example, Bierer, Carlson, and King 1996). If these languages had become extinct and knowledge of the medicinal plants and their uses had been lost, all of humanity would be impoverished and human survival would be left less secure. So, it is argued that reduction of language diversity diminishes the adaptive strength of humans as a species because it lowers the pool of knowledge from which we can draw for survival.11 (See also Part IV of this volume.) 4. Consequences for understanding human language. A major goal of linguistics is to understand human cognition and human language capacity through the study of what is possible and impossible in human languages. The discovery of previously unknown linguistic features and traits as we document languages contributes to achieving this goal and advances knowledge of how the human mind works. Conversely, language extinction is a horrendous impediment to achieving this goal. The following example illustrates this well. The discovery of the existence of languages with OVS [Object-Verb-Subject] and OSV [Object-Subject-Verb] basic word orders forced abandonment of a previously postulated language universal. Greenberg (1978, 2) had proposed that “whenever the object precedes the verb the subject does likewise.” However, it was discovered that Hixkaryana (a Cariban language of Brazil, with only 350 speakers) has OVS basic word order, as seen in the following sentence: toto

yonoye

man ate

kamura jaguar

“The jaguar ate the man.” We now know that several languages have OVS or OSV basic word order; most of them are spoken in small communities in the Amazon. Discovery of languages with these basic word orders not only forced abandonment of the postulated universal but also required revision of numerous other theoretical claims about language. It is all too plausible, given what has happened to indigenous languages at

11 In principle it is possible for a society to give up its language and shift to another, and find ways to talk about this sort of knowledge in the new language. What we observe, however, in case after case, is that when a language is not passed on to the next generation, the knowledge of the natural and cultural world encoded in that language fails to be transmitted as well.

10 Lyle Campbell and Kenneth L. Rehg the hands of unscrupulous loggers, miner, and ranchers, that the few languages with these word orders could have become extinct before they were documented, leaving us forever with erroneous assumptions about what is possible and impossible in human language and how that reflects on our understanding of human cognition. Documentation of endangered languages has frequently and repeatedly demonstrated the importance of obtaining adequate descriptions of these languages; the discovery of previously unknown linguistic traits is helping linguists to comprehend the full range of what is possible and impossible in human language (for numerous examples, see Palosaari and Campbell 2011). There are, of course, objections to sustaining minority and endangered languages, the most common one having to do with the role of language in geopolitical conflicts. It is often said that if we had fewer languages, we would understand each other better and live in greater harmony. But this is far from true. That monolingualism does not guarantee nor even foster greater “understanding” is attested throughout history. As David Crystal (2000, 27) reminds us, “all of the large monolingual countries of the world have had their civil wars.” It is also shown by the many recent and ongoing armed conflicts among groups speaking the same language, e.g., in Darfur, Egypt, Iraq, Lybia, Syria, Yemen, Colombia, Northern Ireland, Thailand, or the 1994 genocide in Rwanda that involved Hutu and Tutsi, both speakers of the same language, Kinyarwanda. This contrasts with lack of such conflicts in relatively peaceful, officially multilingual Belgium, Canada, Finland, Luxembourg, Singapore, Switzerland, Tanzania, etc. Multilingual and multicultural countries need to recognize that national unity and understanding are not fostered by monolingualism or ethnic cleansing, but that recognition of minority languages rights can bring about mutual trust, peace, and ultimately national stability. We need to expose the erroneous assumption that people and countries cannot be both multilingual and successful, and show, rather, that there are significant benefits from multilingualism. Several recent research papers report that bilingual children tend to grow up to be more tolerant citizens than monolinguals. In short, there is no evidence that fewer languages might lead to greater harmony.

5. Responses to languages endangerment Work on and in support of endangered languages has risen to prominence since Krauss’s 1992 call to action, primarily in two areas—language documentation and language revitalization. There are, however, some differences of opinion concerning the nature of the activities encompassed by these two labels, as discussed here.

Introduction 11

5.1. Language documentation What is “language documentation”? There is a range of opinions (as well as disagree ments) about what language documentation is. Many have followed Himmelmann’s (1998, 2006) definition. Himmelmann contrasted language description and language documentation, saying that language documentation “aims at the record of the linguistic practices and traditions of a speech community” (Himmelmann 1998, 9–10), and that “language documentation may be characterized as radically expanded text collection” (Himmelmann 1998, 2). Himmelmann (2006, 1) defines language documentation “as a field of linguistic inquiry and practice in its own right which is primarily concerned with the compilation and preservation of linguistic primary data and interfaces between primary data and various types of analyses based on these data.” He more generally observes that “a language documentation is a lasting, multipurpose record of a language.” Woodbury (2011, 159) provides a similar definition, stating: “Language documentation is the creation, annotation, preservation and dissemination of transparent records of a language.” Similarly, the Hans Rausing Endangered Languages Project website says that language documentation “emphasises data collection methodologies, in two ways: first, in encouraging researchers to collect and record a wide range of linguistic phenomena in genuine communicative situations; and secondly, in its use of high quality sound and video recording to make sure that the results are the best possible record of the language.”12 With statements such as these, it is little wonder that some linguists, as noted by Himmelmann (2012, 187) himself, have misinterpreted this approach to mean: Documentary linguistics is all about technology and (digital) archiving. Documentary linguistics is just concerned with (mindlessly) collecting heaps of data without any concern for analysis and structure. Documentary linguistics is actually opposed to analysis.

Himmelmann goes on to note that “the fact that language documentation and language description can be separated fairly clearly on methodological and epistemological grounds does not mean that they can be separated in actual practice” (2012, 188). Many linguists believe both that they cannot be and that they should not be separated and consequently follow the Boasian tradition, wherein language documentation includes a grammar, a dictionary, and texts/recordings representing a large range of genres. Rehg (2007, 15) says language documentation “involves the development of high-quality grammatical materials and an extensive lexicon based on a full range of textual genres and registers, as well as audio and video recordings, all of which are fully annotated, of archival quality, and publicly accessible.” Rhodes et al. (2007, 3) list as necessary for adequate language documentation all of the following:

12

http://www.hrelp.org/documentation/. Accessed October 18, 2015.

12 Lyle Campbell and Kenneth L. Rehg All the basic phonology All the basic morphology All the basic syntactic constructions A lexicon which (a) covers all the basic vocabulary and important areas of special expertise in the culture, and (b) provides at least glosses for all words/morphemes in the corpus A full range of textual genres and registers. (Rhodes et al. 2007; see also Rhodes and Campbell, Chapter 5, this volume.) Clearly, opinions differ about the demarcation between language documentation and description or analysis, but as Himmelmann (2012) explains, in spite of misunderstandings, there is broad agreement, but with differences of emphasis. Most agree that documentation can and probably should include a rich corpus of recordings, a grammar, and a dictionary, though some place greater emphasis on a large number of recording representing many genres and on the technology for recording and archiving, while other give more attention to the description, to the analysis which includes a grammar and dictionary. (See Rhodes and Campbell, Chapter 5, this volume.) We, the editors of this volume, subscribe to the view that reconciles these different views, that adequate language documentation has as its goal the creation, annotation, preservation, and dissemination of transparent records of a language where that record is understood explicitly to include language analysis and the production of a grammar and a dictionary, along with a rich corpus of recordings adequately archived.13

5.2. Language documentation and linguistic theory Sometimes language documentation and linguistic theory are thought to be in tension with or even openly hostile to one another. This might be true if language documentation 13

Himmelmann’s (1998, 2006) call for a documentary linguistics has had many positive outcomes. With its greater focus on recording languages in their sociocultural settings with attention to many genres and speech events, an approach later linked with the ethnography of communication, modern language documentation projects typically provide a much broader set of data on the uses and functions of language beyond the traditional narratives that were often the focus of earlier language documentation text recordings. (See Floyd, Chapter 16, this volume.) However, in spite of it being linked with ethnography of communication, we do not see so far well-developed accounts of the ethnography of communication coming out from language documentation projects. In fact, for many, we encounter abundant recordings, but of a more etic sort (from an outsider’s view). However, adequate ethnography of communication requires analysis, emic description (from an insider’s point of view). Thus, even the expanded view that language documentation should record lots about everything, without some analysis/description, we can end up with lots of recordings that are of little true ethnography of communication relevance. This is in fact true of the materials archived from some language documentation projects, from video cameras and audio recordings aimed at lots of etic things but with little emic understanding.

Introduction 13 were held to be only “radically expanded text collection” (see above). However, today, most do not subscribe to such a view. As seen in the examples mentioned above and others in various chapters of this volume, language documentation can and does contribute directly to linguistic theory. It has brought many formerly unknown language traits to our attention, and these discoveries have impacted general claims about language and have allowed new generalizations to be made. Clearly, it is inaccurate to see language documentation as somehow unrelated to and even perhaps hostile to linguistic theory (see Rice, Chapter 6, and Rhodes and Campbell, Chapter 5, this volume). Some scholars documenting languages may have seemed to be against theory—no doubt some actually have been—in their positions about the presentation of the descriptive results of language documentation. It is commonly advised that such descriptions should be as accessible as possible to a wide audience of users, including members of the communities whose languages are described. This is understood to mean avoiding description in terms of a formal theoretical framework that uses notations and technical jargon that would limit its accessibility. Some advocate theory-neutrality while acknowledging that it is not possible to be utterly theory free in the description of language. We, the editors, subscribe to the recommendation to make the results of language documentation as accessible as possible, thus avoiding those aspects of formal theories in the presentation of the results of language documentation that limit access. However, aspects of linguist theory can provide insights into and explanations for things we encounter in the languages we document; they can make us aware of what to expect and of how different constructions typically interact with others. A primary value of linguistic theory for language documentation is that it helps us to know what questions to ask. Our belief is that (1) the products of language documentation should be as accessible as possible and therefore not encumbered by discussion that only the technically trained can follow; (2) findings in language documentation research can and should be written up for publications aimed at and accessible to the broad audiences, most of whom will not have the training to understand formal linguistic theories; and (3) where the findings are able to contribute to theoretical issues, it is best to write them up and publish them in professional journals whose audiences have the technical background to appreciate them. They might also be given in “notes to linguists” that address important theoretical issues given in appendices to the more general description.

5.3. Language revitalization, language conservation The terms “language revitalization” and “language conservation” are often used interchangeably; however, they differ in scope. Language revitalization commonly refers to efforts to rescue endangered languages, or to revive dead languages, also commonly called “sleeping” or “dormant” languages. The term “language conservation” is more inclusive. It encompasses not only all the activities subsumed under language revitalization but language maintenance as well. Consideration of language maintenance is essential because there are many small languages that are not yet endangered but that

14 Lyle Campbell and Kenneth L. Rehg might well soon be in the absence of definitive actions undertaken to maintain them. Because the focus of this Handbook is on currently endangered languages, it employs the term “language revitalization.” There are additional reasons, however, why the label “language conservation” might be preferable. Conservation can be defined as “a careful preservation and protection of something; especially: planned management of a natural resource to prevent exploitation, destruction, or neglect.”14 Language conservation thus encompasses all the activities that are undertaken to sustain—that is, to conserve—minority languages. Further, the term “conservation” links the efforts of linguists and language activists to the broader undertakings of conservation in general. Endangered languages do not exist in a vacuum. They are commonly found in locations where the environment and culture has been degraded as well. Consequently, linguists and other scholars concerned with the languages, cultures, and environments of minority groups are paying increasing attention to the kinds of interdisciplinary activities now subsumed under the label “biocultural studies,” most notably championed by Terralingua15 and discussed in some depth in Part IV of this Handbook. The relationship between language conservation and language documentation, and between both of these and assessing language endangerment, is often also misunderstood. Language conservation and language documentation are not in opposition but are interrelated. Revitalization efforts can contribute to language documentation, as obviously language documentation can serve the interests of language revitalization. Too often the two are seen as opposed, even antagonistic, to one another. Some believe language documentation serves only the interests of academics and neglects the language communities that want to revitalize their languages. In reality, modern language documentation projects rarely lack a language revitalization component. Most scholars involved in language documentation feel an obligation to make their documentation efforts serve the language community whose language is being documented. Without language documentation, what is needed for preparing teaching and learning materials for the language is simply unavailable. Language revitalization depends highly on the availability of language documentation. Moreover, language documentation and language revitalization often are not conducted as independent activities but are intertwined and mutually supportive. For example, many documentation projects train community members who become part of the documentation team and who at the same time often use their training in service of the communities’ language conservation efforts. To cite one example, in the SYLAP program (Shoshone-Goshute Youth Language Apprenticeship Program) (see http://shoshoniproject.utah.edu/sylap/), young people helped to prepare a talking dictionary, part of the documentation of this language. They were trained in the technology and then recorded elders who were 14 Merriam-Webster’s Collegiate Dictionary, 10th ed. Springfield, MA, 1995. https://www.merriam- webster.com/dictionary/conservation 15 http://www.terralinguaubuntu.org.

Introduction 15 native speakers for sound clips for the dictionary. In the process, these youth learned much more Shoshone (part of the revitalization effort), but also their interest and documentation activity brought the generations together and awakened greater enthusiasm in the community for the language, resulting in much more support for and interest in revitalizing the language among community members. This is but one example, but it shows clearly that language documentation and revitalization need not be considered separate activities and are not at odds with one another. More and more, language documentation is not just in the hands of a single scholar who is an outsider to the speaker community. Very often, the documentation is done by teams with at least some members of the language community as team members. Increasingly, those doing the documentation are themselves linguistically trained members of the community who also are concerned with revitalization of their language. Very often decisions about what is actually documented and how the documentary materials will be used are in the hands of community leaders and elders. (See also Hinton, Chapter 19, and Sapién, Chapter 9, this volume.)

6. The structure of this Handbook This Handbook includes thirty-nine chapters, organized into five parts: (I) Endangered Languages, (II) Language Documentation, (III) Language Revitalization, (IV) Endangered Language and Biocultural Diversity, and (V) Looking to the Future. Part I, Endangered Languages, addresses some of the fundamental issues that are essential to understanding the nature of the endangered languages issue. It consists of four chapters that deal with such matters as the challenges of determining how many of the world’s languages are endangered, how language vitality can be assessed, language contact and its potential consequences for language endangerment, and the significance of indigenous language rights in combating language endangerment and language loss. Part II, Language Documentation, contains fourteen chapters that provide an overview of the issues and activities of concern to linguists and others in their efforts to record and document endangered languages. It includes discussions of the goals of language documentation, the relationship between documentation and linguistic theory, the design and implementation of documentation projects for both spoken and signed languages, the tools and technology for documenting and revitalizing endangered languages, the products of language documentation (corpora, grammars, dictionaries, orthographies), the role of archiving in endangered-language scholarship, the ethnographic tools that can be employed to document the sociocultural contexts of endangered languages, the documentation of languages in urban diaspora communities, and the consideration of ethical practices in language documentation and revitalization. Part III, Language Revitalization, with ten chapters, encompasses a diverse range of topics, including approaches and strategies for revitalizing endangered and sleeping (“dormant”) languages, a project to analyze revitalization projects globally and

16 Lyle Campbell and Kenneth L. Rehg comparatively across cultures, an examination of the conditions under which language acquisition can take place, the three technologies that are essential to enabling a language in the digital domain, the stages of successful language recovery, three examples of language revitalization programs (Myaamiaataweenki, Truku Seediq, and Māori), a discussion of language revitalization activities in Africa, and the challenges and limitations of planning the maintenance of minority languages. Part IV, Endangered Languages and Biocultural Diversity, extends the discussion of language endangerment beyond its conventional boundaries to consider the interrelationship of language, culture, and environment. It includes six chapters that deal with such issues as the striking congruence between the global distributions of species and languages, the concept of “biocultural diversity” and the goals and nature of efforts to maintain it, the value of collaborative efforts between linguists and other scientists in documenting traditional and local knowledge about biodiversity, the ramifications of climate change with respect to languages and cultures, the benefits and challenges of interdisciplinary language documentation, and the vital but commonly undervalued importance of the lexicon as a repository of cultural data and its potential to contribute to our understanding of human cognition. Part V, Looking to the Future, consists of five chapters that address a variety of topics that are certain to be of consequence in future efforts to document and revitalize endangered languages. Included are discussions of the strategies for locating and obtaining funding, the teaching of linguists to document endangered languages and the training of language activist to support them, the design of a new generation of software that will enable linguists to collaborate with speakers to produce high-quality large-scale documentation, and, finally, the impact of indigenous languages on the well-being of their users.

References Barlow, Russell and Lyle Campbell. In press. “Language Classification and Cataloguing Endangered Languages.” In Cataloguing Endangered Languages, edited by Lyle Campbell and Anna Belew. London: Routledge. Bierer, Domnald E., Thomas J. Carlson, and Steven R. King. 1996. “Shaman Pharmaceuticals: Integrating Indigenous Knowledge, Tropical Medicinal Plants, Medicine, Modern Science, and Reciprocity into a Novel Drug Discovery Approach.” http://www. retreatayahuasca.com/Ethnobotanique/feature11.html. Accessed April 27, 2017. Campbell, Lyle. In press. “On How and Why Languages Become Endangered: Reply to Mufwene.” Language. Campbell, Lyle and Anna Belew. In press. “Introduction: Why Catalogue Endangered Languages?” Cataloguing of Endangered Languages, edited by Lyle Campbell and Anna Belew. London: Routledge. Crystal, David. 2000. Language Death. Cambridge: Cambridge University Press. Felger, Richard and Mary Beck Moser. 1973. “Eelgrass (Zostera marina L.) in the Gulf of California.” Science 181(4097): 355–356.

Introduction 17 Fishman, Joshua A. 1990. “What Is Reversing Language Shift (RLS) and How Can It Succeed?.” Journal of Multilingual & Multicultural Development 11: 5–36. Fishman, Joshua A. 1991. Reversing Language Shift: Theoretical and Empirical Foundations of Assistance to Threatened Languages. Clevedon, UK: Multilingual Matters. Greenberg, Joseph H. 1978. Introduction to Universals of Human Language, vol. 2, edited by Joseph H. Greenberg, Charles A. Ferguson, and Edith A. Moravcsik, 1–6. Palo Alto, CA: Stanford University Press. Greymorning, Stephen. 1999. “Running the Gauntlet of an Indigenous Language Program.” Revitalizing Indigenous Languages, edited by Jon Reyhner, Gina Cantoni, Robert N. St. Clair, and Evangeline Parsons Yazzie, 6–16. Flagstaff: Northern Arizona University. Himmelmann, Nikolaus. 1998. “Documentary and Descriptive Linguistics.” Linguistics 36: 161–195. Himmelmann, Nikolaus. 2006. “Language Documentation: What Is It and What Is It Good for?” Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 1–30. Berlin: Mouton de Gruyter. Himmelmann, Nikolaus. 2012. Linguistic Data Types and the Interface Between Language Documentation and Description. Language Documentation & Conservation 6: 187–207. http://hdl.handle.net/10125/4503. Krauss, Michael. 1992. “The World’s Languages in Crisis.” Language 68: 4–10. Littlebear, Richard. 1999. “Some Rare and Radical Ideas for Keeping Indigenous Languages Alive.” In Revitalizing Indigenous Languages, edited by Jon Reyhner, Gina Cantoni, Robert N. St. Clair, and Evangeline Parsons Yazzie, 1–5. Flagstaff: Northern Arizona University. Okura, Eve. 2018. “Endangerment of Language Isolates.” Language Isolates, edited by Lyle Campbell, 344–371. London: Routledge. Palosaari, Naomi and Lyle Campbell. 2011. “Structural Aspects of Language Endangerment.” The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 100–119. Cambridge: Cambridge University Press. Pfeiffer, Anita and Wayne Holm. 1994. “Laanaa Nisin: Diné Education in the Year 2004.” Journal of Navajo Education 11: 35–43. Rehg, Kenneth L. 2007. “The Language Documentation and Conservation Initiative at the University of Hawai‘i at Mānoa.” In Documenting and Revitalizing Austronesian Languages, edited by D. Victoria Rau and Margaret Florey, 13– 24 (Language Documentation & Conservation Special Publication No. 1). Honolulu: University of Hawai‘i Press. http://hdl. handle.net/10125/1350. Rhodes, Richard, Lenore A. Grenoble, Anna Berge, and Paula Radetzky. 2007. “Adequacy of Documentation.” (A preliminary report to CELP [the Committee on Endangered Languages and their Preservation]) Washington DC: Linguistic Society of America, presented at the 2007 annual meeting, Anaheim, CA. 2006. Skyhawk, Sonny. 2012. “Why Should We Keep Tribal Languages Alive? Indian Country, April 6, 2012 (http://indiancountrytodaymedianetwork.com/2012/04/06/why-shouldwe-keep-tribal- languages-alive-99182). Robins, Robert Henry and Eugenius M. Uhlenbeck, eds. 1991. Endangered Languages. Oxford: Berg. Rogers, Christopher and Lyle Campbell. 2015 “Endangered Languages.” Oxford Research Encyclopedia of Linguistics, edited by Mark Aronoff. New York: Oxford University Press. http://linguistics.oxfordre.com/view/10.1093/acrefore/9780199384655.001.0001/acrefore9780199384655-e-21?rskey=OwOKqb&result=1[Revised ed. 2017.]

18 Lyle Campbell and Kenneth L. Rehg Swadesh, Morris. 1948. “Sociologic Notes on Obsolescent Languages. International Journal of American Linguistics 14: 226–235. Woodbury, Anthony C. 2011. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by In Peter K. Austin and Julia Sallabank, 159–186. Cambridge: Cambridge University Press. Zepeda, Ofelia and Jane H. Hill. 1991. “The Condition of Native American Languages in the United States.” In Endangered Languages, edited by R. H. Robins and E. M. Uhlenbeck, 135– 155. Oxford: Berg.

Pa rt I

E N DA N G E R E D L A N G UAG E S

Chapter 1

The Status of t h e Worl d ’ s E ndangered L a ng uag e s Anna Belew and Sean Simpson

1. Introduction Of the estimated 7,097 living languages currently spoken in the world (Lewis, Simons, and Fennig 2016), a significant proportion are currently at risk of extinction.1 Estimates regarding the number of endangered languages vary widely, from 90% of all languages at the direst end of the spectrum (Krauss 1992) to roughly 34% at the most optimistic (Lewis et al. 2016). What is clear is that the world’s languages are in crisis, and that language endangerment and death are now occurring at an unprecedented pace. In the past fifty years alone, roughly 220 languages are known to have lost their last speakers (Campbell et al. 2013)—and 220 is almost certainly a drastic underestimation given how many languages have likely slipped away without ever being known to linguists. However, since the call to action by Krauss and others in the early 1990s, efforts both to document and to revitalize threatened and endangered languages worldwide have increased exponentially, and the information we currently have about these languages is better than ever before. This chapter is designed to provide a broad overview of what we know at present about the status of the world’s endangered languages—both on a regional level and as a whole. First, though, it is important to point out some of the

1

The following people have contributed to building the Catalogue of Endangered Languages database, from which much of our information is drawn, and deserve credit and thanks: Lwin Moe, Russell Barlow, Anna Belew, Amy Brunett, Yen-ling Chen, Jacob Collard, Kristen Dunkinson, Katie Gao, Bryn Hauk, Raina Heaton, Uliana Kazagasheva, Nala Lee, Marcus Nero, Eve Okura, Sara Oldaugh, Christianne Ono, Sean Simpson, Alexander Smith, Kaori Ueki, John Van Way, Stephanie Walla, and Brent Woo. We also thank the Catalogue’s Board of Regional Directors, whose expertise guided the curation of language data for each geographic region, and the project’s Principal Investigators.

22 Anna Belew and Sean Simpson difficulties in assessing language endangerment, especially on a large scale, and to acknowledge some of the limitations inherent to any such overview.

2. Difficulties in determining the status of the world’s endangered languages A major hurdle to listing endangered languages and their status is the act of “listing” itself. Enumeration of languages—that is, identifying, naming, and counting speech varieties—is difficult at best.2 For one, what counts as a “language”? The line between language and dialect is frequently blurry, and in the case of endangered and underdocumented languages, it is often based on limited information. As an example, the Koro language of India was until very recently undescribed, and noted in the literature only as a dialect of the neighboring Hruso language. However, Anderson and Murmu (2010) revealed that Koro was in fact a separate language, one likely unrelated to the Hrusish languages. Cases such as this are not necessarily restricted to isolated locales. Long-known to locals in the Hawaiian Islands, though only spottily referenced in the academic literature, Hawaiʻi Sign Language (HSL) was until very recently assumed to be a dialect of American Sign Language (ASL). Only in 2013 did research demonstrate that not only is HSL a language in its own right, distinct from ASL, but the two in fact do not even belong to the same family of sign languages (Lambrecht et al. 2013). Hawaiʻi is hardly isolated in the same sense as the hinterlands of India. In this case, the paucity of academic information available on the variety was less a result of geographic isolation and more a result of the scant academic attention that has been paid to endangered and underdescribed sign languages worldwide (see Woodward, Chapter 8, this volume). Even for well-described varieties, establishing definitive boundaries between “languages” can be challenging. The primary linguistic criterion of mutual intelligibility may point to two varieties being separate languages, but political or cultural criteria may consider them a single language, as in the case of Mandarin and Cantonese “Chinese.” Another example is the Jejueo (제주어) language, spoken on Jeju Island, South Korea. Though well-known, the variety spoken on Jeju had been considered a “dialect” of Korean until very recently, and South Korea was thought to be a monolingual nation

2 At worst, it has been argued that such enumerative activity is actively harmful to speaker communities—that such quantification reinforces essentialization and commodification of languages (see Dobrin, Austin, and Nathan 2007; Hill 2002; Moore, Pietikäinen, and Blommaert 2010). While we acknowledge that catalogue-style enumeration of languages and their vitality is necessarily imprecise and reductive, we believe that such quantitative treatments nevertheless have valid uses (see Belew and Simpson 2018).

Status of the World’s Endangered Languages 23 speaking a linguistic isolate. However, recent mutual intelligibility testing has shown that by linguistic standards, Jejueo is in fact a highly endangered sister language of Korean (O’Grady 2014). Koreanic is now recognized as a small language family, and South Korea as a multilingual nation. Any list of endangered languages produced as recently as 2010 would not have included Jejueo, HSL, or Koro, yet today the Catalogue of Endangered Languages and Ethnologue include all three. Similar situations of languages misidentified as dialects, or vice versa, are common around the world—and these are just the languages that linguists know about. In addition to those languages that have been misclassified or misidentified, it is likely that there remain languages as yet completely unknown to linguists. In light of all this, it is evident that any count of “languages” must be taken as an approximation. Aside from the difficulties inherent in simply identifying potentially endangered languages, actually assessing the endangerment levels of these languages presents its own host of challenges. As comparability of vitality assessment across language varieties is typically one of the goals of any large-scale overview of endangered languages, it is necessary that the criteria on which such assessments are based be internally consistent. However, the complexity and variety of the circumstances faced by each individual language make the application of broad, “one-size-fits-all” vitality assessment heuristics fraught with difficulty. Nearly all of the vitality assessment schemas in broader use today (e.g., GIDS (Graded Intergenerational Disruption Scale): Fishman 1991; UNESCO’s nine factors: UNESCO 2003; EGIDS (Extended GIDS): Lewis and Simons 2010; LEI (Language Endangerment Index): Lee and Van Way 2016) operate on the principle of scoring a language in several categories thought to be indicative of linguistic vitality (speaker numbers, intergenerational transmission, domains of use, etc.), and combining the scores on each measure in some way in order to come up with a composite vitality rating for the language as a whole. Which categories are included, how languages are “scored” in each category, and the method of calculating the composite vitality rating differ among assessment techniques (for an overview of language vitality assessment, see Lee and Van Way 2016), but all ultimately boil down to a methodology designed to sort the languages of the world into a finite number of categories according to degree of endangerment. However, each language is ultimately unique in terms of the sociocultural-political- economic niche it occupies and the challenges to its vitality that it faces, and thus any attempt at categorization must be seen as an approximation. For instance, the degree of monolingualism in the target language may be a strong indicator of language vitality for relatively secluded populations, or in areas of the world where multilingualism is not common, yet may be completely uninformative for languages in areas such as the highlands of Papua New Guinea or the Vaupés River basin of Brazil and Colombia, where multilingualism is the norm. As the examples of Koro and HSL suggest, one of the largest obstacles to the enumeration, classification, and assessment of endangered languages is simply the paucity of information available about such languages. Many endangered languages are known to

24 Anna Belew and Sean Simpson researchers exclusively from a single scholarly source—which may be decades old, may not contain all (or any) of the information required to assess vitality with any degree of accuracy, and/or may not in fact be a primary source at all, but rather a secondary source based on other, unpublished and inaccessible work or hearsay reports. Even when multiple sources of information are available for a threatened language, these sources often contain conflicting information that must be sorted through and resolved prior to assessing language vitality—and often the multiple sources end up citing each other, all based on an uncertain original source. The inherent fuzziness of language enumeration, the distortion introduced by forcing all languages into a one-size-fits-all vitality assessment scale, and the sheer lack of information available on many of the languages that are endangered today all converge to make large-scale overviews of language endangerment particularly tricky. By nature, such enterprises are prone to a certain degree of imprecision, and all one can do is attempt to control for such imprecision as best as one can. In an effort to do this, we draw the information presented below primarily from what may be considered the most accurate and complete compendium of information on language endangerment available today: the Catalogue of Endangered Languages. As the bulk of the information presented is drawn from the Catalogue, we shall briefly outline its structure and advantages over other such comparable tools prior to delving into our regional and global overviews of language endangerment.

3. The Catalogue of Endangered Languages The Catalogue of Endangered Languages (ELCat, at www.endangeredlanguages.com) is an academic database developed jointly by the linguistics departments at the University of Hawaiʻi at Mānoa (UHM) and Eastern Michigan University (EMU), now maintained and hosted at UHM. Its aim is to compile the most accurate available information regarding the vitality, status, and context of each endangered language in the world, and to make that information freely available to the public. The initial framework for ELCat was designed at the 2009 Endangered Languages Information and Infrastructure (ELIIP) workshop, sponsored by the U.S. National Science Foundation and held at the University of Utah. During this workshop, roughly fifty specialists in endangered- language documentation, revitalization, and archiving developed a template for the types of information that should be included in an online catalogue of the world’s endangered languages. In 2011, work began to construct the Catalogue of Endangered Languages, again supported by a grant from the National Science Foundation. From August 2011 until June 2012, the initial version of the Catalogue was compiled by research

Status of the World’s Endangered Languages 25 teams at both universities, and in June 2012, the ELCat database was released to the public as part of the Endangered Languages Project website (see Heaton and Simpson 2018 for more information about the Endangered Languages Project). The database has been constantly updated and expanded since its launch, and now contains information about 3,346 endangered and dormant languages compiled from more than 1,600 bibliographic sources. ELCat is a structured database for the compilation of specific types of information regarding endangered languages. The information collected may be divided into three general categories: speakers and vitality, location, and context (sociolinguistic, political, etc.). ELCat also includes a metric for rating a language’s level of endangerment: the Language Endangerment Index, or LEI. The LEI provides a composite language endangerment rating based on four factors, where that information is available: intergenerational transmission, number of native speakers, domains of use, and trends in decreasing or increasing speaker numbers. See Appendix 1 for a complete list of the data fields collected for ELCat. For additional detail regarding the LEI, see Lee and Van Way (Chapter 2, this volume). All the data presented in ELCat are compiled from other primary sources; the Catalogue team does not normally conduct surveys, fieldwork, or other firsthand assessments of language vitality.3 Rather, the project aims to gather all relevant data available in the current endangered-language literature into a structured database format which may easily be accessed, searched, and analyzed by anyone. The original source of data collected is cited, and data from all known bibliographic sources is presented on the language’s web page. For example, ELCat reports fewer than 2,000 native speakers of Lokono; this data comes from Rybka (2015), and users see this citation when viewing the 2,000 figure. Given the recency of this publication and the quality of the information, Rybka (2015) is set as the “preferred” source—i.e., that which is displayed first when a user arrives at the Lokono web page in ELCat. However, ELCat also contains Lokono speaker counts of 2,505 from Crevels (2012) and 2,750 from Moseley (2010), among others. ELCat users may browse through the data from all available sources and compare the reported speaker counts, vitality levels, and other information provided by each author. For each endangered language, a thorough search of the existing literature has been conducted by the project’s research team; the vast majority of publications regarding the vitality of specific endangered languages have been consulted and relevant information incorporated into the project over the past five years. The information available in the ELCat database may thus be viewed as a reasonable representation of the discipline’s current knowledge regarding endangered languages; below, we present an overview of the current status of the world’s endangered 3 Though members of the ELCat research team have contributed firsthand information to several language entries based on their own fieldwork.

26 Anna Belew and Sean Simpson languages, as seen in the Catalogue’s data.4 We first present an overview of the world’s endangered languages by geographic region5 and level of vitality. Next, we discuss the “known unknowns” of the status of the world’s endangered languages: what information do we know to be missing from our knowledge base, and what types of information necessary to assessments of linguistic vitality are most and least prevalent in the current endangered-languages literature? Finally, we present concluding remarks about overall trends in global language endangerment, including important correctives to some erroneous claims about loss of linguistic diversity.

4. Language endangerment: Regional trends In the following sections, we provide an overview of language endangerment in each of the Catalogue’s geographic regions, ordered alphabetically.

4.1. Africa Africa is home to an estimated 2,139 languages (Lewis et al. 2016), nearly one-third of all living languages on Earth. Of these, 606 non-dormant languages are currently included in ELCat. Overall, slightly more than one-quarter (28.3%) of Africa’s languages are currently known to be endangered to some degree—a rate of endangerment which is quite low compared to the global average of c.47.1% (see section 6). Africa’s endangered languages are also rated more vital, on average, than endangered languages in most other regions. More than two-thirds of African endangered languages (67.9%) fall in the three least-endangered categories of the LEI, and only 6.1% of them are rated

4

It should be noted that the data provided in this chapter is accurate as of June 28, 2016; however, numbers of languages and speakers found in the ELCat database at any given point in the future may differ from those presented here, as the database is constantly updated as new information becomes available. 5 The geographic divisions used by the Catalogue of Endangered Languages are similar to, e.g., the United Nations geoscheme, which divides the world into five primary regions; however, ELCat’s twelve regional divisions are a reflection of both linguistic areas and the particular expertise of its personnel. For example, Australia is a “subregion” of the Pacific in the UN geoscheme but a primary region in ELCat; it constitutes a distinct linguistic area and is under the supervision of an expert in Australian languages, not languages of the wider Pacific. It should also be noted that some languages are spoken in multiple countries which straddle regional divisions. For this reason, some languages are counted toward the total figures for more than one region, and the sum of the regional language totals will exceed the number of languages in the Catalogue.

Status of the World’s Endangered Languages 27 Table 1.1 Overview of endangered languages in Africa by LEI rating LEI level 0: At Risk

Total languages (Africa)

% of ELCat languages (Africa)

% of ELCat languages (global)

14

0.2

2.5

1: Vulnerable

156

25.2

16.2

2: Threatened

250

40.4

26.9

3: Endangered

91

14.7

22.2

4: Severely Endangered

47

7.6

10.9

5: Critically Endangered

38

6.1

12.6

6: Dormant

12

1.9

5.2

0

1.6

1.6

1.6

7: Awakening Unknown

0 10

“Critically Endangered” as compared to the global average of 12.6% (see Table 1.1). While twelve African languages are known to have lost their last speakers in roughly the past fifty years, there are currently no known efforts to revive any African languages that have become dormant; however, language maintenance and revitalization efforts are becoming increasingly commonplace in many African countries, particularly in the realm of mother-tongue-medium education.

4.2. Australia Australia, like North America, is one of the areas of most severe language endangerment and loss in the world. Overall, ELCat currently recognizes 353 endangered and dormant languages in Australia, of which 256 have remaining native speakers. Australia is the only region in the world aside from North America where languages rated “Critically Endangered,” the most severe endangerment level in the LEI, outnumber any other endangerment category, comprising more than 40% of all endangered languages on the continent. Indeed, just over 5% of all Australian languages fall into the three least severe endangerment categories; a handful of languages, such as Alyawarr and Pitjantjantjara, are still in fairly vigorous use, but most Australian languages are faced with far more urgent endangerment situations. However, on a more optimistic note, Australia is also home to more than one-third of all languages in the world classified as “Awakening”; community-based revival efforts, language centers, and school programs have gained enormous traction in recent years, creating new speakers of languages which are endangered or were previously dormant. Table 1.2 presents a summary of Australia’s endangered languages by LEI rating.

28 Anna Belew and Sean Simpson Table 1.2 Overview of endangered languages in Australia by LEI rating LEI level

Total languages (Australia)

% of ELCat languages (Australia)

% of ELCat languages (global)

0: At Risk

0

0

1: Vulnerable

5

1.4

16.2

2: Threatened

14

3.9

26.9

3: Endangered

28

7.9

22.2

4: Severely Endangered

64

18.1

10.9

5: Critically Endangered

142

40.2

12.6

6: Dormant

75

21.2

5.2

7: Awakening

22

6.2

1.6

3

0.07

1.6

Unknown

2.5

Table 1.3 Overview of endangered languages in the Caucasus by LEI rating LEI level

Total languages (Caucasus)

0: At Risk

% of ELCat languages (Caucasus)

% of ELCat languages (global)

8

6.7

2.5

1: Vulnerable

16

13.5

16.2

2: Threatened

27

22.8

26.9

3: Endangered

21

17.7

22.2

4: Severely Endangered

22

18.6

10.9

5: Critically Endangered

20

16.9

12.6

6: Dormant

4

3.3

5.2

7: Awakening

0

0

1.6

Unknown

0

0

1.6

4.3. The Caucasus The area surrounding the Caucasus Mountains, at the cultural and geographic intersection of Europe and Asia, is among the most linguistically diverse regions in the world. Endangered languages in the Caucasus vary in vitality, and are fairly evenly distributed across the range of degrees of endangerment. In the Caucasus region, 118 languages are currently identified as being endangered; of these, slightly more than half (53.3%) fall into the three most severe endangerment categories, while just under half fall on the less-endangered side of the spectrum. In addition, four languages are known to have become dormant since 1960; we know of no active revival efforts for previously dormant languages in this area. Table 1.3 summarizes the endangered languages of the Caucasus by LEI rating.

Status of the World’s Endangered Languages 29 Table 1.4 Overview of endangered languages in East Asia by LEI rating LEI level

Total languages (E. Asia)

% of ELCat languages (E. Asia)

% of ELCat languages (Global)

0: At Risk

10

3.7

2.5

1: Vulnerable

43

15.9

16.2

2: Threatened

72

26.6

26.9

3: Endangered

52

19.2

22.2

4: Severely Endangered

45

16.6

10.9

5: Critically Endangered

42

15.5

12.6

6: Dormant

4

1.4

5.2

7: Awakening

2

0.7

1.6

Unknown

0

0

1.6

4.4. East Asia East Asia (a region which in the Catalogue includes China, Japan, Mongolia, eastern Russia, South Korea, and Taiwan) is home to some of the world’s largest language families, and great linguistic diversity. Overall, the Catalogue currently lists 264 endangered languages in East Asia, spanning the range of endangerment levels. Few East Asian endangered languages (only 3.7%) are considered “At Risk,” the least severe LEI rating; the largest number of languages fall into the “Threatened” category (26.6%), but roughly comparable numbers of languages are rated “Vulnerable,” “Severely Endangered,” and “Critically Endangered.” An additional four languages are known to have become dormant in the past fifty years, and two formerly dormant languages (Siraya and Soyot) are the subject of revival efforts. Table 1.4 gives an overview of East Asian endangered languages by LEI level.

4.5. Europe Europe is home to some of the world’s most widely publicized endangered languages, such as Basque, Welsh, and Irish. The Catalogue currently includes 197 endangered languages of Europe, a slight majority of which (52.6%) fall on the less-endangered half of the LEI scale. An additional five languages are known to have become dormant in the past fifty years. However, there are a growing number of initiatives to support the maintenance and revival of minority languages, such as the Cornish and Manx revivals, the Romani language revitalization movement, and Saami language and culture activism; three European languages which had lost their last speakers are now “Awakening,” or engaged in language revival. Table 1.5 presents a summary of Europe’s endangered languages by endangerment level.

30 Anna Belew and Sean Simpson Table 1.5 Overview of endangered languages in Europe by LEI rating LEI level

Total languages (Europe)

% of ELCat languages (Europe)

% of ELCat languages (global)

0: At Risk

15

7.3

2.5

1: Vulnerable

40

19.6

16.2

2: Threatened

53

25.9

26.9

3: Endangered

36

17.6

22.2

4: Severely Endangered

29

14.2

10.9

5: Critically Endangered

24

11.7

12.6

6: Dormant

5

2.4

5.2

7: Awakening

3

1.4

1.6

Unknown

0

0

1.6

Table 1.6 Overview of endangered languages in Mexico, Central America, and the Caribbean by LEI rating LEI level

Total languages (M/CA/C)

0: At Risk

14

8.0

2.5

1: Vulnerable

52

29.7

16.2

2: Threatened

58

33.1

26.9

3: Endangered

18

10.2

22.2

4: Severely Endangered

7

4.0

10.9

5: Critically Endangered

10

5.7

12.6

6: Dormant

4

2.2

5.2

7: Awakening

2

1.1

1.6

10

5.7

1.6

Unknown

% of ELCat languages (M/CA/C)

% of ELCat languages (global)

4.6. Mexico, Central America, and the Caribbean Overall, in Mexico, Central America, and the Caribbean, ELCat currently recognizes 175 endangered and dormant languages; 169 of these have remaining speakers. On the whole, languages in this region trend toward the less-endangered end of the spectrum, with the majority (70.8% of the region’s endangered languages) falling into the three least severe endangerment categories: “At Risk,” “Vulnerable,” or “Threatened” (see Table 1.6). Four languages of this region are known to have become dormant since 1960, and two others which had previously lost their last speakers (Boruca and Chiquimulilla Xinka) are now “Awakening,” or undergoing revival efforts.

Status of the World’s Endangered Languages 31 Table 1.7 Overview of endangered languages in the Near East by LEI rating LEI level

Total languages (Near East)

0: At Risk

% of ELCat languages (Near East)

% of ELCat languages (global)

9

7.4

2.5

1: Vulnerable

33

27.2

16.2

2: Threatened

34

28.0

26.9

3: Endangered

17

14.0

22.2

4: Severely Endangered

7

5.7

10.9

5: Critically Endangered

4

3.3

12.6

6: Dormant

3

2.4

5.2

7: Awakening

0

0

1.6

14

11.5

1.6

Unknown

4.7. Near East The Near East is home to a relatively small number of languages, as compared to highly linguistically diverse regions such as Melanesia and West-Central Africa; correspondingly, the number of endangered languages in the region is somewhat lower than in most parts of the world (121 languages). Political instability has, in many areas, prevented linguistic survey work in recent decades; consequently, the available information about languages of the Near East is considerably sparser than that which is available for most regions. There are fourteen languages in the Near East for which no vitality information at all is available, representing more than one-quarter (25.9%) of the fifty-four languages in the world with a LEI rating of “Unknown.” Of those Near Eastern languages with enough data to calculate an endangerment rating, the majority (62.8%) fall in the three least severe endangerment categories. Only four languages of the region are critically endangered; however, an additional four languages are known to have become dormant since 1960, and none to our knowledge are currently the subject of revival efforts. Table 1.7 gives an overview of the area’s endangered languages by LEI rating.

4.8. North America North American6 languages are, by sheer volume, among the most endangered in the world. Ethnologue (Lewis et al. 2016) estimates that 256 languages are spoken in North America today; of these, 199 are classified in ELCat as being endangered (not dormant)—this is roughly three-quarters of the languages in North America. However, 6

The “North America” region in ELCat covers the languages of the United States, Canada, and Greenland; the languages of Mexico are treated with those of Central America and the Caribbean.

32 Anna Belew and Sean Simpson Table 1.8 Overview of endangered languages in North America by LEI rating LEI level 0: At Risk

Total languages (N. America)

% of ELCat languages (N. America)

% of ELCat languages (global)

5

2.3

2.5

1: Vulnerable

10

4.7

16.2

2: Threatened

24

11.3

26.9

3: Endangered

28

13.2

22.2

4: Severely Endangered

31

14.6

10.9

5: Critically Endangered

67

31.7

12.6

6: Dormant

19

9.0

5.2

7: Awakening

26

12.3

1.6

0

0

1.6

Unknown

given that the Ethnologue’s count of living languages in North America includes large non-indigenous and immigrant languages such as Spanish, French, and Tagalog, it is more accurate to say that all of North America’s indigenous languages are endangered. Nineteen North American languages have become dormant within the past fifty years and are not to our knowledge being revived; another twenty-six languages are in the “Awakening” category, meaning that they have lost their last native speaker but concerted revival efforts are under way. Incidentally, North America is the only region of the world in which a greater number of dormant languages have active revival efforts than not, and nearly half of known languages being revived worldwide are in North America—a testament to the success of the growing language revitalization movement across the continent. Similarly, North America is the only global region with no languages of unknown vitality; there is far more available information on North American languages than for languages in most global regions. While a handful of languages remain in fairly vigorous use, though still endangered (e.g., Navajo and Cree), the majority of North American languages are rated at least at the vitality level of “Endangered” by LEI criteria. Table 1.8 provides an overview of North American languages by their LEI endangerment level, with comparisons to global averages.

4.9. Pacific The Pacific, a division which here refers to all of Melanesia, Polynesia,7 and Micronesia, is a region of enormous linguistic diversity. Of the 492 Pacific languages currently recognized as endangered, a large majority (72.5%) fall in the middle of the 7

With the exception of the languages of Hawaiʻi, which are included with the languages of the United States.

Status of the World’s Endangered Languages 33 Table 1.9 Overview of endangered languages in the Pacific by LEI rating LEI level 0: At Risk

Total languages (Pacific)

% of ELCat languages (Pacific)

% of ELCat languages (global)

1

0.02

2.5

1: Vulnerable

41

8.2

16.2

2: Threatened

135

27.1

26.9

3: Endangered

222

44.6

22.2

4: Severely Endangered

66

13.2

10.9

5: Critically Endangered

25

5.0

12.6

6: Dormant

4

0.08

5.2

7: Awakening

1

0.02

1.6

Unknown

2

0.04

1.6

endangerment scale, rated “Threatened” or “Endangered.” In addition to the endangered languages which still have native speakers, there are four languages in the Pacific which are known to have lost their last speaker since 1960. There is a single language being revived, Sîshëë of New Caledonia. Table 1.9 provides an overview of the endangered languages of the Pacific by LEI rating.

4.10. South America South America has frequently been identified as one of the world’s “hotspots” for biocultural diversity. Ethnologue (Lewis et al. 2016) estimates that 457 living languages (including non-indigenous languages) are found on the continent, of which 342 are currently endangered by ELCat’s criteria—a sobering endangerment rate of roughly 75%. All of South America’s still-spoken indigenous languages are endangered. Most of these cluster toward the middle of the endangerment scale; almost half (49%) of endangered South American languages are rated either “Threatened” or “Endangered.” An additional forty-six languages of South America have become dormant within the past fifty years. While revival of formerly dormant languages is not currently as prevalent in South America as in North America or Australia, there is growing interest in language maintenance and revitalization; for example, the Umutina language of Mato Grosso State, Brazil, is now being taught in schools after having lost its last native speaker in 1988. Table 1.10 provides an overview of South America’s endangered languages by LEI rating.

4.11. South Asia South Asia is remarkably linguistically diverse, with 711 living languages (Lewis et al. 2016) estimated in the six countries that compose the region (Bangladesh, Bhutan,

34 Anna Belew and Sean Simpson Table 1.10 Overview of endangered languages in South America by LEI rating LEI level 0: At Risk

Total languages (S. America)

% of ELCat languages % of ELCat languages (S. America) (global)

9

2.3

2.5

1: Vulnerable

57

14.8

16.2

2: Threatened

105

27.3

26.9

3: Endangered

83

21.6

22.2

4: Severely Endangered

28

7.2

10.9

5: Critically Endangered

52

13.5

12.6

6: Dormant

40

10.4

5.2

7: Awakening

1

0.5

1.6

Unknown

8

2.0

1.6

Table 1.11 Overview of endangered languages in South Asia by LEI rating LEI level

Total languages (S. Asia)

% of ELCat languages % of ELCat languages (S. Asia) (global)

0: At Risk

25

7.4

2.5

1: Vulnerable

87

30.7

16.2

2: Threatened

107

37.8

26.9

3: Endangered

39

13.7

22.2

4: Severely Endangered

10

3.5

10.9

5: Critically Endangered

10

3.5

12.6

6: Dormant

3

1.0

5.2

7: Awakening

0

0

1.6

Unknown

2

0.07

1.6

India, Nepal, Pakistan, and Sri Lanka). Of these languages, 280, or roughly 39%, are currently identified by the Catalogue as being endangered—a relatively low percentage of endangerment as compared to the global average, especially if compared to regions such as Australia and North America. The majority of South Asian endangered languages also cluster toward the more vital end of the spectrum. Of the 283 endangered and dormant languages in the region, more than three-quarters (77.3%) fall into LEI categories 0–2, the three least severe endangerment levels. Only twenty languages are rated either “Severely” or “Critically Endangered.” There are also three languages known to have become dormant since 1960. Table 1.11 presents an overview of the endangered languages of South Asia by LEI rating.

Status of the World’s Endangered Languages 35 Table 1.12 Overview of endangered languages in Southeast Asia by LEI rating Total languages (SE Asia)

LEI level 0: At Risk

% of ELCat languages (SE Asia)

% of ELCat languages (global)

6

0.1

2.5

1: Vulnerable

69

15.5

16.2

2: Threatened

108

24.3

26.9

3: Endangered

161

36.2

22.2

4: Severely Endangered

55

12.3

10.9

5: Critically Endangered

38

8.5

12.6

6: Dormant

4

0.09

5.2

7: Awakening

0

0

1.6

Unknown

3

0.06

1.6

4.12. Southeast Asia Southeast Asia8 is another highly linguistically diverse area, home to an estimated 1,740 languages (Lewis et al. 2016). The number of living endangered languages in the region is correspondingly high (440 in ELCat). Southeast Asia is the global region with the thirdhighest raw number of endangered languages, after Africa (606) and the Pacific (492). Of the 440 endangered languages in Southeast Asia, more than half (61.1%) cluster at the middle of the endangerment scale, rated “Threatened” or “Endangered.” An additional four languages have lost their last speaker since 1960, but we are currently aware of no revival efforts for dormant languages in Southeast Asia. Table 1.12 presents a summary of Southeast Asian languages by endangerment level.

5. “Known unknowns” in data on endangered languages While the available data about the world’s endangered languages is more comprehensive today than ever before, there remain significant gaps in our knowledge. For many endangered languages, the types of information necessary for a thorough assessment of linguistic vitality are unavailable; for example, the only available information regarding the Bakole language of Cameroon is a 1982 estimate of 300 speakers, with nothing 8 Note that the “Southeast Asia” region comprises both island and mainland Southeast Asia, and includes the languages of Brunei, Cambodia, Indonesia, Laos, Malaysia, Myanmar, the Philippines, Singapore, Thailand, and Vietnam.

36 Anna Belew and Sean Simpson known about intergenerational transmission, age of youngest speakers, domains of use, other languages used by the speaker community, or any other facets of its sociolinguistic context. However, one advantage of a structured database like the Catalogue of Endangered Languages is the ability to identify these gaps—to pinpoint the data which are missing from our knowledge base. Armed with the information that very little is currently known about the vitality of Bakole, or any number of other “lacuna languages,” linguists may choose to direct time and attention to the languages, language families, or regions about which the least information is available (see Hauk and Heaton 2018, for further discussion of “triaging” resources for endangered-language research). Below, we outline some “known unknowns,” or gaps, which have been identified in the existing information on the world’s endangered languages. In many cases, the only primary sources available for a given language do not include information on all four factors9 used by the LEI to calculate an endangerment rating (see Lee and Van Way, Chapter 2, this volume). For this reason, ELCat includes, along with the endangerment rating for each language, a rating of the level of confidence for that rating. If a language’s endangerment rating is based on a single LEI factor, such as speaker numbers, the level of confidence associated with that vitality estimate would be 20%. If information is available for all four factors, the level of confidence associated with that vitality rating would be 100%.10 Currently, only 600 of the 3,346 languages listed in the Catalogue (roughly 18%) have a vitality rating that is considered 80% confident or higher, while nearly two-thirds (2,065) have vitality levels estimated with only 20% confidence (Hauk and Heaton 2018). While this statistic may seem to suggest that ELCat has less reliable data than other large-scale language catalogues, it merely reflects the fact that the LEI is the first vitality-assessment metric to quantify missing information—and that, for most endangered languages, a great deal of information is unavailable or unknown. This is to be expected, given that many endangered languages are spoken by populations that are in some way isolated, marginalized, or actively suppressed by their governments. Even for endangered-language communities that are in intensive contact with language organizations, academics, census boards, missionary groups, or other organizations that collect the type of demographic and sociolinguistic information necessary to assess linguistic vitality, there are limited resources available to conduct this type of work. Gathering accurate data about intergenerational transmission, domains of use, speaker demographics, and so on is time-consuming and labor-intensive, and so it is unsurprising that such information is still unavailable for a great many languages. In addition to the levels of certainty assigned by the LEI’s vitality ratings, we may also examine which types of data are largely absent from the existing literature. Our hope is that

9 The LEI calculates language endangerment scores based on absolute number of speakers, trends in speaker numbers, degree of intergenerational transmission, and domains of use. Other information regarding endangerment factors is collected but not factored into LEI scores. 10 Each factor except intergenerational transmission is worth 20% on the confidence scale. Intergenerational transmission is weighted double with respect to the other factors, and thus counts for 40% on the confidence scale.

Status of the World’s Endangered Languages 37 by targeting the largest gaps in the available data, linguists and other researchers may help improve the discipline’s understanding of language endangerment on a global scale. Table 1.13 presents a regional overview of the types of data which are either present or missing from the currently available information about the world’s endangered languages. By far the most prevalent type of data for endangered languages is a raw speaker count. Of the languages included in the Catalogue, estimates of the number of speakers are available for 98.4%. Given that speaker counts are generally the first and most prominent piece of information presented as evidence of language endangerment, and represent an easily digested data point for the public, funding bodies, and governments to conceptualize language loss, it is unsurprising that speaker numbers are fairly ubiquitous in the existing literature (see Dobrin et al. 2007; Moore et al. 2010 for further discussion of enumeration in the discourse of language endangerment). Note, though, that endangered languages in the Near East region have much less data available on speaker populations—only 87.6% of Near East languages have any estimate of speaker numbers, as compared to the 97% or more in most regions, perhaps due to the great difficulty of

Table 1.13 Data-type availability for endangered languages of each region Percentage of endangered languages for which each data type is available Speaker number (%) Africa

98.3

Trends in Other Language Intergenerational speaker Domains languages used attitudes transmission (%) numbers (%) of use (%) by speakers (%) (%) 24.9

15.0

18.9

58.6

36.6

Australia

100

23.2

21.2

15.2

50.4

29.7

Caucasus

100

72.0

51.6

36.4

88.1

42.3

East Asia

99.2

62.8

59.8

40.5

82.1

53.5

Europe

98.5

62.2

45.0

28.9

74.0

39.7

Mexico, Central America, Caribbean

97.6

16.8

8.7

8.1

57.5

49.4

Near East

87.6

33.8

25.6

27.2

71.9

58.6

North America

99.5

79.4

72.2

27.7

91.8

90.9

Pacific

22.7

17.5

14.8

75.4

55.9

South America

100 99.2

30.7

25.7

11.8

81.5

80.5

South Asia

98.5

30.3

24.0

31.4

78.7

74.2

Southeast Asia

99.3

31.5

49.7

17.3

66.8

36.4

Globally

98.4

33.0

29.8

19.8

69.3

52.0

38 Anna Belew and Sean Simpson conducting language survey work in areas which have sustained political conflict for prolonged periods. However, the availability of speaker counts should not be conflated with the accuracy of speaker counts; in many cases, estimates of speaker populations should be taken with several grains of salt. Some speaker counts are grounded in thorough, recent research, as in the case of Jalkunan of Burkina Faso: a 2012 survey was conducted by linguist Vu Truong specifically to ascertain the number of Jalkunan speakers, as well as to assess factors impacting the language’s vitality. Given that the Jalkunan community is relatively small (perhaps 1,000–1,200 people), and only approximately 600 members of the community speak Jalkunan (Truong, personal communication, October 25, 2012), it was possible for a single researcher to obtain a fairly accurate number of speakers within a relatively short span of time. However, this is among the best-case scenarios for the production of speaker counts. As discussed in section 2, even determining who counts as a “speaker” presents its own set of challenges; this difficulty is compounded by the time, effort, and resources required to conduct thorough language surveys. Many linguists who do research on endangered languages are neither trained in demographic survey methods nor eager to devote substantial amounts of their limited fieldwork time to refining speaker counts. And the larger the speaker population, the greater the practical challenge of obtaining an accurate count. While it is easy to say with confidence that the Akuntsú language of Brazil has only five speakers, it is much more difficult to say with any certainty how many people speak Evenki in northern China—estimates range from 10,000 to 30,000, and to accurately refine the demographic information about such a large population would require amounts of resources generally only available to governments and major institutions. As a result, some speaker counts are extrapolated from other sources; for example, if a government census reports a cumulative 6,000 residents of all communities where a language is spoken, and a linguist’s experience suggests that roughly half the people in one of these communities speaks the language in question, the linguist may estimate 3,000 speakers of the language. Once such a figure is published, it may be repeated for years or decades, even as the true number of speakers changes dramatically. Although the Catalogue makes every effort to incorporate the most accurate possible data on speaker numbers, it is limited by the availability of up-to-date reports and by the accuracy of these reports; as a result, there are numerous endangered languages for which the only available information on number of speakers is a count taken ten, twenty, or forty years ago, which has been reproduced numerous times in numerous sources. For these reasons, among others, estimates of speaker numbers in many cases must be understood as just that: rough estimates. Acknowledgment of this fact is intended neither to undermine the utility of ELCat’s data, nor to place blame on linguists or others for failing to undertake the demanding and difficult work of language demographics and censuses; rather, it is to underline the fact that major lacunae remain in the knowledge about the world’s endangered languages, and to encourage scholars and funding bodies to support work that will help fill in the gaps. While speaker counts are relatively prevalent in the literature, other types of data used to assess language vitality are far less frequently published. As seen in Table 1.13,

Status of the World’s Endangered Languages 39 data regarding intergenerational transmission is available for only 33% of languages in ELCat—for two-thirds of endangered languages, no information whatsoever is available regarding whether children are currently acquiring these languages. Given that intergenerational transmission is widely considered the most crucial factor in assessing language endangerment, this absence is concerning. While raw speaker numbers may be useful in gauging the immediate threat faced by a language (e.g., a language with a single-digit number of speakers is almost certainly moribund), it is the degree of intergenerational transmission that largely determines its prospects for survival in the coming decades: a language could have only 400 speakers, but if it has had just a few hundred speakers for all of its known history and all children in the speech community are acquiring the language fully, it is far less endangered than a language spoken by 4,000 adults (or even 400,000) if there are no children learning it. The availability of information on intergenerational transmission varies greatly by region, however. In North America, more than three- quarters of endangered languages have some information about transmission, as do 72% of endangered languages in the Caucasus; this may be reflective of a higher concentration of research programs, better availability of funding, or greater prevalence of in-depth ethnographic and language survey work in these regions. At the other end of the spectrum, a mere 16.8% of endangered languages in Mexico, Central America, and the Caribbean have any information about intergenerational transmission. Africa, Australia, and the Pacific similarly suffer from a lack of information on transmission; in each of these regions, this data is available for fewer than one-quarter of all endangered languages. Similarly, information on trends in speaker numbers (i.e., whether the speaker population is stable, declining slowly, or declining rapidly) is rarely available. While some small languages may not be particularly endangered if they have always been spoken by small populations, languages that experience steep drops in speaker numbers may be facing extinction at an accelerated pace. The rate of change in speaker numbers is considered a sufficiently important indicator of vitality to be one of the four factors included in a LEI rating; however, data on trends in speaker numbers is available for only 29.8% of the world’s endangered languages. Information on demographic trends is particularly sparse for the languages of Africa, the Pacific, and Mexico, Central America, and the Caribbean; in each of these regions, fewer than 18% of endangered languages have any information about trends in speaker numbers. However, there is good coverage of speaker number trends in North America, with data available for 72.2% of endangered languages, and in the Caucasus and Southeast Asia, for roughly 50% of the endangered languages in these regions. There are also notable gaps in information about the social, cultural, economic, and political contexts of endangered languages. As the majority of the world’s endangered languages are currently threatened by ongoing language shift, rather than “sudden language death” (cf. Campbell and Muntzel 1989) in which speakers are themselves physically endangered, information about domains of use, rates of multilingualism, and targets of language shift can be of great use in assessing endangerment. However, the availability of this information is still quite limited. Below, we outline some of the major gaps in knowledge about the contexts of endangered languages.

40 Anna Belew and Sean Simpson The domains in which a language is used is one of the four factors that produce a LEI rating, and can shed important light on patterns of language shift. If, for example, a language is still being acquired by children but is used only in the home or specific cultural domains, with languages of wider communication being used in a growing number of public and official domains, it is likely that it is more threatened than a language with a similar number of speakers which occupies a wider range of domains. (It should be noted, though, that restricted domains of use do not necessarily indicate a loss of linguistic vitality; in situations of widespread societal multilingualism, each language in a speaker’s repertoire will by necessity be used in fewer domains than a language spoken in a monolingual society. If language use across different functional domains has remained fairly stable, a language may securely occupy a limited number of domains for an indefinite amount of time.) However, information about domains of use is available for fewer than one-fifth of endangered languages (19.8%). For the vast majority of endangered languages, there is no publicly available data regarding the situations in which the language is used, and no data on whether other languages are encroaching on domains previously occupied by the endangered language. Information of this kind is particularly sparse for the languages of Mexico, Central America, and the Caribbean, where domains of use are known for only 8.1% of endangered languages; similarly, domains of use data is largely missing for the languages of South America (present for 11.8%) and the Pacific (14.8%). There is relatively good coverage of domains of use in East Asia, however, with data available for 40.5% of endangered languages in the region. While information on domains of use is scant, there is, surprisingly, far more data available regarding language attitudes. For more than half of the languages in the ELCat database (52%), there is at least some information about speakers’ attitudes toward their language. As language attitudes play a pivotal role in language shift and maintenance, the availability of this information can be a boon for language planning and revitalization programs, or it can provide valuable background information for sociolinguists who wish to conduct research in a given community (see Sallabank 2013 for further discussion of language attitudes and language endangerment). Attitude data is least widely available for the languages of Africa and Southeast Asia (in both regions, available for only 36% of languages) but far more prevalent than average for North American and South American languages (90.9% and 80.5%, respectively, of languages in these regions). In addition to attitudes, information about speakers’ multilingual repertoires can be key to assessing patterns of language shift. As noted above, the most common scenario of language endangerment globally is that of gradual shift toward some other language(s); information about which languages are the targets of shift can help refine documentation and maintenance efforts, as well as inform theoretical understanding of language endangerment. Fortunately, this information is fairly prevalent in the literature: more than two-thirds of endangered languages (69.3%) have information available about other languages used by speakers. North America is again the region with the best available information about multilingualism, with 91.8% of North American endangered languages having data about other languages used by speakers (though in nearly all North American cases, it can be assumed that the primary additional language used by speakers is English, or French in

Status of the World’s Endangered Languages 41 a few cases in eastern Canada). Information on multilingualism is also widely available for the Caucasus region, with data provided for 88.1% of languages in this area. However, information on multilingualism is far less available in Australia (data available for only 50.4% of endangered languages), Mexico, Central America, and the Caribbean (57.5% of languages), and Africa (58.6% of languages). Given the above data, it is apparent that while the available information on the world’s endangered languages is likely the best it has ever been—and now most accessible, thanks to the Catalogue of Endangered Languages—there remains a great deal which is simply not known about their vitality, contexts, and prospects for survival. The absence of any information about intergenerational transmission for more than two-thirds of the world’s endangered languages complicates any attempts to assess their vitality based on available data. Similarly, a widespread lack of data on domains of use is detrimental to understanding the processes of language endangerment. And while estimates of speaker numbers are widely available, information about chronological trends (whether speaker numbers are increasing or decreasing over time) is available for fewer than a third of the world’s endangered languages; some languages with small speaker populations may be less endangered than they appear if there has been no recent decline in number of speakers, but without such information, any assessment of their prospects for survival are necessarily incomplete. We would like to reiterate that our intention in drawing attention to these gaps in available knowledge is not to undermine the utility of ELCat’s data, or to chastise linguists or others for failing to collect the missing data—rather, our hope is that identifying the “known unknowns” of language endangerment will assist researchers in targeting the types of data which are most lacking, and most urgently needed. It has become increasingly common for publications about endangered languages to include some amount of information about a language’s sociocultural and geographical context and vitality (e.g., many reference grammars of endangered languages now include some information on the language’s broader sociocultural, political, and geographical context, and a description of endangerment factors affecting the language), and there is a growing interest in supporting sociolinguistically informed language documentation (see Childs, Good, and Mitchell 2014). Given these encouraging developments, we anticipate and hope that the coming years will see a steep increase in the availability of information about the world’s endangered languages.

6. The big picture of language endangerment The areal overviews above demonstrate how variable language endangerment is from region to region. In some regions, such as South Asia, the degree of language endangerment appears to be comparatively milder, the majority of endangered languages falling somewhere around “Vulnerable” or “Threatened” on the LEI vitality scale.

42 Anna Belew and Sean Simpson Regions such as Australia, on the other hand, are experiencing language endangerment at a catastrophic level, with over two-thirds of endangered languages currently listed as “Critically Endangered” or worse. Zooming out to a big-picture view, what can be said about the language endangerment crisis from a global perspective? Do the languages of the world pattern as a whole more like those of Australia or more like those in South Asia? Table 1.14 gives a breakdown of all languages listed in the Catalogue with respect to vitality status as calculated by the LEI. The cumulative percentages given in the right-most column of the table are obtained for each row by adding together the number of languages in that row and the number of languages in all rows above that row, and dividing them by the total number of living languages. For example, the cumulative percentage shown for level 5—“Critically Endangered” (9.2%)—is obtained by adding the number of languages that are critically endangered to the number of languages that are listed as awakening and dormant, and dividing by total living languages. Thus, the cumulative percentages for each row in the right-most column may be treated as the percentage of languages worldwide which are at least that endangered. As Table 1.14 shows, there are currently 3,346 languages listed in the Catalogue. If dormant and awakening languages are excluded (as these have already crossed the point of “no remaining native speakers,” and thus can no longer be considered endangered per se), we arrive at a total of 3,116 languages. Based on Ethnologue’s list of 7,097 total living languages worldwide (Lewis et al. 2016), we may conclude that roughly 47.1% of the world’s living languages are currently endangered to some degree. Of these, 787 (11.0% of all living languages)

Table 1.14 Comprehensive breakdown of languages in the Catalogue of Endangered Languages by LEI rating LEI level 0: At Risk

Total number of languages

% of all languages worldwide*

Cumulative % of all languages worldwide

86

1.2

46.4

544

7.7

45.2

2: Threatened

901

12.7

37.5

3: Endangered

744

10.5

24.8

4: Severely Endangered

365

5.1

14.3

5: Critically Endangered

422

5.9

9.2

6: Dormant

1: Vulnerable

175

2.5

2.5

7: Awakening

55

0.8

3.2

Unknown

54

0.7

N/A

3,346

47.1

N/A

TOTAL

*Based on Ethnologue’s current estimation of 7,097 living languages (Lewis et al. 2016).

Status of the World’s Endangered Languages 43 may be considered imminently moribund, with LEI vitality ratings of level 4: “Severely Endangered” or worse. There has been a wide range of estimates for global rates of language endangerment in recent years, from Krauss’s (1992) estimation of 90% endangerment and 50% moribundity11 at one extreme, to Ethnologue’s (Lewis and Simons 2016) most recent estimation of 34% endangerment and 13% moribundity at the other. The numbers presented here are (thankfully) rather more optimistic than Krauss’s initial estimation.12 Indeed, one of the of the major takeaways from this overview is that the data compiled in the Catalogue of Endangered Languages—in our view, the most complete and accurate resource on worldwide language endangerment available at present—add to the mounting evidence that global levels of language endangerment and extinction are closer to the more conservative end of the estimation range than the more dire. Likewise, the oft-cited statistic that a language dies every two weeks is grimmer than the conclusion we draw from the data available in ELCat. This figure was initially an extrapolation made in the 1990s and early 2000s based on the prediction that at least 50% of the world’s languages would become extinct or doomed within the next 100 years (3,000 of the then-estimated ~6,000 + extant languages to be lost over the 1,200 months in those 100 years = 1 language every 1.5–2 weeks; e.g., Crystal 2000, 19), yet it has come to be treated in the literature not as the vague extrapolation that it was, but rather as a fundamental truth assumed to be based on hard facts. A more realistic approach to calculating current extinction rates is to obtain an estimate of known extinctions over some recent period of time and take the average as a rough proxy for current rate of extinction (though of course this approach is not perfect either). Again turning to the data from the Catalogue presented in Table 1.14, there are currently 175 languages listed as “Dormant.” The Catalogue lists as dormant those languages known or suspected to have lost their last speaker since 1960, so we may take as our estimation window the fifty-six-year span between 1960 and the time of this writing (2016). Most (though not all) of the 55 languages listed in the Catalogue as “Awakening” also lost their last native speaker in the years since 1960; these languages now have active, ongoing language revival efforts. Adding these 55 awakening languages to the 175 dormant languages results in a total of 230 languages which have grown dormant in the fifty-six years (672 months) since 1960. This works out to an average loss of roughly one language every 2.9 months. Thus, it appears from this data that language obsolescence is proceeding at a rate closer to one language every three months than one language every two weeks (see also Campbell et al. 2013; Campbell and Okura 2018). This is not to say that the rate of language dormancy and death will not increase in the years to come—it very well might, given the large number of still-spoken languages (296) which currently have fewer than 10 speakers and the 648 with fewer than 100 11

The term “moribundity” here denotes languages on the brink of imminent extinction. Even if we were to treat LEI vitality level 3: “Endangered” as the cutoff for moribundity, this would include 1,531 languages (21.6% of languages worldwide)—still a far cry from the more dire estimates of 50%. 12

44 Anna Belew and Sean Simpson speakers—but to claim that languages are currently dying at a rate of one every two weeks is not supported by the available data.

7. Concluding remarks: Curbing the circulation of extreme estimates Finally, it is important to note that, while the figures presented here may appear milder in contrast with the most drastic estimates that have been circulated over the years, this does not make the endangerment crisis faced by the world’s languages any less pressing or severe. The loss of a language every three months is still a frightful rate of extinction, and the fact that nearly half of the languages spoken in the world today are endangered, already exhibiting signs of potential future extinction, is shocking. The fact that this disclaimer is necessary, however, highlights one of the traps that the linguistics community has set for itself over the last two decades. Though more recent endangerment estimates tend to fall on the milder side of the spectrum, it tends to be the most dire predictions that are repeated the loudest. Whether this is because of the shock value that such estimates hold, or simply because they have been in circulation longer and are thus more entrenched, is difficult to say. It is likely that the truth is a little of both. We are not the first to report that the most alarming figures for language endangerment appear to be overinflated, yet the claims that up to 90% of the world’s languages are in imminent danger and that languages are dying off at a rate of one every two weeks remain so ingrained in the discourse surrounding language endangerment that they seem to pop up in nearly every publication, grant proposal, or newspaper article that mentions the subject. Having so often repeated the most dire of predictions—both to ourselves in academic publications and to the public at large via press releases, general-audience news articles, and splashy websites from influential organizations—we have landed ourselves in a situation where all but the most extreme estimates of endangerment appear to diminish the seriousness of the crisis faced by the world’s endangered languages today. This means we now run the risk that laypeople, funding institutions, and academics in other fields may hear the adjusted estimates and conclude that this language endangerment affair is not as pressing as it had been portrayed to be. This would be an unfortunate unintended outcome, and a major setback to the field and its efforts to document and revitalize these languages. While exuberantly overstated predictions of language death may have served to some degree to support the cause in the past, by raising awareness for the plight of endangered languages and motivating documentation and revitalization efforts, it is now clear that they are not accurate, and not supported by the evidence at hand. It is our belief that continued propagation of these figures, despite evidence to the contrary, may prove harmful to the language conservation movement in the future. We therefore strongly urge academics and others to carefully consider the possible negative consequences

Status of the World’s Endangered Languages 45 of including such erroneous claims in any future journal articles, grant proposals, or general-interest pieces they will write. The cause of language diversity may be better championed by using the actual data available to us, as well as by continuing to improve the quality and quantity of that data.

Appendix 1 Speakers and Vitality 1. Number of speakers* 2. L2 speakers 3. Semi-speakers 4. Child speakers 5. Young adult speakers 6. Older adult speakers 7. Elder speakers 8. Ethnic population 9. Date of information 10. L evel of intergenerational transmission* 11. Trends in speaker numbers* 12. Domains of use* 13. LEI endangerment rating 14. Comments

Intended to represent only L1 or fully fluent speakers.

Speakers of child-bearing age. Adults past child-bearing age. Adults of the grandparent generation. If there is an ethnic group which corresponds to the language. Year of speaker count. Numerical rating assigned by LEI criteria; see Lee and Van Way (Chapter 2, this volume) for details. Numerical rating assigned by LEI criteria; see Lee and Van Way (Chapter 2, this volume) for details. Numerical rating assigned by LEI criteria; see Lee and Van Way (Chapter 2, this volume) for details. Calculated based on the information in fields marked with an asterisk; see Lee and Van Way (Chapter 2, this volume) for details.

Locations 15. Geocoordinates 16. Countries 17. Description of Location 18. Comments

One or more map points representing the location(s) of speaker communities. A list of all countries where the language is spoken. Additional information regarding the locations of speaker communities.

Context 19. Government support

Description of any governmental, official or national policies regarding the language.

46 Anna Belew and Sean Simpson 20. Institutional support 21. Speaker attitudes 22. Other languages used 23. Number of speakers of other languages 24. Domains of use of other languages 25. Primary orthography 26. Competing orthographies 27. Comments

Description of any institutions supporting the language, e.g., universities or language centers. Description of overall attitudes of speakers towards the language. List of other languages used by multilingual speakers of the language. Description of how many speakers of the endangered language are competent in specific other languages. Description of which domains are occupied by languages other than the endangered language. Description of the language’s most widely used writing system, if any. Description of any additional orthographies other than the primary orthography.

References Anderson, Gregory D. S. and G. Murmu. 2010. “Preliminary Notes on Koro: A ‘Hidden’ Language of Arunachal Pradesh.” Indian Linguistics 71: 1–32. Belew, Anna and Sean Simpson. 2018. “Language Extinction Then and Now.” In Cataloguing the World’s Endangered Languages, edited by Lyle Campbell and Anna Belew. Abingdon, UK: Routledge. https://www.routledge.com/Cataloguing-the-Worlds-Endangered- Languages/Campbell-Belew/p/book/9781138922082 Campbell, Lyle, Nala Huiying Lee, Eve Okura, Sean Simpson, and Kaori Ueki. 2013. “New Knowledge: Findings from the Catalogue of Endangered Languages.” Paper presented at the 3rd International Conference on Language Documentation and Conservation (ICLDC), Honolulu, Hawaiʻi, February 28–March 3. Campbell, Lyle and Martha Muntzel. 1989. “The Structural Consequences of Language Death.” In Investigating Obsolescence, edited by Nancy Dorian, 181–196. Cambridge: Cambridge University Press. Campbell, Lyle and Eve Okura. 2018. “New Knowledge Produced by the Catalogue of Endangered Languages.” In Cataloguing the World’s Endangered Languages, edited by Lyle Campbell and Anna Belew. Abingdon, UK: Routledge https://www.routledge.com/ Cataloguing-the-Worlds-Endangered-Languages/Campbell-Belew/p/book/9781138922082 Childs, Tucker, Jeff Good, and Alice Mitchell. 2014. “Beyond the Ancestral Code: Towards a Model for Sociolinguistic Language Documentation.” Language Documentation & Conservation 8: 161–191. doi: http://hdl.handle.net/10125/24601. Crevels, Mily. 2012. “Language Endangerment in South America: The Clock Is Ticking.” In The Indigenous Languages of South America, edited by Lyle Campbell and Veronica Grondona, 167–234. Berlin: Mouton de Gruyter. Crystal, David. 2000. Language Death. Cambridge: Cambridge University Press. Dobrin, Lise, Peter K. Austin, and David Nathan. 2007. “Dying to Be Counted: The Commodification of Endangered Languages in Documentary Linguistics.” In Proceedings

Status of the World’s Endangered Languages 47 of Conference on Language Documentation and Linguistic Theory, edited by Peter K. Austin, Oliver Bond, and David Nathan, 59–68. London: School of Oriental and African Studies. Fishman, Joshua. 1991. Reversing Language Shift. Clevedon, UK: Multilingual Matters. Hauk, Bryn and Raina Heaton. 2018. “Triage: Setting Priorities for Endangered Language Research.” In Cataloguing the World’s Endangered Languages, edited by Lyle Campbell and Anna Belew. Abingdon, UK: Routledge. https://www.routledge.com/Cataloguing-the- Worlds-Endangered-Languages/Campbell-Belew/p/book/9781138922082 Heaton, Raina and Sean Simpson. 2018. “How the Catalogue of Endangered Languages serves communities whose languages are at risk.” In Cataloguing the World’s Endangered Languages, edited by Lyle Campbell and Anna Belew. Abingdon, UK: Routledge. https:// www.routledge.com/Cataloguing-the-Worlds-Endangered-Languages/Campbell-B elew/ p/book/9781138922082 Hill, Jane. 2002. “‘Expert Rhetorics’ in Advocacy for Endangered Languages: Who Is Listening, and What Do They Hear?” Journal of Linguistic Anthropology 12: 119–133. Krauss, Michael. 1992. “The World’s Languages in Crisis.” Language 68: 4–10. Lambrecht, Linda, Barbara Earth, and James Woodward. 2013. “History and Documentation of Hawaiʻi Sign Language: First Report.” Paper presented at the 3rd International Conference on Language Documentation and Conservation (ICLDC), Honolulu, Hawaiʻi, February 28–March 3. Lee, Nala Huiying and John Van Way. 2016. “Assessing Levels of Endangerment in the Catalogue of Endangered Languages (ELCat) Using the Language Endangerment Index (LEI).” Language in Society 42: 271–292. Lewis, M. Paul and Gary F. Simons. 2010. “Assessing Endangerment: Expanding Fishman’s GIDS.” Revue Roumaine de Linguistique 55: 103–120. Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig, eds. 2016. Ethnologue: Languages of the World. 19th ed. Dallas, TX: SIL International. Online: http://www.ethnologue.com. Moore, Robert E., Sari Pietikäinen, and Jan Blommaert. 2010. “Counting the Losses: Numbers as the Language of Language Endangerment.” Sociolinguistic Studies 4: 1–26. Moseley, Christopher, ed. 2010. Atlas of the World’s Languages in Danger, 3rd ed. Paris: UNESCO Publishing. O’Grady, William. 2014. “Jejueo: Korea’s Other Language.” Paper presented at 7th World Congress of Korean Studies, Honolulu, Hawaiʻi, November 5–7. Rybka, Konrad. 2015. “State-of-the-Art in the Development of the Lokono Language.” Language Documentation & Conservation 9: 110–133. doi: http://hdl.handle.net/10125/ 24635. Sallabank, Julia. 2013. Attitudes to Endangered Languages: Identities and Policies. Cambridge: Cambridge University Press. UNESCO Ad Hoc Expert Group on Endangered Languages (Matthias Brenzinger, Arienne M. Dwyer, Tjeerd de Graaf, Collette Grinevald, Michael Krauss, Osahito Miyaoka, Nicholas Ostler, Osamu Sakiyama, María E. Villalón, Akira Y. Yamamoto, Ofelia Zapeda). 2003. “Language Vitality and Endangerment.” Document submitted to the International Expert Meeting on UNESCO Programme Safeguarding of Endangered Languages, Paris, March 10–12. Online: http://www.unesco. org/culture/ich/doc/src/00120-EN.pdf.

Chapter 2

Assessing De g re e s of L anguage Enda ng e rme nt Nala H. Lee and John R. Van Way

1. Introduction The need for a method of assessing degrees of language endangerment or vitality has been recognized since the early 1990s, when Fishman (1991) first developed the Graded Intergenerational Disruption Scale (GIDS). Since then, other methods have been developed, such as UNESCO’s (2003) nine factors for assessing language vitality, Krauss’s (2007) schema for assessing language viability, and the Expanded Graded Intergenerational Disruption Scale (EGIDS) (Lewis & Simons 2010), each addressing a range of varied and overlapping concerns. The current chapter provides an overview of the various methods of assessing language endangerment or vitality, examines the factors that these methods associate with language loss, discusses the advantages and disadvantages of these tools, and expands on the most contemporary tool for assessment to date—the Language Endangerment Index (LEI). Notably, while some methods have been described as tools for assessing language vitality (UNESCO 2003) or viability (Krauss 2007), we describe them as being methods for assessing language endangerment in this chapter, thereby emphasizing the problem of language endangerment and the need to address it. The outcomes of utilizing these various tools are similar in that the language being assessed is positioned on a continuum between the two poles of viability and loss. As the magnitude of the language endangerment problem and the will to address it grow, as attested in the other chapters in this volume, there is a parallel need to consider how a language’s status can and should be assessed. The importance of these language endangerment measures should be clear, but some of the whys and wherefores are listed here: (i) Much of the construal and understanding of the magnitude of the language endangerment problem relies on the assessment of languages on these measures of endangerment since they allow diachronic trends to be systematically assessed.

Assessing Degrees of Language Endangerment 49 (ii) They provide an uncomplicated way of understanding the scope of language endangerment from a bird’s-eye view, particularly for the lay audience. (iii) They raise awareness of language endangerment in communities where this is an issue. (iv) They are tools with which researchers, community members, and language activists assess the urgency of attention that particular languages require. (v) They allow language endangerment to be compared across different languages in a range of different contexts. (vi) They enable researchers and funding agencies to compare and assess both the urgency and the likelihood of success for proposals to document and/or conserve endangered languages. (vii) They also allow a better understanding of whether or not linguistic diversity can be correlated with other forms of diversity such as biodiversity and cultural diversity. (Sutherland 2003; Maffi 2005; Gorenflo et al. 2012; Haspelmath 2012. Also see Harmon & Loh, Chapter 29; Maffi, Chapter 30; McClatchey, Chapter 31; Dunn, Chapter 32; and MacKenzie & Davis, Chapter 34, this volume.) Given the numerous reasons for which language endangerment assessments are essential, it is valuable to understand the methods with which degrees of language endangerment are assessed, so that researchers are able to recognize the benefits and pitfalls of various methods, apply their own assessments to the languages that they are working on, and better understand the assessments of others.

2. Some measures of degrees of language endangerment Fishman (1991), in the volume Reversing Language Shift, developed the often-used first tool to be conceived for the purpose of measuring degrees of endangerment. The GIDS is based primarily on the notion that a language’s viability will be threatened if intergenerational transmission is disrupted. GIDS identifies different domains of language use (such as education and mass media), and distinguishes between eight levels that represent different stages of disruption to the domains of language use (see Table 2.1). At level 1, a language would be the safest, and it would be used in most social domains, including education, work, mass media, and government at the national level. At the opposite end of the scale, a level 8 language would have lost most domains of use and is maintained only among members of the grandparent generation. Fundamentally, language shift is assumed to take place progressively across the different domains as the language advances along the different levels, leading up to the point where the language is lost entirely.

50 Nala H. Lee and John R. Van Way Table 2.1 Graded Intergenerational Disruption Scale (GIDS) (adapted from Fishman 1991) Level Description 1

The language is used in education, work, mass media, government at the national level.

2

The language is used for local and regional mass media and governmental services.

3

The language is used for local and regional work by both insiders and outsiders.

4

Literacy in the language is transmitted through education.

5

The language is used orally by all generations and is effectively used in written form throughout the community.

6

The language is used orally by all generations and is being learned by children as their first language.

7

The child-bearing generation knows the language well enough to use it with their elders but is not transmitting it to their children.

8

The only remaining speakers of the language are members of the grandparent generation.

A different approach is taken by UNESCO (2003), whose goal is to compile a list of endangered languages. In its most current form, the list appears on the 2010 edition of Atlas of the World’s Languages in Danger (Moseley 2010). Using the approach proposed by the UNESCO ad hoc expert group on endangered languages, the degree of language vitality or endangerment is assessed through nine factors (see Table 2.2). Each factor is rated on a scale of 0 (worst-case scenario) to 5 (best possible situation), except for absolute number of speakers, for which an actual number must be provided. All factors are viewed to be equally important—the UNESCO working group’s caveat is that no single factor when used on its own is sufficient for the assessment of a language’s vitality and that all factors should be used in conjunction with each other (UNESCO 2003: 7). UNESCO states that together, these nine factors can be applied to different languages for the purpose of assessing the state of the language and the type of support needed for maintenance, revitalization, perpetuation, and documentation. Notably, while UNESCO indicates that all factors should be equally considered, the degree of endangerment assigned to each language on their Atlas of the World’s Languages in Danger (Moseley 2010) is actually based on Factor 1: Intergenerational Transmission. That aside, the UNESCO’s nine factors for assessing language vitality differs from other methods by taking into account the type and quality of documentation (Factor 9). While this factor does not directly contribute to the understanding of language endangerment, it provides valuable information regarding the type of support needed in particular for documentation. A third framework for assessing degrees of language endangerment is that of Krauss (2007) (see Table 2.3). Krauss’s (2007) classification involves three broad categories. These include languages that are “Safe” at one end of the continuum and languages that are “extinct” at the other end of the same continuum. Languages that are “Safe” have at least one million speakers, or are officially supported regional or state languages, such as Icelandic (ISO 639-3: isl) or Faroese (ISO 639-3: fao).

Assessing Degrees of Language Endangerment 51 Table 2.2 UNESCO’s nine factors for assessing language vitality (adapted from UNESCO 2003) Factor number

Factor

1

Intergenerational language transmission

2

Absolute number of speakers

3

Proportion of speakers within the total population

4

Shifts in domains of language use

5

Response to new domains and media

6

Availability of materials for language education and literacy

7

Governmental and institutional language attitudes and policies, including official status and use

8

Community members’ attitudes toward their own language

9

Type and quality of documentation

Table 2.3 Krauss’s framework for classifying language according to viability (adapted from Krauss 2007) Terminology

Designator

Safe Endangered

Extinct

Description a+

One million speakers; officially supported regional or state language

Stable

a

All speak, children and up

Instable; eroded

a-

Some children speak, all children speak in some places

Definitely Endangered

b

Spoken only by parental generation and up

Severely Endangered

c

Spoken only by grandparental generation and up

Critically Endangered

d

Spoken only by very few, of great- grandparental generation

e

No speakers

Conversely, languages that are “extinct” have no speakers. In between the two polar ends is the broad “endangered” category, which is further elaborated. The distinctions within the broad “endangered” category are determined by how much of each generation speaks the language in question. Interestingly, Krauss (2007) states that 95% of the world’s languages most likely fall within the broad “endangered” category, and that in all probability, no language with fewer than 10,000 speakers could be classified as “Safe,” thus specifying an extremely narrow “Safe” category and a much wider “endangered” category. A more recent method devised for the assessment of degrees of endangerment is the EGIDS (Lewis and Simons 2010). Developed for the purposes of Ethnologue (Lewis,

52 Nala H. Lee and John R. Van Way Simons, and Fennig 2016), EGIDS is meant to provide an overall assessment of development versus endangerment of any language. Languages are assessed on a scale of thirteen levels, as opposed to the eight that GIDS uses. Like the GIDS, each level corresponds to a particular description. For example, Rongga (ISO 639-3: ror), an Austronesian language spoken in Indonesia, has been classified as a language at level 6a in Ethnologue, and should correspond with the following description: “[t]‌he language is used for face- to-face communication by all generations and the situation is sustainable” (Lewis et al. 2016. A key difference between GIDS and EGIDS is that each level and accompanying description are associated with an individual label as well. For example, Ahtna (ISO 639- 3: aht), an Athabaskan language spoken in Alaska, is assessed as a language at level 8a, Table 2.4 Expanded Graded Intergenerational Disruption Scale (EGIDS) (adapted from Lewis and Simons 2010) Level

Label

Description

0

International

The language is widely used between nations in trade, knowledge exchange, and international policy.

1

National

The language is used in education, work, mass media, and government at the national level.

2

Provincial

The language is used in education, work, mass media, and government within major administrative subdivisions of a nation.

3

Wider The language is used in work and mass media without official status to communication transcend language differences across a region.

4

Educational

The language is in vigorous use, with standardization and literature being sustained through a widespread system of institutionally supported education.

5

Developing

The language is in vigorous use, with literature in a standardized form being used by some though this is not yet widespread or sustainable.

6a

Vigorous

The language is used for face-to-face communication by all generations and the situation is sustainable.

6b

Threatened

The language is used for face-to-face communication within all generations, but it is losing users.

7

Shifting

The child-bearing generation can use the language among themselves, but it is not being transmitted to children.

8a

Moribund

The only remaining active users of the language are members of the grandparent generation and older.

8b

Nearly extinct

The only remaining users of the language are members of the grandparent generation or older who have little opportunity to use the language.

9

Dormant

The language serves as a reminder of heritage identity for an ethnic community, but no one has more than symbolic proficiency.

10

Extinct

The language is no longer used and no one retains a sense of ethnic identity associated with the language.

Assessing Degrees of Language Endangerment 53 Table 2.5 Revitalization EGIDS labels and descriptions (adapted from Lewis & Simons 2010) Level

Label

Description

6b

Re-established

Some members of a third generation of children are acquiring the language in the home with the result that an unbroken chain of intergenerational transmission has been re-established among all living generations.

7

Revitalized

A second generation of children are acquiring the language from their parents who also acquired the language in the home. Language transmission takes place in home and community.

8a

Reawakened

Children are acquiring the language in community and some home settings and are increasingly able to use the language orally for some day-to-day communicative needs.

8b

Reintroduced

Adults of the parent generation are reconstructing and reintroducing their language for everyday social interaction.

9

Rediscovered

Adults are rediscovering their language for symbolic and identificational purposes.

which is associated with the description “[t]he only remaining active users of the language are members of the grandparent generation and older,” and bears the label “moribund.” The labels and descriptions have been revised and updated since the inception of EGIDS, and Table 2.4 lists the labels and descriptions according to how they currently appear on the Ethnologue website (Lewis et al. 2016) at the time of writing. In addition to focusing on language shift and assessing its downward trend, Lewis and Simons (2010) also provide a set of alternative labels for situations of revitalization that the labels and descriptions in Table 2.4 do not capture. Table 2.5 lists these alternative labels and descriptions. The additional set of labels and descriptions for categories representing levels of revitalization further differentiates EGIDS from the other methods of assessment that have been presented in this section. As compared to these other methods of assessment, the EGIDS is arguably more comprehensive at both safe and endangered ends of the scale.

3. Common advantages and disadvantages of various methods of assessment There are advantages and disadvantages of GIDS, UNESCO’s nine factors for assessing language vitality, Krauss’s framework for classifying languages according to viability,

54 Nala H. Lee and John R. Van Way and EGIDS, depending on what the goal of assessment is, and how much and what type of information is available for assessment. Dwyer (2011), Gao (2015), and Lee and Van Way (2016) have presented some of these arguments. GIDS is laudable as the predecessor of all other measures of endangerment. In developing GIDS, Fishman (1991) pioneered the way for assessing degrees of endangerment— in addition to enabling researchers to approach the issue of language vitality in a systematic manner, GIDS associated the loss of vitality with language shift, and illustrated how intergenerational transmission (or the lack of it) could be a key indicator of language shift. The assumption that intergenerational transmission is necessary for the continued viability of any language now underlies all newer methods of measuring endangerment, including UNESCO’s nine factors for assessing language vitality, Krauss’s framework for classifying languages according to viability, and EGIDS. The notion of language shift across various domains, as incorporated in the descriptions of the various GIDS levels, is also utilized in most of the other newer measures of language endangerment or vitality. Two of UNESCO’s nine factors deal explicitly with domains of language use, including Factor 4 and Factor 5, which are concerned with “shifts in domains of language use” and “response to new domains and media,” respectively. As an elaborated version of GIDS, EGIDS also continues to incorporate the concept of domains of language use in the individual descriptions of the various levels. For example, a level 1 language is one that is “used in education, work, mass media, and government at the national level.” There are additional advantages to the newer methods. With nine factors, UNESCO’s (2003) framework is clearly much broader. It covers a more comprehensive range of varied factors, including items such as “proportion of speakers within the total population” and “availability of materials for language education and literacy,” and it allows the individually delineated factors to be considered independently of each other. The benefits of doing so are further explained in the next paragraph. Krauss’s (2007) framework for assessing degrees of endangerment incorporates more endangered than safe categories, the implication being that more languages than not cannot be considered as safe. Hence, for all intents and purposes, Krauss (2007) highlights the immensity of the language endangerment problem at hand. EGIDS on the other hand, as Dwyer (2011) points out in her comparison of various methods of assessing endangerment, yields fast results as compared to methods that require far more information. This is useful for when a quick evaluation is required, and the specific information required for an EGIDS assessment is available. The various frameworks discussed thus far can all be further enhanced. In their reconceptualization of GIDS, Lewis and Simons (2010) pointed out that GIDS does not adequately describe all possible statuses of a language, and that GIDS is least elaborated at the lowest end of the scale, where disruption is supposedly the greatest. In response, EGIDS was developed. However, EGIDS has limitations that are similar to GIDS. Whereas UNESCO’s framework features nine individually delineated factors associated with language endangerment, multiple factors are designated in single categories within the GIDS and EGIDS systems. It is often challenging to put a language into a single category that describes multiple factors. The assessment of Central Okinawan (ISO 639-3: ryu) is problematic on the EGIDS. The language is rated as level 7 (shifting) in Ethnologue (Lewis et al. 2016), which means that the language should meet the

Assessing Degrees of Language Endangerment 55 description for a level 7 language: “[t]‌he child-bearing generation can use the language among themselves, but it is not being transmitted to children.” In many ways, detailed nuances that paint the context of the language’s situation are lost. Read (2011) shows that not only are children generally not learning Central Okinawan, but the child-bearing generation are mostly only passive speakers and do not use the language among themselves; the situation might better fit some level between 7a (shifting) and 8a (moribund). In addition, the use of the language has always been discouraged by government policies, leading to a shrinkage of domains of language use. Despite having the highest number of speakers of any Ryukyuan language, Central Okinawan has experienced a sharp decline in its number of speakers. None of these factors are taken into account in the EGIDS description. To further complicate matters, there have been some attempts at standardization and some resources developed, including a medical dictionary (Inafuku 1992). There are also revitalization attempts under way such as language classes (Hara 2005). In short, lumping various factors into a single description is not without problems. Krauss’s model (2007) has the advantage of a wider endangered end of the scale, like EGIDS, but it is less comprehensive with regard to individual factors that are associated with language endangerment. Krauss (2007: 2) acknowledges that “it remains a major study . . . to consider factors detracting from language ‘SAFETY’ ” (emphasis in original). Evans (2010) also indicates that a difficulty in applying Krauss’s model is that speakers mix words from dominant languages into their own—this may appear to be language shift, but it is unclear if speakers are doing this out of choice or if they do it because they do not have sufficient command of their own language. This observation applies to all the models discussed here, since ongoing language shift is always inherently assumed. Thus, in addition to paying attention to which generations are speaking what language, it helps to have other information, such as in which domains speakers use particular languages. Conversely, the UNESCO framework is more comprehensive with regard to the individual factors that detract from language safety, has assessment criteria that concern domains of language use, and is designed to treat individual factors separately. However, there is an inherent problem with the last point in particular. The nine-factors approach is not designed in such a way that one can determine the overall degree of endangerment of a language by combing the nine different factors. Again, in UNESCO’s online Atlas of the World’s Languages in Danger (Moseley 2010), the overall outlook of the language appears to be principally determined by “intergenerational language transmission,” even though all factors are supposed to be considered together for an accurate picture of linguistic vitality. Effectively, only the first factor in UNESCO’s list of nine factors appears to be utilized when comparing endangerment across different languages in the online atlas. Also, the UNESCO framework addresses the type and quality of documentation as a factor indicating language vitality. While this information is important because it helps indicate the potential for revitalization and the urgency of further research and development of language material, the relation between type and quality of documentation and the degree of endangerment of a language is simply not clear. A language can be endangered to a greater or lesser extent regardless of whether documentation exists. Latin, as an example, is extremely well documented but, nevertheless, no longer has native speakers.

56 Nala H. Lee and John R. Van Way On a related note, in the ideal state of perfect information, no single factor should be relied on for a complete assessment of a language’s vitality. UNESCO’s caveat can again be reiterated for effect: none of the nine factors, when used on their own, is sufficient for the assessment of a language’s vitality and all factors should be used in conjunction with each other (UNESCO 2003: 7). This is underscored by the fact that even intergenerational transmission or the lack of it, which is the most important factor in all of the methods of language endangerment assessment, may not always be an accurate indicator of degree of endangerment. In Dwyer’s (2011) assessment of EGIDS, it becomes clear that a language can be endangered even if intergenerational transmission remains relatively strong. Dwyer (2011) examines the assessment of Wutun (ISO 639-3: wuh), a Chinese-Tibetan- Mongolic contact language, showing that even though the language would be labeled “vigorous” within the EGIDS framework for corresponding with Lewis and Simons’ (2010) description of being used orally by all generations, and is being learned at home by all children as their first language, there are low speaker numbers and a lack of formal support, and the situation is further complicated by ethnic misclassification. This lends support to the notion that any method of assessment should not be based solely on one factor. All four frameworks mentioned here share the same inherent disadvantage. Many languages and their social conditions are simply not well documented. Lehmann (1999) suggests that while there is no reliable estimate of the number of languages that have received linguistic description, it is probable that nothing is known about half of the world’s languages aside from their names. This may be an overstatement, but, by any measure, there is insufficient scholarship regarding a large number of the world’s languages, much less information for the accurate assessment of a language on any of these measures. In other words, one would not be able to confidently apply a GIDS, EGIDS, Krauss, or UNESCO descriptor to a particular language if not much is known about that language. Given the advantages and disadvantages of the various methods of assessing degrees of language endangerment, the key questions facing anyone who wishes to assess a language’s status are what type of information does one have, and to what end? EGIDS is fast to apply but requires particular types of information regarding intergenerational transmission. The UNESCO framework is more comprehensive but also requires more information than the EGIDS, and it is not designed to give an overall score that may be useful for comparing across languages. The LEI is designed to mitigate some of these issues, learning from the successes and shortcomings of its predecessors.

4. A new method: The Language Endangerment Index The LEI was designed for the needs of the Catalogue of Endangered Languages (ELCat) which is the central feature of the Google-powered Endangered Languages Project

Assessing Degrees of Language Endangerment 57 (ELP, www.endangeredlanguages.com), a digital platform for sharing information and resources on the world’s endangered languages. While LEI was designed with ELCat in mind, it is an assessment tool that can easily be utilized for the purposes of other researchers, community members, and interested parties. LEI has the advantage of relying on multiple factors for scoring, and is usable even if the knowledge of the assessed language and its social situation is not perfect. LEI assesses a language based on four separate factors, and it is possible to attain an overall score for degree of endangerment even in the absence of particular information for certain factors. In addition to an overall level of endangerment, LEI also generates a level of certainty with which one may regard the level of endangerment; the level of certainty is based on the number of factors used to determine the assessment (i.e., the factors for which information is available).

4.1. Factors In developing LEI, we maintain that understanding a language’s overall degree of endangerment is only possible through the careful examination of individual factors responsible for the language’s vitality. LEI assesses the level of endangerment based on four factors: intergenerational transmission, absolute number of speakers, speaker number trends (whether increasing or decreasing), and domains of use. These four factors identified are knowable and comparable across languages. While considerations such as attitudes toward one’s language indubitably contribute to language vitality, such information is rather complex and differs depending on particular communities. Also, when compared to the other factors used on LEI, rarely is information on attitudes reported in the information available about the language in question. Each of the four factors is measured on a scale of 0 to 5, where each number is associated with a particular description. In general, the bigger the number, the more likely that the language being assessed is endangered. With any source of information, the assessor is aware of which factors can be assessed, and chooses the best description for the language from those available on each scale. All descriptions are written in clear and straightforward terms, so that these scales can be used by anyone, with or without training in linguistics. The following subsections provide information regarding the use of these individual scales of intergenerational transmission, absolute number of speakers, speaker number trends, and domains of use.

4.1.1. Intergenerational transmission Intergenerational transmission is undoubtedly the most critical factor in assessing degrees of language endangerment. This is recognized on all other vitality frameworks and also in LEI, where intergenerational transmission carries twice the weight of each of the other three factors. Doubling the weight of the intergenerational transmission score reflects the importance of this factor. It is almost certain that a language ultimately faces extinction if younger generations have no knowledge of it. In Krauss’s terms (1992: 4),

58 Nala H. Lee and John R. Van Way Table 2.6 Scale of intergenerational transmission 5

4

3

2

1

0

Critically endangered

Severely endangered

Endangered

Threatened

Vulnerable

Safe

There are only a few elderly speakers.

Many of the grandparent generation speak the language, but the younger people generally do not.

Some adults in the community are speakers, but the language is not spoken by children.

Most adults in the community are speakers, but children generally are not.

Most adults and some children are speakers.

All members of the community, including children, speak the language.

languages that are no longer being learned by children are “beyond mere endangerment” and “doomed to extinction” if the course is not dramatically reversed. Table 2.6 presents the scale of intergenerational transmission. At the two opposing ends of the scale are two extreme scenarios. For example, Bih,1 an Austronesian Malayo-Chamic language of Vietnam, has very few elderly speakers, and it is not being learned by the younger generation (Ngyuen 2013). It is therefore considered to be “Critically Endangered” on the scale of intergenerational transmission. In contrast, a language that is spoken by all members of the community, including children, would be ranked as “Safe,” as, for example, Mandarin (ISO 639-3: cmn) in Beijing, China.

4.1.2. Absolute number of speakers The next factor on LEI is absolute number of native speakers. On this scale, a “Critically Endangered” language would have between one and nine speakers (see Table 2.7). An example of a language that is “Critically Endangered” is Akuntsú (ISO 639-3: aqz), which has only five speakers at the time of writing (Aragon 2014). In comparison, a “Safe” language would have at least 100,000 speakers. A 2001 census of India showed that Hindi is spoken natively by 422 million people in the country (India 2001); it would constitute a “Safe” language on the scale of absolute number of speakers. Note that while numbers may sometimes appear arbitrary, they are necessary on an index of endangerment— in many cases, an estimate of speaker numbers is the only kind of vitality information available in sources on the language. For example, the information available at the time of writing for Vishavan (ISO 639-3: vis), a Dravidian language spoken in India, includes only a figure for number of speaker (150). Hence an endangerment index that does not account for speaker numbers would be entirely unsuitable for evaluating the degree 1

Bih does not appear to have its own ISO 639-3 code, and it is possibly conflated with Rade (ISO 639- 3: rad) in Ethnologue (Lewis et al. 2016).

Assessing Degrees of Language Endangerment 59 Table 2.7 Scale of absolute number of speakers 5

4

3

2

1

0

Critically endangered

Severely endangered

Endangered

Threatened

Vulnerable

Safe

1–9 speakers

10–99 speakers

100–999 speakers

1000–9999 speakers

10,000–99,999 speakers

≥ 100,000 speakers

Table 2.8 Scale of speaker number trends 5

4

3

2

1

0

Critically endangered

Severely endangered

Endangered

Threatened

Vulnerable

Safe

A small percentage of the community speaks the language, and speaker numbers are decreasing very rapidly.

Less than half of the community speaks the language, and speaker numbers are decreasing at an accelerated pace.

Only about half of community members speak the language. Speaker numbers are decreasing steadily, but not at an accelerated pace.

A majority of community members speak the language. Speaker numbers are gradually decreasing.

Most members of the community speak the language. Speaker numbers may be decreasing, but very slowly.

Almost all community members speak the language, and speaker numbers are stable or increasing.

of endangerment for languages such as Vishavan. We must highlight the fact that the ranges of LEI’s speaker number scale are chosen because they are comparable across levels within this factor, as well as with LEI’s other factors, and are reasonable in their approximate influence on language vitality.

4.1.3. Speaker number trends Where appropriate information is available, it is necessary to go beyond the absolute number of speakers. The scale of speaker number trends aims to capture a more dynamic view of language shift by providing information regarding direction and rate of shift (see Table 2.8). A language that is “Critically Endangered” on this scale is one that is spoken by a small percentage of the community, and whose number of speakers is decreasing rapidly. Thao (ISO 639-3: ssf), an Austronesian language spoken in Taiwan, is an example of a “Critically Endangered” language by speaker number trends. There are only fifteen speakers of Thao “out of a considerably larger population which claims ancestry” (Blust 2003: 1), and the rate of shift away from Thao toward Taiwanese is

60 Nala H. Lee and John R. Van Way Table 2.9 Scale of domains of use 5

4

3

2

1

0

Critically endangered

Severely endangered

Endangered

Threatened

Vulnerable

Safe

Used only in a few very specific domains, such as in ceremonies, songs, prayer, proverbs, or certain limited domestic activities

Used mainly just in the home and/or with family, and may not be the primary language even in these domains for many community members

Used mainly just in the home and/or with family, but remains the primary language of these domains for many community members

Used in some nonofficial domains along with other languages, and remains the primary language used in the home for many community members

Used in most domains except for official ones such as government, mass media, education, etc.

Used in most domains, including official ones such as government, mass media, education, etc.

occurring rapidly. In contrast, a “Safe” language on this scale is one spoken by almost all community members, and whose numbers are stable or increasing.

4.1.4. Domains of use “Domains of language use” is a term that has been used as early as the 1930s to refer to the different functions of languages in a bilingual community (see Schmidt-Rohr 1932; Mak 1935; Weinreich 1966). According to Fishman (1965, 1994: 44), domains are “interactions that are rather unambiguously related (topically and situationally) to one or another of the major institutions of society,” and examples of these institutions include family, government and religion, among others. On the LEI’s scale of domains of use, at the most advanced stage of language shift, a language would only be used in a few very specific domains, such as in ceremonies, songs, prayer, proverbs, or certain limited domestic activities (see Table 2.9). Examples of such languages are Ge’ez, which is used only for liturgical purposes.2 On the other end of the scale are languages that are used in most domains, such as Malay in Malaysia (ISO 639-3: zsm). It is often the case that speakers value languages that are used in official domains, and promote education and literacy in these languages, often at the expense of other languages. Note that an implicit cline is observed when language shift occurs. For example, if the language is used in official domains such as higher education, it is usually inferred that the language is also used in informal domains such as the family. 2 Note that the domain of religion is not regarded as an official one, considering that speakers of threatened languages may practice non-dominant religions and traditions. Fishman (1991: 99) states that the remaining few speakers of a language include “reciter,” “blessers,” “cursers,” and “prayers.”

Assessing Degrees of Language Endangerment 61

4.2. Calculating level of endangerment and level of certainty Individual scores on the various LEI scales are aggregated to arrive at an overall level of endangerment—the scores assigned for each factor are summed, with the score for intergenerational transmission weighted double, and the total score is then converted to a percentage of the highest attainable score based on the number of factors used. When all four factors are used, the highest attainable score is 25. If only two factors are used (e.g., absolute number of speakers and speaker number trends), the highest attainable score is 10. However, if two factors are used, and one of these is intergenerational transmission (e.g., intergenerational transmission and absolute number of speakers), the highest attainable score is 15 (10 points possible for intergenerational transmission [five times two], plus 5 points possible for absolute number of speakers). The total sum of scores for all factors used is then divided by the highest attainable score based on the number of factors used, and converted to a percentage (by multiplying by 100). The formula for deriving the aggregate score as a percentage is as follows: Level of endangerment = {[(intergenerational transmission score × 2) + absolute number of speakers score + speaker nu mber trends score + domains of use score] / total possible score based on number of factors used} ×100.

The percentage generated is then interpreted using the scale on the left in Table 2.10 for the language’s overall level of endangerment. The scale on the right in the same table provides the level of certainty with which one may regard the generated level of endangerment; level of certainty is based on number of factors known and used in the rating. While the two pieces of information are not combined, they should be considered together. To demonstrate more effectively how LEI works, Dupaningan Agta and Sentinelese will be used as examples. Dupaningan Agta (ISO 639-3: duo) is an Austronesian language spoken in the northeastern part of Luzon in the Philippines. Robinson (2008) describes the language and its social conditions in her reference grammar: most adults in the community are speakers, but children generally are not. In five of the thirty-five Dupaningan Agta communities, the language is said to be no longer learned by children. Even in places where children know the language, they often reply in Ilokano when spoken to in Dupaningan Agta. The language is also said to have about 1,400 speakers. Based on Robinson’s (2008) information, we are able to assess the language based on intergenerational transmission information as well as absolute number of speakers. No information regarding speaker number trends and domains of use is available. Dupaningan Agta is thus given 2 points on the scale of intergenerational transmission and 2 points on the scale of absolute number of speakers, since it is “Threatened” on both scales (see Table 2.6 and Table 2.7). The level of endangerment is calculated in

62 Nala H. Lee and John R. Van Way Table 2.10 Language Endangerment Index (LEI) and levels of certainty Language Endangerment Index

Level of certainty

100–81% = Critically Endangered

25 points possible = 100% certain, based on the evidence available

80–61% = Severely Endangered

20 points possible = 80% certain, based on the evidence available

60–41% = Endangered

15 points possible = 60% certain, based on the evidence available

40–21% = Threatened

10 points possible = 40% certain, based on the evidence available

20–1% = Vulnerable

5 points possible = 20% certain, based on the evidence available

0% = Safe

this manner: [(2×2) +2]/[(5×2) +5] × 100 = 6/15 × 100 = 40% (threatened), where 6 is the total number of points the language has been accorded on the scales, and 15 is the total number of attainable points on those scales, since intergenerational transmission is doubly weighted. Also since the language is scored on the basis of 15 possible points out of 25, we are 60% certain that our assessment of the language as “Threatened” is accurate (see Table 2.10).3 In stark contrast to Dupaningan Agta, much less is known about Sentinelese (ISO 639-3: std). The language is spoken on North Sentinel Island, located among the Andaman Islands. The Sentinelese are said to be extremely hostile to outsiders, in particular those who have tried to land on their island (Singh 1978). In 2001, the Census of India shows that they surveyed the Sentinelese from a distance, and recorded that there were thirty-nine individuals (India 2001). Based on this limited information, the language can only be rated based on the scale of absolute number of speakers. With less than 100 speakers, the language is considered to be “Severely Endangered” on this scale (see Table 2.7). One may postulate that the language is safe and used in all domains, and make various conjectures about the language’s measures on the other scales, but these would be sheer speculation, since little is known about the community. The language is hence “Severely Endangered” on the overall scale of endangerment, based on

3

The other unique labels utilized by ELCat are dormant and awakening. ELCat’s stance is that it is better to declare that a language is dormant instead of “dead” or “extinct” if the language has recently lost its last speaker. The loss of a language is a sensitive subject, and the term “dormant” is used out of respect to community members, and also to encourage revival efforts. Similarly, to also encourage language revitalization efforts, the term “awakening” is used for languages once considered dormant, but where exists within the community some form of targeted revitalization undertaken by a coherent and organized group of interested parties, with the expressed goal of creating new speakers of the language.

Assessing Degrees of Language Endangerment 63 the information provided by the Census of India about number of speakers, and we are 20% certain about this rating since the language is scored on 5 points out of the 25 possible points because only one factor is known. The above demonstration of LEI, which is the latest method for assessing degrees of language endangerment, shows how researchers and interested parties can attain a more accurate and appropriate assessment of endangerment even in the absence of particular types of information, as well as a score for level of certainty. This key feature differentiates LEI from GIDS, EGIDS, or any of its other predecessors.

5. Conclusion Given the enormity of the language endangerment problem, the need for accurate and rapid assessments of degrees of language endangerment has become more vital than ever before. For parties who have a vested interest in endangered languages, or in their documentation, conservation, or revitalization, it is important to be aware of the different advantages and pitfalls of the various vitality or endangerment assessment tools that exist, so that an appropriate tool can be selected for their purposes. Drawing on the strengths of previous research, the LEI was developed to assess the state of endangerment quickly and accurately for languages throughout the globe. It produces comparable and quantifiable results, which can be used for a bird’s-eye view of a region, or a closer look at the key factors known and unknown about a particular language. LEI’s advantages are that it provides an overall endangerment rating (while simultaneously treating individual factors separately) and that it does not conflate uncertainty with endangerment, as it generates a level of certainty rating for each assessment. Although it was designed for the purposes of the ELCat, its capability to produce comparable and replicable results will allow other researchers, language activists, and policymakers to assess language vitality for themselves in other contexts.

6. Where to find more information Those interested in learning more can visit the Endangered Languages Project online at http://endangeredlanguages.com. Each language in the Catalogue of Endangered Languages is shown there with its endangerment rating, as computed by LEI. When exploring a language, users can see the assessment in each of the four factors (intergenerational transmission, number of speakers, trends over time, domains of use), as mentioned in each source on the language. Users can also browse the world map and adjust the filters to view only languages with certain levels of endangerment (among other filters). It is hoped that readers will be able to see how LEI is applied by exploring the website.

64 Nala H. Lee and John R. Van Way

References Aragon, Carolina Coelho. 2014. “A Descriptive Grammar of Akuntsú.” PhD diss., University of Hawai‘i at Mānoa. Blust, Robert. 2003. Thao Dictionary. Taipei: Academica Sinica, Institute of Ethnology. Dwyer, Arienne M. 2011. “Tools and Techniques for Endangered-Language Assessment and Revitalization.” Paper presented at the Vitality and Viability of Minority Languages conference, New York, October 23–24. http://www.trace.org/events/events_lecture_proceedings. html. Accessed June 26, 2016. Evans, Nicholas. 2010. Dying Words: Endangered Languages and What They Have to Tell Us. Malden, MA: Wiley-Blackwell. Fishman, Joshua A. 1991. Reversing Language Shift: Theoretical and Empirical Foundations of Assistance to Threatened Languages. Clevedon, UK: Multilingual Matters. Fishman, Joshua A. 1965. “Who Speaks What Language to Whom and When?” La Linguistique 1: 67–88. Gao, Katie B. 2015. “Assessing the Linguistic Vitality of Miqie: An Endangered Ngwi (Loloish) Language of Yunnan, China.” Language Documentation & Conservation 9: 164–191. Gorenflo, Larry J., Suzanne Romaine, Russell A. Mittermeier, and Kristen Walker-Painemilla. 2012. Co-occurrence of Linguistic and Biological Diversity in Biodiversity Hotspots and High Biodiversity Wilderness Areas.” Proceedings of the National Academy of Sciences of the United States of America 109: 8032–8037. Hara, Kiyoshi. 2005. “Regional Dialect and Cultural Development in Japan and Europe.” International Journal of the Sociology of Language 175/176: 193–211. Haspelmath, Martin. 2012. “Should Linguistic Diversity Be Conserved Like Biodiversity?” Diversity Linguistics Comment, June 23. http://dlc.hypotheses.org/195. Accessed May 30, 2016. Inafuku, Seiki, ed. 1992. Igaku okinawago jiten [A Medical Dictionary of Okinawan]. Ginowan: Roman Shobō Honten. India, Office of the Registrar General and Census Commissioner. 2001. Census of India. New Delhi: Government of India, Ministry of Home Affairs. Krauss, Michael. 1992. The world’s Languages in Crisis. Language 68(1): 4–10. Krauss, Michael. 2007. “Classification and Terminology for Degrees of Language Endangerment.” In Language Diversity Endangered, edited by Matthias Brenzinger, 1–8. Berlin: Mouton de Gruyter. Lee, Nala H. and John R. Van Way. 2016. “Assessing Levels of Endangerment in the Catalogue of Endangered Languages (ELCat) Using the Language Endangerment Index (LEI).” Language in Society 45: 271–292. Lehmann, Christian. 1999. “Documentation of Endangered Languages: A Priority Task for Linguistics.” Arbeitspapiere des Seminars fur Sprachwissenschaft der Universität Erfurt 1: 1– 15. Erfurt: Universität Erfurt. http://www2.uni-erfurt.de/sprachwissenschaft/ASSidUE/ ASSidUE01.pdf. Accessed June 26, 2016. Lewis, M. Paul and Gary F. Simons. 2010. “Assessing Endangerment: Expanding Fishman’s GIDS.” Revue Roumaine de Linguistique 55: 103–120. Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig. 2016. Ethnologue: Languages of the World. 19th ed. Dallas, TX: SIL International. Online: http://www.ethnologue.com. Accessed May 1, 2016.

Assessing Degrees of Language Endangerment 65 Maffi, Luisa. 2005. “Linguistic, Cultural, and Biological Diversity.” Annual Review of Anthropology 34: 599–617. Mak, Wilheim. 1935. “Zweisprachigkeit und Mischmundart im Obserschlesien.” Schlesisches Jarbuch für deutsche Kulturarbeit 7: 41–52. Moseley, Christopher, ed. 2010. Atlas of the World’s Languages in Danger, 3rd Edition. Paris: The United Nations Educational, Scientific and Cultural Organization. http://www.unesco.org/ culture/languages-atlas/en/atlasmap.html. Accessed May 1, 2016. Ngyuen, Tam. 2013. “A Grammar of Bih.” PhD diss., University of Oregon. Read, Zachary. 2011. “Number of Speakers of Central Okinawan.” A Guide to Okinawan, JLect. http://www.jlect.com/downloads/Number-of-Central-Okinawan-Speakers.pdf. Accessed May 1, 2016. Robinson, Laura C. 2008. “Dupaningan Agta.” PhD diss., University of Hawai‘i at Mānoa. Schmidt-Rohr, Georg. 1932. Die Sprache als Bildnerin de Völker. Jena: Diederichs. Singh, N. Iqbal. 1978. The Andaman Story. New Delhi: Vikas. Sutherland, William J. 2003. “Parallel Extinction Risk and Global Distribution of Languages and Species.” Nature 423: 276–279. UNESCO Ad Hoc Expert Group on Endangered Languages (Matthias Brenzinger, Arienne M. Dwyer, Tjeerd de Graaf, Collette Grinevald, Michael Krauss, Osahito Miyaoka, Nicholas Ostler, Osamu Sakiyama, María E. Villalón, Akira Y. Yamamoto, Ofelia Zapeda). 2003. “Language Vitality and Endangerment.” Document submitted to the International Expert Meeting on UNESCO Programme Safeguarding of Endangered Languages, Paris, March 10–12. http://www.unesco.org/culture/ich/doc/src/00120-EN.pdf. Accessed June 26, 2016. Weinreich, Uriel. 1966. Languages in Contact: Findings and Problems (4th Printing). London: Mouton & Co.

Chapter 3

L anguage C ontac t a nd L anguage Enda ng e rme nt Sarah G. Thomason

1. Introduction Language contact, peaceful and hostile, has been a feature of all human cultures throughout history.1 Language endangerment, though less pervasive than language contact, has probably also occurred throughout human history, but it is especially prominent today. There are currently about 7,000 languages in the world. About 3,400 of these languages are endangered, according to the Catalogue of Endangered Languages (ELCat, www.endangeredlanguages.com). Optimists predict that 50% of the world’s languages will be lost by 2100; pessimists predict that 90% will be gone by then. Language endangerment almost always happens as a result of language contact, so that the two phenomena are closely interrelated. The goal of this chapter is to explore their interrelationships, first sociopolitically and then linguistically. The chapter begins with preliminary comments on the complex relationships between contact and the loss of languages. These comments are followed by the first main section (section 2), a survey of some well-known conditions under which contact has led to endangerment (with mention of one or two less thoroughly studied conditions as well). I will then argue that contact does not predictably cause one (or more) of the languages in a contact situation to decline (section 3). Section 4 compares contact-induced language change with processes of attrition in some (not all) gravely endangered languages. The chapter closes with a brief conclusion (section 5). 1

One section of this chapter derives from my talk on “Safe and unsafe language contact,” which was delivered as the Linguistic Society of America (LSA) presidential address in Baltimore in January 2010 and later at Wayne State University; I am grateful to both audiences for valuable comments. Lyle Campbell’s helpful comments on an early draft of this version led to significant improvements, and I am grateful for his help. Any errors and infelicities that remain are of course my own responsibility.

Language Contact and Language Endangerment 67 First, though, an explanation of the hedge “almost always” in the statement above of the linkage between contact and endangerment: in a very few known cases, contact has played no role in the disappearance of languages. (There may of course be additional cases described in the literature, as well as cases that left no traces at all in the historical record.) One example of language death without language contact was a devastating natural disaster: Tamboran, a non-Austronesian language once spoken in present-day Indonesia, is said to have died in 1815, when its speakers were all killed by the eruption of Mt. Tambora (Donohue 2007). Other cases of language death without language contact result from genocide; these cases are less straightforward because genocide might have been preceded by some decline due to contact. One example is Susquehannock, an Iroquoian language that was spoken in eastern Pennsylvania and neighboring areas. A smallpox epidemic—a disease brought to the New World by Europeans—decimated tribal members and then, in 1763, the remnant of the tribe was killed by a lynch mob, taking their language with them into oblivion (Mithun 1981, 2). The epidemic was caused by direct or indirect contact with the European immigrants, not with their languages per se; the genocide was the murder of the last few surviving speakers. Most other known instances of genocide have been less complete. For instance, when the government of El Salvador, in putting down a peasant revolt, attempted to kill all the people identified as Indians, Salvadoran Indians tried to protect themselves by giving up their native languages and shifting abruptly to Spanish. Two of the indigenous languages were especially hard hit by this shift: Chilanga (Salvadoran Lenca) is now extinct (ELCat, www.endangeredlanguages.com), and Cacaopera is also classified as extinct (Campbell & Muntzel 1989).

2. When language contact leads to language endangerment The scholarly literature can give the impression that language contact is a zero-sum game—that, in a two-language contact situation, one of the languages is doomed to disappear because its speakers will shift to the other language. As I argue in the next section, this impression is misleading: many contact situations are stable, in the sense that both (or all) of the languages are maintained. It is true, however, that a great many contact situations feature transitional bilingualism (or multilingualism), in which speakers of nondominant languages first learn the dominant language in their environment and then shift to it, giving up their heritage languages. In these cases only one of the two languages is maintained over the long term. This general situation is the focus of this section. All the causal factors and examples in the section illustrate different aspects of the linkages between language contact and language endangerment, specifically in situations in which language contact leads to endangerment.

68 Sarah G. Thomason Language endangerment very rarely has just one cause. Almost inevitably, the forces that bring a language to the edge of extinction are much more complex. A list of causes of language decline under contact conditions should therefore not be interpreted as a list of mutually exclusive routes to language loss; in the vast majority of cases, more than one cause is operative, and the causes themselves also overlap with each other. The list of macro-causes of language loss in contact situations is short: conquest, economic pressures, melting pots, language politics, and language attitudes. Standardization, which often leads to the endangerment and demise of nonstandard dialects, may or may not count as a separate cause, because it may be subsumed under political and/or economic causes. All these factors, with the possible exception of language attitudes, have been extensively studied in the context of language loss, so I will characterize each of them very briefly in this section (see, for instance, Thomason 2015, 18–35, for more detailed discussion of the various factors). Sometimes, when a community is conquered, its language is replaced by the conquerors’ language. The languages of colonial powers in the New World and Australia, for example, have replaced many, perhaps most, of the indigenous languages that were spoken before the colonizers arrived. Similarly, the Roman conquest of, and Roman settlement in, large portions of western Europe resulted in the disappearance of continental Celtic languages and the endangerment of Basque and other minority languages; and Russian expansion throughout northern Eurasia has led to the ongoing replacement of many indigenous languages of Siberia by Russian. Some cases in which conquerors’ languages have triumphed are due to coercion, as when the US, Canadian, and Australian governments implemented policies that actively suppressed the linguistic and cultural heritage of indigenous peoples. In other cases, however, indigenous people have shifted to a conqueror’s language because its dominance in the region’s political, economic, and social spheres has made the shift expedient. Economic pressures can encourage language shift (and therefore loss of the shifting group’s original language) even without conquest, of course. Immigrant populations in the United States, for instance, have shifted to English rapidly, typically (at least in urban settings) within three generations: knowing English, except in a few Spanish-speaking enclaves in parts of Florida and the southwestern United States, is a prerequisite for economic success. The American varieties of those immigrant languages—American Hungarian, American Polish, and so forth—are all disappearing. And although official government suppression of indigenous languages no longer exists in the United States, Canada, or Australia, the indigenous languages continue to decline, in part because of economic pressure to assimilate to the dominant culture. Similar replacements of local minority languages by the languages of socioeconomically dominant groups can be found in most countries around the world. In the United States the economic motivation overlaps with the peculiar national cultural ideology, the melting pot: the idea that shifting to American culture and the English language is an absolute cultural good, preserving a perceived (though nonexistent) homogeneous national cultural identity, is powerful. Groups that don’t assimilate—merge into the melting pot—are viewed with suspicion by many Americans.

Language Contact and Language Endangerment 69 The US melting pot ideology in turn has political consequences. In recent decades the English-only movement, as promoted by organizations such as US English and Official English, has led to the establishment of English as the sole official language of more than two dozen states; one result is that serious threats to free speech have emerged in several of them. This type of legislation echoes the anti-German legislation outlawing the use of German in public, which passed in many states during World War I. Policies of this kind helped cause the endangerment and, in most parts of the country, the loss of American German varieties. Political considerations also affect language policies and lead to language endangerment in many, possibly most, other countries. A few examples are France’s traditional (former) policy regarding the country’s minority languages, which surely contributed to the endangerment of Breton; the establishment of Slovak as the sole official language of Slovakia, making the precarious state of the minority Rusyn (Ruthenian) language even more precarious; the anti-minority-languages policy of the former Soviet Union during part of its history; and the decline of the minority Finnic language Seto in modern-day Estonia, first as a result of Soviet educational policies (which promoted Russian) and then as a result of Estonian government policies (which promote Estonian) (according to a brochure of the Estonian Bureau for Lesser-Used Languages 2009). Dialect loss as a result of the standardization of a minority language is increasingly common as endangered-language communities all over the world undertake systematic efforts to revitalize their languages. If they are to be successful, revitalization programs must, at some stage, include standardization so that the endangered language can be taught in schools and other venues. For languages with several dialects, the standardization process typically begins with choosing one dialect as the standard; all the newly nonstandard dialects then become endangered within the endangered-language community. A successful revitalization effort therefore results in shift from nonstandard dialects to the selected standard dialect—and the loss of all the nonstandard dialects. Endangered dialects, like endangered languages, often form a major part of a group’s identity, but unlike endangered languages, endangered dialects are very unlikely to be the target of any revitalization efforts. The final cause of endangerment to be discussed in this section is speakers’ attitudes toward their heritage language. (Of course other attitudes are also relevant—the speakers’ attitudes toward the dominant language and the attitudes of dominant- language speakers toward the language that is threatened with endangerment. The focus here is on speakers’ attitudes toward the heritage language because that appears to be the most crucial set of attitudes in predicting endangerment.) The term “attitude” is vague; it has to do with speakers’ view of the status of their current or former language—prestigious, stigmatized, useful, useless, other?—and with the role, if any, that their heritage language plays in their sense of personal and community identity. Speakers’ negative attitudes alone are not a predictor of endangerment, because of all the other factors: economics, politics, societal change, and the rest. But attitudes are more

70 Sarah G. Thomason important than has often been recognized. If speakers are content to shift to a language that offers increased socioeconomic opportunities, their heritage language will become endangered. If the speakers don’t care whether their endangered heritage language survives, it won’t. If a community doesn’t want to revitalize its endangered language, revitalization won’t happen. But if, by contrast, a community cares passionately about its heritage language, successful revitalization is much more likely. In the most extreme cases, even revival is a possibility: Hebrew, after many centuries without native speakers or usage in everyday communication, was revived in modern Israel and now has millions of fluent native speakers. The cultural forces that made this most spectacular example of revitalization possible are unlikely to be replicated elsewhere, since they include the establishment of a nation state and the status of Hebrew as the sacred language of a major religion; still, the case of Modern Hebrew illustrates the power of attitudinal factors in the outcome of endangerment and even dormancy. Meanwhile, other revival efforts are under way and showing promising signs of at least partial success; googling “revived languages” produces information about some of these (for instance, the once-dormant Celtic languages Cornish and Manx). Similarly, if not (yet) so successfully in advancing revitalization, a great many endangered language communities value their heritage languages immensely, viewing them as a vital part of their personal and collective identity. Here is a typical example, inscribed on a sign at the Warradjan Aboriginal Culture Centre in Kakadu National Park (Northern Territory, Australia): Language is fundamental to cultural identity. This is so for all people everywhere. For Bininj, their unique world is expressed in their language. For this reason, it is important that people keep their own language alive. For Bininj, language and land are linked. . . . Each clan has its own name and territory . . .

Another attitude that favors maintenance of linguistic diversity is prestige. In some parts of the world, multilingualism is valued so highly that it is normal to learn another group’s language, even a conqueror’s language, without giving up a heritage language. This attitude is still common among indigenous peoples in the US Northwest and neighboring parts of Canada. As Spicer noted (1961, 396), in writing about the prevalence of multilingualism in the region from 1750 to 1858: There was prestige attached to knowing languages and perhaps aesthetic satisfaction; while serving as an army scout, one Wasco learned a certain amount of Delaware from a fellow scout. While this was not a “practical” accomplishment, it provided a basis for pride and prestige.

Unfortunately, however, negative attitudes toward endangered heritage languages seem to be at least as common as positive ones, and they are too often more effective as well. One such attitude that is often mentioned by people whose language has become

Language Contact and Language Endangerment 71 endangered is shame. This has frequently resulted from official suppression of minority languages, as in the infamous boarding schools in the United States, Canada, and Australia. When teachers considered a minority language worthless, their pupils were likely to adopt that viewpoint. So, for instance, Marie Smith Jones, the last speaker of the Eyak language of Alaska, commented about her schooldays, “When I was in school we were beaten for speaking our language. They wanted to make us ashamed” (Raymond 1998). But explicit government pressures are not needed to cause a minority language to become stigmatized in its speakers’ minds, as these quotations from a speaker of Udmurt (Votyak) indicate: “The Russian language is more beautiful than Udmurt”; “People can express themselves well only in Russian” (Shirobokova 2009). Similarly, a dominant group’s negative evaluation of a minority ethnic group often extends to the minority language. The saying “You are as stupid as a Veps,” reflecting the attitude of Russian speakers in Russian Karelia, has contributed to the endangerment of the Veps language (Romanova 2007). Another example: when some Christian preachers in southeastern Alaska condemned Tlingit as “demonic,” many Tlingit members of their congregations came to believe that God hates Tlingit (Dauenhauer & Dauenhauer 1998: 65). Less drastic, but still potentially devastating for a minority language’s chances of survival, is the view that speaking a minority language just doesn’t make sense, either because of economics (why speak Zapotec if you need Spanish to get a job?) or politics (speaking Yiddish in the former Soviet Union or Kurdish in Turkey could be dangerous to one’s physical well-being). A related factor is connected with the impact of standardization on the attitudes of speakers toward their speech variety. The belief that one must speak the minority language correctly can actually encourage endangerment. In Norway, the Finnic language Southern Saami had already become endangered as a result of non-attitudinal factors when programs were established to promote its use in the Saami community. But, although the revitalization effort has met with success in that many community members are learning to speak Saami, progress is slowed by the fact that “incorrect” Southern Saami is stigmatized. People who want to avoid being judged negatively, and who (as second-language learners) cannot speak Standard (“correct”) Southern Saami, prefer to speak Norwegian, which will not draw criticism (Johansen 2009). This too is ultimately a problem arising from language contact—in this case, contact among dialects of the same language—and it plagues many revitalization efforts: who gets to decide what the revitalized language will be like? There is a tension between some elders’ (and many linguists’) view of “the genuine language” and younger community members’ view of what counts as “the language.” The survey in this section has barely scratched the surface of the complexities of contact situations that foster language endangerment. As noted earlier, all, or almost all, of these complexities have been widely discussed in the language contact literature and in the rapidly expanding literature on endangerment. Finally, it is important to emphasize that language loss through contact and shift is not always a result of linguistic imperialism. Sometimes, to be sure, language shift is forced, as in the boarding schools mentioned above; these and other examples are typical of the fate of

72 Sarah G. Thomason indigenous languages in many European colonies and in many non-European expansion zones: Egyptian (in its late Coptic form) was replaced by Arabic, Tlingit speakers pushed into Eyak territory in Alaska, and so forth. But sometimes language shift is neither encouraged nor particularly desired by the target language speech community. So, for instance, the Yaakus of Kenya shifted to Maasai for a combination of social and economic reasons; the Maasais were not urging the shift (Brenzinger 1992). Given all the kinds of social, political, and economic contexts in which language endangerment is likely, it is perhaps unsurprising that the prevailing view is that language contact is a zero-sum game. Contact situations that lead directly or indirectly to language endangerment have been the focus of a great many scholarly publications and conferences. The general theme of “contact as conflict” is visible in many titles, for instance Languages in conflict: linguistic acculturation on the Great Plains (Schach 1980); When languages collide: perspectives on language conflict, language competition, and language coexistence (Joseph et al. 2003); “Sprachkontakt als Kulturkonflikt” [“Language contact as culture conflict”] (Nelde 1984); Duelling languages: grammatical structure in codeswitching (Myers-Scotton 1993); Langues en conflit: études sociolinguistiques [Languages in conflict: sociolinguistic studies] (Boyer 1991); and The First World Congress on Language Contact and Conflict (Brussels, June 1979). Violent imagery is also popular in discussions of language contact and language endangerment, for instance in the bloodthirsty terms for the replacement of a minority group’s language by a dominant group’s language. Some years ago I googled several of these terms and got thousands of hits for each of them: language genocide (8,220 hits), language suicide (16,900), language murder (19,000), linguistic genocide (26,500), and linguicide (29,600). There are even explicit claims in the literature that language contact equals language conflict. Here, for instance, is the first sentence of an article entitled “Language conflict” (Nelde 1997, 285): “Throughout history, ever since the Tower of Babel was left unfinished, contacts between different languages have inevitably resulted in conflicts between speakers of those languages.” In all these contexts, the clear implication is that language contacts inevitably lead to the disappearance of one of the two languages in a contact situation. But this is a misleading view. Much less attention has been devoted to contact situations that do not tend to lead to endangerment, and we will now turn to that topic.

3. When contact does not lead to endangerment Language contact does not lead to language endangerment when the contact involves balanced bilingualism (or multilingualism, but for convenience I’ll refer only to bilingualism in this section). In this context, by “balanced bilingualism” I mean a balanced

Language Contact and Language Endangerment 73 bilingual community rather than a balanced bilingual individual (though of course balanced bilingual communities are likely to be inhabited by balanced bilingual people). In balanced bilingual contact situations, contact is an integral part of peaceful social interaction, as seen in such cultural practices as linguistic exogamy, social boundary maintenance, and facilitation of intergroup communication. Often, contact is also a stimulus for speakers’ creativity—code-switching, language play, secret languages, new in-group languages for new ethnic groups, and so on. Let’s consider each of these in turn. Perhaps the most obvious way in which linguistic diversity is embedded in culture is linguistic exogamy, where one must marry outside one’s language group. In the Vaupés basin of Colombia and Brazil, for instance, marrying someone of your own language group counts as incest (Aikhenvald 2002, 17); in a family, therefore, both parents and children are full bilinguals. Multilingualism is very widespread. Similarly, many Aboriginal people in the Cape York peninsula of northern Australia practiced linguistic exogamy, among them speakers of Guugu Yimithirr (Haviland 2006, 476). One case that has been extensively studied is Misión La Paz, Argentina, where linguistic exogamy is practiced among the three languages of the village: Chorote, Nivaclé, and Wichí. There is passive bilingualism between marriage partners; the children of a family choose the preferred language of one parent as their primary language, and everyone engages in multilingual conversations (Campbell & Grondona 2010). Next, consider social boundary maintenance as a cultural function of bilingualism and multilingualism. The most famous example of linguistic diversity used for purposes of boundary maintenance is the village of Kupwar in Maharashtra, India. In Kupwar there are four local languages belonging to two different language families: Marathi (Indic; the official state language of Maharashtra), Urdu (Indic), Kannaḍa (Dravidian), and Telugu (Dravidian). As reported by Gumperz and Wilson (1971), there was strict separation in the village between public and private spheres, accompanied by widespread multilingualism and structural convergence in the languages. Kupwar Marathi and Kupwar Kannaḍa have traded four to six features each, and Kupwar Urdu has changed toward Kannaḍa and/or Marathi in twelve features. Gumperz and Wilson claim that structural convergence in the Kupwar varieties of the four languages has been so extreme that there is extensive mutual intertranslatability (but see Kulkarni-Joshi 2016 for a recent reassessment of their arguments), and they observe that “As long as ethnic separateness of home life is valued, then, and language remains associated with ethnic separateness, there is little reason to expect multilingualism to disappear” (154). This striking level of local multilingualism may be connected, at least indirectly, with an observation made by Pandit (1977) about language maintenance and shift in different regions: “A second generation speaker in Europe or America gives up his native language in favour of the dominant language of the region; language shift is the norm and language maintenance an exception. In India language maintenance is the norm and shift an exception” (154). A different type of situation in which linguistic diversity serves to maintain social boundaries involves pidgin languages—languages that arise in order to facilitate intergroup communication and that are spoken solely as second (or third, or fourth,

74 Sarah G. Thomason or . . .) languages. All pidgins serve this general purpose, but only some of them were apparently created specifically to prevent outsiders from learning the full language of the community that provided the bulk of the lexicon. Several examples have been reported. One was the seventeenth-century Delaware (Lenape) Pidgin, according to the Dutch missionary Michaëlius, who commented in 1628 that the Delawares “rather design to conceal their language from us than to properly communicate it, except in things which happen in daily trade; saying that it is sufficient for us to understand them in that . . . ” (quoted in Jameson 1909, 128). Michaëlius was a more acute observer than some missionaries, anthropologists, and linguists have been. During the 1970s, fieldworkers among Hamer speakers in Ethiopia devoted months to learning what they thought was the Hamer language, only to discover that what they had learned was in fact Pidgin Hamer, used only with outsiders and policemen (Lydall 1976). And a missionary named W. G. Lawes, working in the late nineteenth century among Motu speakers in what is now Papua New Guinea, was taught “a simplified form” of the language—a kind of early Pidgin Motu—and only learned his mistake when his five-year-old son Frank, who had learned the full language from his village playmates, spilled the beans. Even after Lawes made this discovery, many villagers “were still opposed to imparting this knowledge to strangers” (Chatterton 1970, 95–96). A final example is Mobilian Jargon, a pidgin once spoken in Louisiana, which was used as “a social and cultural barrier against non-Indian outsiders in particular” (Drechsel 1984). Other pidgins apparently arose (only?) to facilitate intergroup communication, without the distancing motive evident in the creation of Pidgin Delaware, Pidgin Hamer, Pidgin Motu, and Mobilian Jargon. Because they are never first languages, the use of such pidgins also instantiates balanced bilingualism/multilingualism: all their speakers have at least one other language (their native language). Many pidgins are of course ephemeral, but they are stable as long as the social circumstances of their emergence are stable, and some pidgins have lasted for over 100 years. One example of a pidgin is Chinook Jargon (US Pacific Northwest and British Columbia), which was a major vehicle for communication with other Native tribes and with whites in at least the nineteenth century and early twentieth century. Chinook Jargon was added to speakers’ already multilingual repertoires as an additional linguistic resource; it did not lead to the endangerment of the speakers’ heritage languages. (All those Native languages are now endangered, but not because of Chinook Jargon.) Another example is Hiri Motu (closely related to Lawes’s Pidgin Motu), which was used in trading expeditions along the New Guinea coast, in interactions between speakers of Motu (Austronesian) and various non-Austronesian languages. These examples could easily be multiplied from contact situations around the world. As long as the socioeconomic circumstances that gave rise to the pidgins persisted, the pidgins comprised a useful addition to—but not a replacement for—the speakers’ native languages. The pidgin merely added one more repertoire to the speakers’ linguistic resources. Just as monolingual speakers exploit their multiple repertoires, manipulating registers and speech styles for social purposes, so too do bilingual speakers—except that

Language Contact and Language Endangerment 75 bilinguals can draw on different languages as well as registers and speech styles of one language at a time. Creative uses of bilingual repertoires are a typical feature of balanced bilingual situations (and are not unknown in transitional bilingual contexts). Code-switching, for instance, is often used to create social identities. A famous case in the United States is Spanish-English code-switching, also known as Spanglish (see, e.g., Zentella 1997), which is used among young Puerto Ricans in New York as a marker of Puerto Rican identity. Another example is the code-switching used by Tewas in Arizona, where the switching is between Tewa (a Kiowa-Tanoan language) and Hopi (Uto-Aztecan) (Kroskrity 2000). A related but distinct function of code-switching has been observed in The Netherlands, where trendy young Dutch bilinguals often insert so many English words into their Dutch morphosyntax that the result is ad hoc relexification (Ad Backus, personal communication, 1999); this phenomenon is comparable to the proliferation of novel words in teenage slang in a monolingual US community: both practices are language play of a particular type, marking in-group status. Another type of language play, also probably an in-group indicator, is illustrated by the language and dialect mixing among working-class Afrikaans speakers in Cape Town, South Africa (Deumert 2005, 131, quoting Stone 2003, 391): I suggest that the dialect constitutes linguistic bricolage. The “ends,” the “standard” dialects from which it is composed, are appropriated and adeptly made to constitute a new “means,” the working-class dialect, under the noses (so to speak) of the sanctimoniously dominant from whom it is taken. The processes of construction are partly serious, rule-bound and consequential, and partly creative, playful, whimsical and unpredictable, and the two processes interweave and oscillate in unstable equilibrium . . .

Like Spanglish, this emerging dialect draws on and is completely dependent on bilingualism—the maintenance of the community’s two languages. It is akin, in a nontrivial way, to the secret languages of the Pig Latin variety that some monolingual Americans invent during their childhood or adolescence. Some secret languages are put to more serious uses, however. Bray (1913, 139–140) reports on a secret language devised by the multilingual Lōṛī people of Baluchistan. The language, called Mōkkī, drew on the languages spoken by community members (primarily Brahui, a Dravidian language, and Balochi, an Iranian language), and it served to prevent unfriendly outsiders from understanding Lōṛīs’ conversations: There is a certain appropriateness in winding up a survey of the languages of this province withMōkkī, the cant of the Lōṛīs, for it’s a hotchpotch of the lot. It is an artificial jargon, which the Lōṛīs have mechanically invented on the basis of the language of the people among whom they live, and which they more especially employ when they want to keep their meaning to themselves. And yet so universally and successfully is the jargon used, that it seems doubtful whether its artificiality suffices to debar it from being classed as a language. However artificial its origin and character, it is at any rate acquired naturally and as a matter of course by Lōṛī children; it

76 Sarah G. Thomason is no longer, it would seem, simply a secret patter; it is becoming a language for the home-circle. It is all very simple. Take any word from any language, and turn it inside out:chukak “dog” [from] Brahui kuchak; randum “man” [from] Persian mardum. But though this is their chief device for obscuring the meaning of everyday words, there are several others. Sometimes they add a suffix. Prefixes are affected still more. or they resort to sound-changes the thin disguise of isolated words and the rapidity of connected sentences, blurred in the rapidity of speech, [make] both Brahui and Baloch admit freely that Mōkkī is beyond them.

Extensive lexical distortion is known from numerous communities around the world, and is used for a variety of purposes. Sometimes, unlike the Mōkkī case, the distorted lexicon is made up, essentially in a monolingual context. Mōkkī may be the only example of a language with a deliberately disguised lexicon that became (apparently) the main language, or at least one of the main languages, of a speech community. The point, in the present context, is that Mōkkī could only have been created in a context featuring balanced bilingualism (or multilingualism). A final point remains to be discussed in this section. Occasionally balanced bilingualism ends without language shift (which is the outcome of transitional bilingualism) but nevertheless with the disappearance of the less dominant language, through a process that might be called borrowing into oblivion. Most or all known examples involve two languages that are related, often closely related. Here is a characterization of one example of death by borrowing, the fate of Laha, an Austronesian language spoken in Central Maluku, Indonesia, where the dominant language, Ambonese Malay, belongs to the same (large) branch of Austronesian (Collins 1980, 13–14): All speakers of Laha are fluent speakers of Ambonese Malay . . . Laha has maintained its indigenous language in the face of increasing pressure from Ambonese Malay but only at the expense of drastic revision of its grammar. . . . Bit by bit the grammar of Laha has become nearly interchangeable with Ambonese Malay grammar. This adaptability in the Laha language has contributed to its survival.

In other words, although Laha vocabulary is still mainly native, the grammar is not; instead, the entire grammar is borrowed from Ambonese Malay. The full Laha language is therefore not being passed on to new generations of speakers. A similar example, but with more closely related languages, is the case of Votic, a Finnic language, which has been almost entirely absorbed into another Finnic language, Ižora (also known as Ingrian). Both languages are spoken in Russia. According to Ariste (1970), Ižora words and grammatical features made their way into Votic almost imperceptibly, until they achieved preponderance. Then the language of these Vots was no longer Votic, but Ižora.

Aside from this absorption phenomenon, which is (as far as I know) rare, balanced bilingualism does not normally lead to language endangerment. Of course balanced

Language Contact and Language Endangerment 77 bilingualism can transform into transitional bilingualism. But as long as the two languages of a community remain in general everyday use, possibly but not necessarily in different domains (such as home vs. workplace), and as long as the social circumstances that favor bilingualism remain in place, language contact of this kind is stable, not ephemeral.

4. Contact-induced language change and attrition in endangered languages Attrition in a receding language— typically in the last stages before the language vanishes—is the loss of linguistic material, both lexicon and structure, without replacement by (for instance) material borrowed from the language to which the minority- language community is shifting. This is in fact one type of contact-induced change, according to my preferred definition of contact-induced change: “any linguistic change that would have been less likely to occur outside a particular contact situation is due at least in part to language contact” (Thomason 2001, 62). In this section, therefore, I will be comparing changes due to attrition with a different—and much more common—type of contact-induced change: the transfer of linguistic features from one language to another. Attrition is a gradual process that happens to an endangered language that loses speakers over several or many generations, until finally the only remaining speakers are semi-speakers, community members who have learned the language imperfectly as a second language. Not all languages undergo much attrition in these circumstances, but attrition is probably the most common outcome of a gradual decline in language use. In more abrupt processes of language death, like the dramatic break in transmission that occurred in some Native American communities when their children were forced to attend language-suppressing boarding schools, a language may be more likely to disappear without undergoing significant amounts of attrition. These complications will remain unaddressed in this section, however; the focus here is on cases in which attrition does occur. The main conclusion that arises from the comparison is that attrition does not involve unique kinds of linguistic change. Instead, the same kinds of change that happen during periods of attrition in a dying language also happen in “ordinary” contact-induced change. (For that matter, similar kinds of change happen in internally-motivated change as well.) First, consider lexical loss. As a gradually dying language loses domains of usage, it is also likely to lose words that are (or were) specific to the lost domains. So, for instance, speakers of the endangered Barí language (Venezuela) are reported to be losing 40–60% of their traditional plant names from one generation to the next (Romaine 2010, 327, citing Lizarralde 2001), presumably because the old food and medicinal plants are no longer gathered. But this kind of loss is by no means restricted to endangered-language communities; many communities with viable languages have abandoned older ways of

78 Sarah G. Thomason life, and they too have lost the specialized vocabulary of those domains. Not all such changes are due to language contact, of course. Most modern Americans could not name more than one or two parts of a horse’s harness, for example, because the great importance of horses in Americans’ daily lives vanished (along with most of the horses) with the rise of the automobile, but language contact was not a factor in this lexical loss. The only thing that sets lexical loss in an endangered language apart from lexical loss in a viable language is the scale of the loss: a much larger percentage of the specialized lexicon, and even of the ordinary lexicon, will be gone by the late stages of language loss. The same is true of structural erosion in a dying language: much more structure is lost in attrition than in ordinary contact-induced change, but individual simplifying changes do not differ in character. So, for instance, the endangered Australian language Warlpiri and the endangered Argentinian language Vilela have both lost (or are losing) their inclusive/exclusive “we” distinction, apparently due to attrition (as well as to the absence of the feature in the dominant language, English or Spanish, respectively; Golluscio and González 2008, 222ff., Bavin 1989, 282–283); and the non-endangered Dravidian language Brahui (Pakistan) has lost its inclusive/exclusive “we” distinction under the influence of the Iranian language Balochi (Emeneau 1962, 56). A common prediction is that, under attrition, phonemes and phonological processes are likely to be lost. An example is the ongoing replacement, in the speech of one of the last Vilela speakers, of the voiceless lateral fricative with š (Golluscio & Gonzalez 2008, 222ff.). This may be compared with the ordinary contact-induced change that led to the merger of two Czech lateral phonemes, one plain and one palatalized, under the influence of German, which lacks such a distinction (Jakobson 1962[1938], 240). Again, the difference between attrition and other contact-induced change is that attrition leaves the receding language with less linguistic material overall, while ordinary contact-induced change doesn’t. It should also be noted that attrition is not the only process of change typically found in gradually dying languages; the other main kind is borrowing from the dominant language. Borrowing sometimes simplifies structure, but often it does not, even in a dying language; and it does not differ from lexical or structural borrowing into a non- endangered language. In general, there is just one major difference between attrition and other kinds of contact-induced change: attrition is overall loss of a language’s substance. By contrast, although other contact-induced changes may produce overall simplification in structure, they are just as likely to produce overall complication, and even more likely to have neither effect (because simplifying one structural subsystem very often complicates another subsystem).

5. Conclusion The main goal of this chapter has been to underscore the asymmetry between language contact and language endangerment. Endangerment almost always results from language contact, but the reverse is not true: language contact certainly does not always

Language Contact and Language Endangerment 79 lead to endangerment. There are, as outlined above, quite a few circumstances that encourage—and in some cases even demand—stable language contact with balanced bilingualism. These contact situations may be characterized as safe language contact: bilingualism is stable as long as the external circumstances remain the same. Unsafe language contact, by contrast, obtains in circumstances that encourage or even require transitional bilingualism, with eventual shift by the speakers of one language to the language of the other group. Only in the latter case is a prediction of endangerment justified. The final main section of the chapter sketches the relationship between attrition in receding languages and contact-induced changes in non-endangered languages. Here the point is that there is no appreciable difference in individual changes; the only significant difference between the two types of process is that attrition leaves the language poorer overall, in both lexicon and structure, while ordinary contact-induced change has no such effect.

References Aikhenvald, Alexandra Y. 2002. Language Contact in Amazonia. Oxford: Oxford University Press. Ariste, Paul. 1970. “Die Wege des Aussterbens zweier finnisch-ugrischer Sprachen” [“The Routes to Death of Two Finno-Ugric languages”]. Monda Lingvo-Problemo 2: 77–82. Bavin, Edith L. 1989. “Some Lexical and Morphological Changes in Warlpiri.” In Investigating Obsolescence: Studies in Language Contraction and Death, edited by Nancy C. Dorian, 181– 196. Cambridge: Cambridge University Press. Boyer, Henri. 1991. Langues en conflit: études sociolinguistiques. Paris: L’Harmattan. Bray, Denys de S. 1913. Census of India, 1911, vol. IV: Baluchistan. Calcutta: Superintendent Government Printing, India. Brenzinger, Matthias. 1992. “Lexical Retention in Language Shift: Yaaku/Mukogodo-Maasai and Elmolo/Elmolo-Samburu.” In Language Death: Factual and Theoretical Explorations with Special Reference to East Africa, edited by Matthias Brenzinger, 213–254. Berlin: Mouton de Gruyter. Campbell, Lyle and Verónica Grondona. 2010. “Who Speaks What to Whom? Multilingualism and Language Choice in Misión La Paz.” Language in Society 39: 617–648. Campbell, Lyle and Martha C. Muntzel. 1989. “The Structural Consequences of Language Death.” In Investigating Obsolescence: Studies in Language Contraction and Death, edited by Nancy C. Dorian, 181–196. Cambridge: Cambridge University Press. Chatterton, Percy. 1970. “The Origin and Development of Police Motu.” Kivung 3: 95–98. Collins, James T. 1980. “Laha, a Language of the Central Moluccas.” Indonesia Circle 23: 3–19. Dauenhauer, Nora Marks and Richard Dauenhauer. 1998. “Technical, Emotional, and Ideological Issues in Reversing Language Shift: Examples from Southeast Alaska.” In Endangered Languages: Current Issues and Future Prospects, edited by Lenore A. Grenoble and Lindsay J. Whaley, 57–98. Cambridge: Cambridge University Press. Deumert, Ana. 2005. “The Unbearable Lightness of Being Bilingual: English- Afrikaans Language Contact in South Africa.” Language Sciences 27: 113–135. Donohue, Mark. 2007. “The Papuan Language of Tambora.” Oceanic Linguistics 46: 520–537.

80 Sarah G. Thomason Drechsel, Emanuel J. 1984. “Structure and Function in Mobilian Jargon: Indications for the Pre-European Existence of an American Indian Pidgin.” Journal of Historical Linguistics and Philology 1: 141–185. Emeneau, Murray B. 1962. Brahui and Dravidian Comparative Grammar. Berkeley: University of California Press. Golluscio, Lucía A. and Hebe González. 2008. “Contact, Attrition and Shift in Two Chaco Languages: The Cases of Tapiete and Vilela.” In Lessons from Documented Endangered Languages, edited by K. David Harrison, David S. Rood, and Arienne Dwyer, 195–242. Amsterdam: John Benjamins. Gumperz, John J. and Robert Wilson. 1971. “Convergence and Creolization: A Case from the Indo-Aryan/Dravidian border.” In Pidginization and Creolization of Languages, edited by Dell Hymes, 151–167. Cambridge: Cambridge University Press. Haviland, John B. 2006. “Guugu Yimithirr.” In Concise Encyclopedia of Languages of the World, edited by Keith Brown and Sarah Ogilvie, 473–476. Oxford: Elsevier. Jakobson, Roman. 1962[1938]. “Sur la théorie des affinités phonologiques entre des langues.” In Selected Writings, vol. 1, 234–246. The Hague: Mouton de Gruyter. Jameson, J. F., ed. 1909. Narratives of New Netherland: 1609–1664. New York: Scribner. Johansen, Inger. 2009. “Changes in a Small Language Community.” Paper presented at the 12th International Conference on Minority Languages, Tartu, Estonia, May 28–30, 2009. Joseph, Brian D., Johanna Destafano, Neil G. Jacobs, and Ilse Lehiste, eds. 2003. When Languages Collide: Perspectives on Language Conflict, Language Competition, and Language Coexistence. Columbus: Ohio State University Press. Kroskrity, Paul V. 2000. “Language Ideologies in the Expression and Representation of Arizona Tewa Identity.” In Regimes of Language: Ideologies, Polities, and Identities, edited by Paul V. Kroskrity, 329–359. Santa Fe, NM: School of American Research Press. Kulkarni-Joshi, Sonal. 2016. “Forty Years of Language Contact and Change in Kupwar: A Critical Assessment of the Intertranslatability Model.” Journal of South Asian Languages and Linguistics 3: 147–174. Lizarralde, Manuel. 2001. “Biodiversity and Loss of Indigenous Knowledge in South America.” In On Biocultural Diversity: Linking Language, Knowledge, and the Environment, edited by Luisa Maffi, 265–281. Washington, DC: Smithsonian Institution Press. Lydall, Jean. 1976. “Hamer.” The Non-Semitic Languages of Ethiopia, edited by M. Lionel Bender, 393–438. East Lansing, MI: African Studies Center. Mithun, Marianne. 1981. “Stalking the Susquehannocks.” International Journal of American Linguistics 47:1–26. Myers-Scotton, Carol. 1993. Duelling Languages: Grammatical Structure in Codeswitching. Oxford: Oxford University Press. Nelde, Peter Hans. 1997. “Language Conflict.” In The Handbook of Sociolinguistics, edited by Florian Coulmas, 285–300. Oxford: Blackwell. Nelde, Peter Hans. 1984. “Sprachkontakt als Kulturkonflikt.” In Kultur und Gesellschaft, edited by Wolfgang Kühlwein, 31–40. Tübingen: Narr. Pandit, P. B. 1977. Language in A Plural Society: The Case of India. Delhi: Devraj Channa Memorial Committee. Raymond, Joan. 1998. “Say What? Preserving Endangered Languages.” Newsweek, September 14, 1998. Romaine, Suzanne. 2010. “Contact and Language Death.” In The Handbook of Language Contact edited by Raymond Hickey, 320–339. Oxford: Wiley-Blackwell.

Language Contact and Language Endangerment 81 Romanova, Evgenia. 2007. “The Process of Revitalization of Culture and Indigenous Ethnic Identity: The Case of the Vepsian People in Karelia.” MA thesis, University of Tromsø, Norway. http://www.ub.uit.no/munin/bitstream/10037/1156/4/thesis.pdf. Accessed September 7, 2009. Schach, Paul, ed. 1980. Languages in Conflict: Linguistic Acculturation on the Great Plains. Lincoln: University of Nebraska Press. Shirobokova, Larisa. 2009. The Role of Sociolinguistics in Progress and Revitalization of Minority Finno-Ugric Languages. Paper presented at the Twelfth International Conference on Minority Languages, Tartu, Estonia, May 28–30. Spicer, Edward. 1961. Perspectives in American Indian Culture Change. Chicago: University of Chicago Press. Stone, G. L. 2003. The Lexicon and Sociolinguistic Codes of the Working-Class Afrikaans- Speaking Cape Peninsula Coloured Community. In Language in South Africa, edited by P. Mesthrie, 381–397. Cambridge: Cambridge University Press. Thomason, Sarah G. 2001. Language Contact: An Introduction. Edinburgh and Washington, DC: Edinburgh University Press and Georgetown University Press. Thomason, Sarah G. 2015. Endangered Languages: An Introduction. Cambridge: Cambridge University Press. Zentella, Ana Celia. 1997. Growing up Bilingual: Puerto Rican Children in New York. Oxford: Blackwell.

Chapter 4

Indigenou s L a ng uag e Rig hts—M iner’ s C a na ry or Marine r’ s T e rn? teresa l. m c carty

Like the miner’s canary, the [American] Indian marks the shift from fresh air to poison gas in our political atmosphere; and our treatment of Indians, even more than our treatment of other minorities, reflects the rise and fall of our democratic faith.—Felix Cohen, scholar of international and American Indian law. (Cohen 1953, 390)

Felix Cohen was a legal scholar and ethicist who served during Franklin D. Roosevelt’s administration (1933– 1947) in the United States agency charged with overseeing American Indian affairs. Cohen understood that the place and role of Indigenous peoples far outweigh their population numbers. Pointing to shifts in the US government’s American Indian policy as indicators of the “rise and fall” of democratic ideals, he saw in those policy shifts “exemplars of how US justice has been applied and misapplied” (Lomawaima and McCarty 2006, 6). In this chapter, I extend Cohen’s metaphor to the field of language planning and policy (LPP), and in particular the LPP subfield concerned with language rights. The central theme I develop is the right of choice, arguably the cornerstone of democracy, but, equally important for the present discussion, of Indigenous self-determination. Self-determination arises from “the most basic right” of Indigenous peoples “to be recognized as peoples” (Magga 1995, 1). As Phillipson, Rannut, and Skutnabb-Kangas (1995, 10) explain in their seminal treatment of linguistic human rights: “The right to self-determination is a basic principle in international law, aimed at recognizing the right of peoples (not only states) to determine their own political, economic

Indigenous Language Rights 83 and cultural destiny, . . . and hence avoid being assimilated.” As we will see, the interlinked principles of self-determination and peoplehood distinguish an analysis of Indigenous language rights. At the same time, apropos of Cohen’s comment, such an analysis illuminates broader issues of equity and justice in language rights for other minoritized communities striving to reclaim, revitalize, and sustain heritage or ancestral languages. The remainder of the chapter is divided into four sections. I begin with some background on Indigenous peoples, their distinctive legal-political status, the present state of Indigenous language vitality and endangerment, and the stakes involved in Indigenous language rights. This is followed by an exploration of Indigenous language rights within a larger body of LPP scholarship. I then contextualize that scholarship by exploring Indigenous language rights in one key public domain: education. Although universal rights to education have long been recognized—for example, in Article 26 of the 1948 United Nations Universal Declaration of Human Rights (http:// www.un.org/en/universal-declaration-human-rights/index.html)—the right to public education in Indigenous mother tongues has, until recently, been almost universally denied. As many scholars and community-based language revitalizers have noted, this is a leading cause of Indigenous language loss and of profound social and educational inequalities (see, e.g., Skutnabb-Kangas and Dunbar 2010). The chapter concludes with the implications of this work for the revitalization and sustainability of endangered Indigenous languages and their associated cultural and knowledge systems. There, I return to Cohen’s metaphor of the miner’s canary and propose an alternate conceptualization for Indigenous language rights rooted in self-determination and choice.

1. What is at stake in Indigenous language rights? There are approximately 370 Indigenous people worldwide—about 5% of the world’s population. Indigenous peoples reside in ninety countries and every continent on earth. While there is no universally accepted definition of “Indigenous,” the United Nations Permanent Forum on Indigenous Issues (PFII) considers the term to reference people who: (1) self-identify as Indigenous and are accepted as members of the community that so identifies; (2) have historical continuity with pre-invasion or pre- settler colonial societies; (3) have a strong connection to originary territories and natural resources; (4) possess distinct social, economic, or political systems that are non-dominant within the larger society; and (5) are committed to sustaining their originary environments and sociocultural systems as distinctive peoples (United Nations PFII n.d.).

84 Teresa L. McCarty This definition parallels Convention 169 of the International Labour Organization (ILO), the UN agency that, since 1920, has been engaged with Indigenous rights: . . . [Indigenous peoples are] peoples in independent countries who are regarded as indigenous on account of their descent from the populations which inhabited the country, or a geographical region to which the country belongs, at the time of . . . colonization or the establishment of present state boundaries and who, irrespective of their legal status, retain some or all of their own social, economic, cultural and political institutions. (ILO 1996–2016, Article 1.1)

At the time of this writing, the ILO Convention had been ratified by twenty member states, primarily in Latin America, the European Union, and Australasia (ILO 2009). According to Skutnabb-Kangas (2012, 88), the ILO definition “may be the strongest legally.” These international definitions intersect with Indigenous sovereignties, a status that predates colonial encounters but is also recognized constitutionally in many nation-states, and in myriad treaties, legislation, and case law. In the United States and Canada, tribal sovereignty entails a singular, legally and morally binding government- to-government relationship of Indigenous nations to federal, state, provincial, and territorial governments (Wilkins and Lomawaima 2001). In Norway, recognition of inherent Indigenous sovereignty is reflected in the 1987 Sámi Act and establishment of the Saami (Sámi) Parliament; in Finland, the Finnmark Act recognizes “Saami ownership of most of the land in Finnmark” (Magga 2015, 298). In Aotearoa/New Zealand, the 1840 Te Tiritio Waitangi (Treaty of Waitangi), signed between Māori chiefs and the British Crown, established a “mutual framework by which colonization could proceed,” while guaranteeing Māori “possession of their lands, their homes and all their treasured possessions (tonga)” (May 2012a, 303). While none of these kinds of agreements have been without tension and strife (and they continue to be violated in many nation-states), they nonetheless provide a legally codified framework for language rights. Indigenous peoples speak about two-thirds of the world’s approximately 7,000 known spoken languages “and control or manage some of the ecosystems richest in biodiversity” (Nettle and Romaine 2000, 13). With some exceptions—Guaraní in Paraguay and Greenlandic in Greenland, for instance—all Indigenous languages are endangered, a situation that is “inextricably linked” to threats to global biodiversity (Maffi 1995, 17). Perhaps more importantly, language endangerment reflects and contributes to the loss of distinctive Indigenous knowledge systems, identities, and communal well-being. “We believe that First Nations, Inuit and Métis languages are sacred and are gifts from the Creator,” states the Task Force on (Canadian) Aboriginal Languages and Cultures (2005)—a statement that is representative of Indigenous beliefs and language ideologies throughout the world (for additional examples in North America, see Kroskrity and Field 2009; McCarty 2013). In many cases, the number of speakers is quite small—a few hundred people or a few elders. But relatively “large” languages, such as Quechua, with millions of speakers whose originary territories span six South American countries, are also rapidly being displaced by dominant linguistic regimes.

Indigenous Language Rights 85 Indigenous-language endangerment is the legacy of centuries of colonization, genocide, and linguicide, described by Gikuyu scholar Ngũgĩ wa Thiong’o (2009, 17) as “conscious acts of language liquidation.” Central to these experiences have been official and unofficial policies intended to dispossess Indigenous peoples simultaneously of their languages and lands. For example, in sub-Saharan Africa, European missionaries constructed an “artificial multilingualism” whereby closely related African varieties, akin to British and American English, were recorded as distinct tongues. These artificial linguistic boundaries became the template for dividing African peoples territorially and for “de-Africanisation through the exclusive use of colonial languages in high-prestige domains” such as government and schools (Makalela 2005, 153). In North America, from the eighteenth through much of the twentieth century, Native American children were forcibly removed from their families and compelled to attend distant residential schools, which “often included a ban on speaking students’ tribal languages” (Skutnabb- Kangas, Bear Nicholas, and Reyhner 2016, 189). During the same period, Australia’s British colonial government implemented a notorious “White Australia” policy designed to “breed out” Blackness among Indigenous peoples and “produce a homogeneous English- speaking Anglo- Saxon culture” (Romaine 1991, 3). As part of that policy, Aboriginal and Torres Strait Islander children, referred to as the “stolen generation,” were forcibly taken from their homes, “often in the absence of the parent but sometimes even by taking the child from the mother’s arms,” to distant schools where they were prevented from learning their ancestral language and suffered physical and emotional abuse (Commonwealth of Australia 1997, 4). In what is now Norway, beginning in the mid-eighteenth century, the “Norwegian state launched a systematic war against Sámi culture and language for 100 years,” while other states that today occupy Sámi territory—Finland, Sweden, and western Russia—“denied the existence of the Sámi as a people” (Magga 1995, 220). In Latin America, “a clear and often publicly conceded intention of eradicating Indigenous ethnocultural differences” underlay centuries of policies designed to construct a uniform “national” society through segregated schooling (López 2008, 43). Contemporary Indigenous language rights movements, then, have been mobilized in tandem with larger decolonizing movements to contest these inequities and to reclaim and sustain distinctive Indigenous identities, lifeways, territories, natural resources, and “valued possessions” (Kāretu 1995, 217; see also Dunbar-Ortiz et al. 2015). The stakes in these movements are high. By definition, Indigenous languages are autochthonous. Thus, unlike speakers of colonial and many immigrant languages, Indigenous communities have no external pool of speakers to turn to as a language (re)acquisition resource. Language recovery efforts are also challenged by the racialized power hierarchies that have led to language loss in the first place. Hence, Indigenous language rights are about much more than language per se; this is not a question of language as an essentialized “object of the right” (Wee 2011, 21). Rather, the core concern is with “an indigenous plan of self-determination that includes language and culture regeneration,” where language and culture are conceived as dynamic, fluid, and ever-changing (Hohepa 2006, 299). The issue, say De Korne and Leonard

86 Teresa L. McCarty (2017, 1), is “whether the power structures that produce language endangerment and displacement are being meaningfully contested, or whether they are merely being reshaped and reproduced along familiar top-down lines.”1

2. What are language rights? Orientations in research and practice Language rights have largely been framed within a Western juridical tradition. “In a broad sense,” write Hult and Hornberger (2016, 36), “language rights can be understood as what is legally codified about language use, often with special attention to the human and civil rights of minorities to use and maintain their languages.” According to Hamel (1997a, 3), “linguistic rights relate either to subordinate minorities and peoples, or to dominant groups who want to perpetuate their linguistic rule and privileges through legislation.” The construct of language planning orientations, introduced in 1984 by Richard Ruiz, provides a useful heuristic for examining language rights research and practice. “Orientation,” Ruiz states, “refers to a complex of dispositions toward language and its role, and toward languages and their role in society” (1984, 16). He continues: Orientations are basic to language planning in that they delimit the ways we talk about language and language issues, they determine the basic questions we ask, the conclusions we draw from the data, and even the data themselves. Orientations . . . help to delimit the range of acceptable attitudes toward language. . . . In short, orientations determine what is thinkable about language in society. (1984, 16)

Ruiz proposed three overarching orientations toward language: language as problem, right, and resource. A language-as-problem orientation constructs certain languages and their speakers in deficit terms, linking those languages to “poverty, handicap, low educational achievement, little or no social mobility” (Ruiz 1984, 19). This orientation characterizes the history of settler colonialism, particularly in the realm of compulsory education. A language-as-right orientation is concerned with linguistically influenced access to social resources, including legal services, education, health care, and employment. In counterpoint to these two orientations, Ruiz proposed language-as-resource, an orientation that promotes bi/multilingualism for all.2 We can extend the orientation heuristic to examine a continuum of approaches to Indigenous language rights. At one end of the continuum is a “prohibition orientation,” 1

For a helpful brief “history of language rights in the West,” see Skutnabb-Kangas 2000, 505–511. For a summary of the “dispositions” informing each language planning orientation, see Hult and Hornberger 2016, 33. 2

Indigenous Language Rights 87 “the goal of which is clearly to force the linguistic minority group to assimilate to the dominant language” (Skutnabb-Kangas and Phillipson 1995, 79). Such an orientation can be characterized as language-restrictionist (Gándara and Hopkins 2010), linguistically repressive, and hostile (Del Valle 2003). Colonial schooling and proscriptive medium-of-instruction policies are prime examples of this orientation. At the other end of the continuum is a promotion orientation that is both individual and collective in scope. The various orientations along the language rights continuum are discussed in the sections that follow.

2.1. Tolerance versus promotion orientations This orientation reflects a distinction made by the German linguist Heinz Kloss (1977[1998]) in The American Bilingual Tradition. Tolerance-oriented rights protect minoritized speakers’ “right to cultivate their language in a private sphere, namely, in the family and in private organizations” (Kloss 1977[1998], 20). In contrast, promotion- oriented rights “regulate how public institutions may use and cultivate the languages and cultures of the minorities” (Kloss 1977[1998], 20). It is important to point out that Kloss applied this formulation solely to non-Indigenous settlers: Spaniards in what is now the US Southwest, Germans in the Northeast, Midwest, and South, and “late cosettlers” such as French Canadians, Scandinavians, and Mexicans (1977[1998], 16–17). Although subsequent scholars have used Kloss’s distinction to analyze Indigenous language rights (May 2012a, 2014; Skutnabb-Kangas 2000, 2012), Native Americans are excluded from Kloss’s “American bilingual tradition.”

2.2. Norm-and-accommodation versus official-language rights This approach juxtaposes the normalization of a dominant language as the “language of public communication . . . [unless] some special circumstance arises,” with formalized, government-backed rights (Patten and Kymlicka 2003, 27). An example of norm-and- accommodation rights is the provision of an interpreter when needed for legal proceedings, or of bilingual ballots, where the dominant language, though not “official” (as, for instance, in the United States), nonetheless remains the unquestioned language of power. Promotion-leaning examples of official-language rights are the 1987 designation of Māori as official in Aotearoa/New Zealand, the 2007 Nunavut Official Languages Act recognizing Inuktitut and Inuinnaqtun as co-official with French and English in the northern Canadian territory of Nunavut, and the post-apartheid officialization of nine Indigenous languages (along with English and Afrikaans) in South Africa. A prohibition-leaning example of official-language rights is the movement to make English the sole official language of the United States (English being already normalized as the national language).

88 Teresa L. McCarty

2.3. Personality versus territoriality orientations This approach distinguishes universal from territorially circumscribed rights. The personality principle entails that “language rights follow persons wherever in the state they may choose to live” (Patten and Kymlicka 2003, 29). Canada’s provision of federal government services in English and French throughout the country is an example of the personality principle. Notably, however, speakers of Aboriginal Canadian languages do not enjoy the same right. The territorial orientation “involves an attempt to divide a multilingual state into a series of unilingual regions, in which only the local majority language gets used in a variety of public contexts” (Patten and Kymlicka 2003, 29). Belgium’s use of French and Dutch in Wallonia and Flanders, respectively, and the formalized use of specific languages in the autonomous regions of Spain, are examples of the territorial principle. Historically, these principles have been nonexistent or only partial when applied to Indigenous peoples. In the United States, for example, official language policies developed by some Native nations designate the local Indigenous language as official on tribal lands, including its use in schools (Zepeda 1990). This is an application of the territorial principle—yet English remains dominant. Conversely, the right to use a Native American language outside of tribal lands—an application of the personality principle—exists in theory, but in practice, is not guaranteed or normalized by dominant-language regimes.

2.4. Individual versus collective rights This final set of orientations can be seen as a parallel to tolerance versus promotion rights. Tolerance of individual rights to use a minoritized language in private domains is “relatively uncontroversial,” says May (2011, 265–266), but whether speakers of minoritized languages “have the right to maintain and use that language in the public or civic realm,” including promotion via bi/multilingual education, remains in question. The U.N. Universal Declaration of Human Rights, for example, views rights as residing primarily, if not exclusively, in individuals as citizens. Yet, as Skutnabb-Kangas (2000, 483) points out, the collective rights of minoritized speakers are the “essential tools” through which they gain access to the rights that dominant-language speakers “are granted through individual rights.”

2.5. Individual rights plus collective rights For Indigenous peoples seeking to reclaim, revitalize, and sustain ancestral languages, approaches that combine individual and collective rights are the most powerful and necessary. Two prominent examples are the 1992 UN Declaration on the Rights of Persons Belonging to National or Ethnic or Religious Minorities (UNDRPBNERM; United Nations General Assembly 1992), and the 2007 UN Declaration on the Rights of

Table 4.1 Language planning orientations (Ruiz 1984) vis-à-vis the continuum of Indigenous and minoritized language rights Language Planning Orientations (Ruiz 1984)

Problem Right Resource

Continuum of Language Rights (Sources: May 2012a; Patten and Kymlicka 2003; Skutnabb-Kangas 2000) Prohibition Promotion Prohibition/Restrictionist

Tolerance vs. Promotion

Accommodation vs. Official

x

x

x

x

x x

x

x

Personality vs. Territoriality

Individual vs. Collective

x

x

x

x

x x

90 Teresa L. McCarty Indigenous Peoples (UNDRIP). The 1992 Declaration states that minoritized peoples “have the right to enjoy their own culture, . . . and to use their own language, in private and in public” (United Nations 1992, Article 2.1). UNDRIP, one of the most hard-fought and significant international conventions on Indigenous rights, states unequivocally that: Indigenous peoples have the right to revitalize, use, develop and transmit to future generations their histories, languages, oral traditions, philosophies, writing systems and literatures, and to designate and retain their own names for communities, places and persons. . . . Indigenous peoples have the right to establish and control their educational systems and institutions providing education in their own languages, in a manner appropriate to their cultural methods of teaching and learning . . . [and that states] shall, in conjunction with indigenous peoples, take effective measures, in order for indigenous individuals . . . to have access, when possible, to an education in their own culture and provided in their own language. (United Nations General Assembly 2007, Articles 13.1, 14.1, 14.3)

Table 4.1 shows the integration of Ruiz’s (1984) language planning orientations with the continua of language rights discussed above. As can be seen, prohibition/restrictionist/ repressive/hostile approaches are, in intent and effect, problem orientations that are anti- Indigenous and anti-minority rights. Tolerance and accommodation provisions may be both problem-and rights-oriented; territoriality provisions are rights-based. Only promotion, personality, and collective rights recognize language as both right and resource. But, while these latter approaches, exemplified in UNDRIP and UNDRPBNERM, are promising, declarations such as these are non-binding; they do not guarantee these rights. For instance, “there is nothing in these articles about the state having to allocate resources” for promotion-oriented rights (Skutnabb-Kangas and May 2017, 132–133).

2.6. Linguistic human rights as part of language rights As the discussion above suggests, language rights are intimately tied to human rights. Yet, scholarship connecting the two has emerged only in the last few decades. Further, until the mid-1990s, linguistic human rights were “absent from binding international human rights instruments, especially in education” (Skutnabb-Kangas 2000, 482). Seminal publications include Skutnabb-Kangas, Phillipson, and Rannut’s (1995) Linguistic Human Rights and a special issue of the International Journal of the Sociology of Language on “Linguistic Human Rights from a Sociolinguistic Perspective” (Hamel 1997b). This and more recent interdisciplinary scholarship have grown directly out of concerns about worldwide language endangerment (May 2004). To frame this growing area of scholarship and international convention and law, Skutnabb-Kangas uses this formula: “human rights + language rights = linguistic human rights” (2000, 484). While some scholars assert that language rights are “perforce” a part of human rights (Makoni 2002, 2), others argue that language rights, while implicating human rights, do not necessarily bring the two together in principle or practice

Indigenous Language Rights 91 (Skutnabb-Kangas 2000, 2012). Moreover, scholars have pointed to the challenges of both conceptualizing and actualizing linguistic human rights. Makoni (2012, 2), for instance, notes the “fuzziness of language boundaries” and “fluidity in language identity” as complicating factors that undermine linguistic human rights (see May 2012b and May 2018 for a critique and call for reinstating minoritized and Indigenous languages into core language domains, particularly education). Perhaps the stiffest challenge to achieving genuine sociolinguistic parity and pluralism is the fact that the already-advantaged status of dominant language regimes means they “will continue to dominate in most if not all language domains” (May 2012c, 137). For this reason, May (2011, 283) refers to language rights—and in particular collective and promotion-oriented rights—as the “Cinderella” right, requiring concerted political will to be invited to the metaphoric human rights “ball.”

2.7. Indigenous perspectives on language rights Indigenous scholarship helps bring the challenges and possibilities of these various rights orientations into focus. For example, Hopi scholar Sheilah Nicholas (2014) examines how Hopi people living in what is now the southwestern United States conceptualize and enact linguistic and cultural rights. For Hopi, individual and collective rights coexist with responsibilities and reciprocity within an Indigenous ontological and epistemological paradigm of long-life sustenance, health, and happiness. Illustrating that paradigm with public dance performances keyed to the Hopi ceremonial calendar, she writes: These ancestral traditions are ritual practices that convey, through myriad forms of language, . . . the Hopi way of life. . . . Moreover, according to Hopi belief, this is the Hopi birthright, itaamakiwa, inherent with individual and collective responsibilities in the pursuit of life’s fulfillment on the journey toward old age and a spiritual destiny. (cited in McCarty, Nicholas, and Wyman 2015, 234)

Building on Nicholas’s analysis and related ethnographic studies of Yup’ik and Navajo in the United States, McCarty et al. (2015, 229) propose a framework rooted in the “4 Rs”: language rights, resources, responsibilities, and reclamation. From this perspective, “language rights can be understood as both individual and collective, inhering in birthright within a distinctive community or communities and in the political sovereignty of Native American peoples.” Diné-Lakota scholar Tiffany Lee (2016) links linguistic rights to the Diné (Navajo) concept of k’é, a constellation of values centered on kinship, love, compassion, unselfishness, and peacefulness. K’é anchors Indigenous self-determined efforts to promote and protect ancestral languages in both private and public domains, including education. While challenging to implement in practice, Lee offers several contemporary community-based language revitalization initiatives among the Diné and Puebloan peoples of the US Southwest as exemplars of “collaboration, respect, and shared

92 Teresa L. McCarty decision making” guided by the practice of k’é (2016, 113). “For the security and continuity of Indigenous lands, languages, cultures, epistemologies, families, and resources,” k’é represents “a solid and promising foundation for ensuring our future” (Lee, 2016, 113). Similarly, in sub-Saharan Africa, Makalela (2015, 2017) and Makoni (2012) tie language rights to an Indigenous African epistemological system based on ubuntu, “realized in the injunction: ‘I am because you are, you are because we are’ ” (Makalela 2017, 521). Ubuntu “valorizes interdependence, fluidity, and flexibility of cultural and linguistic systems,” Makalela explains, breaking down “boundaries between languages” and de- naturalizing unilingualist and problem-oriented language rights approaches (2017, 527). Through ubuntu, language planners can rediscover “a plural vision of interdependence, fluid and overlapping discursive system that matches ways of communicating where the use of one language is incomplete without the other” (2017, 527). In sum, LPP research and practice show a clear need for both individual and collective rights within a promotion-, reclamation-, and sustainability-oriented paradigm. Precisely because of extreme language endangerment, its association with often-violent colonial encounters (Meek 2010), and the ongoing oppression of Indigenous peoples, an individual rights orientation (e.g., Wee 2011) is dangerously inadequate. As Hirvonen (2008, 38) notes for Sámi, the threatened status of the language means that it requires “special support and positive discrimination.” Discussing Indigenous language rights in Latin America, López (2008) argues for promotion-oriented policies that develop plurilingualism and interculturality among all social sectors. “The bottom line,” he asserts (2008, 60), is that: Indigenous people are now conscious that their continuity as different, and their possibilities of exercising the right to be so, are in jeopardy if society as a whole does not change its perception and interpretation of what is Indigenous and, hence, of the multiethnic character of the society. In other words, for the Indigenous population . . . to continue resisting the forced assimilation process that the . . . hegemonic sector in control of the nation- state imposed upon them, these same powerful . . . sectors have also got to open up their minds and hearts to the diversity that has always characterized the countries they are part of.

3. Indigenous language rights in education3 Education is arguably the key public domain in which the centuries-long fight for Indigenous language rights has been waged. Writing more than forty years ago about American Indian and Alaska Native language education, Kari and Spolsky (1973) pointed out that schools have been the only social institution to both demand exclusive use of English and prohibit the use of Indigenous mother tongues. “Before colonization,” 3 Parts of this section are modified from McCarty (2018), and used with permission of Oxford University Press.

Indigenous Language Rights 93 write Brock-Utne and Hopson (2005, 3) of the situation in sub-Saharan Africa, different ethnolinguistic groups “did not have a language of instruction problem. Colonial education,” they continue, “created a social division among Africans based on the mastery of the colonial.” Thus, colonial schooling and medium-of-instruction policies have been (and are) primary instruments of linguistic assimilation and minoritized-language endangerment (Tollefson and Tsui 2004). By requiring education only in the dominant language, language-restrictionist policies seek to “erase and replace” linguistically encoded knowledge systems and cultural identifications with those associated with majoritarian interests (Lomawaima and McCarty 2006, xxii). Such language education policies are associated with myriad educational, economic, and social inequities, including low rates of education attainment by Indigenous children and youth and high rates of poverty, clinical depression, and teen suicide (Brayboy and Maaka 2015; Skutnabb-Kangas and Dunbar 2010). Yet, in recent years, schools have become significant sites for reclaiming and exercising language rights. “The school appears to be seen not only as the place and instrument to conquer the bastions of the hegemonic society,” López (2008, 60) observes, “but also as the context and tool to recreate knowledge and local wisdom, to revitalize or even recover a vulnerable language or one that is at the verge of extinction.” Hornberger (1988) was among the first to offer a systematic, in-depth examination of whether schools and language education policies can, in fact, serve this purpose. In an ethnographic case study of bilingual education policy and practice in the largely rural, Quechua-speaking Department of Puno, Peru, she asked, can language maintenance be planned? And, can schools be effective agents for language maintenance? The answers, Hornberger found, are rooted in local language practices and wider language ideologies that position Quechua as the extra-school, ayllu or home-community language, and Spanish as the language of the school and other outside institutions. Hornberger showed how increasing contact with dominant-Spanish speakers, combined with the marginalization of Quechua speakers, mitigated against micro-level, tolerance-oriented, individual language rights, while problems of implementation and overall government instability undermined macro-level, promotion-oriented, collective rights. Weighing these interlinked factors, she concluded: “If a bilingual education program is to make any contribution to language maintenance, it seems most likely that it should be an enrichment . . . or two-way, bilingual education [reflecting] a valuing of Quechua by society not only for Quechua speakers but also for non-Quechua speakers” (1988, 236). In an edited volume published twenty years later, Hornberger (2008) asked a similar question: “Can schools save Indigenous languages?” The answer, she and the volume contributors indicate, is a qualified, “No, but.” As McCarty (2008, 175) writes, “No, schools alone cannot do the job, but in tandem with other social institutions they can be (and have been) a strategic resource for exerting Indigenous language and education rights.” So it is within these spaces of possibility and constraint—what Hornberger (2006) refers to as “ideological and implementational spaces”—where language rights in education are negotiated. Such negotiations may be as “micro” as the minute-by-minute interactions of students and their teachers. For example, Hornberger’s (1988) ethnographic study documents a classroom activity in which the second-grade teacher asks

94 Teresa L. McCarty her Quechua-speaking students to draw and label, in Spanish, certain vegetable foods. Despite the fact that these were foods with which the children were familiar, they foundered in trying to accomplish the task. However, when instruction was provided in Quechua, the situation was transformed, as mother-tongue instruction enabled the children to access academic content using the linguistic resources they already possessed while acquiring new knowledge and linguistic resources in Spanish. Since Hornberger’s original study, a significant body of scholarship, much of it ethnographic, has explored the interaction of micro-level processes like these with meso- level school and community processes and those at the level of official language policy. The cases of Māori and Hawaiian are instructive. Both are Eastern Polynesian languages, and their revitalization over the past four decades has followed parallel pathways. In both cases, Indigenous peoples experienced “political disenfranchisement, misappropriation of land, population and health decline, educational disadvantage and socioeconomic marginalization” associated with language shift (May 2005, 366). In 1978, an Indigenous grassroots “Hawaiian Renaissance” led to the designation of Hawaiian as co-official with English in the State of Hawai‘i (Wilson 2014). In 1987, a similar Indigenous movement made Māori official in Aotearoa/New Zealand. At the same time, a parent-led movement in both Aotearoa/New Zealand and Hawai‘i led to the establishment of Māori Kōhanga Reo and Hawaiian Pūnana Leo (“language nest”) preschools, where all instruction is provided in the Indigenous language (in almost all cases, children’s second language). This form of instruction, called “revitalization immersion” (McIvor and McCarty 2016) or “education for language revitalization” (López and García 2016), spread horizontally to other communities and vertically by grade level. Today there are scores of Māori and Hawaiian partial-and full-immersion pre- K–12 schools, as well as university-based programs dedicated to the promotion of these languages in their national and state contexts, respectively. These reclamation efforts are widely recognized as language rights victories that have spearheaded revernacularization of these Indigenous languages, significantly improved students’ academic attainment, and served as models for Indigenous communities throughout the world. In Latin America, for example, López and García (2016) report on Māori- influenced language nest preschools in Oaxaca, Mexico, and Baré “language niche” preschools in Venezuela. Similar language reclamation efforts are under way in Canadian Aboriginal communities. One recent study compared academic and linguistic outcomes, including postsecondary preparedness, for Mi’kmaq students enrolled in revitalization immersion and nonimmersion programs in New Brunswick. These researchers found that “students in the immersion program not only had stronger Mi’kmaq language skills” compared to students in the nonimmersion program—an outcome we would expect—but “students within both programs ultimately had the same level of English” (Usborne et al. 2011, 200). Similar findings have been reported for Kanienke:ha (Mohawk), Cree, and Secwepemc in Canada (McIvor and McCarty 2016), indicating the potential for revernacularization , bi-/multilingual development, and enhanced academic outcomes through such education programming.

Indigenous Language Rights 95 The sociolinguistic context in other parts of the world has given rise to different language education approaches. Within the Indian subcontinent, more than 750 languages are spoken—10% of all languages in the world. Nearly half of those languages are endangered. Mohanty and Panda (2017) link this to a “double divide” between English and major regional and national languages and between those languages and Indigenous languages. This multilayered linguistic hierarchy mirrors the unequal distribution of power and resources, “leading to disadvantage, marginalization, language shift and loss of linguistic diversity” (Mohanty and Panda 2017, 509). Similarly, in Nepal, “English is killing Hindi, which is killing Nepali, which is killing Nepal’s major indigenous languages” (Hough, Magar, and Yonjan-Tamanag 2009, 163). At the same time, compulsory education in dominant languages has led to widespread education disparities for Indigenous students (Mohanty and Panda 2017). Under such conditions, multilingual education has shown great promise. In the Indian states of Andhra Pradesh and Orissa, Mohanty (2010) and Mohanty et al. (2009) report on mother-tongue based multilingual education in approximately 1,500 schools serving students from eighteen Indigenous groups. These programs use Telugu in Andhra Pradesh and Oriya in Orissa as the medium of instruction for all school subjects through grade 4, after which Telugu or Oriya is used alongside English, then Hindi. The Orissa program—Multilingual Education Plus (MLE+)—uses a culture-based curriculum to “foster collaborative classroom learning and cultural identity” (Mohanty 2010, 172). According to Mohanty et al. (2009, 295), these programs have “improved the basic competencies of literacy and numeracy among all children, increased their school attendance and . . . resulted in greater parental satisfaction and community involvement.” In Nepal, over 200 languages are spoken and half the nation’s 23 million people are non-Nepali speakers. The Multilingual Education Project for All Non-Nepali Speaking Students of Primary Schools is a grass-roots program in six Indigenous communities (Hough et al. 2009). In this project education practitioners and community members have collaborated to develop and implement a critical Indigenous pedagogy centered on local languages and knowledge systems, including traditional healing and agricultural practices, oral histories, collectivism, life cycle rituals, and an Indigenous numerical system. This “is still very much a work in progress,” say Hough et al. (2009, 175), but evidence to date indicates the project is having a positive impact on curricular reforms within local schools and teacher training programs. In Ethiopia, the second most populous country in sub-Saharan Africa with eighty ethnolinguistic groups, a 1994 education policy extended the use of Ethiopian languages (in addition to Amharic) on a regional basis. The policy specifies eight years of mother-tongue medium schooling, plus Amharic as a subject for students who do not have this as a mother tongue, English as a subject to the end of year 8, and transition to English-medium instruction in year 9. In a review of policy implementation across Ethiopia, Heugh et al. (2010) found that students who had mother-tongue instruction for a full eight years of their primary schooling performed as well as or better than their English medium-of-instruction peers on assessments of English, science, and mathematics. Summing up 100 years of education research on the African subcontinent,

96 Teresa L. McCarty Makalela (2005, 164) states that bi/ multilingual education “accelerates academic success, . . . provides psychological support necessary to nourish cognitive development [and] enhances an autonomous worldview.” This form of education is also a force for sustaining African languages and challenging prohibition-oriented rights. Latin America is home to 40 to 50 million Indigenous people who speak as many as 700 languages (López and Sichra 2008). UNESCO identifies more than 100 of these languages as endangered (Maxwell 2016). Latin American republics were founded on an exogenous ideology of monolingual/monocultural national identity as a precursor to citizenship and equality for all (García, López, and Makar 2010). There has been little equality, however, as this citizenship model has excluded women, the poor, and Indigenous peoples. Throughout the region, Indigenous language rights have come relatively recently, with major gaps between official policies and local conditions, including racism and profound education and economic disparities. At the same time, recent decades have also witnessed important changes brought about by a growing “Indigenous conquest” of linguistic and education rights (Rockwell and Gomes 2009). Today, the constitutions of most Latin American countries recognize the rights of Indigenous peoples to retain and sustain their languages and lifeways (Haboud et al. 2016; Maxwell 2016). Education for language revitalization (ELR) and bilingual/intercultural education (BIE) are major foci of these efforts (López and García 2016). BIE tends to be school- based, while ELR “goes beyond school activities” such that “homes, schools, and communities complement one another” (López and García 2016, 121). One example of the strategic coupling of school-and community-based language planning is the long- standing Indigenous Bilingual Education Teacher Training Program of the Peruvian Amazon Basin (FORMABIAP). FORMABIAP prepares primary school teachers from fifteen Indigenous groups, some of whom, like language immersion teachers in North America, are learning their heritage/community language as a second language. López and García (2016, 125) examine the “complex and conflicting experience” of Kukama- Kukamirias FORMABIAP participants, whose “external ascription as Indigenous was accompanied by the shame endured for not speaking Kukama.” Elder-led language workshops, dialogue sessions, and community-based activities such as a Kukama literary competition have helped the teachers recover their ancestral language. Overall, say López and García (2016, 126), “Kukamas are now convinced of the need to recuperate their language and culture.” López and García (2016) also report on a promising Kaqchikel (Mayan)-medium school near Guatemala City that combines community-based cultural learning with bilingual Kaqchikel-Spanish instruction. “The school staff is determined to restore oral Kaqchikel in everyday life,” these researchers state (2016, 127). Maxwell (2016) and Messing and Nava (2016) provide myriad examples of ELR and BIE for Nahuatl, Hñahñu, and Isthmus Zapotec in Mexico; Mayan in Guatemala; Quechua, Shuar, Aymara, and Guarani in the Andes; and Mapuche and Rapa Nui in Chile. (For additional examples, see Coronel-Molina and McCarty 2016; de León 2017; and Haboud and Limerick 2017).

Indigenous Language Rights 97 Where there are relatively smaller numbers of Indigenous-language speakers, as in the United States, Canada, Aotearoa/New Zealand, Australia, and the Nordic countries, Indigenous-language revitalization efforts face the challenges of dwindling numbers of speakers (hence, teachers are often second-language learners); the need to create writing systems and teaching materials “from scratch”; and the legacy of colonial ideologies that learning the Indigenous language “holds children back” academically (Wyman et al. 2010, 38). The absence of supportive state-level policies and official policies that violate inherent Indigenous sovereignties and state-guaranteed language rights exacerbate these challenges. Wilson (2012), for instance, examines a boycott by Native Hawaiian parents of mandated English-language testing for students enrolled in Hawaiian immersion schools. Increasing urbanization adds further challenges by separating learners from Indigenous-language strongholds. One innovative response is a call by the New Zealand Māori Council to establish Māori housing clusters near tribal cultural centers, with preference given “to those committed to speaking te reo [Māori language]” (New Zealand Māori Council 2016, para. 3). Other creative strategies include Indigenous language houses “where people live together committed to using the language with each other,” and language pods—“groups of speakers and advanced learners who get together on a regular basis to converse on various topics” (Hinton 2017, 266). In regions of the world where there are still significant numbers of speakers of all generations, as in Africa and Asia, bi/multilingual education continues to confront the “ideological paradox” of “constructing a national identity that is also multilingual and multicultural” (Hornberger 2000, 173). May (2014), citing an early analysis by Bullivant (1981), refers to this as the pluralist dilemma, in which diverse ethnolinguistic group-rights claims must be reconciled with those of the nation-state. In all of these regional contexts, national policies often force an early exit from bi/multilingual schooling (where it exists), limiting children’s acquisition of the Indigenous language and the equality of its social standing (Mohanty and Panda 2017). The root causes of these tensions are fundamental and often racialized structural inequalities, ideologies of linguistic and cultural deficit, and the remedial framing of Indigenous bi/multilingual education. Summing up these issues, López and Sichra (2008, 306) note the urgency “to abandon once and for all the compensatory understanding of [bi/multilingual education] and to regard it as an approach for better educational quality in general.”

4. Moving forward—from miner’s canary to mariner’s tern This chapter began by quoting legal ethicist Felix Cohen, who likened Native American experiences to the miner’s canary. Native experiences, Cohen reasoned, represent a “test” of democracy for the nation-state as a whole. In their book, The Miner’s Canary, Guinier and Torres (2002, 11) apply Cohen’s metaphor to the treatment of people based

98 Teresa L. McCarty on political race: “Those who are racially marginalized are like the miner’s canary: their distress is the first sign of a danger that threatens us all.” I have argued that the miner’s canary is an equally appropriate, if troubling, metaphor for the treatment of people based on language. To rephrase Guinier and Torres’s point: [Linguicized] communities signal problems with the way we have structured power and privilege. These pathologies are not located in the canary. . . . Such an approach would solve the problems of the mines by outfitting the canary with a tiny gas mask to withstand the toxic atmosphere. (2002, 12)

Focusing on Indigenous language rights forces us to look beyond the metaphoric canary and confront the “air quality in the mines” (Guinier and Torres 2002, 12): the inequalities that marginalize Indigenous communities while privileging dominant social class and raciolinguistic interests. (For comparative analyses, see Alim, Rickford, and Ball 2016; Flores and Rosa 2015.) In the final analysis, Indigenous experiences must symbolize more than the diagnostic canary, whose full utility is realized through self-sacrifice. As Guinier and Torres (2002, 12) point out, the canary alerts us “to both danger and promise,” and as such, signals “the need to rebuild a movement for social change informed by the canary’s critique.” In this spirit, a more forward-looking metaphor is the mariner’s tern. Indigenous peoples in Polynesia have long known that the flight pattern of a certain species of tern provides a systematic and dependable means of sea navigation. These birds nest on land and cannot swim. Each day they fly to sea at dawn in search of food, returning to land at dusk. By carefully observing the terns’ flight habits, Polynesian mariners have for centuries been able to determine both the direction and the approximate distance of land. In a similar manner, Indigenous experiences provide a means of “way-finding”—a path through which we might work around and through linguistic repression and the structural inequalities it reflects and creates, to reclaim the reciprocal, pluricultural language rights vision reflected in the Diné concept of k’é and the African notion of ubuntu. Contemporary Indigenous language movements are part of this larger project, the object of which is the right to choose linguistic allegiances and practices, based on the principles of peoplehood and self-determination.

References Alim, H. Samy, John R. Rickford, and Arnetha F. Ball, eds. 2016. Raciolinguistics: How Language Shapes Our Ideas About Race. New York: Oxford University Press. Brayboy, Bryan M. J., and Margaret J. Maaka. 2015. “K–12 Achievement for Indigenous Students.” Journal of American Indian Education 54(1): 63–98. Bullivant, Brian. 1981. The Pluralist Dilemma in Education: Six Case Studies. Sydney: Allen and Unwin. Cohen, Felix S. 1953. “The Erosion of Indian Rights, 1950–1953: A Case Study in Bureaucracy.” Yale Law Review 62: 349–390.

Indigenous Language Rights 99 Commonwealth of Australia. 1997. Bringing Them Home: Report of the National Inquiry into the Separation of Aboriginal and Torres Strait Islander Children from Their Families. Sydney: Human Rights and Equal Opportunity Commission. Coronel- Molina, Serafín M., and Teresa L. McCarty, eds. 2016. Indigenous Language Revitalization in the Americas. New York: Routledge. de León, Lourdes. 2017. “Language Policy and Indigenous Education in Mexico.” In Encyclopedia of Language and Education, Vol. 1: Language Policy and Political Issues in Education, 3rd ed., edited by Teresa L. McCarty and Stephen May, 415–433. Cham, Switzerland: Springer International. Del Valle, Sandra. 2003. Language Rights and the Law in the United States: Finding Our Voices. Clevedon, UK: Multilingual Matters. Dunbar-Ortiz, Roxanne, Dalee Sambo Dorough, Gudmundur Alfredsson, Lee Swepston, and Petter Wille. 2015. Indigenous Peoples’ Rights in International Law: Emergence and Application. Kautokeino and Copenhagen: Gáldu Resource Centre for the Rights of Indigenous Peoples and International Work Group for Indigenous Affairs. Flores, Nelson, and Jonathan Rosa. 2015. “Undoing Appropriateness: Raciolinguistic Ideologies and Language Diversity in Education.” Harvard Educational Review 85 (2): 149–171. García, Ofelia, Dina López, and Carmina Makar. 2010. “Latin America.” In Handbook of Language and Ethnic Identity, Vol. 1: Disciplinary and Regional Perspectives, 2nd ed., edited by Joshua A. Fishman and Ofelia García, 353–373. Oxford: Oxford University Press. Guinier, Lani, and Gerald Torres. 2002. The Miner’s Canary: Enlisting Race, Resisting Power, Transforming Democracy. Cambridge, MA: Harvard University Press. Haboud, Marleen, Rosalee Howard, Josep Cru, and Jane Freeland. 2016. “Linguistic Human Rights and Language Revitalization in Latin America and the Caribbean.” In Indigenous Language Revitalization in the Americas, edited by Serafín Coronel-Molina and Teresa L. McCarty, 201–223. New York: Routledge. Haboud, Marleen, and Nicholas Limerick. 2017. “Language and Education in Bolivia, Ecuador and Peru.” In Encyclopedia of Language and Education, Vol. 1: Language Policy and Politics in Education, 3rd ed., edited by Teresa L. McCarty and Stephen May, 435–447. Cham, Switzerland: Springer International. Hamel, Rainer Enrique. 1997a. “Introduction: Linguistic Human Rights in a Sociolinguistic Perspective.” International Journal of the Sociology of Language 127: 1–24. Hamel, Rainer Enrique, guest ed. 1997b. “Linguistic Human Rights from a Sociolinguistic Perspective” (Theme issue). International Journal of the Sociology of Language 127: entire. Heugh, Kathleen, Carol Benson, Mekonnen Alemu Gebre Yohannes, and Berhanu Bogale. 2010. “Multilingual Education in Ethiopia: What Assessment Shows About What Works and What Doesn’t.” In Multilingual Education Works: From the Periphery to the Centre, edited by Kathleen Heugh and Tove Skutnabb- Kangas, 287– 315. New Delhi: Orient BlackSwan. Hinton, Leanne. 2017. “Language Endangerment and Revitalization.” In Encyclopedia of Language and Education, Vol. 1: Language Policy and Politics in Education, 3rd ed., edited by Teresa L. McCarty and Stephen May, 257–272. Cham, Switzerland: Springer International. Hirvonen, Vuokko 2008. “’Out on the Fells, I Feel Like a Sámi’: Is There Linguistic and Cultural Equality in the Sámi School?” In Can Schools Save Indigenous Languages? Policy and Practice on Four Continents, edited by Nancy H. Hornberger, 15–41. New York: Palgrave Macmillan.

100 Teresa L. McCarty Hohepa, Margie Kahukura. 2006. “Biliterate Practices in the Home: Supporting Indigenous Language Regeneration.” Journal of Language, Identity, and Education 5: 293–315. Hornberger, Nancy H. 1988. Bilingual Education and Language Maintenance: A Southern Peruvian Quechua Case. Dordrecht, Holland: Foris. Hornberger, Nancy H. 2000. “Bilingual Education Policy and Practice in the Andes: Ideological Paradox and Intercultural Possibility.” Anthropology and Education Quarterly 31: 173–201. Hornberger, Nancy H. 2006. “Nichols to NCLB: Local and Global Perspectives on US Language Education Policy.” In Imagining Multilingual Schools: Languages in Education and Glocalization, edited by Ofelia García, Tove Skutnabb-Kangas, and María Torres-Guzmán, 223–237. Clevedon, UK: Multilingual Matters. Hornberger, Nancy H., ed. 2008. Can Schools Save Indigenous Languages? Policy and Practice on Four Continents. New York: Palgrave Macmillan. Hough, David A., Ram Bahadur Thapa Magar, and Amrit Yonjan-Tamang. 2009. “Privileging Indigenous Knowledges: Empowering Multilingual Education in Nepal.” In Social Justice Through Multilingual Education, edited by Tove Skutnabb-Kangas, Robert Phillipson, Ajit K. Mohanty, and Minati Panda, 159–176. Bristol, UK: Multilingual Matters. Hult, Francis M., and Nancy H. Hornberger. 2016. “Revisiting Orientations in Language Planning: Problem, Right, and Resource as an Analytic Heuristic.” The Bilingual Review 33: 30–49. International Labour Organization (ILO). 1996– 2016. Indigenous and Tribal Peoples Convention, 1989 (No. 169). www.ilo.org/dyn/normlex/en/f ?p=NORMLEXPUB:12100:0::N O::P12100_ILO_CODE:C169. Accessed January 1, 2017. International Labour Organization (ILO). 2009. Indigenous and Tribal Peoples’ Rights in Practice: A Guide to ILO Convention No. 169. Geneva: ILO. Kāretu, Tīmoti S. 1995. “Māori language rights in New Zealand.” In Linguistic Human Rights: Overcoming Linguistic Discrimination, edited by Tove Skutnabb-Kangas, Robert Phillipson, and Mart Rannut, 209–218. Berlin: Mouton de Gruyter. Kari, James, and Bernard Spolsky. 1973. Trends in the Study of Athapaskan Language Maintenance and Bilingualism (Navajo Reading Study Progress Report No. 21). Albuquerque, NM: Navajo Reading Study. Kloss, H. (1977[1998]). The American Bilingual Tradition (reprint edition with a new introduction by Reynaldo F. Macías and Terrence G. Wiley). Washington, DC and McHenry, IL: Center for Applied Linguistics and Delta Systems Co. Kroskrity, Paul V., and Margaret Field, eds. 2009. Native American Language Ideologies: Beliefs, Practices, and Struggles in Indian Country. Tucson: University of Arizona Press. Lee, Tiffany S. 2016. “The Home-School-Community Interface in Language Revitalization in the USA and Canada.” In Indigenous Language Revitalization in the Americas, edited by Serafín M. Coronel-Molina and Teresa L. McCarty, 99–115. New York: Routledge. Lomawaima, K. Tsianina, and Teresa L. McCarty. 2006. “To Remain an Indian”: Lessons in Democracy from a Century of Native American Education. New York: Teachers College Press. López, Luis Enrique. 2008. “Top-Down and Bottom-Up: Counterpoised Visions of Bilingual Intercultural Education in Latin America.” In Can Schools Save Indigenous Languages? Policy and Practice on Four Continents, edited by Nancy H. Hornberger, 42–65. New York: Palgrave Macmillan. López, Luis Enrique, and Fernando García. 2016. “The Home-School-Community Interface in Language Revitalization in Latin America and the Caribbean.” In Indigenous Language

Indigenous Language Rights 101 Revitalization in the Americas, edited by Serafín M. Coronel-Molina and Teresa L. McCarty, 116–135. New York: Routledge. López, Luis Enrique, and Inge Sichra. 2008. “Intercultural Bilingual Education Among Indigenous Peoples in Latin America.” In Encyclopedia of Language and Education, Vol. 5: Bilingual Education, 2nd ed., edited by Jim Cummins and Nancy H. Hornberger, 295–309. New York: Springer. Maffi, Luisa. 1995. “Linguistic and Biological Diversity: The Inextricable Link.” In Rights to Language: Equity, Power, and Education, edited by Robert Phillipson, 17–22. Mahwah, NJ: Lawrence Erlbaum. Magga, Ole Henrik. 1995. “The Sámi Language Act.” In Linguistic Human Rights: Overcoming Linguistic Discrimination, edited by Tove Skutnabb-Kangas, Robert Phillipson, and Mart Rannut, 219–233. Berlin: Mouton de Gruyter. Magga, Ole Henrik. 2015. “Indigenous Peoples’ Rights in Norway and the International Indigenous Movement.” In Indigenous Peoples’ Rights in International Law: Emergence and Application, edited by Roxanne Dunbar-Ortiz, Dalee Sambo Dorough, Gudmundur Alfredsson, Lee Swepston, and Petter Wille, 196–303. Kautokeino and Copenhagen: Gáldu Resource Centre for the Rights of Indigenous Peoples and International Work Group for Indigenous Affairs. Makalela, Leketi. 2005. “‘We Speak Eleven Tongues’: Reconstructing Multilingualism in South Africa.” In Languages of Instruction for African Emancipation: Focus on Postcolonial Contexts and Considerations, edited by Birgit Brock-Utne and Rodney Kofi Hopson, 147– 173. Capetown and Dar es Salaam: Centre for Advanced Studies of African Society (CASAS) and Mkuki n Nyota Publishers. Makalela, Leketi. 2015. “A Panoramic View of Bilingual Education in Sub-Saharan Africa.” In The Handbook of Bilingual and Multilingual Education, edited by Wayne E. Wright, Sovicheth Boun, and Ofelia García, 566–577. Malden, MA: Wiley Blackwell. Makalela, Leketi. 2017. “Language Policy in Southern Africa.” In Encyclopedia of Language and Education, Vol. 1: Language Policy and Political Issues in Education, 3rd ed., edited by Teresa L. McCarty and Stephen May, 519–529. Cham, Switzerland: Springer International. Makoni, Sinfri B. 2012. “Language and Human Rights Discourses in Africa: Lessons from the African Experience.” Journal of Multicultural Discourses 7: 1–20. Maxwell, Judith. 2016. “Revitalization Programs and Impacts in Latin America and the Caribbean.” In Indigenous Language Revitalization in the Americas, edited by Serafín M. Coronel-Molina and Teresa L. McCarty, 247–265. New York: Routledge. May, Stephen. 2004. “Rethinking Linguistic Human Rights: Answering Questions of Identity, Essentialism and Mobility.” In Language Rights and Language “Survival”: A Sociolinguistic Exploration, edited by Donna Patrick and Jane Freeland, 35–53. Manchester, UK: St. Jerome Publishing. May, Stephen. 2005. “Introduction. Bilingual/ Immersion Education in Aotearoa/ New Zealand: Setting the Context.” International Journal of Bilingual Education and Bilingualism 8: 365–376. May, Stephen. 2011. “Language Rights: The ‘Cinderella’ Human Right.” Journal of Human Rights 10: 265–289. May, Stephen. 2012a. Language and Minority Rights: Ethnicity, Nationalism and the Politics of Language, 2nd ed. New York: Routledge. May, Stephen. 2012b. “Contesting Hegemonic and Monolithic Constructions of Language Rights ‘Discourse.’” Journal of Multicultural Discourses 7: 21–27.

102 Teresa L. McCarty May, Stephen. 2012c. “Language Rights: Promoting Civic Multilingualism.” In The Routledge Handbook of Multilingualism, edited by Marilyn Martin-Jones, Adrian Blackledge, and Angela Creese, 131–142. London: Routledge. May, Stephen. 2014. “Justifying Educational Language Rights.” Review of Research in Education 38: 215–241. May, Stephen. 2018. “Unanswered Questions: Addressing the Inequalities of Majoritarian Language Policies.” In The Multilingual Citizen: Towards a Politics of Language for Agency and Change, edited by Lisa Lim, Christopher Stroud, and Lionel Wee, 65–72. Bristol, UK: Multilingual Matters. McCarty, Teresa L. 2008. “Schools as Strategic Tools for Indigenous Language Revitalization: Lessons from Native America.” In Can Schools Save Indigenous Languages? Policy and Practice on Four Continents, edited by Nancy H. Hornberger, 161–179. New York: Palgrave Macmillan. McCarty, Teresa L. 2013. Language Planning and Policy in Native America: History, Theory, Praxis. Bristol, UK: Multilingual Matters. McCarty, Teresa L. 2018. “Revitalizing and Sustaining Endangered Languages.” In The Oxford Handbook of Language Policy, edited by James W. Tollefson and Miguel Pérez-Milans, 355– 378. New York: Oxford University Press. McCarty, Teresa L., Sheilah E. Nicholas, and Leisy T. Wyman. 2015. “50(0) Years Out and Counting: Native American Language Education and the Four Rs.” International Multilingual Research Journal 9: 227–252. McIvor, Onowa, and Teresa L. McCarty. 2016. “Indigenous Bilingual and Revitalization- Immersion Education in Canada and the USA.” In Encyclopedia of Language and Education, Vol. 5: Bilingual and Multilingual Education, 3rd ed., edited by Ofelia García and Angel Lin. Cham, Switzerland: Springer International. doi:10.1007/978-3-319-02324-3_34-1. Meek, Barbra A. 2010. We Are Our Language: An Ethnography of Language Revitalization in a Northern Athabaskan Community. Tucson: University of Arizona Press. Messing, Jacqueline, and Refugio Nava Nava 2016. “Language Acquisition, Shift, and Revitalization Processes in Latin America and the Caribbean.” In Indigenous Language Revitalization in the Americas, edited by Serafín Coronel-Molina and Teresa L. McCarty, 76–96. New York: Routledge. Mohanty, Ajit. 2010. “Languages, Inequality and Marginalization: Implications of the Double Divide in Indian Multilingualism.” International Journal of the Sociology of Language 205: 131–154. Mohanty, Ajit K., Mahendra Kumar Mishra, N. Upender Reddy, and Gumidyala Ramesh. 2009. “Overcoming the Language Barrier for Tribal Children: Multilingual Education in Andhra Pradesh and Orissa, India.” In Social Justice Through Multilingual Education, edited by Tove Skutnabb-Kangas, Robert Phillipson, Ajit K. Mohanty, and Minati Panda, 283–297. Bristol, UK: Multilingual Matters. Mohanty, Ajit K., and Minati Panda. 2017. “Language Policy and Language Education in the Indian Subcontinent.” In Encyclopedia of Language and Education, Vol. 1: Language Policy and Political Issues in Education, edited by Teresa L. McCarty and Stephen May, 507–518. New York: Springer. Nettle, Daniel, and Suzanne Romaine. 2000. Vanishing Voices: The Extinction of the World’s Languages. Oxford: Oxford University Press. New Zealand Māori Council. 2016. “Media Statement on Māori Language Policy.” Press Release, July 18. http://www.scoop.co.nz/stories/PO1607/S00199/maori-language-policy. htm. Accessed August 4, 2016.

Indigenous Language Rights 103 Nicholas, Sheilah E. 2014. “‘Being’ Hopi by ‘Living’ Hopi: Redefining and Reasserting Cultural and Linguistic Identity— Emergent Hopi Youth Ideologies.” In Indigenous Youth and Multilingualism: Language Identity, Ideology, and Practice in Dynamic Cultural Worlds, edited by Leisy T. Wyman, Teresa L. McCarty, and Sheilah E. Nicholas, 70–89. New York: Routledge. Patten, Alan and Will Kymlicka. 2003. “Language Rights and Political Theory: Context, Issues, and Approaches.” In Language Rights and Political Theory, edited by Will Kymlicka and Alan Patten, 1–51. Oxford: Oxford University Press. Phillipson, Robert, Mart Rannut, and Tove Skutnabb-Kangas. 1995. Introduction to Linguistic Human Rights: Overcoming Linguistic Discrimination, edited by Tove Skutnabb-Kangas, Robert Phillipson, and Mart Rannut, 1–22. Berlin: Mouton de Gruyter. Rockwell, Elsie, and Ana Maria R. Gomes, guest eds. 2009. “Introduction to the Special Issue: Rethinking Indigenous Education from a Latin American Perspective.” Anthropology and Education Quarterly 40: 97–109. Romaine, Suzanne. 1991. Introduction to Language in Australia, edited by Suzanne Romaine, 1–24. Cambridge, UK: Cambridge University Press. Ruiz, Richard. 1984. “Orientations in Language Planning.” NABE Journal 8(2): 15–34. Skutnabb-Kangas, Tove. 2000. Linguistic Genocide in Education—Or Worldwide Diversity and Human Rights? Mahwah, NJ: Lawrence Erlbaum. Skutnabb-Kangas, Tove. 2012. “Indigenousness, Human Rights, Ethnicity, Language and Power.” International Journal of the Sociology of Language 213: 87–104. Skutnabb-Kangas, Tove, and Robert Dunbar. 2010. “Indigenous Children’s Education as Linguistic Genocide and a Crime Against Humanity? A Global View.” Gálda Čála Journal of Indigenous Peoples Rights 1: entire. Skutnabb-Kangas, Tove, and Stephen May. 2017. “Human Rights and Language Policy in Education.” In Encyclopedia of Language and Education, Vol. 1: Language Policy and Political Issues in Education, 3rd ed., edited by Teresa L. McCarty and Stephen May, 125–141. Cham, Switzerland: Springer International. Skutnabb-Kangas, Tove, Andrea Bear Nicholas, and Jon Reyhner. 2016. “Linguistic Human Rights and Language Revitalization in the USA and Canada.” In Indigenous Language Revitalization in the Americas, edited by Serafín M. Coronel-Molina and Teresa L. McCarty, 181–200. New York: Routledge. Skutnabb-Kangas, Tove, and Robert Phillipson. 1995. “Linguistic Human Rights, Past and Present.” In Linguistic Human Rights: Overcoming Linguistic Discrimination, edited by Tove Skutnabb-Kangas, Robert Phillipson, and Mart Rannut, 71–110. Berlin: Mouton de Gruyter. Skutnabb-Kangas, Tove, Robert Phillipson, and Mart Rannut, eds. 1995. Linguistic Human Rights: Overcoming Linguistic Discrimination. Berlin: Mouton de Gruyter. Task Force on Aboriginal Languages and Cultures. 2005. Towards a New Beginning: A Foundational Report for a Strategy to Revitalize First Nation, Inuit and Métis Languages and Cultures. Ottawa: Aboriginal Languages Directorate, Department of Canadian Heritage. Tollefson, James W., and Amy B. M. Tsui, eds. 2004. Medium of Instruction Policies: Which Agenda? Whose Agenda? Mahwah, NJ: Lawrence Erlbaum. United Nations General Assembly. 1992. Declaration on the Rights of Persons Belonging to National or Ethnic, Religious and Linguistic Minorities. http://www.un.org/documents/ga/ res/47/a47r135.htm. Accessed January 21, 2017. United Nations General Assembly. 2007. Declaration on the Rights of Indigenous Peoples. New York: United Nations General Assembly. http://www.un.org/esa/socdev/unpfii/ documents/DRIPS_en.pdf/. Accessed January 1, 2017.

104 Teresa L. McCarty United Nations Permanent Forum on Indigenous Issues (UN PFII). n.d. Who are Indigenous Peoples? http://w ww.un.org/esa/s ocdev/u npfii/documents/5session_factsheet1.pdf/. Accessed January 1, 2017. Usborne, Esther, Josephine Peck, Donna-Lee Smith, and Donald M. Taylor. 2011. “Learning Through an Aboriginal Language: The Impact on Students’ English and Aboriginal Language Skills.” Canadian Journal of Education 34: 200–215. wa Thiong’o, Ngugi. 2009. Something Torn and New: An African Renaissance. New York: BasicCivitas Books. Wee, Lionel. 2011. Language Without Rights. Oxford: Oxford University Press. Wilkins, David E., and K. Tsianina Lomawaima. 2001. Uneven Ground: American Indian Sovereignty and Federal Law. Norman: University of Oklahoma Press. Wilson, William H. 2012. “USDE Violations of NALA and the Testing Boycott at Nāwahīokalani’ōpu’u School.” Journal of American Indian Education 51(3): 30–45. Wilson, William H. 2014. “Hawaiian: A Native American Language Official for a State.” In Handbook of Heritage, Community, and Native American Languages in the United States, edited by Terrence G. Wiley, Joy K. Peyton, Donna Christian, Sarah Catherine K. Moore, and Na Liu, 219–228. New York and Washington, DC: Routledge and Center for Applied Linguistics. Wyman, Leisy, Patrick Marlow, Fannie Cikuyaq Andrew, Gayle Sheppard Miller, Rachel Cikigaq Nicholai, and Nita Yurrliq Reardon. 2010. “Focusing on Long-term Language Goals in Challenging Times: A Yup’ik Example.” Journal of American Indian Education 49(1–2): 28–39. Zepeda, Ofelia. 1990. “American Indian Language Policy.” In Perspectives on Official English: The Campaign for English as the Official Language of the USA, edited by Karen L. Adams and Daniel S. Brink, 247–256. Berlin: Mouton de Gruyter.

Pa rt I I

L A N G UAG E D O C UM E N TAT ION

Chapter 5

The Goals of L a ng uag e D o cum entat i on Richard A. Rhodes and Lyle Campbell

1. Introduction There is a broad consensus in favor of the general idea that the goal of language doc umentation is to provide an accurate and adequate record of a language for posterity. That general formulation becomes a more complicated matter when it is broken down into its parts, of which there are three principal ones. What constitutes language doc umentation in the first place? How do we evaluate the accuracy of a documentary rec ord? And what counts as having an adequate record? The goals of this chapter are to address these questions and to attempt to provide useful perspectives on what must go into answering them.1 The first question, the nature of language documentation, has been discussed ex tensively in recent literature on the topic (for example, Campbell 2016; Himmelmann 1998, 2006, 2012; Rehg 2007; Woodbury 2010; among others). We do not rehash these discussions here. Instead, with those debates as background, we offer our own posi tion, which reflects Woodbury’s view (2010, 159): “Language documentation is the creation, annotation, preservation, and dissemination of transparent records of a lan guage.” Below we return to this question to reinterpret, in particular, the distinctions that Himmelmann (especially 1998) has made. We do not address the second question, about the accuracy of the record being developed in current language documentation projects, directly here. Recent developments in language documentation and tech nology for recording and archiving have made recording faster and better and facilitate checking the accuracy of the data, but questions of accuracy of description and analysis remain, the accuracy varying according to the skill and experience of those who engage

1

We thank Kenneth Rehg for very helpful comments on an earlier version of this chapter.

108 Richard A. Rhodes and Lyle Campbell in the documentation of the particular languages. That leaves the matter of adequacy as the question that warrants fuller discussion here. The question of adequacy is a complicated one. There are two primary audiences for the documentary record of a language, and these audiences’ interests may overlap, but they may also diverge. The two principal audiences are the scientific (academic) com munity and the heritage language community.2 The academic community values knowl edge for its own sake, openness (Woodbury’s transparency), and accuracy and fullness of the record. Heritage communities vary widely in what they value, particularly with regard to coverage. Heritage communities typically share a deep sense of language as an emblem of their culture, and flowing from that, a sense of proprietary interest, but their connection to the language is not—and should not be—bounded by the concerns of the scientific community. (For more on this, see below.) From both scholarly and utilitarian perspectives, there are rigorous expectations of what the adequate documentation of a language requires. These expectations not unrea sonably include: (1) A text corpus of appropriate size and cultural breadth, representing a wide range of genres; (2) The description of the grammar (or development of extensive grammat ical materials), based on an adequate analysis of the language, and a lexicon (dictionary); (3) Materials in a form that can reasonably be expected to be accessible to future gen erations of the language community as well as to those with academic and other interests. We could, like much of the literature on the matter, come at the question in the ab stract. However, if we consider the substantial amount of currently extant language doc umentation covering hundreds of languages to varying degrees, we may get a clearer sense of what adequacy in language documentation looks like. In our approach to this we go back and forth between conditions of adequacy and works describing various lan guages as a way to evaluate how the proposed level of coverage comports with our ideals of adequate coverage.

2

In linguistics, both “heritage” or “heritage language” and “community” have been used with varying senses. By “heritage language community” we mean the community (also sometimes called language group, language population) that includes native speakers of the language (if any remain), second- language learners and semispeakers of the language, and non-speakers who believe themselves to be stakeholders because of their ethnic identity or group membership with the people who traditionally spoke the language in question. There are, of course, also other audiences with interest in documentary materials on particular languages, e.g., the curious public, educators, school children, media personnel, people interested in local geography and culture, those involved in administering and delivering social programs, friends of community members, etc.

The Goals of Language Documentation 109

2. Background and context Actually, humans have been documenting languages since the advent of writing. Most of this documentation is in the form of written records produced for particular commu nicative purposes, though perhaps most of it did not serve the speakers of the languages involved as writing serves purposes of users of modern languages. Also, writers who left us this documentation did not set out with the goals of modern language documenta tion in mind. A telling early example of particular relevance to the history of documen tation arises out of the needs of translation and scribal expertise that led to the creation of the bilingual Ebla tablets and lexicons (c.2500 bce), representing both the Eblaite and Sumerian languages. This record is perhaps the first example of an important aspect of documentation, the fact that bilingual documents incidentally include metalinguistic information (see Archi 2003). Conscious documentation, for the sake of language pres ervation, goes back at least as far as Yaska’s grammar of Sanskrit, sixth century bce, preceding Pāṇini by two centuries (Matilal 1990). There are likewise numerous Helenist grammarians (from the third century bce on) and Latin grammarians (from Stilo, 154– 174 bce on) (Frobes 1933; Sandys 1903), who wanted to teach proper language use. A number of descriptive linguistic traditions arose in antiquity as responses to lan guage change and the need to continue to be able to read important texts. In the case of the Old-Babylonian tradition, when the first descriptive linguistic texts were com posed, Sumerian, which was the language of religious and legal texts, was being replaced by Akkadian. This grammatical tradition emerged by about 1900 bce and lasted 2,500 years, so that Sumerian could be learned and these texts could continue to be read (Gragg 1995; Hovdhaugen 1982). Language change and need for accessibility also stimulated the Hindu grammatical tradition. The Vedas, the oldest of the Sanskrit memorized religious texts, date from c.1200 bce. Sanskrit was changing but ritual re quired exact verbal performance of the sacred texts. Rules of grammar were devised for learning and understanding the archaic language. Pāṇini’s famous description (c.500 bce, which contains also rules formulated by his predecessors in a tradition from the tenth to the seventh centuries bce) originated in comparisons between versions called padapāṭa (word-for-word recitation) and saṃhitapāṭa (continuous recitation, of di vine origin, unalterable) of the same Vedic texts. The grammatical rules were devised for this comparison and for checking textual accuracy, and technical methods of gram matical description were developed in connection with the formulation of these rules (Staal 1974). The Greek grammatical tradition also began in response to language change; it was developed by schoolmasters, though it came to be known only through later writings of philosophers. Homer’s works (c.850 bce) were basic in early Greek education, but the Greek of the fifth to the third centuries bce had changed so much that explanations of Homer’s language were necessary for the schools. Observations taken from earlier school grammar are found in works of Plato, Aristotle, and the Stoics (Hovdhaugen 1982, 46). (See Campbell 2000, 81–82.)

110 Richard A. Rhodes and Lyle Campbell These grammatical traditions notwithstanding, monumental value in the record of Sanskrit, Greek, and Latin resides in the sheer volume of what was written in each—a point we will return to below. This documentation, these written records, have allowed scholars at great remove from the last native speaker to do work of contemporary lin guistic significance. Witness works like Deshpande’s on Sanskrit, Sociolinguistic Attitudes in India (1979), to mention but a single example. The value of the classical rec ord is even clearer given the existence of scholarly communities entirely devoted to the contemporary study of those resources, for example, the International Colloquium on Latin Linguistics (see http://web.philo.ulg.ac.be/cill/en/ [accessed March 1, 2017]), the Journal of Greek Linguistics (http://www.brill.com/journal-greek-linguistics [accessed March 1, 2017]), several dedicated to research on Sanskrit, and many others on Latin and Greek. European exploration in the Renaissance led explorers, particularly clerics, to provide documentation of the languages that they encountered. Nowhere is this truer than in Mesoamerica: the most outstanding example being Bernardo de Sahagún whose rec ord of Classical Nahuatl language and Aztec culture gives us an unparalleled window into this pre- conquest Mesoamerican culture (Dibble and Anderson 1950– 1982; León-Portilla 2002). While these examples demonstrate the great value of having an extensive corpus, few would look to these classical languages for models of best practice for documentation of today’s endangered languages, little-known languages, languages with no significant written tradition. These documents from Greek and Latin and others typically involve written texts (often highly massaged critical editions) and some degree of standardiza tion. This is a far cry from most modern language documentation, where the researchers usually start with nothing, having to figure out the phonology, morphology, syntax, lex icon, and everything from exposure to spoken language without standardization, and with much variation and all the other complications attendant to linguistic fieldwork. The corpus of many texts by many writers (often over generations) for official purposes differs radically from what could be expected from a typical fieldwork language docu mentation project today. Even in the case of Sahagún (mentioned above), there is a big difference between his “documentation,” where the extensive Aztec texts he directed to be written were not aimed at the language itself, and the work of other early language documenters, for ex ample, Carochi (1645), whose Nahuatl grammar really was aimed at the language, aimed at explaining the language and making it learnable. Thus, there is an important distinc tion to be clarified between purposeful documentation of a language and texts that un intentionally provide documentation. With these classical examples as background, this would be a good place to re- examine Himmelmann’s (1998) contrast between description and documentation. These examples relate to Himmelmann’s view that documentation “aims at the record of the linguistic practices and traditions of a speech community” (Himmelmann 1998, 9–10), with his pronouncement that “language documentation may be characterized as radically expanded text collection” (Himmelmann 1998, 2). We can be fairly confident that

The Goals of Language Documentation 111 Himmelmann was not thinking that the gold standard for documentary adequacy is the classical text collections of multi-million words in size. Still, it is hard to look at the array of documentary materials in these early cases and not to come to the conclusion that there is great value in quantity. We, of course, do not suggest that we should settle for any less than the best quality possible in this day and age of high technology. Imagine if we had an audio recording of even a single Catalinian speech by Cicero, or a contemporary recording of Socrates. What we are saying is that we should not disparage volume of ma terial as highly useful in adequate language documentation. But here is where we think Himmelmann went wrong. Even with his clarification in his more recent work (Himmelmann 2012), we never should have gotten into sharply contrasting description and documentation, and many linguists never accepted this. We understand that the twentieth-century grammar-lexicon-text collection standard, started by Franz Boas, as it was practiced was not ideal, but the record shows us that unless documentation includes description (analysis), the record will be inadequate. There are things we would not know about Latin and Greek had there not been contem porary grammatical commentary. For example, there is nothing in the millions of words of the Latin text corpus to tell us about the missing forms in defective noun inflection (see, for example, Bennett 1908, 54ff.); we would know little about these, were it not for the comments of various Roman grammarians. This suggests—and we will propose— that the adequate documentation of a language must include linguistic analysis as well. Let us be clear. We are claiming that documentary linguistics must include descriptive linguistics, language analysis. The traditional trilogy of grammar, dictionary, and texts is not wrong. It is, in fact, exactly right. If we compare the twentieth-century attempts at language documentation of indige nous language with these cases involving classical languages just mentioned, many that we hold in the highest regard seem to pale. Robert M. Laughlin’s truly impressive Great Tzotzil Dictionary of San Lorenzo Zinacantan (1975) contains remarkable amounts of eth nographic information on top of a very extensive lexicon with a grammar and examples. But from what is presented in this dictionary alone we cannot know enough about the phonology without audio recordings; without hundreds of thousands of words of texts in a range of genres we cannot fully appreciate how to talk about the world, or what the fre quency effects might be. Bloomfield’s Menomini Lexicon (1975), The Menomini Language (1962), the grammar, and Menomini Texts (1928) are considered to constitute one of the best examples of the threefold approach to language documentation that dominated the twentieth century. But the textual documentation of Old English is far grander, and the Old English record pales in comparison to the major classical languages. The point we are making here is that there are shortcomings to some of the common current emphases in language documentation, whether they be a lexicographi cally centered approach, like that used in the El Proyecto para la Documentación de las Lenguas de Mesoamérica (see http://www.albany.edu/pdlma/ for descrip tion of this project [accessed March 1, 2017]), or technology-centered approaches, like those encouraged by the Hans Rausing Endangered Languages Project The Endangered Languages Project website says that language documentation

112 Richard A. Rhodes and Lyle Campbell “emphasises data collection methodologies, in two ways: first, in encouraging researchers to collect and record a wide range of linguistic phenomena in genuine communicative situations; and secondly, in its use of high quality sound and video recording to make sure that the results are the best possible record of the language” (http://www.hrelp.org/documentation/ [accessed October 26, 2014]). In spite of the emphasis on texts in this approach, the benefit of sheer volume has not necessarily been sufficiently recognized in contemporary documentation, and certainly no very large corpora have been produced such as those available for the classical languages we have mentioned. The historical record gives us no better evidence of the lasting value of our documentation than that there be a lot of it, coupled with adequate analysis. It is important, in this context, however, to point out the difference between the corpora available for these classical languages or for English (and other large modern “languages of civilization”) and the little-studied languages of contemporary language documenta tion projects. There is a big difference between the mostly accidental accumulation in those languages of texts created for a variety of reasons not including language docu mentation per se and the textual corpus a language documentation project may be able to produce for a language with no written tradition. For most languages in most modern language documentation projects, when it comes to texts, in fact anything more than zero is a significant gain, though it is to be recognized that the more texts the merrier. More to our main point, we must not lose sight of the fact that there is more to language docu mentation than a corpus of texts. A distinction should be made between the accidental accumulation of texts (as documentation) and modern targeted language documenta tion of previously mostly unstudied languages. Among purposeful language documen tation projects, projects such as Laughlin’s dictionary and Bloomfield’s Menomini work (mentioned above) stand out as remarkable language documentation achievements. What is more, even when it comes to texts, it is our considered opinion that current practice is often blinkered, in two regards. The prejudice toward collection of the sorts of texts humanities scholars would consider literature and of traditional tales (oral traditions) must be overcome. We need to document also how to talk to babies and dogs as much as we need the creation myth cycle. Users of the documentation will want to know what is appropriate for greetings, leave-taking, scolding, swearing, and the many other things that tend not to show up in traditional texts or in products of many lan guage documentation projects.

3. Standards of description Let us now move on to what the standards should be for the descriptive part of the docu mentation in the approach we advocate.3

3

Portions of this section reflect conclusions of the 2007 CELP report (Rhodes et al. 2007).

The Goals of Language Documentation 113 Let us first discuss the more self-contained areas of phonology and morphology. It is, in principle, entirely possible to fully document both, with the proviso that we must not be limited to what linguists may currently find theoretically tractable or interesting.

3.1. Phonology The documentation of a phonological system is complete at the point where all extant phonological entities and patterns are fully exemplified in natural text and fully argued for to the limit of current theory and practice. This will include a full range of phonetic and phonological phenomena at all levels, phonetic properties and patterns, phonemic contrasts and distribution, and the full range of morphophonemic phenomena, in cluding their behavior in casual speech and occurrence in discourse and intonation. It can be considered done when nothing new is coming up in non-elicited material, and when any apparent lacunae in the phonological system can be shown to be because it is absent from the language, not just non-occurring in the corpus. This must, however, also include material that might seem intractable by being hard to transcribe, such as phonetic and phonological properties of interjections, ideophones, and discourse performance.

3.2. Morphology The morphology is complete when all inflectional and derivational possibilities are catalogued, exemplified in natural text, and fully analyzed. It is done when nothing new is coming up in non-elicited material, and when any apparent lacunae in the morpho logical system are proved to be absent from the language and not just non-occurring in the corpus. It should be noted that there is often irregularity in inflectional forms. Enough forms need to be recorded to satisfy the question of whether such irregularities exist. Derivational morphology can be more challenging to document exhaustively; it is harder to elicit and harder to test for what may have been missed. At some point deriva tional morphology in some languages may begin to overlap with lexicography. Let us look at some good examples of extant descriptive work to see how well they measure up with respect to morphology. For a very long time grammars were considered complete if they covered the mor phology well, particularly the inflectional morphology, as for example Bennett’s New Latin Grammar, first published in 1895, does. There are even early minority language grammars that cover morphology well, e.g., Baraga’s (1850) Theoretical and Practical Grammar of the Otchipwe Language. Bloomfield’s (1962) The Menomini Language is remarkably complete in morphology and morphophonology, in a language with much complexity in those areas. Lewis’s (1967/2001) Turkish Grammar is a good ex ample of a high-quality grammar, covering many complexities of the morphopho nology, giving an exhaustive picture of the extensive morphology. The introduction

114 Richard A. Rhodes and Lyle Campbell to Young and Morgan’s (1987) revision of The Navajo Language has excellent coverage of the morphology, but only cursory treatment of the phonology (see Kari and Leer 1984 for a review) . (Navajo has a particularly complex morphophonology.) Rice’s (1989) Grammar of Slave comes close to a model of adequacy by these standards (see Kari 1989 for a review). However, all these works underdescribe low-level phonology, especially casual/fast speech phenomena. Spectrograms, though very useful, alone are not a sufficient record. For syntax and lexicon, the problem is very different from the relatively closed sys tems of phonology and morphology. Let us start with syntax.

3.3. Syntax Documentary adequacy means having catalogued all the basic syntactic constructions, with examples. Earlier, in speaking of the extent of the text collection, we talked in terms of number of words, but syntax (and lexicon) give more clarity to what the corpus should look like to fulfill the job well. We need a sufficiently wide variety of text genres in large enough measure to have a reasonable certainty that we have seen both all the general constructions and a good sampling of highly specialized constructions. (For example, number constructions in English such as those needed to account for twentysomething, thirtysomething, but *tensomething, *twentytwosomething, which must refer to age, but twenty some odd, thirty some odd, *ten some odd, *twenty-two some odd, which cannot refer to a person of a certain approximate age and preferentially refers to numbers of “vis ible” entities). Criticisms of some very good descriptive grammars reveal that they fall down on syntactic facts of this sort. Lewis’s (1967/2001) Turkish Grammar was criticized by both Ménage (1969) and Lees (1970) for leaving some significant constructions out. Even the compendious A grammar of contemporary English by Quirk, et al. (1975) had some lacunae, as noted in Langendon’s (1975) review. The very act of describing the syntax of a language has its dangers, too. We must be aware of the need to approach our analyses in a framework that is relatively free of theory (Haspelmath 2010). That is to say that adequacy of coverage is not determined by or lim ited to what is amenable to current analytical thought or theoretical fads. Numerous scholars (from Boas onward) have talked in terms like “letting a language speak for it self.” This is a key factor in documenting a language. Part of this is the recognition that every language has one or more “core issues”—phonological patterns, morphological constructions, syntactic constructions—that even a brief grammatical sketch would have to address head-on. Those are things that a documenter’s grammar (in the general sense) must not neglect but must address head-on. For example, much of the syntax of Indo-European languages centers on the verb phrase (VP). But beyond that, collocations, idioms, and metaphors make extensive use of that VP syntax. This stands in stark contrast to, say, the syntax of Algonquian lan guages, where examples of obviation, inversion, and the oblique-like relative root com plement construction are littered by the dozens across every page, but syntactic patterns supporting the existence of a VP constituent are nowhere in sight (Rhodes 1990a, 1990b,

The Goals of Language Documentation 115 1993, 1998, 2006, 2010a, 2010b). The important point is that the framework in which the analyses are presented must be first of all capable of expressing the documenter’s pre-theoretical insights into the language and accessible to the various target audiences. Trying to force a language which does not have a clear VP constituent into a model which insists on VPs does no good. One can be theoretically eclectic as long as one lets the structure of the language drive both consistency of analysis and the agenda for what must be described. If we do our job of documenting well, future linguists will be able to use our data and analyses to garner insights into issues regarding the nature of the human speech capacity and human languages generally that, though unanticipated today, will be of concern in the future. Finally, the kind of analysis we are proposing should not shy away from simple bookkeeping as we reach the end of our analytic capabilities—we need to list those constructions, phonological, morphological, and syntactic, which appear in the corpus but for which we may have nothing further to say. Examples in syntax include highly specialized constructions associated with particular meanings or genres. For ex ample, we do not currently have an understanding of the syntax of headlines in English (SERIAL MURDERER CLAIMS FIFTH VICTIM), instructions on bottles and boxes (Shake well before using. This side up), and the like (Sadok 1975). Now let us turn to consider the lexicon.

3.4. Lexicon An adequate lexicon (or dictionary) should contain all the basic vocabulary and all terms in important areas of special expertise in the culture, and will provide at the very least glosses for all words/morphemes in the corpus. In the abstract it is difficult to define in more specific terms what an adequate lexicon will look like. Estimates of the average size of the vocabulary of ordinary English-speaking people vary significantly. The average 16-year-old has been estimated to know 10,000–12,000 words, with estimates varying between around 20,000 words to 50,000–75,000 for the average college graduate (see, for example, World Wide Words: Investigating the English Language across the Globe). http://www.worldwidewords.org/articles/howmany.htm [accessed March 19, 2017]). But what would a corresponding numbers of entries be in the lexicon of a language with large numbers of bound lexical morphemes, like Inuktitut, where phonological words correspond to whole clauses in less agglutinative languages. Fuller documentation will require that the lexicon include significant amounts of encyclopedic knowledge, both about the culture and about the technologies and local environment, and a full range of names, both personal and toponymic. It should also contain important quotes, standard allusions, and common sayings. It must not shy away from including apparently com positional formations if they have any hint of lexicalization, unitary existence, or for mulaic sequences. In the dictionary, generalization and data are not at odds. In current practice, lexicography may appear to involve decontextualization. In documentation, redundancy and abundant examples can be valuable in the lexicon.

116 Richard A. Rhodes and Lyle Campbell For lexicons, there are many notable examples. The classical languages, which have (ef fectively) closed corpora, have nevertheless yielded examples of what a comprehensive lexicon might look like, e.g., the Oxford Latin Dictionary (2012) or the massive ongoing Thesaurus Linguae Latinae (see, for example, http://www.npr.org/sections/parallels/2016/ 05/14/476873307/the-ultimate-latin-dictionary-after-122-years-still-at-work-on-the-letter- n [accessed March 13, 2017]), or the Liddell et al. (1925) dictionary of Greek. Modern lexicons that give extensive coverage of languages, both minority and majority languages, are nu merous. Laughlin’s (1975) Great Tzotzil Dictionary of San Lorenzo Zinacantan, mentioned above, is an excellent example that approaches our conditions, likewise, Hill, et al.’s (1998) 30,000-entry Hopi Dictionary (see Dakin’s 2000 review). So too are dictionaries like Nguyễn Văn Khôn’s (1966) Việt-Anh Tự Điển, which has 65,000 entries with significant coverage of technical areas (although little information about personal names or toponymy). The Spanish dictionary of the Real Academia Española has 2,349 pages of entries (plus an ap pendix of model verb conjugations); the edition curently in preparation is expected to in clude 43,300 entries, with approximately 200,000 meanings (see, for example, http:// www.rae.es/noticias/la-academia-entrega-la-23a-edicion-del-drae-que-se-publicara-en- octubre [accessed March 13, 2017]). The Finnish Nykysuomen sanakirja [Modern Finnish Dictionary], has over 200,000 entries (Sadeniemi 2002[1951–1961]). Interestingly, even the most extensive monolingual dictionaries, such as Webster’s Third (Gove 1961) with 450,000 entries, has little in the way of onomastic information. The German Duden series has twelve volumes covering various aspects of grammar, lexicon, and usage (Dudenredaktion 2005, 2010a, 2010b, 2012, 2013a, 2016a, 2016b). The topics generally covered in a single dictionary get treated in separate volumes. Spelling is volume 1 (Dudenredaktkon 2013a), pronunciaton is volume 6 (Dudenredaktion 2006), etymology is volume 7 (Dudenredaktkon 2013b), the thesaurus is volume 8 (Dudenredaktkon 2008). But of particular interest here are volumes 2, 11, and 12 (Dudenredaktkon 2010a, 2012, and 2008). Volume 2 is a quite complete compi lation of collocations, a largely overlooked area, but a very important part of German usage. Volume 11 is idioms and figures of speech. Again an area generally overlooked. Volume 12 is quotes and sayings (Dudenredaktion 2008–2016). The demands of minority languages of necessity will likely be simpler where there is no previous written record, but there are still sayings, figures of speech, epithets, idioms, and so on, to be included in the lexicon. A partic ularly good example of a minority language dictionary that captures many sayings, figures of speech, epithets, and idioms is Fisher et al.’s (2007) Cheyenne Dictionary.

4. Language documentation and language revitalization While adequate language documentation does not require direct efforts aimed at lan guage revitalization, the relationship between documentation and revitalization is often misunderstood and requires some comment. To jump to our conclusion first, the two

The Goals of Language Documentation 117 are not in opposition but are interrelated, and revitalization efforts can contribute to language documentation, as obviously language documentation can serve the interests of language revitalization. Too often the two are posed as distinct and even antagonistic to one another. Some believe language documentation serves only the interests of academics and neglects the language communities that want to revitalize their languages. Some tribal leaders have said that the time of their few remaining elders who know their language should not, in their view, be squandered on documenting the language but is needed for teaching their young. Their view is understandable, given that their role in dealing with the language is one of crisis management. On the other extreme, Mufwene (2017) believes that scholars have focused on language revitalization to the determent of broader understanding of the phenomenon of language vitality. However, these un fortunate positions do not reflect the real relationship between documentation and revitalization. In reality, modern language documentation projects almost never lack a language revitalization component. Granting agencies require evidence of community approval and of the documentation being made available to communities, and communities typ ically require things connected with language revitalization as a condition of their ap proval. Most scholars involved in language documentation are committed to making sure that their documentation efforts also serve the language community whose lan guage is involved. Unlike extinct biological species, even a “dead” language can continue to exist after the last speaker is gone, but to continue, documentation is crucial—an undocumented extinct language is irredeemably lost forever. When it comes to the moral obligations of linguists, as has been said various times, if we do not document the languages now while it is still possible to do so, we deserve the contempt of later generations that will have no access to their language. Without language documentation, what is needed for preparing teaching and learning materials for the language is simply unavailable. Language revital ization fatefully depends on the availability of language documentation. (See Campbell and Rehg, Introduction, this volume, for more discussion and examples of this.) A number of the chapters in this volume address various aspects of language revitali zation; they make it clear that documenting a language can also be morally fulfilling, can support and foster language revitalization, and is absolutely necessary for communities concerned with the survival of their languages. However, outsider researchers with commitment to revitalization are well advised not to think in terms of “I’m here to help you” (me leading) but rather “how can we work together?” (with the community in charge). In short, when it comes to the moral obligations of linguists, as has been said at times in the literature, if we do not document the languages now while it is still possible to do so, later generations whose heritage it is may with justification censure us for not providing them with the means to learn about their linguistic heritage and learn their language. Not even in Hollywood’s wildest imagination can an utterly dead language with no documentation be cloned and brought back to life. It’s document now or forever

118 Richard A. Rhodes and Lyle Campbell deserve rebuke for our negligence. In some senses, helping to provide adequate docu mentation is of the greatest service linguists and other scholars can render communities whose language is at risk.

5. Conclusions Adequate language documentation is not a matter of “yes” or “no” but of degree. Any advance in documentation beyond what we previously had is good, and in the case of relatively undocumented languages, any documentation at all is a welcome gain. We can talk about ideal adequate documentation; we can talk about best procedures to follow to progress toward more ideal documentation; and we can talk about the relative merits of volumes of texts and comprehensiveness of analysis. However, a message we do not want to leave is that if the documentation we have or are able to obtain does not reach an ultimate goal of full adequacy, then there is something wrong and it has no value. Just the opposite: we hope our discussion of adequate documen tation has clarified both the definition of language documentation, what must not be left out, and what is needed to make for more adequate language documentation. We agree with and reiterate Himmelmann’s (2012, 201) more recent advice, in what he labels “pragmatic resolution of conflicts of interest in language documentation”: “Do what is pragmatically feasible in terms of the wishes and needs of the speech commu nity and in terms of your own specific skills, needs, and interests.” Do as much as pos sible as well.

References Archi, Alfonso. 2003. “Archival Record-Keeping at Ebla 2400–2350 BC.” Ancient Archives and Archival Traditions: Concepts of Record-Keeping in the Ancient World, edited by Maria Brosius. Oxford: Oxford University Press. Baraga, Frederic. 1850. A Theoretical and Practical Grammar of the Otchipwe Language: The Language Spoken by the Chippewa Indians Which Is Also Spoken by the Algonquin, Otawa and Potawatami Indians With Little Differences. For the Use of Missionaries and Other Persons Living Among the Indians of the Above Named Tribes. Detroit, MI: Jabez Fox. Bennett, Charles E. 1895. New Latin Grammar. 1st ed. (2nd ed. 1908; 3rd ed. 1918). Boston: Allyn & Bacon. Bloomfield, Leonard. 1928. Menomini Texts, Vol. 12. New York: Publications of the American Ethnological Society. Bloomfield, Leonard. 1962. The Menomini Language (edited by Charles F. Hockett). New Haven, CT: Yale University Press. Bloomfield, Leonard. 1975. Menomini Lexicon (edited by Charles F. Hockett). Milwaukee, WI: Milwaukee Public Museum Press. Campbell, Lyle. 2000. “The History of Linguistics.” In Handbook of Linguistics, edited by Mark Aronoff and Janie Rees-Miller, 81–104. Oxford: Blackwell.

The Goals of Language Documentation 119 Campbell, Lyle. 2016. “Language Documentation and Historical Linguistics.” In Language Contact and Change in the Americas: Studies in Honor of Prof. Marianne Mithun, edited by Andrea L. Berez, Diane M. Hintz, and Carmen Jany, 249–271. Amsterdam: John Benjamins. Carochi, Horacio. 1645. Arte de la lengua Mexicana con la declaracion de los adverbios della. Mexico. [Reprinted several times, e.g., in a 1981 facsimile edition (Mexico: Editorial Innovación), and in a 1983 facsimile edition with an introduction by Miguel León-Portilla (Mexico: Universidad Nacional Autónoma de México, Instituto de Investigaciones Históricas).] Dakin, Karen. 2000. “Review of Hopi Dictionary/Hopìikwa Lavàytutuveni: A Hopi Dictionary of the Third Mesa Dialect with an English-Hopi Finder List and a Sketch of Hopi Grammar, by Kenneth C. Hill et al.” International Journal of American Linguistics 66: 398–402. Deshpande, Madhav. 1979. Sociolinguistic Attitudes in India: A Historical Reconstruction. Ann Arbor, MI: Karoma Publishers. Dibble, Charles E., and Arthur J. O. Anderson (translation and introduction). 1950–1982. Bernardino de Sahagún, Florentine Codex: General History of the Things of New Spain, Translation of and Introduction to Historia General de Las Cosas de La Nueva España (12 Volumes in 13 Books). Salt Lake City: University of Utah Press. (Digital facsimile edition 2009: 16 DVDs, Tempe, Arizona: Bilingual Press, 2009. Reproduced with permission from Arizona State University Hispanic Research Center.) Dudenredaktion Hrsg. 2005. Bildwörterbuch. Band 3 Dudenverlag, Mannheim/Leipzig/Wien/ Zürich. Dudenredaktion Hrsg. 2006. Aussprachewörterbuch. Band 6 Dudenverlag, Mannheim/ Leipzig/Wien/Zürich. Dudenredaktion Hrsg. 2008. Zitate und Aussprüche. Band 12 Dudenverlag, Mannheim/ Leipzig/Wien/Zürich. Dudenredaktion Hrsg. 2010. Stilwörterbuch. Band 2 Dudenverlag, Mannheim/Leipzig/Wien/ Zürich. Dudenredaktion Hrsg. 2010a. Fremdwörterbuch. Band 5 Dudenverlag, Mannheim/Leipzig/ Wien/Zürich. Dudenredaktion Hrsg. 2010b. Bedeutungswörterbuch. Band 10 Dudenverlag, Mannheim/ Leipzig/Wien/Zürich. Dudenredaktion Hrsg. 2012. Redewendungen. Band 11 Dudenverlag, Mannheim/Leipzig/ Wien/Zürich. Dudenredaktion Hrsg. 2013a. Rechtschribung. Band 1 Dudenverlag, Mannheim/Leipzig/Wien/ Zürich. Dudenredaktion Hrsg. 2013b. Herkunftswörterbuch. Band 7 Dudenverlag, Mannheim/Leipzig/ Wien/Zürich. Dudenredaktion Hrsg. 2014. Synonymwörterbuch. Band 8 Dudenverlag, Mannheim/Leipzig/ Wien/Zürich. Dudenredaktion Hrsg. 2016a. Grammatik Band 4 Dudenverlag, Mannheim/Leipzig/Wien/ Zürich. Dudenredaktion Hrsg. 2016b. Wörterbuch der sparchlichen Zweifelsfälle. Band 9 Dudenverlag, Mannheim/Leipzig/Wien/Zürich. Fisher, Louise, Wayne Leman, Leroy Pine, Sr., and Marie Sanchez. 2007. Cheyenne Dictionary. Lame Deer, MT: Chief Dull Knife College. Online: http://www.cdkc.edu/ cheyennedictionary/index.html. Forbes, P. B. R. 1933. “Greek Pioneers in Philology and Grammar.” The Classical Review 47: 105.

120 Richard A. Rhodes and Lyle Campbell Gove, Philip B. 1961. “Preface.” Webster’s Third New International Dictionary. Springfiled, MA: G. & C. Merriam. Gragg, Gene B. 1995. “Babylonian Grammatical Texts.” Concise History of the Language Sciences: From the Sumerians to the Cognitivists, edited by E. F. K. Koerner and R. E. Asher, 19–21. Oxford: Pergamon. Haspelmath, Martin. 2010. “Framework-Free Grammatical Theory.” The Oxford Handbook of Linguistic Analysis, edited by Bernd Heine and Heiko Narrog, 341–366. Oxford: Oxford University Press. Hill, Kenneth C., Ekkehart Malotki, Mary E. Black, Emory Sekaquaptewa, and Michael Lomatuway’ma. 1998. Hopi Dictionary/Hopìikwa Lavàytutuveni: A Hopi-English Dictionary of the Third Mesa Dialect. Tucson: University of Arizona Press. Himmelmann, Nikolaus. 1998. “Documentary and Descriptive Linguistics.” Linguistics 36: 161–195. Himmelmann, Nikolaus. 2006. “Language Documentation: What Is It and What Is It Good for?” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 1–30. Berlin: Mouton de Gruyter. Himmelmann, Nikolaus. 2012. “Linguistic Data Types and the Interface Between Language Documentation and Description.” Language Documentation & Conservation 6: 187–207. http://nflrc.hawaii.edu/ldc/. Hovdhaugen, Even. 1982. Foundations of Western Linguistics: From the Beginning to the End of the First Millenium A.D. Oslo: Scandinavian University Press Kari, James. 1989. “Review of a Grammar of Slave by Keren Rice.” Anthropological Linguistics 31(3/4): 288–291. Kari, James and Jeff Leer. 1984. “Review of the Navajo Language: A Grammar and Colloquial Dictionary by Robert W. Young, William Morgan.” International Journal of American Linguistics 50: 124–130. Langendoen, D. Terence. 1975. “Review of A Grammar of Contemporary English by Randolph Quirk, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik.” Journal of Linguistics 11: 277–280. Lees, Robert B. 1970. “Review of Turkish Grammar by G. L. Lewis.” Foundations of Language. 6: 122–137. León-Portilla, M. 2002. Bernardino de Sahagún: The First Anthropologist. Norman: University of Oklahoma Press. Lewis, Geoffrey L. 1967. Turkish Grammar. Oxford: Clarendon Press. Liddell, Henry George, Robert Scott, Henry Stuart Jones, and Roderick McKenzie. 1925. Greek- English Lexicon. 9th ed. Oxford: Oxford University Press. Matilal, Bimal Krishna. 1990. The Word and the World: India’s Contribution to the Study of Language. Oxford: Oxford University Press. Ménage, V. L. 1969. “Review of Turkish Grammar by G. L. Lewis.” Bulletin of the School of Oriental and African Studies 32: 167–169. Mufwene, Salikoko. 2017. “Language Vitality: The Weak Theoretical Underpinnings of What Can Be an Exciting Research Area. [Perspectives].” Language 93(4): e202–e223. Nguyễn, Văn Khôn. 1966. Việt-Anh Tự Điển. Saigon: Nhà Sách Khaì Trí. Oxford Dictionaries. 2012. The Oxford Latin Dictionary. Oxford: Oxford University Press.

The Goals of Language Documentation 121 Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1975. A Grammar of Contemporary English. London: Longman. Rehg, Kenneth L. 2007. “The Language Documentation and Conservation Initiative at the University of Hawai‘i at Mānoa.” In Documenting and Revitalizing Austronesian Languages, edited by D. Victoria Rau and Margaret Florey, 13– 24 (Language Documentation & Conservation Special Publication No. 1). Honolulu: University of Hawai‘i Press. https:// scholarspace.manoa.hawaii.edu/bitstream/10125/1350/1/02rehg.pdf Rhodes, Richard A. 1990a. “Ojibwa Secondary Objects.” In Grammatical Relations: A Cross Theoretical Perspective, edited by K. Dziwirek, P. Farrell, and E. Mejías-Bikandi, 401–414. Palo Alto, CA: CSLI Stanford. Rhodes, Richard A. 1990b. “Obviation, Inversion, and Topic Rank in Ojibwa.” Berkeley Linguistic Society Parasession to the 16th Annual Meeting, edited by David Costa, 101–115. Berkeley, CA: Berkeley Linguistic Society. Rhodes, Richard A. 1993. “Syntax vs. Morphology a Chicken and Egg Problem.” In Berkeley Linguistic Society 19th Annual Meeting, Special Session, edited by David Peterson, 139–147. Berkeley, CA: Berkeley Linguistic Society. Rhodes, Richard A. 1998. “The Syntax and Pragmatics of Ojibwe mii.” Papers of the Twenty-ninth Algonquian Conference, edited by H. C. Wolfart, 286–294. Winnipeg, Canada: University of Manitoba. Rhodes, Richard A. 2006. Clause Structure, Core Arguments, and the Algonquian Relative Root Construction (the 1998 Belcourt Lecture, University of Manitoba). Winnipeg, Canada: Voices of Rupert’s Land. Rhodes, Richard A. 2010a. “Missing Obliques: Some Anomalies in Ojibwe Syntax.” In Hypothesis A/Hypothesis B: Linguistic Explorations in Honor of David M. Perlmutter, ed ited by Donna B Gerdts, John C. Moore, and Maria Polinsky, 427–45. (Current Studies in Linguistics 49). Cambridge, MA: MIT Press. Rhodes, Richard A. 2010b. “Relative Root Complement: A Unique Grammatical Relation in Algonquian Syntax.” In Rara & Rarissima: Documenting the Fringes of Linguistic Diversity (Empirical Approaches to Language Typology), edited by Michael Cysouw, Orin Gensler, and Jan Wohlgemuth, 305–324. Berlin: Mouton de Gruyter. Rhodes, Richard, Lenore A. Grenoble, Anna Berge, and Paula Radetzky. 2007. Adequacy of Documentation (a preliminary report to the CELP). Washington, DC: Linguistic Society of America Committee on Endangered Languages and Their Preservation. Rice, Keren. 1989. A Grammar of Slave. Berlin and New York: Mouton de Gruyter. Sadeniemi, Matti, ed. 2002 [1951–1961]. Nykysuomen sanakirja, 6 volumes. 15th ed. Valtion toimeksiannosta teettänyt (original edition 1951–1961, Suomalaisen Kirjallisuuden Seura/ Finnish Literarture Society). Helsinki: WSOY. Sadok, Jerry. 1975. “Read at Your Own Risk: Syntactic and Semantic Horrors You Can Find in Your Medicine Chest.” Paper presented at the Tenth Regional Meeting of the Chicago Linguistic Society, 599–607. Chicago. Sandys, John E. 1903. History of Classical Scholarship. Cambridge: Cambridge University Press. Staal, J.F. 1974. “The Origin and Development of Linguistics in India.” In Studies in the History of Linguistics: Traditions and Paradigms, edited by Dell Hymes, 63–74. Bloomington: Indiana University Press.

122 Richard A. Rhodes and Lyle Campbell Woodbury, Anthony. 2010. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter Austin and Julia Sallabank, 159–186. Cambridge: Cambridge University Press. World Wide Words: Investigating the English Language across the Globe. Online: http://www. worldwidewords.org/articles/howmany.htm. Young, Robert W. and William Morgan. 1987. The Navajo Language: A Grammar and Colloquial Dictionary. 2nd ed. Albuquerque: University of New Mexico Press.

Chapter 6

D o cum entat i on, Lingu istic T yp ol o g y, and Formal G ra mma r Keren Rice

The recognition of the seriousness of language endangerment has brought about many changes in linguistics in the past twenty-five or so years, broadening and energizing the field and helping it to be more ethically accountable. In this chapter, I examine the impact of language documentation on two areas of linguistics, linguistic typology and formal grammar, and the impact of typology and formal grammar on documenta tion. Other areas of linguistics have also changed in light of the recognition of language endangerment—there is probably no area that has remained unaffected. I do not con sider these other areas, but see note 1 for a few references.1 I use the term “linguistic theory” as a cover for both linguistic typology and formal grammar. I adopt the term “formal grammar” used by Polinsky (2010) rather than formal theory or the like, recognizing the theory in both typology and formal grammar. An important goal of linguistic theory in this broad sense is to gain an understanding

Thank you to Lyle Campbell for helpful comments and to Virgilio Partida Peñlva for editing. This work is supported by the Canada Research Chair in Linguistics and Aboriginal Studies. 1 For instance, see Nagy (2009), Meyerhoff (2015), and articles in Stanford and Preston (2009) for research in sociolinguistics as well as a symposium meeting on documenting variation in endangered languages at the 2016 Linguistic Society of America meeting, now published as Hildebrandt, Jany, and Silva (2017). In psycholinguistics, see, among others, the 2012 LSA symposium on psycholinguistic research on lesser-studied languages, some of the articles appear in a special issue of Language, Cognition, and Neuroscience (2015), edited by Norcliffe, Harris, and Jaeger; see also recent work by Clemens et al. (2015). For interdisciplinary work in language documentation, see, for instance, articles in Thieberger (2012) on topics including ethnobiology, ethnomathematics, geography, and toponymy. There is work in computational linguistics (e.g., Bird 2009; Bender et al. 2013). Whalen and McDonough (2015) survey laboratory research done in the field. This is but a small sample of the kind of research that has resulted from the recognition of language endangerment.

124 Keren Rice of how languages can be both so similar and so diverse at the same time. Given this, the materials of language documentation and linguistic theory clearly must communi cate: documentation has long provided impetus for theoretical developments, and, in turn, theoretical developments lead to new questions about languages that would have been difficult to pose otherwise. In order to understand the relationship between documentation and the two theo retical areas covered in this article, it is valuable to examine the goals of documenta tion, typology, and formal grammar, what counts as data, and what types of data are preferred in each theory. In section 1 I define language documentation and survey the types of data that are important to it. I then examine typology, asking about questions of concern, kinds of data, and contributions of documentation to typology and vice versa. Following that, I do the same with formal grammar.

1. Defining language documentation Language documentation, as addressed in several articles in this volume, is formally de fined in the literature by Woodbury (2011, 159) as follows: “Language documentation is the creation, annotation, preservation, and dissemination of transparent records of a language.”2 Documentation has other components—interdisciplinarity, involvement of the speech community—but I do not focus on these (see, for instance, Himmelmann (1998, 2006) and Woodbury (2011), among others, for discussion). A good documentary corpus has several qualities—it is diverse; large; ongoing, distributed, and opportunistic; transparent; preservable and portable; and ethical (see Woodbury (2003; 2011, 181)). I use the term “documentation” in a broad sense, to include the collection, annota tion, and preservation of a corpus together with linguistic analysis and the production of a grammar and a dictionary as well as the rich corpus of recordings. I do not discuss the parts of documentation that involve archiving and curation, but focus on the kinds of theoretical work that can arise from a documentary corpus. I concentrate in partic ular on what is regarded as critical data in a theory, and I thus next provide an overview of the types of data that are valued in documentation.

2. Types of data in language documentation A review of the types of data that are considered in language documentation is im portant in understanding different ways of interpreting the contributions of 2

See Campbell and Hauk, with Hallamaa (2015), for various definitions of documentation.

Documentation, Linguistic Typology, and Formal Grammar 125 documentation. Lüpke (2009) provides a good overview of data types and their strengths and weaknesses. See also Mosel (2011) on methods in language documenta tion. Lüpke (2009, 61), following Himmelmann (1998, 2006), divides data, which she terms communicative events, into three major types, observed communicative events, staged communicative events, and elicitation. I briefly consider each type, beginning with what is often considered the sine qua non of documentation, observed communi cative events. The observed communicative events category encompasses a range of types of data, including, among others, exclamatives, directives, conversations, monologues, and ritual speech (Himmelmann 1998, 180). Material from observed communicative events has several distinct advantages—it has a high degree of ecological validity, it represents natural speech, it highlights high-frequency forms, and it can give insight into culture. Major minuses include that it is not very controlled, it does not offer negative evidence, and it may not reveal things that are of low frequency in the language or in the type of materials recorded. Staged communicative events involve activities such as descriptions of visual stimuli, matching games, and the like. Lüpke (2009, 66) notes that, since these are staged, they do not have immediate ecological validity, as they are motivated by the interests of the researcher and do not come from the language being used in its natural setting. At the same time, they do not have the direct linguistic influence from another language that elicitation does. (See, for instance, Majid (2012) for an overview of non-linguistic stimuli in semantics, and discussion in Whalen and McDonough (2015, 401–402) on materials for fieldwork more generally.) At the other extreme from observed communicative events in terms of types of data is elicitation. This includes subtypes such as grammaticality judgments and translation equivalents. Pure elicitation has several advantages. It is easy to use when starting work on an unknown language, it is controlled, it offers ways to obtain infre quent words and constructions, and, perhaps most important, it offers negative evi dence. On the other hand, it can lead to structures that are odd, and it is considered to be of low ecological validity, being limited in terms of insight that it can give into culture. This review of data types is of value because different theoretical perspectives do not accord the same weight to different data types based on the questions that are central to the theory. I now briefly discuss of one of the major motivators of documentary linguistics in re cent times, linguistic diversity and the threat posed to it by language endangerment.

3. Why documentation? Because a major concern for some linguists as the impact of language endangerment became obvious involved diversity (e.g., Hale 1992), I focus on research on the study

126 Keren Rice of diversity in typology and formal grammar. For instance, Mithun (2001, 34) writes as follows: With the accelerating loss of linguistic diversity in our world, it is a time for serious thought about how to record as much as possible of the richness still around us. In many cases what we choose to document may be the principal record of an entire linguistic tradition, both for the descendants of the speakers and for others seeking to understand the possibilities of the human mind. It is a time to consider not only how to fill recognizable gaps in current knowledge, but also how to provide the basis for answers to questions we do not yet know enough to ask. In most cases, these goals can best be met by a mix of styles of collaboration between speakers and linguists. The product of fieldwork will ultimately be shaped not only by the nature of the lan guage, but also by the methods chosen, by the roles assumed by the speakers, and by the preparation and sensitivity of the linguist.

Davis, Gillon, and Matthewson (2014, 180), like Mithun (2001), are concerned with linguistic diversity, writing: At least half of the world’s nearly 7,000 languages will no longer be spoken by the end of this century. . . . This imminent large-scale language extinction has alarming consequences for the investigation of linguistic diversity. If we wish to understand the scope and limits of crosslinguistic variation, it is imperative that in the near fu ture we gather as much information about endangered languages as we can, in a form that allows systematic and accurate crosslinguistic comparison. The need for such work is accepted by linguists of all persuasions; more contro versial is how we should go about it. The difficulty of the task is compounded by the fact that nearly all endangered languages are spoken by small and aging populations, and many have already fallen into disuse even among those who speak them flu ently. These circumstances pose unique challenges for fieldworkers, who must find the most effective way to probe for linguistic diversity in the limited time frame still available.

Mithun, writing from a typological perspective, and Davis et al. writing from a formal grammar perspective, recognize that many languages in the world are not being transmitted, and that this brings responsibility to linguists. In this, they reflect earlier work by linguistics such as Hale (1992, 35), who speaks of the importance of linguistic diversity to the linguistic profession and more broadly, and Krauss (1992, 9), who talks of the necessity of linguists documenting languages (as well as working educationally, culturally, and politically to increase the chance of languages surviving). Linguists of all stripes overall agree about the intellectual value of linguistic diver sity. Both typologists and formal linguists working with speakers of small languages also value the languages from a community perspective, and often engage in work with a community on language conservation and sustainability. Where we see disagreement, as we will see in the discussion of the relationship between documentation and these theories, is in the primary questions that motivate the theory and in the kinds of data

Documentation, Linguistic Typology, and Formal Grammar 127 that are privileged. Thus, we will see that while all three major methods introduced in section 2 are used in both typology and formal grammar, in general, typology gives more weight to, as Mithun (2001, 53) puts it, data that arises when “speakers are allowed to speak for themselves,” while formal grammar stresses a hypothesis-driven methodology with extensive data coming from carefully controlled elicitation (Davis et al. 2014, e219).

4. Documentation and linguistic typology In this section I examine the relationship between documentation and linguistic ty pology, beginning with a discussion of goals and methods in typology. I then consider data types of greatest value in typological work, and outline a small number of the contributions that documentation has made to the study of typology, and vice versa.

4.1. Goals of linguistic typology There are recent works that address language documentation and linguistic typology, and I draw on those in this section. Palosaari and Campbell (2011, 100) observe that “typology is the classification of languages according to linguistic traits and the compar ison or classification of linguistic features across languages.” Epps (2010, 635) remarks that the goal of linguistic typology is “to define the limits, patterns, and explanations that characterize cross-linguistic variation,” noting its close connection with language universals. Epps and Arkhipov (2009, 1) note that linguistic typology is concerned with classifying human languages according to their various properties, defining universals and strong tendencies, and identifying fine distinctions in our understanding of lin guistic categories and units. Typology values both cross-linguistic generalizations and the discovery and explanation of exceptions to those generalizations. Polinsky (2010, 652), in an article comparing typology and formal grammar, remarks in a statement that she admits is an oversimplification, verging on a caricature, that the primary research question in linguistic typology is what makes natural languages so different from each other.3 Issues of diversity are clearly important for typology, with its focus on cross-linguistic comparison. As Bond (2010, 241) notes, “typology is concerned with identifying cross- linguistic patterns and correlations between these patterns.” It is, he writes, further

3 Typologists are more likely to view typology as a study of the classification of structural types cross- linguistically, of cross-linguistic generalizations concerning patterns among linguistic traits, and as an attempt to explain patterns and classification through appeal to language function in cross-linguistic comparison, with function driving form.

128 Keren Rice concerned with “communicative and diachronic processes that result in the geograph ical and genealogical distributions of features across languages.” Recent work in ty pology stresses the importance of fine-grained variables; (see, for instance, Bickel (2007, 245), as well as Bond (2010, 258) and Epps (2010, 639)).

4.2. Data in linguistic typology It goes without saying that data from a broad survey of languages is critical for work in typology, given its goals, and thus rich language documentation greatly enhances typo logical research. There are various ways of gathering data on which typological work is based. Grammars of languages are generally a starting point, and these are often coupled with questionnaires and stimulus kits. (See, for instance, the material available from the Max Planck Institute for Evolutionary Anthropology at https://www.eva.mpg. de/lingua/tools-at-lingboard/tools.php; http://fieldmanuals.mpi.nl/. See also Mithun (2001) for discussion of the complementary roles of elicitation and texts and Mithun (2012) for discussion of methods in syntactic analysis, focusing largely on texts.) Grammars can be based on a variety of kinds of data—at the extremes, the data might come largely from elicitation, or it might be drawn almost entirely from texts. Linguistic typology prioritizes texts. I focus on examples from grammars written with attention to this type of data that are explicit about the framework being functional-typological. Von Prince (2012) is explicit about the kinds of data that she employs in her thesis, a grammar of the Austronesian language Daakaka. Her primary data source is a text collection corpus consisting of conversations, essays, explanation, reports, and stories. The text collection was supplemented with elicitation of lexical items, phrases, and sentences, as well as permutations of sentences (e.g., shifting word order, paradigmat ically related items), and function-based elicitations (e.g., description of a situation). In her thesis, more than two-thirds of the example sentences come directly from the corpus. Konnerth (2014) clearly lays out her methodology and framework in her thesis on the Tibeto-Burman language Karbi. Konnerth used data based on texts that she recorded as her primary data type. She also collected context-free elicitation data on phonolog ical and grammatical topics, although she notes that she examines grammatical topics largely through textual examples and elicitation based on those. The text genres include dialogue (interview/conversation) and monologue (folk tale, procedural text, personal narrative, pear story); in the monologues, the focus was on folk tales because of its im portance to other fields. Zariquiey Biondi (2011), in his thesis on Kashibo-Kakataibo (Panoan), also discusses his data. He relies on both texts and elicited data, using text examples whenever possible. He notes the importance of elicitation, commenting on its necessity due to some forms being rare in natural speech, complications introduced through complexities in actual speech that are not relevant to the point at hand, and performance mistakes (see Mithun

Documentation, Linguistic Typology, and Formal Grammar 129 2001 on whether speakers make performance errors). He remarks that he found some morphemes through elicitation that were not in the texts. His focus was on monologues (traditional tales and myths, jokes, historical narratives, life stories, narratives of various sorts), and he also used conversation. These dissertations are but a small sample of functionally/ typologically based grammars that have appeared in recent years. Other authors make similar statements about the types of data that they rely on, and most recent grammars are explicit about data sources, with the bulk coming from observed communicative events.

4.3. Contributions of documentation to linguistic typology Given the goals of typology, clear descriptions of language based on texts greatly en rich the field. Bond and Epps both point to some of the specific contributions of doc umentation to linguistic typology. As Epps (2010, 637) puts it, “There is no shortage of examples that illustrate the role of data from particular languages in stretching the limits of what was thought to be possible.” She notes in particular object-initial word order in a number of Amazonian languages, unusual word-initial consonant clusters in some Austronesian languages, and multiple case-marking on a single noun phrase in Kayardild (these contribute information about syntactic role, as expected, but also about tense and mood).4 Another example is the study of insubordination, the use of subordinate clauses as main clauses (e.g., Evans 2007). Palosaari and Campbell (2011) outline numerous contributions from documentation to linguistic theory (and typology in particular) in all linguistic domains. Epps (2010, 640) identifies cases where a focus on texts can give insights that are dif ficult to obtain with an elicited corpus. She cites evidentials, remarking that speakers often omit them in elicitation. (My experience suggests that this is the case with many particles that are difficult to translate—they are omitted in elicitation unless the elicita tion is very carefully done [see the discussion of semantic elicitation in section 5], while they are richly used in at least certain genres of speech, and naturally occurring speech of an appropriate genre is often the only way to begin study of such elements. Once such items are identified, elicitation in controlled conditions is invaluable to sort out the complexities.) Seifart’s (2005, 2009) research on nominal classification in Miraña (Boran lan guage, Amazonia) is a good example of the contribution of documentation to typolog ical theory. Epps (2010, 640) also cites this work. Usual analyses of classifier systems (e.g., Grinevald 2000, 2002; Aikhenvald 2000) present a typology built on two main parameters that define two basic types of systems, one of them with subtypes (Seifart

4

Polinsky (2010, 656–657) comments on Evans’s (2003) analysis, noting, following Corbett (2006), that it leads to a skewed typology, and suggests that further investigation is needed.

130 Keren Rice 2009, 366–367). One major parameter involves the presence or absence of agreement. Noun class systems show agreement; classifier systems do not. Noun class systems show additional characteristics—nouns generally belong to only one class and the systems are closed, while in classifier systems nouns may be associated with more than one classifier, the set of classifiers may be large and open, and the use of classifiers may be variable. Classifier systems are further subdivided by whether they involve numeral classifiers, verbal classifiers, or noun classifiers, with each type having its own semantic character istics. Seifart (2005, 2009) argues that Miraña is a challenge for this classification—“The major difficulty for the typological characterization of the Miraña system thus lies in the fact that an internally heterogeneous set of classifying morphemes occurs in a variety of different morphosyntactic contexts where these forms fulfill a variety of different functions, e.g., derivation and unitization in nouns, formation of demonstratives and numerals, and cross-reference on verbs” (Seifart 2005, 12). Simply put, it is difficult to characterize the system in terms of the usual types. Seifart argues that using a larger number of detailed characteristics, each corresponding to a parameter in a multidimen sional typology, is a fruitful way to pursue the type of challenge raised by the Miraña class markers. Seifart’s documentary work on Miraña challenged the prevalent view of nominal classification systems at the time, enriching typology and providing support for more sophisticated ways of viewing hierarchies. As another example, Bond (2010, 253–255) discusses Schultze-Berndt’s research on Jaminjung, a non-Pama-Nyungan language of Australia. Bond mentions in partic ular types of verbs, labeled by Schultze-Berndt (2000, 2003) as Inflecting Verbs and Uninflecting Verbs. Schultze-Berndt provides detailed criteria for distinguishing these classes of verbs from each other, as well as distinguishing these classes from nouns, in cluding how each patterns with respect to Tense Aspect and Modality (TAM)/person marking and case marking. The major point is that it is important to use fine-grained variables in the description in order to make meaningful comparisons between lan guages. Detailed documentation allows for the refinement of more superficial variables. Referential hierarchies (e.g., Silverstein 1976) have been a topic of considerable dis cussion in the typology literature, and there have been arguments for revisiting them based on documentation. Arguments are of different sorts. For example, Epps (2009), based on extensive documentation of Hup (a Nadahup language of the Amazonian Vaupés region), argues that an unusual situation arises in the language, with the inter action between object marking and number marking giving a typologically marked re sult. She suggests (2009, 99) that this can be accounted for by considering competing motivations, a common theme in the functional literature (see, for instance, the recent collection edited by MacWhinney, Machukov, and Moravscik 2014). Epps thus points to documentation enriching the understanding of both typological patterns and why di versity among languages might exist. Schnell (2012) suggests that extensive documentation provides deeper insight into referential hierarchies by allowing for a study of language-internal variation. He examines variation in the use of pronouns to realize patient arguments in Vera’a (Oceanic, North Vanuatu). He finds that while human patient arguments generally

Documentation, Linguistic Typology, and Formal Grammar 131 occur as a pronoun and non-humans as zero, this is only a strong tendency. He fur ther notes the marking of patient arguments is different depending on the type of text. These observations are only possible, he argues, based on large databases, a point re inforced by Simpson (2012), also in work on referential hierarchies. Simpson is par ticularly interested in Arrernte (Pama-Nyungan, Australia), which has been claimed to be a counterexample to the referential hierarchy involving the linking of subject/ agent and object/patient with animacy. Simpson traces a possible historical pathway to how such a counterexample could develop, but notes the difficulty of finding the appro priate examples to test whether the pattern that she proposes is appropriate. Simpson (2012, 81) calls for large corpora with audio-visual recordings to study variation and to understand information structure. It would be easy to multiply examples of the types of contributions documenta tion in the sense of working from a large corpus of natural speech, supplemented with elicitation and testing with stimulus kits and the like, has made to linguistic typology, deepening the knowledge of what patterns can be found in languages. As Epps (2010, 635) points out, “Linguistic typology and language documentation are closely aligned, even symbiotic endeavours.” Documentary linguistics is also helping to reshape typology in other ways. Epps (2010, 641) notes in particular a question posed by Bickel (2007, 239): “What’s where why?,” fo cusing on the role of the traditional universal preferences studied by typologists, but also on geographical, genealogical distributions, diachronic change, and the interaction between language and social, cognitive, and cultural factors. This leads, Epps notes, to a more holistic view of language, sometimes called a return to a Boasian perspective (e.g., Woodbury 2011). Woodbury (2011, 163) summarizes Boas’s perspective on the study of languages, suggesting that Boas did not see a dichotomy between language use and lin guistic knowledge. Woodbury further writes that Boas acknowledged “a universal core of grammatical concepts, structures and categories, alongside an openness to areas where these may vary, and in the areas where they vary, an openness to both genetic in heritance and contact-based diffusion.” This Boasian approach has become increasingly important in the field of typology, to a large degree led by the focus on documentation of small languages.

4.4. Contributions of linguistic typology to language documentation Not only has documentation contributed to typology, typology has also contributed to documentation, as Bond (2010), Epps (2010), and others remark. Bond (2010, 249– 250) notes that typology informs documenters, teaching about types of cross-linguistic variation found and providing concepts and terminology that are useful in documen tation. Epps (2010, 644) stresses the importance of representation to documentation— typology provides a working vocabulary that is helpful. An awareness of diversity helps one remain open-minded about what one might encounter in a new language.

132 Keren Rice Other trends in typology have led those doing documentation to think about the kinds of data that are important. For example, quantitative work has become impor tant in linguistics as a discipline over recent years, and documentation is no exception to this. See, for instance, Bickel (2007) for discussion of quantitative typology, an area that can be fruitfully employed in documentation in thinking about what an appropriate corpus might be, for instance.

4.5. Summary Epps (2010, 646) concludes that linguistic typology and documentary linguistics “share much the same architecture: a common theoretical framework, an awareness of cross-linguistic similarity and variability, and a goal of forming and representing generalizations over diverse realizations,” with both documentation informing typology and typology informing documentation. Both are the richer for this.

5. Documentation and formal grammar In this section I address the relationship between documentation and formal grammar, beginning with discussion of the goals and methods of formal grammar and then outlining some of the contributions that documentation has made to the study of diver sity through the lens of formal grammar, and vice versa.

5.1. Goals of formal grammar Documentation and linguistic theory is the topic of Sells (2010). Polinsky and Kluender (2007) and Polinsky (2010) and references therein are useful sources in understanding the goals of formal grammar in contrast with those of typology. Polinsky and Kluender (2007) and Polinsky (2010) define the primary question of formal grammar as follows, from Polinsky (2010, 652) (as with her discussion of the goal of typology, Polinsky is careful to note the oversimplification in the question): What makes natural languages so similar to each other? Polinsky writes that the goal of formal grammar is the construction of a theory of language, not of languages, distinguishing the goal from that of typology. Relevant data in formal grammar is that which allows the analyst to test his/her theory (Polinsky 2010, 653). Sells (2010, 211) remarks that, with respect to language documentation, and, in par ticular, grammar writing, what is important about formal grammar (Sells uses the term “linguistic theory”) is that it provide a flexible representational system to account for

Documentation, Linguistic Typology, and Formal Grammar 133 properties of hierarchies, relationships, systematicities, and shared inheritances. Since constructing a theory requires testing hypotheses based on data, formal theory requires that one go beyond the data of texts in order to understand what the speaker knows about the language, turning to language that is not necessarily easily found without constructing appropriate situations.

5.2. Data in formal grammar In order to study diversity, data from a range of languages is critical. Formal grammar privileges a different type of data than typology does. Grammars can be helpful, and toolkits are of increased use (see, for instance, Tonhauser et al. 2013 and Totem Field Storyboards 2010–2018 (http://totemfieldstoryboards.org/)). However, elicitation is the most highly valued data type. Much of the recent discussion on data in formal grammar focuses on syntax and, especially, semantics, and I focus on these areas in the following discussion. The type of data that is important for semantic fieldwork is laid out clearly in Matthewson (2004), in the articles in Bochnak and Matthewson (2015), and in Tonhauser and Matthewson (2015). Matthewson (2004) presents a set of method ological principles for conducting fieldwork in semantics within formal grammar. Some of these are commonly adopted in syntactic fieldwork as well. Here I summa rize Matthewson’s major contribution, followed by discussion of data in the formal grammar literature. Matthewson begins with the assumption that the semantic fieldworker has a good working knowledge of the phonology, morphology, and syntax of the language of study (2004, 370). She recognizes the value of texts as part of documentation, noting the im portant point made by Mithun (2001) and others that spontaneous speech exposes the researcher “to phenomena that are outside the boundaries of his/her prior knowl edge or imagination” (2004, 376). But, she notes, “A texts-only approach relies on the assumption that we are capable of extracting all relevant information merely from a set of texts, even though the amount of data we can gather by this method is a fraction of the amount a child hears while acquiring a language” (2004, 376). She further notes that the translations in texts often do not give adequate information for semantic analysis (2004, 377). In response to these challenges, Matthewson (2004, 389) argues for the value of elici tation in language documentation in addition to the usual text collection. She identifies two types of elicitation, translations and judgments. She provides detailed exemplifi cation of semantic methodologies, noting that translations “should always be treated as a clue rather than a result” as they do not provide direct evidence about the truth conditions and felicity conditions that are important in semantics. In order to address the challenges of elicitation, Matthewson proposes that grammaticality judgments, truth value judgments, and felicity judgments are of value, presented with an appro priate discourse context.

134 Keren Rice Having briefly reviewed the some of the methods used by many working in a formal grammar framework, focusing on semantics, I now examine some formal grammar dissertations, summarizing the researcher’s comments about data. Murray (2010, 10), in research on evidentials focused on Cheyenne (Algonquian), writes that she used a grammar and dictionary of Cheyenne as well as texts in her work. Most of her examples, she notes, are from elicitation, with some from texts and some from observations of natural speech. With respect to elicitation, she checked judgments with several people. She used acceptability judgments, including cases with constructed examples in constructed contexts, constructed dialects and texts, naturally occurring examples in naturally occurring contexts, textual examples in their original contexts or with slight modifications of context, translation in a contextual context, and questions about what would be appropriate in a given context. Bogal-Allbritten (2016) also is explicit about how she studied meaning in Navajo (Na-Dene). Bogal-Allbritten draws a parallel between elicitation based on one-on-one interviews and experimental work, both employing materials that are carefully designed to provide insight into the research questions. She presented consultants with a context in English followed by a sentence in Navajo, and asked if that sentence sounded good, or like something they or another fluent Navajo might use in that situation. She tested the judgments in different contexts, and paid attention to the remarks about the data from speakers. Bogal-Allbritten used translation of English sentences into Navajo to a limited degree, and in detailed contexts. She points out that she used corpora when possible, particularly to get an idea of the kinds of patterns that exist in the language. However, she notes, corpora do not give negative data, the positive data is likely to be incomplete, and there can be lack of contextual clarity.5 While the examples above are from work on semantics, similar points are made in syntax dissertations that address the nature of the data. Coon (2010a), writing of her work on com plementation and ergativity in Chol (Mayan), notes that she used spontaneous speech and narratives, interviews with speakers, transcribed spontaneous speech, the scholarly work of Chol-speaking linguists, and her own field notes including formal elicitation and casual speech overheard and discussed. In elicitation, she constructed Chol sentences, gave a con text, and asked whether the sentence was acceptable or not. She also asked for translations from Spanish into Chol and vice versa. Cable (2007), in his thesis on the syntax and semantics of wh-questions in Tlingit (Na-Dene), clearly describes the data types that he used. There is a large collection of 5 Bogal-Allbritten (2016) points to an interesting example of a difference between what is found in texts and what can be gleaned from elicitation. Citing work on comparative constructions, she notes that in an hour-long oral history interview conducted in Navajo, no examples of adjectival comparative constructions occurred (2016, 24), but that speakers formed comparatives with no reluctance in elicitation sessions. Lovick and Tuttle (2016) highlight how little they found comparatives used in stories in Koyukon and Lower Tanana, Na-Dene languages of Alaska; they suggest that comparisons might be interpreted as bragging. Bogal-Allbritten is interested in language knowledge; Lovick and Tuttle are interested in language use. Both data types are important in reaching as deep an understanding of the language as possible.

Documentation, Linguistic Typology, and Formal Grammar 135 Tlingit texts available, and Cable (2007, 58) writes that most of his examples are drawn from these texts. He notes that the texts provide positive data, but also implicit negative data in that there is a sizable number of texts with material of interest for his study. Cable analyzed several core texts in depth and also worked with five speakers. In this, he used translation of English sentences, comparison of this material with other sentences that had been determined to be grammatical, and general comments from speakers. Cable worked with groups of speakers rather than with individuals whenever possible. His work thus depends critically on texts, gathered by others, with other methods used in a supplementary way. While the sine qua non of data in typology is texts, in formal grammar it is more likely to be elicited material, elicited in careful, controlled, and systematic ways with rich con text provided. As Matthewson (2004) points out, such work is not possible without a good knowledge of the language. The contributions to formal grammar recognize the value of texts, but rely heavily on elicitation to obtain material that cannot be found in the texts and judgments that it is not always possible to discern from texts alone.

5.3. Contributions of documentation to formal grammar In section 5.2 I discussed how documentation has contributed to formal grammar by increasing attention to methodology with a focus on the value of carefully structured elici tation in syntax and semantics. In this section I examine contributions to the understanding of linguistic diversity. Perhaps the most obvious effect of documentation is the increased energy being paid to understudied and small languages in formal grammar research. While Ken Hale was long a proponent of formal research on small languages, such research has increased in recent years—see, for example, research by Coon (e.g., 2010a, 2010b) on Chol, Bliss (e.g., 2013) on Blackfoot, Polinsky on a variety of languages, Saxon (e.g., 2000) on Tɬįchǫ, Wilhelm (2008, 2014) on Dëne Su̜ɬiné, and Johns and Kučerova (2017) on Inuit lan guages, for a very small sample. All of these researchers have several aims in mind, one to understand the language in a deeper way and another to contribute to the under standing of linguistic theory and of variation in language.6 Technological developments have allowed language documentation to develop and thrive in its current form, and have had a strong effect on formal grammar. There is increasing attention to topics such as information structure, where phonological, syn tactic, and pragmatic factors often intersect, requiring careful work on topics such as intonation and pausing; (see, for instance, Clopper and Tonhauser (2013) on focus constructions in Paraguayan Guaraní and Verhoeven and Skopeteas (2015) on focus constructions in Yucatec Maya).

6

While there have been many grammars produced by people working in typology, this has generally not been the immediate goal for those working in formal grammar.

136 Keren Rice As Epps (2010) noted for the study of linguistic typology, documentation provides data that in turn leads to theoretical questions. Epps cites questions such as word order (e.g., object-initial word order) and evidentiality. Polinsky (2010, 654–655) refers to con sistency in headedness and the Accessibility Hierarchy and their influence on formal grammar. These generalizations grow out of typological work, and could not have been reached without extensive documentation. I focus now on some contributions of documentation in formal grammar to under standing linguistic diversity, a topic of key interest in formal grammar. Formal grammar asks questions about the limits of diversity, how much languages can vary, and how diversity is limited. Sells (2010) presents several examples of the kind of work that is of interest in formal grammar. I begin with an example that illustrates the importance of controlled data and of negative data, a case study of Barai (Papuan). Drawing on data from Foley and Olsen (1985), Sells shows that two verbs can occur in a sentence but in two different constructions (Sells 2010, 222). (1a) illustrates what Sells calls a control verb and (1b) what he calls a V-V complex predicate. (1) a. Control verb (“sit down to V”) fu fi fase isoe 3sg sit letter write “He sat down to write a letter.” b. V-V complex predicate fu fase fi isoe 3sg letter sit write “He sat writing a letter.” Support for different constructions comes from examining the different meanings when an adverb is added to the sentence, as in (2), with the adverb isema “wrongly.” (2) a. fu isema fi fase isoe 3sg wrongly sit letter write “He sat wrongly and wrote a letter.” b. fu fi fase isema isoe 3sg sit letter wrongly write “He sat down and wrote a letter wrongly.” c. fu fase isema fi isoe 3sg letter wrongly sit write “He wrongly sat writing a letter.” d. *fu fase fi isema isoe 3sg letter sit wrongly write Sells, again following Foley and Olsen, remarks that the adverb precedes the verb that it modifies. The impossibility of placing the adverb between the parts of what he calls

Documentation, Linguistic Typology, and Formal Grammar 137 a complex verb suggests that the words form a unit of some sort that cannot be inter rupted. Sells goes on to raise a number of questions about analysis, but the relevant point here is twofold. First is the use of negative data—what does not occur is very impor tant. Second, it is unlikely that these data would have been found in natural speech— elicitation was likely an important method here. Davis et al. (2014) focus on how to study diversity, supporting a hypothesis- driven research approach. They point out that materials generally associated with documentation—grammars, texts, dictionaries—are vital parts of documentation, but that the kinds of material that result from formal linguistic analysis are also part of doc umentation. The reason for this has been mentioned—in work based entirely on a text corpus, there are things that will not appear, and there is not negative evidence. Davis et al. (2014) present a number of cases where hypothesis-driven elicitation was critical to contributing to the understanding of diversity.7 One example involves lex ical categories, and a claim in the literature that Salish and Wakashan languages of the Pacific Northwest in North America involve category neutrality, without a distinction between nouns and verbs. The foundation for this claim rests in the ability of items to function as both predicates and arguments, as in the following forms from St’át’imcets (also called Lillooet; Salish).8 (3) St’át’imcets categories (Davis et al. 2014, e196) a. šmúɬač ta=kwúkwpiʔ=a (nominal predicate, nominal argument) woman det=chief=exis “The chief is a woman.” b. kwúkwpiʔ ta=šmúɬač=a (nominal predicate, nominal argument) chief det=woman=exis “The woman is a chief.” c. tɬ’iq ta=kwúkwpiʔ=a (intransitive verbal predicate, nominal argument) arrive det=chief=exis “The chief arrived.” d. kwúkwpiʔ ta=tɬ’iq=a (nominal predicate, intransitive verbal argument) chief det=arrive=exis “The one who arrived is a chief.” This fluidity has led to the claim that there is not a formal distinction between categories of noun and verb in the language. However, in controlled elicitation, differences be tween nouns and verbs emerge. In particular, only nouns can head relative clauses and complex predicates. Relative clauses are illustrated below (Davis et al. 2014, e198)—(4a), with a nominal head, is acceptable while (4b, c), with predicate heads, are not.

7

See the papers in response to Davis et al.’s article for a critique. Abbreviations follow the Leipzig glossing rules, with the following additions: dir: directive transitivizer, exis: assertion of existence. 8

138 Keren Rice (4) St’át’imcets categories a. ʔác’χ-ən=ɬkan [na=šáq’w=a špzúzaʔ] see-dir=1sg.sbj [absent.det=fly=exis bird] “I saw a flying bird.” b. *ʔác’χ-ən=ɬkan [na=šáq’w=a kwikwš] see-dir-1sg.sbj [absent.det=fly=exis small] “I saw a flying small (thing).” c. *ʔác’χ-ən=ɬkan [na= kwikwš=a šáq’w] see-dir-1sg.sbj [absent.det=small=exis fly] “I saw a small flying (thing).” The authors conclude that there is a difference between nouns and verbs in St’át’imcets and other Northwest Coast languages. As they point out, the data that led to the claim of category neutrality came from the absence of evidence. The preferred documentary method of texts may well not reveal such data, but such data is important in having as full an understanding of the language as possible. In trying to understand the diversity that exists cross-linguistically, valid analyses of the individual languages are critical, and those may best be achieved through a study of both texts and language knowledge. Another example from Davis et al., this one from Nuu-chah-nulth (Wakashan), involves what are called Condition C effects in formal syntax. In the examples in (5), in (5a), the noun “Lucy” is the subject of the main verb wawa:ma “say,” while in (5b), it is the subject of the complement verb “make bread.” (5) Nuu-chah-nulth (Davis et al. 2014, e192) a. wawa:ma Lucy [ʔanič p’ap’ac’aqtɬi:ɬw’it’as say Lucy comp bread-make “Lucyi said that shei will make bread tomorrow.” b. wawa:ma [ʔamič Lucy p’ap’ac’aqtɬi:ɬw’it’as say comp Lucy bread-make “Lucyi said that shei will make bread tomorrow.”

ʔam’i:tɬik] tomorrow ʔam’i:tɬik] tomorrow

Davis et al. (2014, e194) argue that both sentences are grammatical with the reading that “Lucy” and “she” are coreferential, different from English where they can be interpreted as coreferential only when “Lucy” is the grammatical subject of the matrix verb. The authors ask: Could searching corpora have uncovered the Nuu-chah-nulth condition C facts? Unlikely. In addition to the condition C-defying cases, Nuu-chah-nulth speakers more commonly produce standard condition C-obeying configurations . . ., probably because of a general preference for anaphoric over cataphoric dependencies. And even if condition C-defying cases did turn up in texts, they might not be recognized as such: to properly probe for condition C effects, one needs to present test sentences in a controlled context, to eliminate irrelevant covaluations. Moreover, since

Documentation, Linguistic Typology, and Formal Grammar 139 condition C rules out certain covaluations, the relevant stimuli necessarily involve the possibility of negative data, and as such, even million-sentence corpora . . . could only very indirectly establish the presence of condition C effects.

I round out this section with a discussion of Davis et al.’s work on determiner se mantics, focusing on definiteness. They ask whether Salish languages have definite determiners. They define specific criteria for definiteness, that definite determiners can only be used in familiar contexts and that definite determiners place a uniqueness requirement on the referent matching the NP description within a context. Through elicitation providing negative as well as positive evidence, they argue that the Salish lan guages that they examine lack definite determiners. In these languages, determiners can be used in novel as well as familiar contexts and do not obligatorily refer to a unique or maximal referent in the context. The sentences in (6) are from a story recorded by van Eijk and Williams (1981, 19). (6) St’átimcets (Davis, Gillon, and Matthewson 2014, e201) húy’=ɬkan ptakwɬ, ptákwɬ-min lčʔa ti=šmə́m’ɬač=a going.to=1sg.sbj tell.story tell.story-appl here det=girl=exis “I am going to tell a legend, a legend about a girl . . .” wáʔ=kwuʔ ʔílal látiʔ ti=šmə́m’ɬač=a ipfv=rep cry there det=girl=exis “The girl was crying there.” The authors note that the determiner ti . . . a provides a novel context in the first ex ample, and a familiar one in the second, and thus is not a definite determiner. In this case, textual data provided the context, but such contexts do not always arise. Documentation is critical for studies aimed at understanding diversity. In formal grammar, documentation favors controlled data, driven by a hypothesis-driven theory with predictions about what one might find. Through this, more is learned about the language of study, and controlled cross-linguistic studies allow for comparison be tween languages, pinpointing ways in which languages can differ that allow for clear comparisons because of the controlled data.

5.4. Contributions of formal grammar to documentation Formal grammar has also made contributions to language documentation. Just as Bond (2010) and Epps (2010) note that typology provides representations to documentation, so does formal grammar, with developments in formal grammar providing vocabulary for documenters and identifying topics of interest to probe. The focus on valid methodologies discussed above also is important to documenta tion. Formal grammar is interested in probing for language knowledge that can be dif ficult to access through a corpus alone. For one additional example, it has been claimed

140 Keren Rice that evidentials encode information source and modals encode speaker certainty (e.g., de Haan 1999; Aikhenvald 2004, 7), and thus are discrete categories. Based on research on St’át’imcets, Matthewson, Davis, and Rullman (2007) argue that items classified as evidentials in St’át’imcets encode information source and not certainty, but evidentials also show some properties associated with modals, and that, based on the criteria in the literature, evidentials and modals do not form fully distinct categories.9 Careful use of methods that allow for testing of hypotheses leads to deeper knowledge of the language of study, and of typology as well. The recent attention to elicitation and other methods enriches documentation.

6. Summary I have considered the contributions of documentation to typology and to formal grammar, and the contributions of typology and formal grammar to documentation. While this chapter is not designed to be a comparison of typology and formal grammar, it is difficult to avoid this topic altogether. In the end, with respect to diversity, the theories have a similar goal, to provide insight into particular languages through documentation, and to use that insight to better understand what language is about, and to understand di versity. The theories also differ. Typologists are concerned with how languages can vary, while formal grammarians are more concerned with the limits on diversity. Methods privileged by the theories contrast. Typologists are particularly interested in function, and privilege textual data; formal linguists are interested in language knowledge, and privilege controlled data (given an overall understanding of the language). The data of formal grammar has been particularly open to questions of adequacy. See, for instance, Mithun (2001, 33) on the value of real language data for future generations: If speakers are allowed to speak for themselves, creating a record of spontaneous speech in natural communicative settings, we have a better chance of providing the kind of record that will be useful for future generations.

Mithun (2001, 2014, and elsewhere) also discusses the problems of elicited data, as not representing what people actually say, and how this can be problematic analytically as well as for future generations of speakers. In the elicitation methods as developed in recent years, there is a concerted attempt to move beyond the kinds of examples that Mithun is critical of, through carefully contextualized elicitation. Understanding the

9

The authors do not present a general survey of evidential systems in the typology literature. Chafe and Nichols (1986, vii) define evidentials as grammatical “devices used by speakers to mark the source of and the reliability of their knowledge,” suggesting that the basic observation of Matthewson, Davis, and Rullman (2007) is a contribution of typology to documentation. In any case, formal grammar has deepened the documentation, even if the observation is not new.

Documentation, Linguistic Typology, and Formal Grammar 141 source of data is important, choosing the appropriate type of data for the question under consideration, with attention to quality of data, no matter what the method. Earlier I cited Woodbury (2011, 163) on Boasian models. Here I consider an expanded quote (Woodbury 2011, 163)10: From a modern point of view, Boas’s conception of language was both broad and in terestingly free of dichotomization: there is no strong theoretical division between language use versus linguistic knowledge. There is an acknowledgement of a uni versal core of grammatical concepts, structures and categories, alongside an open ness to areas where these may vary, and in the areas where they vary, an openness to both genetic inheritance and contact-based diffusion.

Thus, while the goals of the two theories under consideration are very different, in the end, both function and linguistic knowledge are important to understanding particular languages and language in general.11 Polinsky (2010, 664–665) writes that one must take care to ensure adequate data: The bad news is that continuing to base our linguistic inquiry on partial data sets (derived from introspection, observation of limited though naturally occurring data, incomplete elicitation of minimal pairs, etc.) is more likely than not an exer cise in inevitable obsolescence, planned or unplanned. Otherwise, we seem doomed to continue along the path of scholastic disputes over insufficient, albeit preferred, data. Nonetheless, the good news is that many of the components are already in place: formalists are good at deducing principles of structure building, while typologists are good at recognizing their functional properties.

Language documentation has been important in leading to better data, both much fuller naturally occurring language and better methods for elicitation. In the end, one goal of language documentation is to understand what Sapir (1921, 127) called the genius of language: For it must be obvious to any one who has thought about the question at all or who has felt something of the spirit of a foreign language that there is such a thing as a basic plan, a certain cut, to each language. This type or plan or structural “genius”

10

Lyle Campbell (personal communication, October 2016) notes that this is not the standard view of Boas’s program, with most seeing him as anti-generalizing, anti-universal concepts, and anti- theorizing, with an interest in the urgency of describing languages and cultures while it was still possible, approaching each on its own terms. 11 See Polinsky (2010, 654) for an interesting analogy with physiology: “An analogy could be drawn from physiology: one can observe a number of people in natural running environments, or test a set of subjects on a treadmill in a lab. In those two conditions, the generalizations are different: natural observations would yield generalizations about preferred patterns; treadmill studies would show what a human body can do when pushed to its limits. Physiologists seldom argue whether one method of observation is better than the other: they have long learned how to combine the data from both.”

142 Keren Rice of the language is something much more fundamental, much more pervasive, than any single feature of it that we can mention, nor can we gain an adequate idea of its nature by a mere recital of the sundry facts that make up the grammar of the language.

Careful attention to methods and insight from different theoretical perspectives all help in understanding the complexities of languages and of language. Linguistics broadly is benefiting from the contributions of language documentation, and, in turn, language documentation is enhanced by the different lenses through which language can be understood.

References Aikhenvald, Alexandra. 2000. Classifiers: A Typology of Noun Categorization Devices. Oxford: Oxford University Press. Aikhenvald, Alexandra. 2004. Evidentiality. Oxford: Oxford University Press. Bender, Emily M., Michael Wayne Goodman, Joshua Crowgey, and Fei Xia. 2013. “Towards Creating Precision Grammars from Interlinear Glossed Text: Inferring Large- Scale Typological Properties.” In Proceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, 74–83, Sofia, Bulgaria, August 8 Associa tion for Computational Linguistics. Bickel, Balthasar. 2007. “Typology in the 21st Century: Major Current Developments.” Linguistic Typology 11: 239–251. Bird, Steven. 2009. “Natural Language Processing and Linguistic Fieldwork.” Computational Linguistics 35: 469–474. Bliss, Heather. 2013. “The Blackfoot Configurationality Conspiracy: Parallels and Differences in Clausal and Nominal Structures.” PhD diss., University of British Columbia. Bochnak, M. Ryan and Lisa Matthewson, eds. 2015. Methodologies in Semantic Fieldwork. Oxford: Oxford University Press. Bogal-Allbritten, Elizabeth. 2016. “Building Meaning in Navajo.” PhD diss., University of Massachusetts, Amherst. Bond, Oliver. 2010. “Language Documentation and Language Typology.” In Language Documentation and Description, vol. 7, edited by Peter K. Austin, 231–261. London: School of Oriental and African Studies. Cable, Seth. 2007. “The Grammar of Q: Q-Particles and the Nature of Wh-Fronting, as Revealed by the wh-Questions of Tlingit.” PhD diss., MIT, Cambridge. (Published as The Grammar of Q: Q-Particles, Wh-Movement, and Pied-Piping. Oxford: Oxford University Press. 2010.) Campbell, Lyle and Bryn Hauk, with Panu Hallamaa. 2015. “Language Endangerment and Endangered Uralic Languages.” In Congressus Duodecimus Internationalis Fenno- Ugristarum, Oulu: Plenary Papers (XII International Congress for Finno-Ugric Studies, Oulu), edited by Harri Mantila, Kaisa Leinonen, Sisko Brunni, Santeri Palviainen, and Jari Sivonen, 7–33. Chafe, Wallace and Johanna Nichols. 1986. Evidentiality: The Linguistic Coding of Epistemology. Norwood, NJ: Ablex Publishing Corporation.

Documentation, Linguistic Typology, and Formal Grammar 143 Clemens, Lauren Eby, Jessica Coon, Pedro Mateo Pedro, Adam Milton Morgan, Maria Polinsky, Gabrielle Tandet, and Matthew Wagers. 2015. “Ergativity and the Complexity of Extraction: A View from Mayan.” Natural Language & Linguistic Theory 33(2): 417–467. Clopper, Cynthia and Judith Tonhauser. 2013. “The Prosody of Focus in Paraguayan Guarani.” International Journal of American Linguistics 79(2): 219–251. Coon, Jessica. 2010a. “Complementation in Chol (Mayan): A Theory of Split Ergativity.” PhD diss., MIT, Cambridge. Coon, Jessica. 2010b. “Rethinking Split Ergativity in Chol.” International Journal of American Linguistics 76(2): 207–253. Corbett, Greville. 2006. Agreement. Cambridge: Cambridge University Press. Davis, Henry, Carrie Gillon, and Lisa Matthewson. 2014. “How to Investigate Linguistic Diversity in the Pacific Northwest.” Language 90(4): e180–e226. De Haan, Ferdinand. 1999. “Evidentiality and Epistemic Modality: Setting the Boundaries.” Southwest Journal of Linguistics 18: 83–101. Epps, Patience. 2009. “Where Differential Object Marking and Split Plurality Intersect: Evidence from Hup.” In New Challenges in Typology: Transcending the Borders and Refining the Distinctions, edited by Patience Epps and Alexandre Arkhipov, 85–104. Berlin: Mouton de Gruyter. Epps, Patience. 2010. “Linguistic Typology and Language Documentation.” In The Oxford Handbook of Linguistic Typology, edited by Jae Jung Song. Oxford: Oxford University Press. Epps, Patience and Alexandre Arkhipov, eds. 2009. New Challenges in Typology: Transcending the Borders and Refining the Distinctions. Berlin: Mouton de Gruyter. Evans, Nicholas. 2003. “Typologies of Agreement: Some Problems from Kayardild.” Transactions of the Philological Society 101: 203–234. Evans, Nicholas. 2007. “Insubordination and its uses.” In Finiteness: Theoretical and Empirical Foundations, edited by Irina Nikolaeva, 366–431. Oxford: Oxford University Press. Foley, William and Mike Olsen. 1985. “Clausehood and Verb Serialization.” In Grammar Inside and Outside the Clause: Some Approaches to Theory from the Field, edited by Johanna Nichols and Anthony C. Woodbury, 17–60. Cambridge: Cambridge University Press. Grinevald, Colette. 2000. “A Morphosyntactic Typology of Classifiers.” In Systems of Nominal Classification, edited by Gunter Senft, 50–92. Cambridge: Cambridge University Press. Grinevald, Colette. 2002. “Making Sense of Nominal Classification Systems: Noun Classifiers and the Grammaticalization Variable.” In New Reflections on Grammaticalization, edited by Ilse Wischer and Gabriele Diewald, 259–275. Amsterdam: John Benjamins. Hale, Kenneth. 1992. “Language Endangerment and the Human Value of Linguistic Diversity.” Language 68(1): 35–42. Hildebrandt, Kristine A., Carmen Jany, and Wilson Silva (editors). 2017. Documenting Variation in Endangered Languages. Language Documentation & Conservation Special Publication No. 13. Honolulu: University of Hawai‘i Press. Himmelmann, Nikolaus. 1998. “Documentary and Descriptive Linguistics.” Linguistics 36: 161–195. Himmelmann, Nikolaus. 2006. “Language Documentation: What Is It and What Is It Good For?” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 1–30. Berlin: Mouton de Gruyter. Johns, Alana and Ivona Kučerová. 2017. “On the Morphosyntactic Reflexes of the Information Structure in the Ergative Patterning of the Inuit Language.” In The Oxford Handbook of Ergativity, edited by Jessica Coon, Diane Massam, and Lisa Travis. Oxford: Oxford University Press.

144 Keren Rice Konnerth, Linda. 2014. “A Grammar of Karbi.” PhD diss., University of Oregon. Krauss, Michael. 1992. “The World’s Languages in Crisis.” Language 68(1): 4–10. Lovick, Olga and Siri Tuttle. 2016. “You Shouldn’t Say That: Cultural Norms and Grammatical Description in Alaskan Dene Languages.” Talk presented at the Dene Languages Conference, Yellowknife, Northwest Territories, Canada, June 6. Lüpke, Friederike. 2009. “Data Collection Methods for Field-Based Language Documentation.” In Language Documentation and Description, vol. 6, edited by Peter K. Austin, 53–100. London: School of Oriental and African Studies. Majid, Asifa. 2012. “A Guide to Stimulus-Based Elicitation for Semantic Categories.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 54–7 1. Oxford: Oxford University Press. Matthewson, Lisa. 2004. “On the Methodology of Semantic Fieldwork.” International Journal of American Linguistics 70(4): 369–415. Matthewson, Lisa, Henry Davis, and Hotze Rullman. 2007. “Evidentials as Epistemic Modals: Evidence from St’at’imcets.” The Linguistic Variation Yearbook 7: 201–254. MacWhinney, Brian, Andrej Malchukov, and Edith Moravscik, eds. 2014. Competing Motivations in Grammar and Usage. Oxford: Oxford University Press. Meyerhoff, Miriam. 2015. “Turning Variation on Its Head. Analysing Subject Prefixes in Nkep (Vanuatu) for Language Documentation.” Asia-Pacific Language Variation 1(1): 78–108. Mithun, Marianne. 2001. “Who Shapes the Record: The Speaker and the Linguist.” In Linguistic Fieldwork: Essays on the Practice Empirical Linguistic Research, edited by Paul Newman and Martha Ratliff, 34–54. Cambridge: Cambridge University Press. Mithun, Marianne. 2012. “Field Methods in Syntactic Research.” In Continuum Companion to Syntax and Syntactic Theory, edited by Silvia Luraghi and Claudia Parodi. London and New York: Continuum. Mithun, Marianne. 2014. “The Data and the Examples: Comprehensiveness, Accuracy, and Sensitivity.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 25–52 (Language Documentation & Conservation Special Issue No. 8). Honolulu: University of Hawai‘i Press. Mosel, Ulrike. 2011. “Morphosyntactic Analysis in the Field: A Guide to the Guides.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 77– 89. Oxford: Oxford University Press. Murray, Sarah E. 2010. “Evidentiality and the Structure of Speech Acts.” PhD diss., New Brunswick, NJ: Rutgers University. (Published as The Semantics of Evidentials. Oxford: Oxford University Press 2017.) Nagy, Naomi. 2009. “The Challenges of Less Commonly Studied Languages: Writing a Sociogrammar of Faetar.” In Variation in Indigenous Minority Languages, edited by James Stanford and Dennis Preston, 397–417 (Impact series vol. 25). Philadelphia: John Benjamins. Norcliffe, Elizabeth, Alice C. Harris, and T. Florian Jaeger, eds. 2015. “Laboratory in the Field: Advances in Cross- Linguistic Psycholinguistics” Language, Cognition and Neuroscience 30(9) (Special Issue). Palosaari, Naomi and Lyle Campbell. 2011. “Structural Aspects of Language Endangerment.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 100–119. Cambridge: Cambridge University Press. Polinsky, Maria. 2010. “Linguistic Typology and Formal Grammar.” In The Oxford Handbook of Linguistic Typology, edited by Jae Jung Song. Oxford: Oxford University Press.

Documentation, Linguistic Typology, and Formal Grammar 145 Polinsky, Maria and Robert Kluender. 2007. “Linguistic Typology and Theory Construction: Common Challenges Ahead.” Linguistic Typology 11: 167–179. Sapir, Edward. 1921. “Types of Linguistic Structure” (Introduction to Chapter 6). In Language: An Introduction to the Study of Speech. New York: Harcourt, Brace. http://www. bartleby.com/186/6.html. Saxon, Leslie. 2000. “Head-Internal Relative Clauses in Dogrib (Athapaskan).” In Papers in Honor of Ken Hale, edited by Andrew Carnie, Eloise Jelinek, and Mary Willie. MIT Working Papers on Endangered and Less Familiar Languages, vol. 1. Schultze-Berndt, Eva. 2000. “Simple and Complex Verbs in Jaminjung. A Study of Event Categorisation in an Australian Language.” In MPI Series in Psycholinguistics 14. Nijmegen, The Netherlands: University of Nijmegen. Schultze-Berndt, Eva. 2003. “Preverbs as an Open Word Class in Northern Australian Languages: Synchronic and Diachronic Correlates.” In Yearbook of Morphology, edited by Geert Booij and Jaap van Marle, 145–177. Dordrecht, Holland: Kluwer. Schnell, Stefan. 2012. “Data from Language Documentation in Research on Referential Hierarchies.” In Potentials of Language Documentation: Methods, Analyses, and Utilization, edited by Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, and Paul Trilsbeek, 64–72 (Language Documentation & Conservation Special Publication No. 3). Honolulu: University of Hawai‘i Press. Seifart, Frank. 2005. “The Structure and Use of Shape-Based Noun Classes in Mirana (North West Amazon).” PhD diss., Nijmegen, Netherlands: Radboud University. Seifart, Frank. 2009. “Multidimensional Typology and Miraña Class Markers.” In New Challenges in Typology: Transcending the Borders and Refining the Distinctions, edited by Patience Epps and Alaxandre Arkhipov, 365–387. Berlin: Mouton de Gruyter. Sells, Peter. 2010. “Language Documentation and Linguistic Theory.” In Language Documentation and Description, vol. 7, edited by Peter K. Austin, 209–237. London: School of Oriental and African Studies. Silverstein, Michael. 1976. “Hierarchy of Features and Ergativity.” In Grammatical Categories in Australian Languages, edited by Robert M.W. Dixon, 112–171. Canberra: Australian Institute of Aboriginal Studies. Simpson, Jane. 2012. “Information Structure, Variation and the Referential Hierarchy.” In Potentials of Language Documentation: Methods, Analyses, and Utilization, edited by Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, and Paul Trilsbeek, 74–82 (Language Documentation & Conservation Special Issue No. 3). Honolulu: University of Hawai‘i Press. Stanford, James and Dennis Preston, eds. 2009. Variation in Indigenous Minority Languages. Amsterdam/Philadelphia: John Benjamins. Thieberger, Nicholas, ed. 2012. The Oxford Handbook of Linguistic Fieldwork. Oxford: Oxford University Press. Tonhauser, Judith, David Beaver, Craige Roberts, and Many Simons. 2013. “Towards a Taxonomy of Projective Content.” Language 89(1): 66–109. Tonhauser, Judith and Lisa Matthewson. 2015. “Empirical Evidence in Research on Meaning.” Manuscript. Ohio State University/University of British Columbia. Totem Field Storyboards. 2010– 2018 http://totemfieldstoryboards.org/. Accessed August 15, 2016. Van Eijk, Jan and Lorna Williams. 1981. Cuystwi Math Ucwalmicwts: Lillooet Legends and Stories. Mount Currie, BC: Ts’zil.

146 Keren Rice Verhoeven, Elisabeth and Stavros Skopeteas. 2015. “Licensing Focus Constructions in Yucatec Maya.” International Journal of American Linguistics 81(1): 1–40. Von Prince, Kilu. 2012. “A Grammar of Daakaka.” PhD diss., Berlin: Humboldt University. Whalen, Douglas H. and Joyce McDonough. 2015. “Taking the Laboratory into the Field.” Annual Review of Linguistics 1: 395–415. Wilhelm, Andrea. 2008. “Bare Nouns and Number in Dëne Sųłiné.” Natural Language Semantics 16: 39–68. Wilhelm, Andrea. 2014. “Nominalization Instead of Modification.” In Cross-Linguistic Investigations of Nominalization Patterns, edited by Ileana Paull, 51–81. Amsterdam: John Benjamins. Woodbury, Anthony. 2003. “Defining Documentary Linguistics.” In Language Documentation and Description, vol. 1, edited by Peter Austin, 35–51. London: Hans Rausing Endangered Languages Project. Woodbury, Anthony. 2011. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 159– 186. Cambridge: Cambridge University Press. Zariquey Biondi, Roberto. 2011. “A Grammar of Kashibo-Kakataibo.” PhD diss., Melbourne, Australia: La Trobe University.

Chapter 7

The Desig n a nd Implem entat i on of D o cum entation Proj e c ts f or Sp oken L a ng uag e s Shobhana Chelliah

1. Introduction The purpose of this chapter is to set up a blueprint on how to design and carry out a language documentation project for an endangered spoken language. In reality, no one blueprint will suffice since there are many documenter profiles and a variety conditions for language en dangerment. The documenter may be a leading figure in a community’s culture and liter ature committee, a non-community linguist collaborating with a community to initiate a documentation project, or a community or non-community scholar gathering data to com plete a dissertation or answer research questions. Also, the language to be documented could score well on a vitality scale or be in danger of going silent in the immediate future. In this chapter, I will assume that the documenters are community members and researchers working together with shared goals and methods. I will further assume that the language to be documented has average vitality as calculated by current sources such as the Catalogue of Endangered Languages. That is, a majority of the population is bi lingual or multilingual, and while older speakers use the language for many purposes, younger speakers use the language in limited domains. Furthermore, while there may be a writing system there is not much by way of written literature or grammatical de scription. Although government educational policies do not support maintenance of the language, the community is interested in creating documentation aimed at re vitalization.1 It should be noted that documentation projects with highly endangered 1

http://www.endangeredlanguages.com.

148 Shobhana Chelliah languages require additional considerations such as the lack of speakers, the average age of speakers, health concerns and weaker memories, and attrition and change resulting from the language being used infrequently. It may be possible to collect word lists but not conversations. See, for example, Michael Krauss’s recollections of his work with Eyak (Krauss 2006). With this scenario in mind, I review the following factors which could impact the planning and implementation of a language documentation project: immediate and long-term goals of the community and the researcher(s); the composition and effective management of the team undertaking the documentation; funding to support the re quired activities; technological proficiency and linguistic expertise of the community and documenters; preparation for onsite and offsite data collection; data management; personnel management; and results dissemination. The audience for this chapter is primarily students and younger scholars with no pre vious experience in language documentation. It is also addressed to experienced lan guage documenters working in an older paradigm, that is, documenters working on their own with select members of a language community rather than in research teams and with larger community organizations.

2. Goals State-of-the-art language documentation projects are expected to create a record of a language that is broadly representative of how a language is used in a commu nity. At the core of that record is a corpus of annotated language data with annotation being detailed enough to make recovery of linguistic structure and meaning possible. In most cases, annotation means interlinear glossing with or without part-of-speech tags. The goal is to create a corpus for multiple users. Communities would access corpus material for understanding culturally relevant language exchanges and for creating novel linguistic utterances. The annotations would facilitate investigations in other disciplines such as typological or historical linguistics, anthropology, psy chology, folklore, and computational linguistics. To make all these things possible for today and for posterity, the resulting record should be accessible and searchable (Woodbury 2011, 159). The field of language documentation pitches a big tent and welcomes participants of many talents and interests to create the record. As a result, language documenta tion projects are not homogenous in the planning stages or in deliverables. Practical considerations (funding and time, for example), gaps in training of one kind versus ex pertise in another (expertise in videography but none in linguistics, for example), or just differences in interest (a community interested in genealogical terms versus a bot anist interested in indigenous plant names) will favor some goals over others. In the re mainder of this section, I review some specific goals and what preparation and planning is required to reach those goals.

Documentation Projects for Spoken Languages 149

2.1. Traditional narratives and conversations If the documenter is interested in creating a collection of traditional narratives, prepa ration will include practice in the use of audio and video recording devices. The com munity would also want to discuss which narratives are to be documented and which speakers are to be recorded performing those narratives. This becomes especially im portant if some narratives can be told only “in season” or by specific people. For the Western Apache, for example, Coyote stories are told in the winter when coyotes will not overhear them (Brandt, Lavender-Lewis, and Greenfeld 1994, xi). So, it would be necessary to have at least a conceptual schedule of which speakers are available, and at which time of year (e.g., agricultural communities will be busy during harvests, in some colder climates such as the Arctic, fishing communities tend to be busy in the summer but available for indoor work in the winter). Once a recording is made, it must be transcribed and translated to be maximally useful, but as is well-known, transcription is a significant bottleneck to translation. Where possible, it is helpful to work with literate speakers who can provide a first-pass transcription in a practical orthography which can form the basis for broad phonemic transcription. That first-pass transcription can also be used in analyzing and revising spelling conventions. For the Lamkang Online Lexical Database project,2 commu nity members were eager to contribute recordings of narratives to the project and have these analyzed so that they could be used for language teaching and culture transmis sion. They recorded and transcribed a number of recordings of traditional narratives. They shared their transcriptions via email and gave me the copies of video recordings on DVDs. Their transcriptions formed the basis of spelling activities during the first of sev eral orthography workshops hosted by the project and also served to launch grammat ical analysis. We found that working on standardizing the practical orthography with the community early on in our documentation process was well worth the investment of time. Working with speakers on spelling issues, especially determining what constituted an orthographic word, speeded up our access to the data as speakers were able to tran scribe recordings more proficiently. For our project, community-driven data collection, especially the recording of natural interactions and culturally situated performances, increased the quantity and quality of language data many times over what a single documenter could accomplish. There are other methods for recording narratives and conversations which have been used with some success. One of these methods is the Basic Oral Language Documentation (BOLD) method by which speakers are trained to repeat previously narrated stories at slow speed so that the recordings can be transcribed at a later date (Reiman 2009). The Aikuma app is another powerful tool for community data collection and oral tran scription. In a recent version of this app, speakers use an android phone to audio and/or video record a speech event. They then play back that event and interject over the sound 2

https://www.nsf.gov/awardsearch/showAward?AWD_ID=1160640&HistoricalAwards=false.

150 Shobhana Chelliah signal to provide simultaneous interpretation in a contact language (Bird et al. 2014). So, learning how to provide training as well as budgeting time for that training would be a useful planning activity.

2.2. Documenting traditional ecological knowledge Documenting specialized vocabularies and linguistic routines connected to ecolog ical and cultural knowledge requires special preparation. A non-methodical way of documenting traditional ecological knowledge (TEK) can be valuable as well. For ex ample, a documentary linguist who is primarily interested in grammar might jot down plant names and their uses as provided by a speaker when on a walk. When traditional ecological knowledge is the focus, however, a more methodical approach is neces sary including background research in the area of interest and gaining familiarity with how other researchers have approached documentation of those areas. Thieberger (2012) includes articles discussing documentation of geography, botany, mathe matics, and astronomy, and can be used as a starting point for background reading. Additional resources are the master classes offered at the International Conference on Language Documentation and Conservation, University of Hawai‘i. Geographer David Marks spoke there on documenting categories of landscape features, and bot anist Will McClatchey spoke on folk taxonomy and the collection of biological/envi ronmental terms.3 Case studies of other focused documentation projects can be found in the journal Language Documentation & Conservation. Training on specific software, such as software to create and manipulate maps, is available at language documenta tion training institutes such as the Institute for Collaborative Language Research. For methodical TEK documentation, a discipline-specific scientist should be a partner on the project; reviewers of grant applications look for this kind of expertise. As example of such a partnership is the US National Science Foundation funded project, directed jointly by linguist Jonathan Amith of Gettysburg College and botanist John Kress of the Smithsonian Institution. The project, called, Advances in Linguistic, Ethnobotanical, and Botanical Sciences through Documentation of Traditional Ecological Knowledge, looks at how communities have historically interacted with their environments and how that in teraction is chronicled in words and phrases. At the same time, the team also created an inventory of floristic specimens with DNA bar-coding of flora in the Sierra Nororiental de Puebla, a biodiverse region of central Mexico.4 Another example is the work of folklorist and cultural anthropologist Craig Mishler from the Alaska Native Language Center at the University of Alaska Fairbanks. Mishler is working with the Gwich’in (Dene) people of northeast Alaska and north west Canada, specifically Crystal Frank, Kenneth Frank, Caroline Tritt-Frank, and Allan Hayton, to document Gwich’in traditional knowledge about caribou. This project 3

4

http://hdl.handle.net/10125/26196 and http://hdl.handle.net/10125/26184. https://www.nsf.gov/awardsearch/showAward?AWD_ID=1401178.

Documentation Projects for Spoken Languages 151 shows that knowledge documentation leads to language documentation and vice versa. Information on the caribou’s anatomy yields lexical and semantic information on Gwich’in names for body parts and names for tissue, skin, and bones. Furthermore, there are cultural artifacts (tools, toys, clothing, naming practices); verbal art and oral history (stories, songs, ceremonies); and butchering and cooking procedures all asso ciated with caribou. Knowledge documentation also requires the documenter to curate language records from different genres and of varied lengths, complexity, and register levels and to make information on social acceptability or taboo part of the metadata.

2.3. Sociolinguistic documentation, dialectology, and language vitality To conduct sociolinguist documentation as well, the documenter must acquire pro ficiency in specific data collection techniques and methods. Sociolinguistic studies encourage the collection of natural interactions which provide information about morphosyntax and fast speech phenomena which are often uniquely attested in con versation (Mithun 2001; Chelliah 2001). Sociolinguistic considerations are also rele vant to lexicography because when putting together a dictionary one must decide on which variety to represent as the standard or most common variety. Documenting nat ural interactions along with ethnographic detail is important for revitalization activi ties as speakers want to know how to use the language in day-to-day communication. Practice in creating ethnographic records along with audio and video recording and in transcribing and representing conversations would therefore be a useful proficiency.5 Communities are often interested in carrying out language vitality studies as a means to motivate better documentation and more vigorous revitalization efforts. In fact, for Administration for Native Americans grants, community assessments are a prerequisite so such studies are carried out routinely. The documentary linguist can assist in creating surveys that document language choice, code-switching, language contact, and networks and domains of language use. The surveyed factors, along with the traditional sociolinguistic variables of age, sex, status, ethnicity, are particularly relevant to multilingual situations (see Childs, Good, and Mitchell 2014). Additional preparatory reading for language vitality surveys would be in language planning and language policy, including information on the history of the language and the political environment in which the language is spoken. Ultimately, each language situation will require a unique survey design but reading about possible designs and case studies of how to work with community members to implement surveys would be useful. A documentary linguist could encourage participants to develop survey questions based on a sample survey. The community would then develop 5

See transcription and representation guidelines for the Santa Barbara Speech Corpus, for example. Available online at: http://www.linguistics.ucsb.edu/projects/transcription/representing.

152 Shobhana Chelliah questions appropriate to their particular language situation, determine who will ad minister those questions, and decide what the ultimate purpose of the survey is (see Indigenous Language Institute n.d.). Often the purpose of the survey is a to provide communities with a wakeup call about the fragility of the language and this in turn leads to strengthened documentation activities. We know that social variables of age, gender, occupation and the like are relevant to determining who speaks which lan guage and when and where different varieties are used. Also, variation and attitudes toward particular varieties can reveal much about the variety’s status, vitality, and patterns of attrition. Providing training to speakers who conduct suvervys to under stand these variables and attitudes will be important (Nagy 2016). Variationist socio linguistic methods may be less readily applied in endangered-language situations in that if only few speakers of the endangered language remain it would be impossible to gather sufficient tokens for statistically significant results (Nagy 2016). Also, avail able data will not be homogenous in that some speech samples will come from semi- speakers or second-language learners and there may be a great deal of variation due to language attrition.

2.4. Documenting for language description Language documentation goes hand in hand with language description, since descrip tion can tell the researcher where there are gaps in the documentation and what docu mentation is needed to fill those gaps. For example, a description of the sound system is required to help motivate orthographic choices. In turn, literate community members can use that knowledge to provide first-pass transcriptions and thus speed up docu mentation. To describe and document the sounds of a language it is necessary to find speakers who are good at working in a quasi-experimental way, e.g., repeating words out of context. To help with acoustic analysis and to collect sound files that can be used to illustrate entries in digital dictionaries, it will be necessary to make high-quality recordings. For this, the documenter will need state-of-the-art recording equipment in cluding perhaps a portable soundproof booth. It would be useful to know how to use additional analytic tools for phonetic analysis and tools for acoustic analysis (Ladefoged 2003; Baart 2009). It would be useful to know how to construct a word list for making recordings for use in acoustic analysis, for example lists of minimal pairs for tone and vowel length studies (Chelliah and de Reuse 2011, 251–278). Connected natural speech reveals fast speech phenomena, for which one needs conversations and natural speech samples. Needless to say, transcription practice in the International Phonetic Alphabet (IPA) before embarking on data collection is a must. There is already a substantial literature on many planning aspects of data collection for language description in the area of morphology, syntax, and semantics (Chelliah and de Reuse 2011; Chelliah 2014; Bochnak and Matthewson 2015). It is impossible to be ready for every challenging aspect of language description but some prepara tion before embarking on data collection will speed up analysis and description. This

Documentation Projects for Spoken Languages 153 preparation could include reading about closely related languages and prevalent ty pological patterns, practice in solving and creating morphology and syntax puzzles, practice using and creating tools for data collection, and familiarity with the litera ture on the geography, environment, and culture of the region where the language is spoken. This section supports the case for language documentation being carried out with partners: the community language expert and the linguist who has the exper tise in language description. The more training in linguistics a language expert has, the more effective the partnership. A case in point is our work with Lamkang (lmk, Tibeto-Burman, Manipur State, India) speaker Sumshot Khular, who has an MA in Linguistics from Manipur University. She participated in the data analysis and collection at a more advanced level than a non-trained speaker because of this training. For example, to collect information on verb inflection, we created a verb par adigm template and collectively worked to fill that template via online file sharing and editing.

2.5. Documentation for language teaching Documentary materials can be made to facilitate the creation of language teaching materials. And, in turn, language teaching materials can contribute to the documen tary record. Training in language pedagogy is necessary, as the methods to teach various aspects of language are anything but obvious. For example, how does one create graded language lessons? It would also be important to ascertain which interactions the com munity wants to maintain or revive. Verbal art (Woodbury 2016) and music (Barwick 2012) are useful in language teaching and are often items the community wants to docu ment and preserve. Preparation for documentation of these arts would include finding a digital repository for music and text that are easily accessible by speakers. Then some consideration must be made for dissemination: in some cases, YouTube or I-Tunes (Turpin and Henderson 2015) would be the appropriate means to disseminate the con tent. Where internet availability is unreliable, a portable device with a USB memory stick might be more appropriate. Linguists usually need training in language- teaching pedagogy but additional training for native speakers in documentary methods and in language teaching methods has proven beneficial (Penfield 2015, Ajo et al. 2010). The Conference on the Indigenous Languages of Latin America (CILLA), University of Texas, Austin, is an inspirational example of how native speakers motivated to create pedagogical materials can also be come leading documenters in their area. Under the direction of Nora England, students in Oxlajuuj Keej Maya’ Ajtz’iib’ (OKMA) conducted dialect surveys and prepared pedagogical grammars toward Mayan language standardization and description of language variation. Several students with this background then continued documenta tion projects as graduate students at the University of Texas at Austin (Woodbury and England 2004; Maxwell 2010).

154 Shobhana Chelliah

2.6. Revitalizing and documenting This chapter limits its focus primarily to documentation involving languages with “av erage vitality.” Documentation of languages with low vitality would be approached differently. In communities with revitalization programs such as Master-Apprentice programs or immersion language classrooms, one possibility would be to record all the interactions between Master and Apprentice or between the teacher and students in the immersion classroom. For the Sauk Master-Apprentice program, hundreds of hours of interaction are available but yet to be incorporated in a documentation project (Jacob Manatowa-Bailey, personal communication, 2016). Legacy materials may be impor tant data sources and learning and discussing these sources a significant source of new language information. Given the age of speakers and their health and comfort levels, this method of extended data collection is appropriate. Elders are powerful channels for language transmission and can provide natural and useful language for day-to-day communication.

3. Partners and teams Since many areas of expertise are needed to implement a documentation project, it is also necessary to partner with other experts to avoid intellectual isolation and in sufficient technological expertise. However, academics tend to maroon themselves on research islands. This may be out of worry that they will be scooped on poten tially brilliant findings. Insecurity about others noticing that their transcription is not perfect or their ability to manipulate equipment is less than adroit might be an other reason. Community members might isolate themselves, because of a historical distrust of outsiders, a fear of researchers taking away data, stealing the limelight, or usurping the goals and methods of a project. These are valid concerns. See Rice (2010) for discussion. But a documenter who can get off the island and use trusted networks to overcome these concerns will reap many benefits. The advantages of collaboration include varied perspectives resulting in increased confidence in findings and inter pretation of data; speeding up analysis, more careful metadata creation and data base management, and increased transparency in record-keeping so that the team functions effectively. However, there are challenges to managing a team and a bit of forethought about proj ect management would be useful. A project manager would need to know how to set reasonable expectations for each team member and to provide the team with the knowl edge, instructions, access to hardware and software that they need to get the task done. It would also be necessary to monitor each team member’s progress, keeping in mind project goals and skills acquisition. With the possible exception of psycholinguists who run lab experiments, linguists traditionally have very little experience with human resource training such as recruiting

Documentation Projects for Spoken Languages 155 workers, hiring and firing guidelines, tracking work and approving pay, encouraging positive group dynamics, making each team member responsible for overall progress, and the rights and responsibilities of data use. It may be because of this lack of training that linguists shy away from managing larger teams. However, it is increasingly the case that language documentation takes place through teams working in a joint virtual and/ or physical space. This is the case with Craig Mishler’s team who reflects that “One key to the continued success of the project has been its methodology. A great deal of our work is done online with the help of Skype and Google Drive. Although we live in different locations, we are still able to work about two hours a day in real time to edit and translate oral texts.”6 See Mishler (2014) for detail on methods of long-distance teamwork toward documentation. Funding mechanisms encourage the team model since funders pro vide incentives to train students in research through practical experience in documen tation and related activities. Like other research labs, a language documentation lab will have one or more lead investigators with research assistants and perhaps a post-doctoral fellow. Language experts from the community would ideally be regular contributors to the group. Each researcher might have a different task—one working on acoustic anal ysis using software like PRAAT, one on generating interlinear analysis of narrative data using software like FLEx, another on dissemination to the community via website and printed materials, and another working on academic publications. At the same time, the lead documenter could be involved in continued data collection and guiding transcrip tion and translation by community and researchers. In handling these multiple participants, two issues arise: First, without careful data management one could very quickly be faced with mountains of information with no metadata and stored with no structure so that it is hard to search. It becomes neces sary to train participants on metadata and backup processes and to hold regular group meetings where each member can provide updates on progress. Second, the rights and responsibilities of each participant with respect to data dissemination and publication must be clearly stated at the outset. I created an agreement on ethics and publication for my team members which looks as follows: • Statement on long-term goals of the project. • Recap of individual team-member tasks. • Agreement on the use of data: none of the data you have access to as a result of your or your teammates’ work on this project (a) can be used for commercial purposes, and (b) none of it can be published or disseminated without first vetting the sub mission with me. My name and the project’s reputation are attached to this data. Therefore, I reserve the right to help you improve the quality of your submission through suggested revisions and peer review. • Agreement on publication attribution: as we get ready to publish our findings and to produce other derivatives of our work, I want to make clear this project’s

6

https://www.arcus.org/witness-the-arctic/2014/3/article/22797.

156 Shobhana Chelliah policies on publication attribution. These policies are based on the Committee for Publication Ethics, which you can read here (http://publicationethics.org/re sources/guidelines): ◦ Each member of the team who contributes to a publication will be an author of that publication. By “contributes” I mean one or more of the following: data collection; data collation; data analysis; conceptualizing and formulating the outline for a presentation that leads to the write up; engaging in the writing of the paper; and contributing tables, charts, graphs, figures, and photographs. ◦ The order of names for publication attribution will be as follows: (1) primary writer, (2) assistant writer, (3) data analysis partner, and (4) graphics and design assistant. Where applicable and as often as possible, community participants will be included in the list of authors with the main writers. ◦ Each of you should aim to be the first author on a paper between now and when our grant period ends. Please talk to me about a topic that is of interest to you and we will start working on it.

When working with a team, ethical issues related to language documentation should be discussed and protocols for data sharing, use, and archiving with respect to the project and the community should be reviewed.7

4. Training There are many conferences and training institutes one can attend to find expert collaborators and to create an informal advisory board. In the United States, there is the American Indian Language Development Institute (AILDI) and the Institute for Collaborative Language Research (CoLang). Training institutes and conferences are an ideal place to meet others just starting out or those in the middle of a documentation project or with existing experience in language documentation. Sharing one’s hopes and aspirations for a project with these documenters will yield useful advice and contacts. It would be helpful to keep a journal of this practical advice and names of possible advisors. The types of questions that could be asked are: • Which academics have been involved with linguistic work on related languages or varieties and could provide guidance on points of interest or challenge in documentation? • Which academics might have students willing to partner with a community and which community might be looking for academics to help with documentation projects?

7

See Thieberger and Musgrave (2007) for discussion.

Documentation Projects for Spoken Languages 157 • What materials exist on the language to be documented at major repositories such as the US National Anthropological Archives? • What materials exist in private holding and community repositories and what level of access is there for those materials? • What has been published on the language in academic journals, present in lan guage archives, or in popular and scholarly books? • How do you identify challenges and solve specific language problems with respect to language analysis software? For example, how does one create a dictionary for an overwhelmingly prefixing language? • What archiving resources are available at the local, national, and international repositories? • How do I navigate Institutional Review Board clearance as an independent scholar? • How do I find models of successful partnerships between communities and re search institutions? • How long does it take to complete aspects of a project and what are realistic goals given time constraints? • How do I best sequence activities for different documentation goals? When no training institute is accessible or affordable, then one could pursue collabora tion with a local university or museum to request funds to create a training module or apply for funding to hold workshops for specific documentation needs. There is more often goodwill and willingness to lend a helping hand toward language documentation than not, especially when requested contributions toward that goal are practical and not financial. Institutions that are not interested in supporting language documentation may need reminding that with the right publicity, an institution supporting language documentation efforts could win public approbation for what is clearly a humanitarian effort.

5. Securing funding A language documentation project can be undertaken without funding. An inspiring example of such a project involves the efforts of an elder from the Lamkang community of Manipur, India, Mr. Beshot Khullar. Mr. Khullar was determined to create a lasting record of the songs, stories, and proverbs of his people and toward this end began writing them down, and even though there is no standardized spelling system of Lamkang, he published anthologies of limited edition copies of this material (Khullar 2006, 2013). He also recorded songs and stories on analog tapes that he later had transferred to DVD and USB flash drives which he distributed to the community with the hopes of keeping his language and traditions alive. He has done this with his own funding, not because of deep pockets but because of deep conviction. And there are other examples in the same community of self-funded documentation efforts. Reverend Daniel Tholung

158 Shobhana Chelliah created videos and photographs of traditional dances and stories. To record traditional narratives from elders, encourage writing, and to get younger speakers involved in lan guage documentation, he instituted at his church a competition which provided a prize for the best written documentation by young congregant of a narrative from an elder. External funding, however, makes long- term documentation efforts possible. Funding from a private institution or government organization is often linked to in frastructure such as a university or think tank, and such infrastructure can support ac tivities from the recording stage to analysis, archiving, and dissemination. All aspects of language documentation are time consuming and it would require time off from a regular job to give it the effort it needs to be done correctly. Also, the process of writing a proposal will in itself help with the planning of the project. A successfully funded researcher or community member becomes part of a cohort of funded documentary linguists and that membership provides further networking possibilities. Writing a proposal forces the documenter to make decisions on practical issues such as the appropriate hardware and software to use for documentation, and in fact the funder might even have guidelines for this. It also motivates the documenter to be mindful about how long it takes to complete tasks. Keeping the reviewer in mind when deciding on workflow will help keep the schedule of activities focused and encourage selection of appropriate team members with the required expertise. For instance, is there someone on the team who can assist with language analysis, someone on the team for the technical aspects of data gathering and then of database management and data archiving? Urgent documentation projects may not get funding because the projects re quire key personnel which have not been mentioned in the proposal. To create a reasonable budget, the aspiring documenter should practice writing out a budget with experienced investigators who have received funding. Funding requests should be tied to justifiable expenses (e.g., per diem based on stated allowable amounts). Funders may be open to discussing the budget ahead of the submission deadline. If not, students and faculty at academic institutions should seek assistance in putting to gether budget and this assistance could come from the research office, senior faculty, and fellow students or colleagues, but preferably it will come from all three groups. For documenters who do not live near the community whose language is to be documented, funding will be necessary for travel, lodging, food, special clothing, funds to compen sate speakers for their time and efforts, and expenses associated with archiving. There is a substantial amount of information available on how to apply for funding (Penfield and Zepeda 2008; Bowern 2008, 2011; Chelliah and de Reuse 2011; Sapien and Nash 2015). It does bear emphasizing that each funder of language documenta tion has a unique mission so it is necessary to identify the documentation activity that supports that mission. Proposals are frequently declined because they are not appro priate for the funding agency to which they have been submitted. In the United States, the Documenting Endangered Languages Program (DEL) at the National Science Foundation (NSF) funds language description and documentation; infrastruc ture creation (e.g., annotated corpora, archives, databases); computational methods (e.g., domain adaptation, statistical machine learning); typological, historical, or

Documentation Projects for Spoken Languages 159 theoretical linguistics; and training and infrastructure creation at US Tribal Colleges and Universities. While language revitalization can be part of broader impacts of an NSF project, NSF does not fund language revitalization projects per se. There is no point, therefore, in submitting a proposal to the NSF to develop teaching materials from existing collections or to train teachers or create language teaching modules for class room use. NSF does not fund these activities. However, other US federal agencies, such as the Administration for Native Americans (ANA) is interested in supporting language revitalization projects related to kindergarten to twelfth grade (K–12) classroom in struction. Similarly, the US Institute for Museums Libraries and Services might fund a proposal to create a digital language archive, but NSF and ANA would not fund reposi tory creation as a sole proposed activity. There will be some small percentage of data that a speech community may want to record, annotate, and digitally preserve but not make publicly available. However, the expectation of most funding agencies is that a majority of output of a funded project should be available to the speaker community and the scientific community through a publicly accessible digital archive. For some funding agencies such as the US National Endowment for the Humanities, dissemination to the public via traveling displays, web site dissemination, radio plays, and the like are also acceptable. If the results of previous funding are found to be inaccessible by a funding agency, that is a warning sign that more funding by this agency may also result in inaccessible and perhaps nonexistent data.

6. Managing funds Another aspect of implementing a documentation project that is not discussed much is the allocation, tracking, and reporting on the use of funds. Clear and consistent bookkeeping is key. Institutions upgrade and change their software regularly and the lag time between expenditures and updating of the financial systems can be long. Therefore, an institution’s financial tracking system cannot substitute for records kept by the researcher.

7. Hardware and software Central to the success of a documentation project is the efficient use of digital recording equipment. Each digital audio and video recorder on the market is slightly different and so in preparation for recording the documenter could read the manual or call a support line to review equipment settings. A short session training community members on the use of equipment is of limited use; time should be allocated for review and prac tice for all team members doing recording, including community members. Common errors include low gain settings, lack of power to the microphone, weak batteries in the

160 Shobhana Chelliah recorders, poor sound quality of video recordings and no external recorder, less than optimal placement of a microphone or video recorder, and lack of cue and context for the video recording such as contextualizing opening shots. Preparing an equipment and procedure checklist for recording is useful.

8. Workflow The well-managed sequencing of research and practical tasks will speed up data analysis and dissemination. This is true for a documenter working on her own or with a team of documenters with assigned tasks all working toward the same goals.

8.1. Research tasks and workflow For data gathering and analysis, the following workflow is typical: recording language interactions or performances > collecting metadata > transcribing > translating > annotating > archiving> disseminating. In implementation, each of these steps is on going and linked to the other. For instance, as the annotated corpus grows, new questions for grammatical analysis arise and old ones are answered. Gaps in morphological anal ysis may make it necessary for the researcher to create specific elicitation schedules to record information to help with those problematic spots. Thus, even though the bulk of recording, translating, and transcribing might occur at the early part of the project, new recordings with transcriptions and translations will be needed throughout the life of the project. Discussion of research methodology and sequencing of these steps in doc umentation can be found in Chelliah (2001, 2016), Gippert, Himmelman, and Mosel (2006), and Chelliah and de Reuse (2011), among others.

8.2. Data management and workflow A time-consuming but necessary effort in language documentation is data manage ment. At the time of writing, we, that is social scientists and especially linguists, have woefully little training in this area. However, language documenters, by dint of the enormous amounts of data each project produces, are at the forefront of developing methods on how to catalog, store, and prepare digital language data for easy access and searchability. Language archives are now providing more guidance on required or preferable formats for project deposits. One example is the guide to depositors at the Endangered Languages Archive (ELAR), the digital language archive funded by the Arcadia fund and hosted on the servers of the School of Oriental and African Studies (SOAS), University of London. In order to pursue its mission to preserve and disseminate

Documentation Projects for Spoken Languages 161 endangered-language data, ELAR provides depositing procedures specifying what can be deposited; what format the deposit should take; and how the files should be named. A useful step in planning for the documentation project would be to use the recommended structure of the deposit from the archive to guide naming conventions and file structure. This will allow the documenter to get materials to the archive in a timely fashion. Thought of in this way, daily, weekly, monthly, and yearly data manage ment tasks take on a creative aspect because each task (e.g., creating backup files or file naming) has a clear goal, that of creating a digital “collection” for public, academic, or speaker-community consumption. Each day will have to have include time budgeted for data backup. It has been stated often and it is worth stating again that multiple dig ital copies of all materials should be stored in multiple secure locations. Typically, the digital recordings from a day’s recording sessions are copied from a memory card (less stable) to a hard drive (more stable) and the files renamed (see the guidelines to file naming on the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC) website: http://www.paradisec.org.au/). Then the contents of the hard drive would be uploaded to a secure server such as a university’s research server. This would be a weekly task. An example of a monthly activity is scanning and cataloging field notes and any printed material shared by the community. Ultimately, all language data and discussion of that data need to be collected in a metadata database which will help the researcher keep track of the data and also facil itate transfer to an archive for long-term preservation and access. An example of such software which can be used for this purpose is SayMore from the Summer Institute of Linguistics which allows the documenter to link together all the files that are related to a particular “session” (i.e., a recording of a narrative or scan of published materials).8 Then all files derived from that source such as commentary or annotation will be asso ciated within the same session folder making it possible for the documenter to quickly find related files. SayMore also has pre-assigned categories for editorial, analytic, de scriptive, and administrative metadata, and even though these will not always satisfy the needs of every documentation project, they are a good starting point. Getting accurate metadata is more challenging when the data is coming into the proj ect from a variety of sources. When several researchers and community members are working on a project, the speed at which and the amount of data that is collected can easily add an hour of housekeeping work per day. For example, on the Lamkang proj ect, speaker-linguist Sumshot Khular created time-aligned translations in SayMore. These were then imported into FLEx and annotated by team members. Further informal “after hours” discussions between team members and Sumshot Khular were recorded in text files or notebooks which were scanned to add to the language record. The rela tionship between these “derivative” files was made clear through uniform file naming conventions.

8

http://saymore.palaso.org/.

162 Shobhana Chelliah Language documentation is a long-term effort so that over the lifetime of a project, the language documenter engages with many people in varied situations. It can be diffi cult to keep track of the faces, places, and related outcomes of each meeting over many years especially if the participants are both in a traditional community setting and di aspora. A simple journal entry with a date, names, contact information, situation of meeting the participant, and what was discussed will be useful in telling the story of the documentation and in filling in gaps in the metadata. A journal recording this in formation about people, places, and faces will be useful and will become part of the documentation itself. As soon as one is able, information about participants should make it into data management software so that the contributions of these participants can be systematically noted. See Conathan (2011) for a checklist of information on participants. As is common in many other disciples, journaling can also be an effective tool for in trospection. Writing experiences down can help organize thoughts, reflect on successes, and evaluate challenges. Teacher-training programs, for example, encourage students to record day-to-day experiences for lesson planning and execution (e.g., Farrell 2007)). Anthropologists and field linguists typically keep journals to record and reflect on experiences (Vivanco 2017). In language documentation, journaling can similarly be useful for recordkeeping, planning, and reflection. Journaling can be therapeutic as well, especially if a researcher has no one to talk to but “Dear Diary.” Consider Macaulay’s (2004) very personal account of her first trip to Chalcatongo, Mexico, to build on ongoing research on Mixtec where she records feelings of culture shock, boredom, and alienation. Macaulay’s journal also includes reflection and planning for next steps in her linguistic investigation. Other common reflections are on eureka moments at certain discoveries and methods of elicitation that work well or don’t work at all. The modern-day journal, the blog, can be used in similar ways and is especially useful if documentation is a coordinated effort between several researchers and the commu nity. Blogs are being used in field methods classes for students to update their classmates on discoveries and procedures used in one-on-one sessions with consultants (Jessica Coon, personal communication, 2016) and one could similarly use blogging in a team documentation project. The fieldworker might also record the documentation progresses itself since the results of the documentation project will be better understood if they are part of a larger story. Video documentation and interviews of how the project came about, who the major contributors are, their motivations for contributing, and who is using the docu mentation could all be captured in high-quality video and audio for later packaging into an explanation of the project. Twenty years ago, fieldworkers only shared a portion of the language data they col lected and this was usually in the form of grammars, dictionaries, and text collections. Today, a documentary corpus is typically archived with most materials being publicly accessible. The documenter can keep this in mind when planning what is collected, how the data is managed, and how it is finally packaged for public viewing. Woodbury (2014,

Documentation Projects for Spoken Languages 163 33) suggests a guide to the corpus which would include both the intellectual and per sonal story of its shaping and about its intended use and audience.

9. Dissemination Another important consideration is when and how to provide copies of recordings to the community. Such dissemination may well start informally. In many cases, copies of the recording are given right away to the person providing what is recorded along with a copy for the community. Data collected at almost every stage can be connected to re vitalization efforts, so sharing data with the community along with suggested uses, even if at the early stages of analysis, can be useful. Close communication and regular visits even if for a few days, old-fashioned note cards and letters with pictures, and language- related gifts will help keep the community apprised of what is going on with the project. If data is shared years after collection, it may be too late to use that data for language revitalization—timely dissemination is critical for this purpose. More formal types of dissemination include papers in academic journals about lan guage documentation methodology, language documentation software, pedagogy, grammatical description, or contributions to linguistic theory and typology. Most linguists will produce many of these, so having a general idea of which sequence to pursue these in will be helpful.

10. Evaluation I gave a talk on linguistic fieldwork at a Berkeley Linguistics Society meeting in 2015. During the question period, a student commented that students conducting fieldwork on endangered languages for their PhD degrees did not have sufficient time to create a corpus of naturalistic data for the language they were studying, because they had to place their efforts on data collection and analysis on a particular aspect of grammar. According to evaluation criteria, would they have succeeded or failed in documenting the language by the end of their projects? Then, at the Fourth International Conference for Language Documentation and Conservation, another PhD student shared concerns about how much was required of the language documenter at the PhD stage, since he/ she is expected to create language teaching materials in addition to focused data analysis and a corpus with interlinear glossing. These interactions highlight the fact that there are no standards for post-project eval uation for a language documentation project. As stated at the outset of this discussion, there are many goals and many profiles for documenters, so evaluating whether or not a project has met its goals will need to be done on a case-by case-basis. However, it should be possible to state when a documentation project has not met the goals of the language

164 Shobhana Chelliah documentation effort writ large. If a PhD dissertation has enabled a student to record and accurately transcribe word lists, then this is a contribution, however small, to the overall documentation of the language. If the research results in a collection of clauses translated from a contact language but devoid of cultural relevance, then the contribution to the effort is meager. That data contributes to a different effort, that of understanding the principles behind language structure, but it does not contribute to creating a lasting rec ord of the language in question. For PhD students working with a time constraint and working towards jobs as academic linguists, a feasible goal would be the collection of a small corpus of natural texts that native speakers could use as a model for further text collection. Language produced without laboratory type stimuli can provide unique insights into grammar (Mithun 2001; Chelliah 2001, 2016). So, for the PhD student asking about corpus creation, the answer is that if no connected text has been collected yet, then, yes, it is the responsibility of the student to document at least some examples of connected text. This student’s engagement with the community may be the last oppor tunity to accomplish this task. It is safe to say that a documentary project can never be thought of as completed. There are, however, some common goals. Whether or not these goals have been reached can be used as criteria to evaluate the success of a project: • Has the project resulted in a collection of transcribed and translated texts? • Is there interlinear glossing for these texts, is the annotation consistent and easily understood? • Is there a sketch grammar to aid in understanding the interlinear glossing? • Is there a word list, vocabulary, or dictionary and are these linked to multimedia re sources to facilitate revitalization and analysis? • Is there appropriate and adequate audio and video documentation. • Are these material archived and is the archive usable by both academics and com munity members. Are there deposits in more than one repository? • Is anyone in the community interested in using the materials and have they been provided with a guide on how to do so? • Have results been disseminated in widely available publications, via a dedicated website, and social media? • Have the concerns of the community been allayed and community needs been met? Success would be measured against whether or how well the project met its own goals. Therefore, when planning a documentation project it should be standard practice to set goals and to think about how and when one could evaluate whether or not those goals have been met. Linguists sometimes reflect that they unwittingly spend too much time on one activity while unintentionally ignoring another. Preparing items for archiving is one activity that gets ignored; another is providing the community with regular updates on progress. Because recording, transcribing, and translating running text is often excluded from linguistic field methods courses, students don’t feel comfortable with text collection and analysis and leave this step for last. It is important to make advances on each of the goals at a steady pace rather than waiting to finish one to get to the next.

Documentation Projects for Spoken Languages 165 Experience teaches us that this is more effective: there will be less catch-up house keeping at a later stage and more community involvement and benefits throughout the process.

11. Conclusion Training, networking, and planning are all needed to move a documentation project from concept to reality. Underlying all of this is the shared conviction of documenters and team members that language documentation products will generate positive outcomes for preserving and providing access to language information necessary for ongoing and future language revitalization. The significance of the outcomes of each documentation project is a strong motivator in supporting documenters as they seek training, support networks of similarly motivated documenters, learn how to sequence activities so that goals are met, and disseminate documentation while keeping access and privacy in mind.

References Ajo, Frances, Valérie Guérin, Ryoko Hattori, and Laura C. Robinson. 2010. “Native Speakers as Documenters. A Student Initiative at the University of Hawai‘i at Mānoa.” In Language Documentation. Practice and Values, edited by Lenore Grenoble and Louanna Furbee, 275– 285. Amsterdam and Philadelphia: John Benjamins. Baart, Joan. 2009. A Field Manual of Acoustic Phonetics. Dallas, TX: SIL International. Barwick, Linda. 2012. “Including Music and the Temporal Arts in Language Documentation.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 166–182. Oxford and New York: Oxford University Press. Bird, Stephen, Florian R. Hanke, Oliver Adams, and Haejoong Lee. 2014. “Aikuma: A Mobile App for Collaborative Language Documentation.” Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages, 1–5, Baltimore, MD, Association for Computational Linguistics. Bochnak, M. Ryan and Lisa Matthewson, eds. 2015. Methodologies in Semantic Fieldwork. Oxford: Oxford University Press. Bowern, Claire. 2008. Linguistic Fieldwork: A Practical Guide. London: Palgrave Macmillan. Bowern, Claire. 2011. “Planning a Language Documentation Project.” In The Cambridge Handbook of Endangered Languages, edited by Peter K Austin and Julia Sallabank, 459–481. Cambridge: Cambridge University Press. Brandt, Elizabeth, Bonnie Lavender- Lewis and Philip Greenfeld. 1994. “Foreword.” In Myths and Tales of the White Mountain Apache, edited by Grenville Goodwin, ix–xvi. Tucson: University of Arizona Press. Chelliah, Shobhana. 2001. “The Role of Text Collection and Elicitation in Linguistic Fieldwork.” In Linguistic Fieldwork, edited by Paul Newman and Martha Ratliff, 152–166. Cambridge: Cambridge University Press.

166 Shobhana Chelliah Chelliah, Shobhana. 2014. “Fieldwork for Language Description.” In Research Methods in Linguistics, edited by Robert J. Podesva and Devyani Sharma, 51–73. Cambridge: Cambridge University Press. Chelliah, Shobhana. 2016. “Responsive Methodology: Perspectives on Data Gathering and Language Documentation in India.” Journal of South Asian Languages 3: 175–195. Chelliah, Shobhana and Willem J. de Reuse. 2011. Handbook of Descriptive Linguistic Fieldwork. Dordrecht, Holland: Springer. Childs, Tucker, Jeffrey Good, and A. Mitchell. 2014. “Beyond the Ancestral Code: Towards a Model for Sociolinguistic Language Documentation.” Language Documentation & Conservation 8: 168–191. Conathan, Lisa. 2011. “Archiving and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter Austin and Julia Sallabank, 235–254. Cambridge: Cambridge University Press. Farrell, Thomas S. C. 2007. Reflective Language Teaching from Research to Practice. London and New York: Continuum Press. Gippert, Jost, Nikolaus P. Himmelmann, and Ulrike Mosel. 2006. Essentials of Language Documentation. Berlin: Mouton de Gruyter. Indigenous Language Institute. n.d. Handbook 3: Conducting a Language Survey (Awakening Our Languages ILI Handbook Series). Santa Fe, NM: The Indigenous Language Institute. Khullar, Beshot. 2006. Lamkaang Pauril Ungta Pau Kurlou [The 1st Lamkaang Book on Old Lamkang Proverbs, Sayings and Riddles]. Kakching, Manipur: D. Edenrei. Khullar, Beshot. 2013. Nee Laa [Folk Songs of Lamkaang]. Kakching, Manipur: Self-published. Krauss, Michael E. 2006. “A History of Eyak Language Documentation and Study: Fredericæ de Laguna in Memoriam.” Arctic Anthropology 43: 172–217. Ladefoged, Peter. 2003. Phonetic Data Analysis. Oxford: Blackwell. Macaulay, Monica. 2004. “Training Linguists for the Realities of Fieldwork.” Anthropological Linguistics 46: 184–209. Maxwell, Judith. 2010. “Training Graduate Students and Community Members for Language Documentation.” In Language Documentation. Practice and Values, edited by Lenore Grenoble and Louanna Furbee, 255–274. Amsterdam and Philadelphia: John Benjamins. Mishler, Craig. 2014. “Linguistic Team Studies Caribou Anatomy.” Arctic Research Consortium of the United States (ARCOS). https://www.arcus.org/witness-the-arctic/2014/3/article/ 22797. Accessed August 15, 2016. Mithun, Marianne. 2001. “Who Shapes the Record: The Speaker and the Linguist.” In Linguistic Fieldwork, edited by Paul Newman and Martha Ratliff, 34–54. Cambridge: Cambridge University Press. Nagy, Naomi. 2016. “Studying More and Less Endangered Heritage Varieties.” Paper presented at the LSA/ CELP Symposium: Documenting Variation in Endangered Languages. Washington, DC, January 7. Nash, Carlos, and Raquel Sapien, eds. 2015. Documenting Endangered Languages Videos. “DEL Outreach Video Series.” Online video clips. YouTube, August 31. https://www.you tube.com/playlist?list=PLx12labZqbzGbA0rQU0xg5cMzz9rp_dqY. Penfield, Susan. 2015. “Innovative Training Opportunities: The NSF/AILDI Collaboration for Indigenous Language Documentation.” In 30 Year Tradition of Speaking from Our Heart, ed ited by Teresa L. McCarty, Lucille J. Watahomigie, Akira Y. Yamamoto, and Ofelia Zepeda, 15–19. Tucson: University of Arizona Press.

Documentation Projects for Spoken Languages 167 Penfield, Susan and Ofelia Zepeda. 2008. “Grant Writing for Indigenous Languages.” http:// aildi.arizona.edu/sites/default/files/grantwriting_manual.pdf. Reiman, Will. 2009. “Basic Oral Language Documentation.” http://hdl.handle.net/10125/5071. Rice, Keren. 2010. “Must There Be Two Solitudes? Language Activists and Linguists Working Together.” In Indigenous Language Revitalization: Encouragement, Guidance & Lessons Learned, edited by Jon Reyhner and Louise Lockard, 37–59. Lincoln and London: University of Nebraska Press. Thieberger, Nicholas, ed. 2012. The Oxford Handbook of Linguistic Fieldwork. Oxford and New York: Oxford University Press. Thieberger, Nicholas and Simon Musgrave. 2007. “Documentary Linguistics and Ethical Issues.” In Language Documentation and Description, vol. 4, edited by Peter K. Austin, 26–37. London: Hans Rausing Endangered Languages Project, School of Oriental and African Studies. Turpin, Myfany and Lana Henderson. 2015. “Tools for Analyzing Verbal Art in the Field.” Language Documentation & Conservation 9: 89–109. http://nflrc.hawaii.edu/ldc.http://hdl. handle.net/10125/24632. Vivanco, Luis A. 2017. Field Notes A Guided Journal for Doing Anthropology. Oxford and New York: Oxford University Press. Woodbury, Anthony C. 2011. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter Austin and Julia Sallabank, 159–186. Cambridge: University Press. Woodbury, Anthony C. 2014. “Archives and Audiences: Toward Making Endangered Language Documentations People Can Read, Use, Understand, and Admire.” Language Documentation and Description, vol. 12, edited by David Nathan and Peter K. Austin, 19–36. London: School of Oriental and African Studies. Woodbury, Anthony C. 2016. “Verbal Artistry: The Missing Link Among Language Documentation, Grammatical Theory, and Linguistic Pedagogy.” Plenary address at the 4th International Conference of Language Documentation and Conservation. http://hdl. handle.net/10125/25388. Woodbury, Anthony C. and Nora C. England. 2004. “Training Speakers of Indigenous Languages of Latin America at a US University.” In Language Documentation and Description, vol. 2, edited by Peter K. Austin, 122–139. London: School of Oriental and African Studies.

Chapter 8

Endangere d Si g n L anguag e s An Introduction James Woodward

1. Introduction Sign Languages (SLs) can develop wherever there is a deaf person or deaf people.* Rennell Island SL (Kuschel 1974) developed through the interaction of one deaf indi vidual with many of the roughly 1,000 inhabitants of Rennell Island. Providence Island SL (Woodward 1987) developed out of the interaction of fewer than 25 deaf people with each other and with many of the estimated 3,000 hearing people living on Providence Island. Original Chiang Mai SL (Woodward and Wongchai 2015) developed through the interaction of fewer than 100 deaf people with each other in Chiang Mai, Thailand, long before the establishment of schools for deaf people in Thailand. Sign languages can exist without the formation of a Deaf Community or a Deaf Culture as demonstrated in the paragraph above. Because of generally positive attitudes among hearing people in places like Rennell Island and Providence Island, deaf people are fairly well integrated into the majority society, and most of the time * Research on which this chapter was based was supported in part by the following grants from the Nippon Foundation, Tokyo, Japan: The Thai World Deaf Leadership Project, Opening (High School and) University Education to Deaf People in Viet Nam Through Sign Language Analysis, Teaching, and Interpretation; Asia-Pacific Sign Language Research and Training Program. The author would like to thank the following Culturally Deaf signers who worked directly with the author on sign language research and sign language documentation in the above-named projects that are reported on in this chapter: Adhi Kusumo Bharoto, Chea Sokchea, Heang Samath, Ho Thu Van, Peoungpaka Janyawong, Le Thi Thu Huong, Long Lodine, Luu Ngoc Tu, Mat Seila, Nguyen Dinh Mong Giang, Nguyen Hoang Lam, Nguyen Tuan Linh, Nguyen Minh Nhut, Nguyen Tran Thuy Tien, Pham Van Hai, Iwan Satryawan, Kampol Suwanarat, Laura Lesmana Wijaya, and the late Thanu Wongchai.

Endangered Sign Languages 169 deaf people have no desire or intention to form their own community. However, within many societies where deaf people are not well integrated, it is common to find the formation of a Deaf Community, with its own social institutions and be lief system and values. Culturally Deaf people in Deaf communities generally think of themselves as part of a linguistic minority that communicates differently from people in the majority society. While many hearing people think a deaf person is a person who can’t hear, many Culturally Deaf think of themselves primarily as a user of a sign language (and members of a linguistic minority). Linguists who want to work with sign languages would do well to adopt a Culturally Deaf perspective on deafness. Culturally Deaf people have had to face the same linguistic and cultural pressures that users of other linguistic minority languages have: being viewed as inferior by members of the majority culture, schooling (where available1) in an institutional setting, almost exclusive instruction under majority group teachers, forced instruction in the majority language, prohibition of the minority language in the school system, etc. However, Culturally Deaf people have had three additional pressures that members of other lin guistic minorities have not had. (1) Culturally Deaf people have a more difficult time overcoming inferiority stereotyping by the majority culture than members of other linguistic minorities since deaf people are often viewed as having a medical pathology. There is an overwhelming tendency in many nations to “diagnose” deafness rather than to identify deaf people, to “prescribe treatment” based on the “diagnosis” rather than encouraging or sometimes even allowing Culturally Deaf adults to interact with deaf children, to advocate “early intervention” rather than early education, etc. It should be noted that The World Health Organization classifies deafness as a non-communicable disease, certainly something that The World Federation of the Deaf is not particularly happy with. (2) Very small percentages of deaf children have a deaf parent or deaf parents. Thus the great majority of Culturally Deaf people belong to a different cul tural/linguistic group from their parents and must be enculturated into the mi nority group through means other than their parents. It is often estimated that around 10% of deaf people in the United States have deaf parents, yet this is one of the highest percentages of deaf children with deaf parents in any nation. In Cambodia, Thailand, and Viet Nam the percentage of deaf children with deaf parents is less than 2% (Woodward 2012). (3) The primary language of Culturally Deaf people differs not only in code structure but also in channel structure from the majority language. This has often resulted in the promotion of speech over signing. When signing is allowed in educational

1

The World Federation of the Deaf estimates that 80% of deaf children around the world have no access to formal education.

170 James Woodward systems, there have often been attempts to change the grammatical order of the natural sign language into the grammatical order of the majority spoken lan guage. Because of this, language oppression has often been considerably more severe for Culturally Deaf people as compared with hearing people. Sometimes there is a conscious effort not to allow and even to punish deaf people for signing, and sometimes it is an unrecognized, even unconscious, habit that can reinforce stereotypes about sign language, such as when people ask about the number of “speakers” of a given sign language instead of asking about the number of users of a given sign language.

2. Intended audience This chapter, of necessity, is rather limited in length; and different audiences require different information. Linguists who know little about sign languages but who may have an interest in the documentation of endangered sign languages are the intended audi ence for this chapter. Thus, this chapter provides a great deal of background information on the structure of some of the endangered sign languages I have worked on since 2003 as Regional Manager of the Asia-Pacific Sign Linguistics (APSL) Research and Training Program at The Centre for Sign Linguistics and Deaf Studies (CSLDS) at The Chinese University of Hong Kong (CUHK). The APSL program, funded by The Nippon Foundation in Tokyo, trains2 Culturally Deaf people from Asia and the Pacific in Sign Linguistics and provides them with con tinued support to document their own sign languages3 by producing dictionaries, gram matical descriptions, and teaching materials. Usually the Culturally Deaf people chosen for such training are users of endangered sign languages and come from countries 2

Training for Thailand was done through the Thai Deaf Leadership Program before APSL began. Training in Viet Nam involved two certificate programs: one involving 225 hours of instruction in Sign Linguistics and the other 225 hours in Sign Language Teaching. Training in Cambodia involved 225 hours of instruction in Sign Linguistics. Training in Indonesia involved several certificate programs and a Higher Diploma (Associate of Arts) degree in Sign Linguistics and Sign Language Teaching. 3 Three handbooks totaling thirty lessons for teaching Ho Chi Minh City Sign Language and three companion dictionaries have been produced by Culturally Deaf APSL trainees, have been published and are being used to promote the use of Ho Chi Minh City Sign Language. Culturally Deaf trainees from Ha Noi are working on similar materials. Five handbooks totaling twenty-five lessons for teaching Cambodian Sign Language and a dictionary have been produced by Culturally Deaf APSL trainees, have been published, and are being used to promote the use of Cambodian Sign Language. Two handbooks totaling twenty lessons for teaching Yogyakarta Sign Language and two companion dictionaries have been produced by Culturally Deaf APSL trainees so far, have been published, and are being used to promote the use of Yogyakarta Sign Language. One handbook of ten lessons for teaching Jakarta Sign Language and one companion dictionary of Jakarta Sign Language have been produced so far by Culturally Deaf APSL trainees, have been published, and are being used to promote the use of Jakarta Sign Language.

Endangered Sign Languages 171 where it is rare to find Culturally Deaf sign language users who have graduated from high school.4

3. Some background information of the sign languages discussed in this chapter Many areas around the world are rich in sign languages, but many of these sign languages are endangered. Southeast Asia is a particular case in point. In Thailand, Viet Nam, Cambodia, and Indonesia for the ten distinct sign languages that are the primary focus of this chapter, nine of the ten are classified as endangered or dying.5

3.1. Thailand Four distinct sign languages have been identified so far in Thailand: Ban Khor SL (Nonaka 2004), Original Bangkok SL (Woodward and Suwanarat 2015), Original Chiang Mai SL (Woodward and Wongchai 2015), and Modern Thai SL (Woodward et al. 2015). These four sign languages have been classified into three different language families based on lexicostatistical analysis of core basic vocabulary (Woodward 2000).

3.2. Viet Nam Three distinct sign languages in have been identified so far in Viet Nam: Hai Phong SL (Woodward 2015a), Ha Noi SL (The HNSL Production Team in preparation), and Ho Chi Minh City SL (The HCMCSL Production Team 2007). These three sign languages have been classified into the same language family based on lexicostatistical analysis of core basic vocabulary (Woodward 2000).

4 Adult Culturally Deaf trainees from Viet Nam and Cambodia had only finished fifth grade when beginning the APSL training, the highest level of training provided to deaf students at that time. Adult trainees from Indonesia had finished ninth-or tenth-grade training when they started the APSL program. 5 There are other endangered sign languages in Southeast Asia, such as Desa Kolok in Bali (Marsala 2008), that are not included in this chapter.

172 James Woodward

3.3. Cambodia One sign language variety has been identified so far in Cambodia: Cambodian SL (The Cambodian Sign Language Production Team 2007) as it is used in Phnom Penh.

3.4. Indonesia (Java) Two distinct sign languages in have been identified so far in on the island of Java in Indonesia: Yogyarkarta SL and Jakarta SL. These two sign languages have been classified into the same language family based on lexicostatistical analysis of core basic vocabu lary (The Jakarta Sign Language Production Team 2013; The Yogyakarta Sign Language Production Team 2013). With this sociolinguistic background information in mind, we can look at some of the formational (phonological), morphological, syntactic, and lexical information that were important to consider in the documentation of these sign languages.

4. Sign language formational structure (sign phonology) Sign languages have a level of sublexical structure analogous to but not dependent on the phonology of spoken languages. Sign language formational structure (sign pho nology) has five parameters: (1) location, (2) handshape, (3) orientation, (4) movement, and (5) non-manual expression. Various transcription systems are used; it is beyond the scope of this chapter to deal with these systems adequately, but the chapter will show how bilingual dictionaries can be created without using a formal transcription system.

4.1. Duality of patterning Sign languages, like spoken languages have duality of patterning and thus can create an infinite amount of vocabulary from re-occurring phonological units. Minimal pairs can be found for each parameter indicating phonological as well as phonetic structure as shown in the following examples from Ho Chi Minh City SL. The minimal pair in Figure 8.1 shows the distinctiveness between the locations of forehead and chin in Ho Chi Minh City SL. The minimal pair in Figure 8.2 shows the distinctiveness between the handshapes of index and index + thumb in Ho Chi Minh City SL. The minimal pair in Figure 8.3 shows the distinctiveness between of palm-down, fingers-outward orientation and palm-contralateral, fingers outward orientation in Ho Chi Minh City SL.

Endangered Sign Languages 173

Figure 8.1. Example of a minimal pair for location in Ho Chi Minh City SL

Figure 8.2. Example of a minimal pair for handshape in Ho Chi Minh City SL

Figure 8.3. Example of a minimal pair for orientation in Ho Chi Minh City SL

Figure 8.4. Example of a minimal pair for movement in Ho Chi Minh City SL

174 James Woodward

Figure 8.5. Example of a minimal pair for non-manual expression in Ho Chi Minh City SL

The minimal pair in Figure 8.4 shows the distinctiveness between long outward (non- repeated) movement and short outward repeated movement in Ho Chi Minh City SL. The minimal pair in Figure 8.5 shows the distinctiveness between no non-manual ex pression and pursed lips in Ho Chi Minh City SL.

4.2. Typical distinctive units in sign phonological inventories Sign languages vary in what is distinctive in their phonological inventories, but general observations can be made about what may be possibly distinctive phonologically.

4.2.1. Locations that are possibly distinctive Listed are some locations that have been found to be phonologically distinctive in a number of sign languages. (1) neutral space in front of the body (2) top of the head (3) upper face (4) whole face (5) eye (6) nose (7) cheek (8) ear (9) mouth/lips (10) chin

(11) under chin (12) throat/neck (13) trunk* (14) upper arm (15) elbow (16) lower arm (17) inside of wrist (18) back of wrist (19) hip (20) leg

* There are many places on the trunk phonetically, but there are few, if any, phonological distinctions that are made in the space of the trunk.

4.2.2. Handshapes that are possibly distinctive A number of factors are important in describing handshapes. (1) How many fingers (not including the thumb are extended): 0, 1, 2, 3, 4? (2) Are the extended fingers straight or bent? (3) Is the thumb closed, partially extended or fully extended?

Endangered Sign Languages 175

(4) Are the fingers and thumb rounded or not rounded? (5) Are the fingers tapered or not tapered? (6) Does the thumb have contact with the finger(s) or not? (7) During contact is the thumb partially or completely inserted between two fingers? (8) Are the extended fingers on the radial side or ulnar side of the hand? (9) Are the extended fingers spread or not spread? (10) Are the fingers crossed or uncrossed? Using the above ten factors, Figure 8.6 shows a chart of distinctive handshapes in Cambodian SL.

Figure 8.6. Chart of distinctive handshapes in Cambodian SL

176 James Woodward

4.2.3. Orientations that are possibly distinctive In order to describe orientations, it is necessary to describe the orientation of the palm and the orientation of the fingers. The palm orientation and the finger orien tation should be specified as if all the fingers were extended and non-bent. There are twenty-four possible combinations of palm and finger orientations in one- handed signs, eight of which do not occur in Ho Chi Minh City SL, as illustrated in Figure 8.7.

4.2.4. Movements that are possibly distinctive Movement is extremely difficult to describe and transcribe since many movements can occur in an individual sign, and some movements may be sequential while others

(a) palm in, fingers up (b) palm contra, fingers up

palm in, fingers contra palm contra, fingers out

palm in, fingers down palm contra, fingers down

palm in, fingers ipsi palm up, fingers in palm up, fingers contra palm up, fingers out

does not occur in HCMCSL does not occur in HCMCSL does not occur in HCMCSL

palm up, fingers ipsi palm contra, fingers in

does not occur in HCMCSL

palm out, fingers up

palm out, fingers contra does not occur in HCMCSL palm out, fingers down does not occur in HCMCSL palm out, fingers ipsi

Figure 8.7. Examples of distinctive orientations for one-handed signs in Ho Chi Minh City SL

Endangered Sign Languages 177 are simultaneous. However, it should be noted that there is a relatively easy way to al phabetize movements which will be discussed in the section related to producing sign language dictionaries. Listed below are some movements that have been found to be phonologically distinctive in a number of sign languages.

(c)

palm down, fingers in

palm down, fingers contra

palm down, fingers out

palm down, fingers ipsi palm ipsi, fingers in

does not occur in HCMCSL

palm ipsi, fingers up palm ipsi, fingers out

does not occur in HCMCSL

palm ipsi, fingers down

Figure 8.7 Continued

178 James Woodward (1) upward (2) downward (3) upward and downward (4) inward (5) outward (6) inward and outward (7) leftward (8) rightward (9) side to side (10) join or link (11) enter (12) cross (13) arc

(14) circle (15) touch (16) twist at wrist (17) bend at wrist (18) bend at palm knuckles (19) bend at finger knuckles (20) thumb rubs fingers (21) fingers wiggle (22) thumb rubs fingers (23) fingers in handshape open (24) fingers in handshape close (25) hands alternate in movement (26) repeated movement

4.2.5. Non-manual expressions that are possibly distinctive Non-manual expressions are also quite difficult to describe and transcribe. However, it should be noted that there is a relatively easy way to alphabetize non-manual expressions which will be discussed in the following section. Listed below are some non-manual expressions that have been found to be phonologically distinctive in a number of sign languages. (1) eyebrows are raised (2) eyebrows are furrowed (3) eyes are opened (4) eyes are closed (5) eyes squint (6) mouth is opened wide (7) lips are rounded (8) lips are stretched (9) cheeks are puffed

(10) tongue is extended (11) head shakes (12) head nods (13) head tilts from side to side (14) head tilts forward (15) head tilts backward (16) shoulders push forward (17) shoulders backward (18) shoulders turn to one side

4.3. How to use sign phonological parameters to “alphabetize” signs in a dictionary The following method has been taught through the Asia-Pacific Sign Linguistics Research and Training Program at The CSLDS at CUHK to Culturally Deaf adults with a minimum fifth-grade education and has been used successfully by them to create bilin gual sign language dictionaries of their own sign languages.

Endangered Sign Languages 179

ship-goes-forward, V

ballpoint-pen, N

bus, N

most, AV

eraser, N

paternal-uncle, N

two-wheeled-vehicle-inchesforward, V

eat-star-apple, V

two-wheeled-vehicle-goesforward, V

wrong, AJ

who, QW

backpack, N

smell, V

ship, N

book-bag, N

warm, AJ

white, AJ

Da-Nang, N

Figure 8.8. Ho Chi Minh City Sign Language signs to be alphabetized

180 James Woodward Figure 8.8 shows some signs from Ho Chi Minh City to be “alphabetized” using five parameters. Step 1. List the one-handed signs. Step 2. Use the handshape chart like the one in Figure 8.6 to organize signs. Begin with column 1, row 1 and move to column 2, row 1, etc. Step 3. For signs with the same handshape, organize signs following the order of orientations as listed in Figure 8.7. Step 4. For signs with the same handshape and orientation organize by location from higher locations to lower locations. Step 5. For signs with the same handshape, orientations, and location, organize by movement from simple to complex. Step 6. For signs with the same handshape, orientation, location, and movement, or ganize by non-manual expression, organize by no non-manual expression to non- manual expression. Step 7. List the two-handed signs. Step 8. Use the handshape chart like the one in Figure 8.6 to organize two-handed signs by the dominant handshape (right hand for right-handed signers). Begin with column 1, row 1 and move to column 2, row 1, etc. Step 9. Use the handshape chart like the one in Figure 8.6 to organize two-handed signs by the non-dominant handshape (left hand for right-handed signers). Begin with column 1, row 1 and move to column 2, row 1, etc. Step 10. For two-handed signs with the same handshapes, organize signs following the order of orientations for the dominant handshape (right hand for right-handed signers) as listed in Figure 8.7. Step 11. For two-handed signs with the same handshapes, organize signs following the order of orientations for the non-dominant handshape (left hand for right-handed signers) as listed in Figure 8.7. Step 12. For two-handed signs with the same handshapes and orientations organize by location from higher locations to lower locations (following the same procedures as with one-handed signs. Step 13. For two-handed signs with the same handshapes, orientations, and location, organize by movement from simple to complex. Step 14. For two-handed signs with the same handshapes, orientations, location, and movement, organize by non-manual expression, organize by no non-manual ex pression to non-manual expression. The eighteen signs discussed above can then be put into a more dictionary-like format as shown in Figure 8.9.

4.4. Phonological processes Sign languages exhibit phonological processes and change found in the world’s spoken languages: assimilation, coalescence, deletion, epenthesis, metathesis, etc. Oldest and

Endangered Sign Languages 181 newest signs for months of the year in Ho Chi Minh City SL, such as February shown in Figure 8.10, illustrate these changes. Assimilation of V handshape from second sign to first sign. Coalescence occurs because two signs become one sign. Deletion of circular movement. Epenthesis of outward movement

(a)

eat star apple, V

wrong, AJ contra, up

book bag (shoulder carry), N

paternal uncle, N contra, out

most, AV

ballpoint pen, N

Figure 8.9. Ho Chi Minh City SL signs after alphabetization

who, QW

182 James Woodward (b) down, in

contra

two wheeled vehicle goes forward, V

two wheeled vehicle inches forward, V

smell, V

down, contra

white, AJ

in

warm, AJ

eraser, N

backpack, N

(c) down

Da Nang, N

ship goes forward, V

Figure 8.9. Continued

in

bus, N

up

ship, N

Endangered Sign Languages 183

Figure 8.10. Oldest and newest signs for February in Ho Chi Minh City SL

Figure 8.11. “Stage 1” Modern Thai SL variant for China

It is sometimes possible to observe several stages in sign phonological change through sociolinguistic variation in a community of sign language users. Figure 8.11 through Figure 8.15 illustrate such phonological change and sociolinguistic variation in the com munity of Modern Thai SL users. “Stage 1” to “Stage “2” Metathesis of handshapes in the second sign in the compound. “Stage 2” to “Stage 3” Regressive assimilation of orientation of handshape in the first sign conditioned by the orientation of the dominant handshape in the second sign of the compound. “Stage 3” to “Stage 4” Deletion of non-dominant hand. Coalescence (two signs merge into one sign). Epenthesis of short outward movement. “Stage 4” to “Stage 5” Deletion of outward movement. Compensatory lengthening (repeated movement). Fusion of “1” and “C” handshapes into second handshape.

184 James Woodward

Figure 8.12. “Stage 2” Modern Thai SL variant for China

Figure 8.13. “Stage 3” Modern Thai SL variant for China

Figure 8.14. “Stage 4” Modern Thai SL variant for China

Figure 8.15. “Stage 5” Modern Thai SL variant for China

Endangered Sign Languages 185

5. Sign language morphology All of the endangered sign languages discussed in this chapter have more morpholog ical processes than the spoken/written majority languages that occur in same nation. One example is that all of the sign languages can inflect for person for both Subject and Object, while none of the relevant majority spoken/written languages do. Sometimes this inflection involves phonological change in the orientation of the handshape and sometimes it does not as Figure 8.16 shows.

5.1. Noun-verb derivations In addition to inflections, Ho Chi Minh City SL has derivational morphology that regu larly distinguishes classes of noun-verb pairs by longer, non-repeated movement on the verb (“fly”) and shorter, non-repeated movement on the noun (“airplane”) and/or with additional non-manual expression on the verb (“ship-go,” “eat-watermelon”) and with no additional non-manual expression on the noun (“ship,” “watermelon”) as shown in Figure 8.17.

I-give-you

you-give-me

person-on-the-right-givesperson-on-the-left

I-tell-you

you-tell-me

person-on-the-right-tellsperson-on-the left

Figure 8.16. Some examples of inflection for subject and object in Original Bangkok SL

186 James Woodward

Figure 8.17. Some examples of derivations in Ho Chi Minh City SL

5.2. Classifier constructions Classifiers in relevant spoken/written majority languages in Southeast Asia are nominal classifiers—that is they occur with nouns. Original Bangkok SL and Original Chiang Mai SL have no classifiers. Classifiers in Modern Thai SL, Ho Chi Minh City SL, Ha Noi SL, Cambodian SL, Yogyakarta SL, and Jakarta SL are nominal and verbal; that is, they can occur with nouns and with verbs. Verbal classifiers in the sign languages discussed in this chapter are more frequent and more important than nominal classifiers. Modern Thai SL has the greatest number of classifiers of the sign languages discussed in this chapter (as illustrated in Figure 8.18) and some of the most complex classifier constructions (as shown in Figure 8.19).

5.3. Postpositions Normally, in Modern Thai Sign Language, Ho Chi Minh City SL, Ha Noi SL, Cambodian SL, Yogyakarta SL, and Jakarta SL, the postpositions are expressed as bound morphemes in two- handed classifier constructions. The orientation, position, and contact of

Endangered Sign Languages 187

Figure 8.18. Some examples of semantic classifiers in Modern Thai SL

Figure 8.19. Some examples of complex classifier constructions in Modern Thai SL

the hands used in the two-handed classifier constructions carry the postpositional meaning6 as demonstrated in Figure 8.20. When separate signs for postpositions are used in these sign languages, they normally occur after the noun.

5.4. A sentence inside a single sign Some constructions in Ho Chi Minh City SL allow whole sentences to be expressed with a single sign. In the example on the left in Figure 8.21, “An-airplane-takes-off-from-the6

In these two examples, the dominant hand (here right hand) is the morpheme “two (people),” the non-dominant hand (here left hand), is the morpheme “CL-motorcycle.” The movement is the morpheme “sit.” The relationship of the dominant hand above the non-dominant hand is an allomorph of “on.”

188 James Woodward

Figure 8.20. Some examples of postpositional classifier constructions in Ho Chi Minh City SL

Figure 8.21. Examples of whole sentences inside a single sign in Ho Chi Minh City SL

South-of-Viet-Nam-and-flies-North-and-lands-in-the-North-of-Viet-Nam” can be expressed by one sign in Ho Chi Minh City SL. Similarly, in the example on the right in Figure 8.21, “An-airplane-takes-off-from-the-North-of-Viet-Nam-and-flies-South- and-lands-in-the-South-of-Viet-Nam” can be expressed by one sign in Ho Chi Minh City SL.7 Finally, it should be noted that classifiers (and compounding of existing signs) is a very productive way of creating new “frozen” lexical items. For example, the sign for “school” in Ho Chi Minh City SL as shown in Figure 8.22 comes from classifiers and compounding. Originally the modern Ho Chi Minh City sign for school developed out

7 In both these signs, the handshape of the dominant hand (here right hand) is the morpheme “airplane.” The orientation of the fingers of the dominant hand is the morpheme for “northward” or “southward.” The non-dominant arm (here left arm) is the morpheme “Viet Nam.” The lower point of contact on the non-dominant arm is the morpheme “the-South-of-Viet Nam.” The upper point of contact on the non-dominant arm is the morpheme “the-North-of Viet Nam.” The first point of contact is the morpheme “from.” The second (last) point of contact is the morpheme “to.” Different parts of the movement include the morphemes “take-off,” “fly,” and “land.”

Endangered Sign Languages 189

Figure 8.22. An example of a frozen lexical item constructed from compounding of classifiers in Ho Chi Minh City SL

of the signs for CL-BANNER (ON) CL-HOUSE. (The repeated movement is a noun derivation.)

6. Sign language lexicons When hearing people start working with sign languages, they may start with vocabulary and/or perspectives from their own language. However, it is important to remember that vision and visual differences are extremely important to people in various Deaf communities and visual information that is considered important to members of a Deaf Community tends to be represented in the lexicon of a sign language. For example, there are numerous different lexical items for eating in Modern Thai SL, Ho Chi Minh City SL, and Cambodian SL that do not occur in spoken/written Thai, spoken/written Vietnamese, and spoken/written Khmer and that also do not occur in other sign lan guages in the region. Figure 8.23 Figure 8.24, and Figure 8.25 show lexical difference for eating in these three sign languages.

6.1. Number incorporation As mentioned earlier in the section on phonology, the sign for February in Ho Chi Minh City SL was originally two signs which through phonological processes became one sign by incorporating the number “two” into the sign for month. Actually Ho Chi Minh City SL frequently incorporates numbers into nouns. Examples of this occur in days of the week (2–7), months of the year (January–September), and number of months (1–9), among others as illustrated in Figure 8.26. A good empirical question is how many of the signs with number incorporation should occur as separate lexical items in the dictionary? Days of the week, all of which involve number incorporation in Ho Chi Minh City SL, and months of the year, most of

190 James Woodward

eat-cucumber

eat-ice-cream

eat-sandwich

eat-lead-tree

eat-noodles

eat-rice-porridge

Figure 8.23. Some examples of signs for people eating different kinds of food in Modern Thai SL

Figure 8.24. Some examples of signs for people eating different fruits in Ho Chi Minh City SL

which involve number incorporation in Ho City SL, should clearly be included, espe cially for a bilingual dictionary of Ho Chi Minh City SL and English. But what about signs like 1 month, 2 months, 3 months, etc., and 1-story-house, 2- story-house, 3-story-house? The difference between “February” and “two-months” is syntactic in spoken/written Vietnamese but lexical in Ho Chi Minh City SL, and most

Figure 8.25. Some examples of signs for people eating different fruits in Cambodian SL

Figure 8.26. Some examples of signs with number incorporation in Ho Chi Minh City SL

192 James Woodward users of Ho Chi Minh City SL see “February” and “two months” as similar signs with different meanings and think both signs should be included in any dictionary involving Ho Chi Minh City SL. “Two-story-house” shows many parallels to “school” in Ho Chi Minh City SL, and most users of Ho Chi Minh City SL believe both signs should be in cluded in any dictionary involving Ho Chi Minh City SL.

7. Sign language syntax Since sign languages are minority group languages, it should come as no surprise when the syntax of sign languages is different from that of associated majority language(s).8

7.1. Basic word order in declarative sentences Original Bangkok SL (Figure 8.27), Original Chiang Mai SL, and Modern Thai SL, Ha Noi SL, Ho Chi Minh City SL (Figure 8.28), and Cambodian SL (Figure 8.29) all have basic SOV word order in declarative sentences while spoken/written Thai, spoken/ written Vietnamese, and spoken/written Khmer have basic SVO word order in declara tive sentences. While spoken/written Indonesian has SVO word order, Yogyakarta SL has SOV word order where subject and object are non-reversible (Figure 8.30) and SVO word order where subject and object are potentially reversible (Figure 8.31). Jakarta SL predominately has basic SVO word order in declarative sentences (Figure 8.32), and spoken/written Indonesian has basic SVO word order in declarative sentences.

7.2. Word order in content questions (content questions as subjects) In additional to dramatic differences in word order in declarative sentences between sign languages and spoken languages in the same area, dramatic differences in word order in content questions are equally likely. Original Bangkok SL, Original Chiang Mai

8

Even though the syntax of the sign languages discussed in this chapter is very similar, it should be noted that none of these sign languages are mutually intelligible. An examination of the vocabulary in the example indicates why this is so. The same situation applies to basic sentential word order in spoken/ written Thai, Vietnamese, Khmer, and Indonesian. While these four spoken/written languages have SVO word order and keep content question words in situ, none of the four spoken/written languages are mutually intelligible.

Endangered Sign Languages 193

Figure 8.27. An example of SOV word order in Original Bangkok SL

Figure 8.28. An example of SOV word order in Ho Chi Minh City SL

Figure 8.29. An example of SOV word order in Cambodian SL

194 James Woodward

Figure 8.30. An example of SOV word order in Yogyakarta SL

Figure 8.31. An example of SVO word order in Yogyakarta SL

Figure 8.32. An example of SVO word order in Jakarta SL

SL (Figure 8.33), Modern Thai SL, Ha Noi SL, Ho Chi Minh City SL (Figure 8.34), and Cambodian SL (Figure 8.35) all place content question words at the end of sentences, while spoken/written Thai, spoken/written Vietnamese, and spoken/written Khmer keep the content word in situ. When the content question word is the subject of a content

Endangered Sign Languages 195

Figure 8.33. OVS word order in Wh-questions in Original Chiang Mai SL

Figure 8.34. OVS word order in Wh-questions in Ho Chi Minh City SL

Figure 8.35. OVS word order in Wh-questions in Cambodian SL

question, Original Bangkok SL, Original Chiang Mai SL, Modern Thai SL, Ha Noi SL, and Ho Chi Minh City SL all have OVS word order, while spoken/written Thai, spoken/ written Vietnamese, and spoken written Khmer have SVO word order. Yogyakarta SL also places content question words at the end of sentences, while spoken/written Indonesian keeps the content word in situ. When the content question

196 James Woodward word is the subject of a content question, Yogyakarta SL has OVS word order where subject and object are non-reversible (Figure 8.36) and VOS word order where subject and object are potentially reversible (Figure 8.37), while spoken/written Indonesian has SVO word order.

Figure 8.36. OVS word order in Wh-questions in Yogyakarta SL

Figure 8.37. VOS word order in Wh-questions in Yogyakarta SL

Figure 8.38. VOS word order in Wh-questions in Jakarta SL

Endangered Sign Languages 197 Jakarta SL also places content question words at the end of sentences, while spoken/ written Indonesian keeps the content word in situ. When the content question word is the subject of a content question, Jarkarta SL has VOS word order (Figure 8.38), while spoken/written Indonesian has SVO word order.

7.3. Word order in content questions (content questions as objects) Sometimes, comparisons of surface structures between sign languages and spoken lan guages can be potentially misleading, unless underlying structures are considered. This is clearly the case with comparing content questions in which the content word is the object of a sentence. All of the sign languages have SVO word order in such sentences, and all the spoken languages have SVO word order in such sentences. However, the SVO structure is a derived structure in the sign languages and a reflection of basic structure in spoken/ written Thai, spoken/written Vietnamese, spoken/written Khmer, and spoken/written Indonesian.

7.4. Some other examples of word order variation There are some interesting additional variations in sentential word order that have been observed in some of the Southeast Asian sign languages discussed in this chapter. These variations occur in sentences that have an object that consists of a phrase and not a single word.

7.4.1. Phonological constraints on word order with a phrasal object. In Modern Thai SL (Figure 8.39), Ho Chi Minh City SL (Figure 8.40), and Cambodian SL (Figure 8.41), if an object has the same handshape as a verb, the object must occur next to the verb and all modifiers for the object, must be placed after the verb.

Figure 8.39. Modern Thai SL Head/Modifier separation due to phonological constraints

198 James Woodward

Figure 8.40. Ho Chi Minh City SL Head/Modifier separation due to phonological constraints

Figure 8.41. Cambodian SL Head/Modifier separation due to phonological constraints

Figure 8.42. Modern Thai SL Object Head before Verb and Question Word at the end of the sentence

7.4.2. Basic Content Questions With a Phrasal Object. In Modern Thai SL (Figure 8.42), Ho Chi Minh City SL (Figure 8.43), and Cambodian SL (Figure 8.44), if an object phrase contains a content question word, the object head occurs before the verb and the content question word occurs at the end of the sentence.

Endangered Sign Languages 199

Figure 8.43. Ho Chi Minh City SL Object Head before Verb and Question Word at the end of the sentence

Figure 8.44. Cambodian SL Object Head before Verb and Question Word at the end of the sentence

8. Summary and conclusion This chapter has provided an introduction to endangered sign languages specifically designed for linguists who know little about sign languages but who may have an in terest in the documentation of endangered sign languages. Focusing on ten Southeast Asian sign languages, nine of which are endangered or dying and six of which are being documented by fluent Culturally Deaf users trained through the Asian-Pacific Sign Linguistics Program in The Centre for Sign Linguistics and Deaf Studies at The Chinese University of Hong Kong, this chapter has provided information about the historical relationships of these sign languages, sign language formational structure, “alphabet ization” of signs by formational parameters, sign language morphology, sign language syntax, and sign language lexicons. At this point, some remarks about the possible future of the documentation, conser vation, and possible revitalization of endangered sign languages are appropriate.

200 James Woodward

9. Toward the future documentation of endangered sign languages Ideally, some readers will check internet sites to find out how many sign languages there are in the world and how many are endangered. Since Ethnologue lists 142 sign lan guages, one might assume there is not that much work to be done. However, the 142 sign languages listed are by no means an exhaustive list of the world’s sign languages. No sign language is mentioned for Myanmar, yet there are at least two sign languages in Myanmar: Yangon SL and Mandalay SL which are not mutually intelligible (The Yangon Sign Language Production Team in preparation). There are anecdotal reports of other sign languages outside these cities in Myanmar. Ethnologue (and other sites) list Indonesian SL but not Jakarta SL and not Yogyakarta SL even though published research indicates they are separate languages (The Jakarta Sign Language Production Team 2013a; The Yogyakarta Sign Language Production Team 2013a). It should be noted that preliminary research on sign language variation at The University of Indonesia begun in 2015 suggests that there may be as many as six dis tinct but historically related sign languages on the island of Java alone. In addition, while the Ethnologue site does list four sign languages for Thailand (Ban Khor SL, Chiang Mai SL, Bangkok SL, and Modern Thai SL) and three for Viet Nam (Ha Noi SL, Haiphong SL, and Ho Chi Minh City SL), it lists only one sign language for Costa, Costa Rican Sign Language, while published research (Woodward 1991, 1992a, 1992b) has used lexicostatistical information to document the existence of at least four distinct sign languages from three language families in Costa Rica (Original Costa Rican SL, Modern Costa Rican SL, Bribri SL, and Brunca SL). Furthermore, sign languages in the Pacific are particularly understudied. Rennell Island SL and Hawai‘i SL are included in Ethnologue, but recently discovered Creolized Hawai‘i SL (a mixture of Hawai‘i SL and American SL) is not yet listed. A recent un published study (Woodward 2015b) reports the existence of an endangered sign lan guage on Majuro in the Marshall Islands. Anecdotal evidence also strongly suggests the existence of more than one sign language in island nations. Reports from Culturally Deaf people in the Marshall Islands suggests that there may be another sign language on Ebeye. Reports from Culturally Deaf people in Fiji indicate that Fiji has more than one sign language, and some signers have reported that they themselves grew up with a different sign language than the one they use now. There are similar reports from Tonga, Vanuatu, etc. While it is clear that there are more than 142 distinct sign languages, no one can say with certainty how many sign languages there are. Clearly it is normal for many nations to have more than one sign language and clearly it is common for most sign languages to be endangered. Based on forty-seven years of work in sign linguistics, twenty-five years of which have been in Asia, if I had to make an educated guess at this point, my estimate is that there are certainly more than 500 distinct sign languages in the world, probably

Endangered Sign Languages 201 1,000, and possibly more. (I strongly believe that Pacific Island nations are going to prove to be a treasure trove for sign languages.) Finally, what percentage of the sign languages of the world is going to prove to be en dangered? Clearly at least half, probably two-thirds (judging from the situation involving the Southeast Asian sign languages discussed in this chapter), and possibly more. In closing, there is still a tremendous amount of work to be done on endangered sign languages. How many people will ultimately need to be involved in this work and how many people are willing to be involved in this work remains to be seen.

References Kuschel, R. 1974. A Lexicon of Signs from a Polynesian Outlier Island. Copenhagen: Psychological Laboratory, Copenhagen University. Marsala, I. Gede. 2008. Desa Kolok: A Deaf Village and Its Sign Language in Bali, Indonesia. Nijmegen, The Netherlands: Ishara Press. Nonaka, Angela. 2004. “The Forgotten Sign Languages: Lessons on the Importance of Remembering from Thailand’s Ban Khor Sign Language.” Language in Society 33: 737–767. The Cambodian Sign Language Production Team. 2007. Cambodian Sign Language: Student Handbook 1: Level 1, Lessons 1–5 (English international ed.). Phnom Penh: Maryknoll Deaf Development Programme. The HCMCSL Production Team. 2007. Ho Chi Minh City Sign Language: Student Handbook 1 (English international ed.). Bien Hoa City, Dong Nai: Project on Opening University Education to Deaf People in Viet Nam Through Sign Language Analysis, Teaching, and Interpretation, Dong Nai University. The HNSL Production Team. In preparation. Ha Noi Sign Language: Student Handbook 1 (English international ed.). Bien Hoa City, Dong Nai: Project on Opening University Education to Deaf People in Viet Nam Through Sign Language Analysis, Teaching, and Interpretation, Dong Nai University. The Jakarta Sign Language Production Team. 2013. Jakara Sign Language: Student Handbook 1 (Hong Kong ed.). Hong Kong: Centre for Sign Linguistics and Studies, The Chinese University of Hong Kong. The Yogyakarta Sign Language Production Team. 2013. Yogyakara Sign Language: Student Handbook 1 (Hong Kong ed.). Hong Kong: Centre for Sign Linguistics and Studies, The Chinese University of Hong Kong. Woodward, James. 1987. “Providence Island Sign Language.” In Gallaudet Encyclopedia of Deaf People and Deafness, vol. 3, edited by John Van Cleve 103–104. New York: McGraw-Hill. Woodward, James. 1991. “Sign Language Varieties in Costa Rica.” Sign Language Studies 73: 329–346. Woodward, James. 1992a. “Historical Bases of New Costa Rican Sign Language.” Revista de Filología y Lingüística de la Universidad de Costa Rica 18(1): 127–132. Woodward, James. 1992b. “A Preliminary Examination of Brunca Sign Language in Costa Rica.” Estudios de Lingüística Chibcha 11: 1–7. Woodward, James. 2000. “Sign Languages and Sign Language Families in Thailand and Viet Nam.” In The Signs of Language Revisited: An Anthology in Honor of Ursula Bellugi and

202 James Woodward Edward Klima, edited by Karen Emmorey and Harlan Lane, 23–47. Mahway, NJ: Lawrence Erlbaum. Woodward, James. 2012. “Endangered Sign Languages and the Importance of Preserving Them.” Invited presentation, Deaf Awareness Week, Kap`iolani Community College, Honolulu, September. Woodward, James. 2015a. “Hai Phong Sign Language.” In The World’s Sign Languages: A Comparative Handbook, edited by Julie Baker Hansen, Goedele De Clerck, Sam Lutalo- Klingi, and William McGregor, 351–360. Berlin: Mouton de Gruyter. Woodward, James. 2015b. “Report on the Linguistic Status of Sign Language Varieties Used in Majuro in Relation to American Sign Language.” Unpublished manuscript, Department of Linguistics, University of Hawai‘i at Mānoa. Woodward, James, Suksiri Danthanavanich, and Peoungpaka Janyawong. 2015. “Modern Thai Sign Language. In The World’s Sign Languages: A Comparative Handbook, edited by Julie Baker Hansen, Goedele De Clerck, Sam Lutalo-Klingi, and William McGregor, 629–648. Berlin: Mouton de Gruyter. Woodward, James and Kampol Suwanarat. 2015. “Original Bangkok Sign Language.” In The World’s Sign Languages: A Comparative Handbook, edited by Julie Baker Hansen, Goedele De Clerck, Sam Lutalo-Klingi, and William McGregor, 677–686. Berlin: Mouton de Gruyter. Woodward, James and Thanu Wongchai. 2015. “Original Chiangmai Sign Language.” In The World’s Sign Languages: A Comparative Handbook, edited by Julie Baker Hansen, Goedele De Clerck, Sam Lutalo-Klingi, and William McGregor, 687–700. Berlin: Mouton de Gruy ter.

Chapter 9

Design a nd Implem entat i on of C oll ab orative L a ng uag e D o cum entation Proj e c ts Racquel-María Sapién

1. Introduction Since the publication of Hale et al. (1992) and its authors’ call for more community- responsive approaches to research with endangered languages, language documentation as an evolving subfield of linguistics has grown significantly. There is a movement among researchers to better take speech community needs into account, and “collaboration” has become a buzzword. An emerging literature seeks to articulate more community- inclusive methods (Wilkins 1992; Grinevald 1998; Stebbins 2003; Florey 2004; Dwyer 2006; Rice 2006; Yamada 2007, 20101; Penfield et al. 2008; and Czaykowska-Higgins 2009; among many others). This shift is not exclusive to research with endangered lan guages but rather reflects a greater movement among social scientists who conduct research with and among members of indigenous communities to recognize commu nity members’ autonomy (Deloria 1969, 1997; Biolsi and Zimmerman 1997; Smith 1999; Denzin, Lincoln, and Smith 2008; Mihesuah 2008;).2 Cameron et al. (1992) provide an initial framework for a more community- responsive approach to sociolinguistic research, and is widely cited in the literature on 1

I previously published as Racquel-María Yamada. I owe a debt of gratitude to members of the Konomerume, Suriname community with whom it has been my privilege to work for over a decade. I am grateful to the community of collaboratively oriented academic and speech community linguists from whom I draw inspiration, support, and guidance. All omissions, misinterpretations, and errors are my own. 2

204 Racquel-María Sapién endangered-languages research. Cameron et al. (1992, 1997) delineate models based on speech community member involvement. By their definition, an ethical model is re search that is done on a particular community, while an advocacy model is research for a community. Their empowerment model represents research conducted with members of speech communities. Recasting the ethical model as linguist-centered, Czaykowska-Higgins (2009) takes the empowerment model a step further, promoting language research by speech com munity members with community-based language research (CBLR). Expanding on this notion, the Community Partnerships Model (CPM) (Yamada 2010)3 outlines an approach to collaborative research that draws from models of sustainable community development. Both models engage with critical indigenous methodologies and seek to articulate a methodological approach to field research with endangered languages that takes speech community members’ stated needs into account. What various collaborative approaches have in common is a goal of meeting the needs of all stakeholders in a language project, including outsider academics, speech commu nity activists, language teachers, learners, and others. Community-inclusive language research, and especially language documentation, recognizes the varied expertise contributed by all members of a research project. More than a moral imperative, collab orative language documentation seeks to address the varied needs of all stakeholders in a way that is responsible, reciprocal, and respectful (Rice 2006), ultimately blurring the line between “researcher” and “subject.” My own experience with collaborative language documentation began when I realized that a linguist-centered model was untenable in my particular fieldwork situation. In 2005, I began a documentation project with members of the Konomerume, Suriname community on the non-prestige Aretyry dialect of Kari’nja (Cariban).4 Although our language work began in 2005, my relationships with community members date to time spent living in the community as a Peace Corps Volunteer from 1995–1998. Our work at that time did not have language as its focus, but we had established partnerships based on a community development model that values all stakeholders’ voices (Peace Corps 2003). As such, when we later began to work on Kari’nja, community members had the expectation of collaboration rather than a researcher-consultant arrangement. Existing literature on endangered languages provides case study examples of com munity collaborative projects (Wilkins 1992; Yamada 2007, 2014; Yamada, Mandé, and Jubithana 2008; Vallejos 2014), discusses roles for speech community and aca demic linguists (Mithun 2001; Grinevald 2003; Dobrin 2008; Gerdts 2010; Guérin and Lacrampe 2010; Stebbins 2012), illustrates challenges to successful collaboration (Franchetto 2010; Whaley 2011; Stenzel 2014), and theorizes more effective collaborative 3

Much of the content for this chapter appeared in an earlier version as part of my dissertation (Yamada 2010). 4 The language is known variously as Carib, Carib of Suriname, Galibí, and Kaliña, among other terms. The dialect is more commonly known as Murato, a term speakers consider pejorative. I use speakers’ autodesignations throughout.

Collaborative Language Documentation Projects 205 methodologies (Furbee and Stanley 2002; Czaykowska-Higgins 2009; Leonard and Haynes 2010; Yamada 2010). Since prior works examine underlying theoretical and methodological assumptions in collaborative language research, my focus for this chapter is on practical application. This chapter outlines phases in the planning and implementation of a community- collaborative language documentation project drawing from models of sustainable community development. Section one provides an introduction and defines collabora tive documentation. In section 2, I discuss phases of a collaborative project beginning with community entry and progressing through to implementation. This is followed by an examination of common obstacles in section 3. Finally, section 4 provides brief conclusions. Since it is informed by my own experience, this chapter is aimed primarily at an audience of academic linguists working with members of a speech community that is not their own. However, linguists and language activists working within their own communities on their own heritage languages may also find the description of phases of project development and implementation useful.

1.1. Why collaborate? One of many reasons to conduct language documentation projects collaboratively is speech community members’ own expectations. In my case, collaboration was ex pected because community members and I had collaborated previously on non- language work. The expectation of collaboration is becoming commonplace and more and more members of communities whose languages are endangered are unwilling to work with researchers whose work is of little immediate benefit to them (Bowern and Warner 2015). Thoughtful, well-planned documentation can support community goals, such as formal teaching, almost immediately5 and can build capacity for community members’ future independent work. Additionally, well-trained teams can accomplish significantly more than a single researcher can alone, and documentation does not depend on h/her6 presence in the community. Furthermore, when community members do the actual re cording themselves, the resulting documentation is richer in two ways. Elder speakers are more comfortable interacting with members of their own community and, as such, are more likely to speak freely while being recorded. Also, the things they say are more complex, grammatically, when talking with another speaker than with an outsider. Finally, if a goal of documentation is to record language that is meaningful and cultur ally embedded, those most qualified to determine what is important to document are speakers themselves. 5

See Yamada (2008, 2011) for examples of documentation in support of formal teaching. I am opting to use nominative s/he and accusative h/her with the acknowledgement that each of the options for non-gendered pronominal forms for a hypothetical or anonymous third person is clumsy in its own way. This option is no exception. 6

206 Racquel-María Sapién

1.2. What constitutes effective collaborative language documentation? Collaborative language research, which includes documentation of endangered lan guages, may involve different permutations of research relationships across institutions, disciplines, researchers, speech community members, policymakers, and others. Collaborative language documentation is the creation of a “lasting, multipurpose record of a language” (Himmelmann 2006, 1) in cooperation with one or more stakeholders. The focus of this chapter is collaboration between “insider” speech community members and “outsider” academic linguists.7 In the more typical case, an academic linguist works with members of a community that is not h/her own to record aspects of the language in situ. In this scenario, those who either consider themselves or are considered by their communities to be competent speakers are recorded speaking the language in natural or naturalistic settings. The location is often a small, isolated community, but this is not neces sarily the norm. Nor is it necessarily the case that the academic linguist is an outsider. Collaborative language documentation blurs the line between “researcher” and “sub ject,” necessitating the use of new terms for different members of a partnership (see Rice 2011 for further discussion).8 For example, Yamada (2010) uses the term “speech com munity linguist” (SCL) to refer to language workers who are working within their own communities on their own native or heritage languages, and “academic linguist” (AL) to refer to outsiders to the speech community who may have institutional affiliation at a university. However, these terms come with the caveat that they impose an artifi cial binary distinction in that one may be both an AL and an SCL simultaneously. In this chapter, I use the terms “outsider academic,” “academic linguist,” and “researcher” somewhat interchangeably to refer to ALs, “speech community member” and “commu nity linguist” to refer to SCLs, and “participants” to refer to both. At the heart of any collaborative project is relationship-building. Effective collabora tion recognizes that those who might be considered “subjects” in other types of research situations are instead partners in a shared undertaking. Social interactions between outsider academics and speech community members are rich and complex, and de pend on clear and effective communication. Trust must be built early in a collaborative project and maintained by fostering an atmosphere of shared responsibility wherein all stakeholders have an equal voice in the research agenda. The underlying assumption that participants must communicate and negotiate with each other at all phases of a project is woven through each of the key phases described in section 2. In addition to clear communication, effective collaboration depends on careful planning and accountability among participants. Ownership and control of both the outcomes and products of a documentation project are shared, which assumes both 7

See Glenn (2009) for an overview of cross-disciplinary collaboration. Actual terms of address should be negotiated early in a partnership and revisited as a project evolves, including both how people address each other and how they are referred to in print. 8

Collaborative Language Documentation Projects 207 intellectual and physical accessibility of products for all interested stakeholders. Where necessary, training is sought in order to support project goals and accessibility of products. Training for community members is commonly cited as essential to effective collaboration, but fewer descriptions discuss training for outsider academics. Balance is essential to effective collaboration, and this includes balance in who needs what sorts of training. Collaborative documentation assumes there is no single expert, but rather all members of a partnership bring varied experiences and expertise. The approach advocated here necessitates a long- term commitment from all participants, which may place additional burdens on the outsider researcher who has to travel to reach the speech community. However, a well-planned collaborative project does not depend on the outsider researcher’s constant presence. As speech community members gain facility with techniques and technology for documentation, work can proceed in the outsider researcher’s absence. Much of the recent literature on community collaborative research assumes some form of social activism on the part of the outsider researcher (Rice 2006; Denzin and Lincoln 2008; Czaykowska- Higgins 2009; Yamada 2010). Although this chapter has practical application as its focus, there is an underlying assumption that collabo rative projects are activist in that they promote autonomy and self determination for members of under-represented communities.9 Collaborative documentation may ben efit the speech community by redistributing the power that defines the research agenda (Swadener and Mutua 2008, 38), and empowering community members to have a voice in research that concerns them. Community members are equal partners with equal power to suggest or reject potential projects. Cooperative determination of a research agenda is assumed at the outset of any collaboration, and projects are only undertaken if they are of balanced mutual benefit. It is sometimes the case that individual aspects of a particular project may focus more on one member’s needs, but the overall partnership is balanced. The needs and goals of one group are not subjugated in favor of those of an other. The researcher relinquishes “the power and authority that has traditionally rested unquestionably on the researcher and the institutions that the researcher represents” (Swadener and Mutua 2008, 41). Many ALs are accustomed to entertaining community needs only after their own goals are met, but collaborative models advocate establishing goals at the outset that meet the shared needs of all partners. Collaborative language documentation depends on partnerships between people who are both willing and able to engage in the work at hand. Although different members of a collaborative project may require additional training of various sorts (and, in fact, training is one essential component of effective collaboration), the desire and ability to collaborate must be present for a project to be effective. According to Nathan and Fang, “documentation as it is currently practiced mainly serves the purposes of descriptive and typological linguists” (2008, 177). Collaboration

9

Not all endangered languages are minority or minoritized languages, but the nature of endangerment is such that they nonetheless tend to be surrounded by more widely spoken languages.

208 Racquel-María Sapién with speech community members is one way to better meet the needs of a larger audi ence. By the approach advanced here, an effective collaborative documentation project involves all stakeholders at the outset working to define mutually determined and mutu ally beneficial goals. Speech community members take increased responsibility for work more traditionally done by an outsider, thereby bridging the gap between ALs and SCLs.

1.3. What is collaborative language documentation not? Much of the criticism of community-collaborative language documentation is based on a different conceptualization of collaboration than that employed here. Some have noted that outsider linguists may have to encourage or convince community members of the value and importance of language documentation. This represents a linguist-centered approach rather than true collaboration. Collaborative documentation is not linguist driven. Rather, collaborative projects depend on equal buy-in from all stakeholders, both outsider academics and insider community members. Relatedly, collaborative documentation projects do not foreground the needs of one set of stakeholders in favor of those of another. Although every component of a collab orative project need not apply to every stakeholder equally, collaborative projects seek balance in meeting stakeholders’ needs. For example, a recording of an elder describing an important cultural process may be most relevant to community members as the basis for teaching materials, and elicitation based on texts from that recording might illu minate linguistic structures that are primarily of interest to the outsider academic. On balance, the single recording can meet the needs of different stakeholders in different ways. Again, balance is the goal, not absolute equity in all undertakings. Finally, collaborative language documentation is not “giving back” something to members of a speech community under study. This misconception is perpetuated even in works that otherwise share a collaborative approach. For example, Dwyer notes that “The most common examples of ‘giving back’ include preparing pedagogical and cul tural materials useful to the community. . . .” (2006, 39). Others discuss mobilization of language documentation as “fieldwork delivered to a language community” (Nathan 2006, 364). Rather than conceiving of the products of documentation that are more community-oriented as “giving back,” true collaboration seeks to “work together” to set goals and undertake projects that are of balanced mutual benefit and depend on contributions from all stakeholders.

2. Phases of design and implementation Recommendations for designing collaborative projects draw heavily on asset-based models of sustainable community development. In particular, I have drawn on the Participatory Analysis for Community Action (PACA) model (Peace Corps 2007),

Collaborative Language Documentation Projects 209 Urban Habitat’s Participatory Planning for Sustainable Community Development (PPSCD) approach (Seitz 2001), and the Methodology of Collaborative Cultural Mapping developed by the Amazon Conservation Team (2008). Most of the phases of implementation come from Peace Corps (2003), and are adapted to collaborative lan guage documentation. Successful community development depends on a number of factors contributing to project sustainability. These same factors contribute to the realization of community- collaborative language documentation goals. Characteristics of successful community development projects that could equally apply to language documentation include the following (adapted from Peace Corps 2003): • Involve all stakeholders in all aspects of project planning, including setting goals and developing a research agenda, • Set realistic goals, objectives, time frame, and budget, • Clearly define project tasks and responsibilities, • Identify and address training needs, • Assign partners to specific roles, • Monitor project progress, • Inform and involve larger community, • Evaluate and reflect on each project phase, and • Instigate changes as necessary throughout project implementation. In a collaborative language documentation project, community members are in volved in each phase of the process. By this approach, teams of community members are taught how to conduct each project phase and the researcher serves as a facilitator rather than a leader. The process of identifying and developing projects cooperatively builds community capacity for developing projects of their own design. Community members learn how to assess needs, recognize assets, set goals, assign roles and responsibilities, initiate training, implement and assess projects, and reflect on the various phases of planning, implementation, and evaluation. Should they later decide to seek outside funding for independent projects, they will be equipped to negotiate outsider protocols for developing and submitting proposals. In the sections that follow, I describe the following phases: community entry, needs and assets assessment, prioritizing, project design, training and team building, and implementation and reflection. For the planning phases, including assessment, prioritizing, design, and training, I recommend a workshop format wherein teams of community members, together with the researcher, work to determine and prioritize goals, identify resources, and meet training needs. The early planning process provides an important opportunity to set the tone of a collaborative project. It is essential that early planning be democratized such that all participants’ ideas are sought out and valued equally. Brainstorming is followed by prioritization and paring down based on available resources. This will inevitably involve rejection of some ideas. However, if all participants feel like valued contributors from the outset, conflicts may be avoided when

210 Racquel-María Sapién some goals are chosen in favor of others. This process takes time and care must be taken to provide opportunities for input from everyone. Depending on cultural appropriateness, the following suggestions may help ensure everyone has a voice: • Rotate facilitation of meetings such that no one person facilitates every session. • Determine meeting protocols cooperatively. • Assign meeting roles, including recording of minutes and timekeeping, on a rotating basis. • Limit the amount of time any one person is allowed to speak. • Discuss and brainstorm in smaller groups before whole group discussion. Reflection is an important element throughout the planning and implementation process. All partners must be given an opportunity to reflect on their involvement with projects and share their impressions as new projects are developed. All phases in the following sections are adapted from Peace Corps (2003) and are described as they relate to the practice of collaborative language documentation.

2.1. Community entry Although an outsider academic may have been invited to collaborate with community members on a language documentation project, s/he may only be acquainted with the small subset of community leadership who extended the invitation. Furthermore, there are aspects of community entry that may be required each time the outsider visits the community regardless of how long s/he has been working with community members. Initial community entry is an observation phase. The outsider researcher takes time to observe and learn about community practices and protocols. This impor tant phase will influence the researcher’s integration in the community, h/her ability to identify partners, and the strength of future partnerships. Note that observation is not unidirectional during this phase. Community members, too, are learning about the outsider researcher and are observing whether and s/he follows formal community protocols, interacts appropriately with community members, and complies with local conventions. Ideally, the outsider researcher will have a primary partner in the com munity, identified by community leaders, who will guide h/her through the early stages of community entry. Researchers should expect to spend much of this time explaining why they are in the community and what they and the community can expect from their partnership. Formal community protocols for this phase may include: • Formal introduction to community leadership, • Individual introductions to elders,

Collaborative Language Documentation Projects 211 • Community-wide meetings facilitated by community leaders to introduce the researcher to the community—the researcher may or may not speak at these meetings, and • Additional meetings with leaders to reflect on issues brought up at community- wide meetings. The community entry phase is also a learning phase. The outsider academic is learning how to function in the community, and community members are learning how to interact with the researcher. During this phase, the researcher can also begin observing language in use. S/he may be introduced to speakers at various levels of fluency. The researcher can learn who speaks the language and in what contexts. In addition to meeting with community leaders and elders, the researcher may interact with potential research partners who are younger and less fluent (or non-speakers, in many cases). The community entry phase is also a time to learn whom the researcher will be working with. Depending on community protocols, leaders may appoint part ners or ask for volunteers. It is important that the researcher observe local community protocols for identifying partners rather than approaching potential collaborators indi vidually without input from community leaders. In my own experience, although I have been working with community members for over a decade, I still have both formal and informal community entry requirements each time I visit Konomerume. Formally, I am required to ask for separate meetings with the village chief, the village council, and the community at large. During this time, I bring gifts to the community and explain the goals of a particular trip. Informally, I have to visit various members of the community. During this time, I reconnect with community members I have worked with in the past and pay my respects to elders. I recently had a conversation with a colleague about this informal community re-entry and how difficult it can be. He said, “Yeah, it’s like you have to recalibrate your relationships every time.” I found this to be a very apt description of the informal community re-entry process.

2.2. Needs, assets, and projects assessment The needs and assets assessment phase is an information-gathering phase. During this time, the researcher and community members work together to identify resources and needs in the community. In addition, participants work together to determine com munity member and researcher assets in terms of talents, training, and interests. It is important to focus on the positive during this phase, identifying strengths in the com munity that will contribute to documentation outcomes. Since this is an information- gathering phase, the researcher should resist the temptation to make recommendations or pass judgment. Although it is useful to begin planning by listing all of the possible desired outcomes, an essential part of the process is then organizing and prioritizing based on shared needs and available resources. Needs are assessed based on documentation goals. For example,

212 Racquel-María Sapién if one goal is to record elders engaged in everyday conversation, needs include available fluent speaking elders, recording devices, and technology training. Resources include technological, practical, and human resources. For example, if one goal is to record elder first-language speakers in conversation, the elders and recording devices may be avail able, but community members may need training in equipment use. For the annota tion of resultant recordings, both academic and speech community linguists may need training. The AL may need to better understand the language in use, and the SCL(s) may need training in linguistic analysis. This phase may be implemented in several ways including workshops, interviews, and formal meetings. Assets may include individual skills and capacities among the re searcher, community partners, and community members at large; local organizations including schools, women’s groups, and sports organizations; and larger institutions such as indigenous rights groups, NGOs (nongovernmental organizations), and gov ernment bodies. Needs may include language-specific issues such as support for revi talization or greater community infrastructure issues such as rebuilding community structures. During this phase, the researcher and community members list all needs and assets they can think of, without regard for whether or not they will be “doable.” Community members need to have a voice in the process and be heard, so it is impor tant that all ideas be included. During the next phase, participants will work together to identify projects that they will have the resources to embark upon. All phases are conducted in partnership, but it is imperative that community members be engaged with this particular phase. It is important that community members define their needs and assets for themselves so that they are empowered to have ownership of the process and be engaged in the work ahead. In addition to establishing needs and assets, participants begin to list poten tial projects. As projects are listed, the researcher and community partners also list the skills or knowledge needed to complete a particular project and any issues or problems that may hinder project completion. During this phase, the researcher and community partners can begin to match identified assets with potential projects. For example, if one of the projects is the documentation of a particular cultural practice (for example, weaving a specific type of basket) and the language that accompanies it (for example, procedural discourse describing the necessary tools and their use), community members with skill in the practice and the ability to describe it in the language may be identified. It may be the case that a younger, less fluent, commu nity member has skill in the practice but is unable to describe it in the language. This situation provides an excellent opportunity to involve multiple community members in that one person might be recorded doing the weaving while another more fluent person describes the process. For language documentation projects, this element tends to be product rather than process oriented. As such, it is useful for this phase if a variety of completed products from other documentation or description projects are available. Sample audio and video recordings, dictionaries, grammars, and pedagogical materials will help community members to focus on what types of products would meet their needs and what they hope

Collaborative Language Documentation Projects 213 to achieve through the partnership. Participants might discuss timelines and resources necessary to complete each type of project. It may also be useful to demonstrate different types of each product. For example, the researcher may have samples of small, illustrated thematic dictionaries and other samples of larger academic volumes. The researcher and community members can discuss the relative merits of each type of product and their usefulness to particular audiences.

2.3. Prioritizing When I was in the Peace Corps, we called this phase “voting with leaves,” because that was a strategy used to facilitate prioritizing among non-literate counterparts. During this phase, the researcher and community members work to analyze and prioritize projects identified during the previous phase. Analysis includes determining what is needed to complete a particular project, and prioritizing includes deciding which projects are most important to the community and the researcher. All participants have a vote, and the researcher is not in a position to override community priorities in favor of h/her own. This phase is best facilitated through workshops attended by community partners and other stakeholders including community leaders. During this phase, the researcher and community members analyze each potential project listed in the previous phase. Interested stakeholders examine each project to establish its level of urgency, gauge community interest, determine whether they have the resources necessary to complete a project and what additional resources are required, discuss probable timelines, and ascertain whether some projects might overlap. Once each potential project has been analyzed this way, participants can begin prioritizing. Determining the “most important” potential projects involves too vague a criterion. More specific criteria include: • Greatest benefit to most people (e.g., documenting a specific cultural practice may serve both researcher and greater community needs, while documenting an iso lated word list of names for things that may not exist in the community would serve a smaller audience), • Biggest impact on community as a whole (e.g., involving the community in creating signage for community locations provides visibility in a way that one-on-one elici tation of place names does not), • Potential for completion in available time frame (e.g., a comprehensive, multi- language, encyclopedic dictionary may take decades to complete while a series of small, thematic dictionaries may be equally useful and less time consuming), • Fewest outside resources required (e.g., creation of an interactive web-based dic tionary assumes access to the internet which requires infrastructure that may not be available in the community while a paper dictionary requires only access to a printer that may be purchased as part of the project).

214 Racquel-María Sapién After analyzing and prioritizing potential projects, participants can eliminate projects that would be impossible to accomplish with available resources. Most projects will require training and should not be eliminated on that basis. For example, a documentation proj ect may require training in equipment operation that can be provided by the researcher. Other projects, for example, community mapping, may require tools and training, such as GPS systems, that are not immediately available but may be obtained by partnering with other organizations. Some project ideas may be desirable, for example, an internet-ready multimedia dictionary, but may not ultimately be obtainable or usable because of a lack of technical resources in the community. Finally, some projects, for example, rebuilding a village school, may be only tangentially related to the goals of the partnership and can be eliminated early on. It is nonetheless useful for the community to have listed them, as they may be revisited in partnership with other researchers or organizations. In determining which specific projects to undertake, to the extent possible, the re searcher and community members use traditional local decision-making methods. In some communities, this is done through democratic processes; in others, consensus is imperative.

2.4. Project planning and design This phase, too, can be conducted in a workshop format. It may involve only those community members who will be conducting the actual work of the partnership. As such, workshops conducted during this phase may be smaller than those for previous phases. However, depending on local protocols, community leaders may choose to pro vide oversight of any phases involving workshops. Community leaders may also need to be involved in the process of identifying and approaching potential team members for different aspects of projects. The researcher and community members should seek to undertake projects whose goals are “SMART: Specific, Measurable, Attainable, Realistic, and Time-bound” (Peace Corps 2003, 78). Prior to embarking on any project, the following questions need to be addressed:

• • • • • • • •

Who are the relevant stakeholders? Who will participate and in what way? What teams are needed to complete the project(s)? What are the training needs? How will training needs be met? How does project implementation interact with the local seasonal calendar? What is the timeline? How will project progress and success be assessed?

Although most questions are relatively well examined, an often neglected element of project planning is the seasonal calendar. Depending on the community, people may be

Collaborative Language Documentation Projects 215 unavailable during important planting times when they have to travel long distances to their fields. In other communities, there may be restrictions on storytelling at different times of the year. This could be disastrous for a poorly timed project that aims to record traditional stories. The researcher and community members need to plan projects care fully to avoid unforeseen obstacles caused by seasonal activities. I once made the mistake of assuming that the rainy season in Konomerume would be an ideal time to record elder speakers in their homes. Families usually have fin ished their annual planting, and the rains make it difficult to undertake long journeys to the fields. They are often involved in quieter homebound activities that, I thought, would provide an ideal environment for recording. What I failed to plan for, how ever, was the effect of the rains that were keeping people at home. There has been a shift in traditional architecture in Konomerume, and many homes now have corru gated tin roofs instead of the older leaf roofs. The deluge brought by a tropical rainy season combined with an uninsulated tin roof provides a less than ideal recording environment. In addition, since the technology team and I had agreed to record elder speakers in their own homes, we had to carry the equipment through the village on foot. The winds that accompany a typical tropical monsoon made it nearly impossible to transport the equipment without some of it getting wet, despite our use of purport edly waterproof cases. When planning a project timeline, in addition to seasonal considerations, participants set mini-goals along the way. For example, if the project aims to produce a video re cording and texts of elders narrating as they engage in traditional cultural practices, there will be intermediate steps to creating the final product. They may include: iden tification and training of a technology team; video-related issues including recording and editing; identification and training of a text team; transcribing, translating, and analyzing of texts; production of the final product(s). Each task is identified and sequenced on the timeline along with the tools and training needed to complete each one. Tasks are organized in a spiral, with each task building a scaffold that supports the next. In addition, a reflection period is included for each intermediate step when team members can assess their progress toward the eventual goal. During the reflection periods, the researcher and community members evaluate their progress to date and en sure they have measures in place to embark upon the next project phase.

2.5. Training and team building For each project phase, the researcher and community members ask, “What are the tools and training necessary to complete this task?” They then work together to de sign training workshops to provide partners with the skills necessary to complete each task. In my experience, workshops are more effective if they are hands-on, encourage participants to learn inductively, and foster interaction among participants and the fa cilitator. Participants “learn by doing,” and workshops are organized such that new skills are practiced as they are introduced.

216 Racquel-María Sapién Since the focus here is on collaborative language documentation, presumably, the outsider researcher will have more training and experience with techniques, technologies, and methods for documentation. The temptation may be for the out sider researcher to assume the role of leader and trainer during training phases. However, effective collaboration depends on recognition that there is no single expert in the room. Therefore, careful evaluation of available resources includes identifying all participants’ strengths and expertise and assigning roles accordingly. In deciding what sort of training is needed, it is useful to avoid assuming the outsider researcher will be providing all training. Depending on which roles s/he is assigned, it is likely that the outsider academic will also require additional training. The underlying assumption is that all participants have some knowledge and expertise to impart, and training is reciprocal. Training may involve technical training in equipment use and maintenance, lit eracy and orthography development, best practices in documentation, lexicography, or training in individual topics in linguistics. Depending on what skills are needed for each project, different project participants can design workshops for particular teams. Local resources or knowledge will play a role, and individual team members may de velop training workshops to share their own knowledge or skills. Additionally, gaps in training may be identified in which neither the outsider researcher nor speech com munity members have the relevant expertise. In these cases, experts may need to be identified outside the particular community. My own need for training was brought to light in a somewhat humorous way. After I had spent some time in Konomerume and had developed enough facility with Sranan Tongo (the national lingua franca) to be comfortable in social situations, I would often sit with the older adult women at parties. In one such case, we were joined by someone who had moved to town several years earlier and had only recently returned to the com munity for a visit. My friend was relating how competent I had become with local cus toms and said, “When Racquel first came to Konomerume, she didn’t know anything. She was like a little baby who needed help to do absolutely everything. But look at her now!” I bristled at the characterization (I knew some things!), but she was not wrong. Fortunately, she and other community members were patient trainers, and through their tutelage, I became (somewhat) competent at tasks that they consider essential to day-to-day life. As a result, my language work is much richer in that it is contextualized and culturally embedded. Once projects have been identified, analyzed, and planned, the researcher and com munity members work together to organize teams that will take responsibility for each project phase. Teams can work together to share roles and responsibilities. In addition, teams may choose to elect a leader who is responsible for ensuring that tasks are carried out. In some cases, one large team, composed of leaders, elders, young adults, and the outsider researcher, will work on all aspects of a project. In others, training smaller teams is more appropriate, each of which is responsible for a different project segment. Community teams take responsibility for their own research and are involved in each phase. In addition, they are building capacity for future independent projects of their

Collaborative Language Documentation Projects 217 own design. Teams maximize available resources and efficiency in that no single person has to “do it all.” Depending on local protocols, building teams may involve community leadership. Team members are chosen based on capacity, motivation, ability to complete tasks, and long-term commitment. Where possible, teams should include members of different ages and genders, and with different roles in the community. However, local politics and familial relations can make the identification of potential team members a deli cate process. Whenever practical, it is wise to involve as much of the larger community as possible in determining who would be best suited to particular tasks. There may be historical reasons why the community may choose one person over another, and the outsider researcher must exercise caution to avoid being drawn into disputes. It is im perative that the outsider researcher recognize speech community members as experts when building teams. Although the outsider researcher may have h/her own ideas about who is best suited to particular tasks, there may be other local cultural issues, such as fa milial ties and/or gender roles, to which the outsider is not privy.

2.6. Implementation and reflection Implementation may not be a separate phase but rather can be conceived of as part of an overlapping process. For example, part of training a technology team to conduct documentation involves actually recording speakers using the language. Resulting recordings can then be used in training a documentation team to transcribe and trans late recorded data. As the documentation team is being trained, the technology team can be making additional recordings that will eventually become a part of the greater documentary corpus. By this model, teams are creating concrete products that are the goal of the partnership concurrently with being trained. As teams begin working independently on particular project aspects, team leaders monitor progress. This is a part of the reflection phase. Materials and equipment need to be examined for appropriateness and adequacy. Teams need to determine whether they have received adequate training to perform particular tasks, or whether additional or more advanced training is necessary. Interest and motivation can begin to lag as teams become more engaged, and the outsider researcher and team leaders need to ensure that responsibilities are being met according to the previously outlined timeline. If time goals are not being met, participants work to determine why, and then teams work to correct any discrepancies. The role of administrator can fall to the outsider researcher or to another motivated community member. As projects progress, the researcher is en gaging with teams and fulfilling h/her own responsibilities as determined during the planning phase. A large-scale documentation, description, and preservation project involves several interacting phases that overlap with one another. The outsider researcher wears many hats in the process. However, an eventual goal is for the researcher to pass those hats along to community members. Planned obsolescence for an outsider researcher need

218 Racquel-María Sapién not mean the researcher will never return to work with a particular community. Rather, as community members take on roles and responsibilities once held by the outsider re searcher, s/he may begin working on other projects and/or assuming other roles in the community. During the reflection phase, the outsider researcher and community teams work to gether to evaluate their progress, reflect on lessons learned, and plan for future collabo ration. This step is important at all phases, but reflecting at the conclusion of a particular project may be more formal than the ongoing reflection that occurs during intermediate steps. This is also a time to present results and products to community leaders and the community at large.

3. Overcoming obstacles My original title for this section was “Reasons Why a Partnership Might Fail.” As I read through my list of “failures” I had experienced in my own work, I was embarrassed to discover that over 80% of my reasons represented supposed failures on the community’s part. I realized that I had been participating in the time-honored tradition by outsider researchers of blaming indigenous people for project failures. When social scientists have attempted to engage community members in the academic endeavor, project failures are often blamed on community infighting, mistrust, misappropriation, or factionalism.10 In blaming members of the speech community, we fail in two ways. We neglect to examine our own role as outsider researchers and fail to recognize that no community is homogenous. We seem to assume that there should be no dissent in a small indig enous community. Our assumptions get in the way of true collaboration, and, when a project fails, it is a lot easier to assign blame than to look for real reasons for failure and try to learn from them. No group is homogenous, and factions, disagreements, and obstacles affect any group, regardless of size or origins of members. All participants in a truly balanced partnership play a role in a project’s successes and failures. More effective than assigning blame is to examine potential obstacles and discuss how they might be overcome. One of the most daunting obstacles to successful collaboration is time. Traditionally funded research projects rarely allow for adequate time spent in the community on the part of the researcher. Furthermore, some researchers view time spent training com munity members as time wasted. A prevailing opinion is that all time in the commu nity needs to be spent on the “more important” business of conducting research and developing products for an academic audience. This mind-set interferes with building

10

See Morrill (2008) for a thoughtful explication of Klamath termination and the lasting legacy of outsiders having unfairly blamed factionalism.

Collaborative Language Documentation Projects 219 effective community partnerships. A well- trained team of community members can conduct independent research in the absence of an outsider academic. In a well- planned partnership, the outsider researcher does not have to be present in the com munity in order for tasks to be completed. Rather than viewing time spent in training as time wasted, we might view it as an opportunity to maximize available resources as well as build capacity for future productivity. A complaint I have heard from other researchers is that they do not have time while in a community to develop training workshops cooperatively. I once attended a conference presentation where the presenter had developed a workshop on language revitalization independent of community input but nonetheless described the work as collaborative. S/he spent only three to five days in each of several communities (most of which s/he was visiting for the first time) during which s/he delivered the workshop in addition to engaging in other research activities. Although community members were engaged and interested, to call this sort of project collaborative is a misnomer. A more community- inclusive approach would engage in a needs and assets assessment with community leaders rather than delivering a prepackaged workshop. In this particular case, scaling back on the number of communities visited might have allowed time to determine what community members had identified as their own needs. A reality of working with speakers of endangered indigenous languages is that many of the people with whom we work are elderly. The tragedy of losing an elder can have resonating effects beyond the practical effects on a documentation project. The loss of an elder has a profound emotional effect on both community members and outsider researchers. It takes time to recover from such a loss, and we need to give ourselves permission to grieve.11 A project might need to be put on hold, or reevaluated after a sufficient grieving period. In my own case, I elected to postpone a project for three years after the sudden death of one of the principal participants. Eventually, other community members and I decided that the person would have wanted the project to continue, so we chose to see it through. The emotional cost was not insignificant, though, and I still have trouble discussing that particular project without becoming tearful. Another obstacle is related to mind-set. The traditional, linguist-centered model is pervasive and both outsider researchers and speech community members come to projects with the expectation that the outsider is the expert. Community members often have trouble taking ownership of projects, and outsider academics are often reluctant to relinquish control of the research process. This mind-set is reflected in the notions of “giving back” versus “working together.” As long as community-focused products are conceived of as “giving back” something to the community, real collaboration is un likely. Shifting toward a more “working together”-oriented model takes effort on the part of all members of a documentation project but is necessary to foster the kinds of relationships necessary for effective collaboration to be possible.

11

See Sapién and Thornes 2017 for a discussion of grief in language work.

220 Racquel-María Sapién Finally, differing notions of success can hinder successful collaboration. In planning, team members must decide what they need to accomplish in order to be successful. In situations of language endangerment, it is rarely the case that an endangered language will be revitalized to the extent that it is once again used by all community members in all contexts. Often, this is what elders expect from a language project, and anything less is considered a failure. Involving the greater community in project planning can con tribute to greater understanding of what is possible. The more informed the population, the greater the likelihood that community members will know what they can expect to accomplish and what are unrealistic goals (regarding differing notions of success, see Leonard 2008; Bowern and James 2010).

4. Conclusions A possible criticism of collaborative language documentation is the fact that, as a re searcher and as a linguist, the outsider, by default, does not arrive in a community com pletely agenda-free. The researcher is there to do a research project, in linguistics, with speakers of an endangered language. However, a change in mind-set on the part of both the researcher and community members is required. Arriving in a community unin vited with a fully fledged project plan and research agenda is not collaborative. Nor is pushing an agenda on a community that has expressed no prior interest in language- related work. The researcher sets out to work in communities that share an interest in working in some way with their languages. They may not have a specific project idea in mind, but by working in partnership, community members and the outsider researcher develop an agenda together. How one finds a community in which to work depends on networking. Once a re searcher has established a collaborative project with a particular community, it is not un common for neighboring communities to take notice. For example, the boat ride from Paramaribo to Konomerume involves passing two other Kari’nja communities. After seeing us pass by several times and hearing about our work in Konomerume, leaders in both of these communities began to ask when we would begin working together. These sorts of contacts can bring new researchers and communities together. I have heard colleagues say, “The community near where I work really wants to work with a linguist. I wish I knew of someone to recommend to them!” Motivated communities exist and can be found by networking with other researchers with already established projects in the region of interest. There are several models of research with indigenous languages that call for collab oration with community members (Cameron et al. 1997; Czaykowska-Higgins 2009). The discussion here is grounded in models of sustainable community development in order to lend a practical aspect to the evolving discussion of collaborative research. Community development models have been extensively piloted, tested, and refined.

Collaborative Language Documentation Projects 221 Applying them to the practice of collaborative language documentation is a logical next step.

References Amazon Conservation Team. 2008. Methodology of Collaborative Cultural Mapping. Brasilia: Amazon Conservation Team Editions. Biolsi, Thomas and Larry J. Zimmerman. 1997. Indians and Anthropologists: Vine Deloria, Jr., and the Critique of Anthropology. Tucson: University of Arizona Press. Bowern, Claire and Bentley James. 2010. “Yan- nhangu Revitalization: Aims and Accomplishments.” In Re-awakening Languages: Theory and Practice in the Revitalization of Australia’s Indigenous Languages, edited by John Hobson, Kevin Lowe, Susan Poetsch, and Michael Walsh, 361–371. Sydney: Sydney University Press. https://ses.library.usyd.edu.au/ bitstream/2123/6926/1/RAL-chapter-30.pdf. Bowern, Claire and Natasha Warner. 2015. “‘Lone Wolves’ and Collaboration: A Reply to Crippen & Robinson (2013).” Language Documentation & Conservation 9: 59–85. http://hdl. handle.net/10125/24634. Cameron, Deborah, Elizabeth Frazer, Penelope Harvey, M. B. H. Rampton, and Kay Richardson. 1992. Researching Language: Issues of Power and Method. London: Routledge. Cameron, Deborah, Elizabeth Frazer, Penelope Harvey, M. B. H. Rampton, and Kay Richardson. 1997. “Ethics, Advocacy and Empowerment in Researching Language.” In Sociolinguistics: A Reader and Coursebook, edited by Nikolas Coupland and Adam Jaworski, 145–162. Hampshire and London: Macmillan Press Ltd. Czaykowska-Higgins, Ewa. 2009. “Research Models, Community Engagement, and Linguistic Fieldwork: Reflections on Working Within Canadian Indigenous communities.” Language Documentation & Conservation 3(1): 15–50. http://hdl.handle.net/10125/4423. Deloria, Vine Jr. 1969. Custer Died for Your Sins. New York: Macmillan. Deloria, Vine Jr. 1997. “Conclusion: Anthros, Indians, and Planetary Reality.” In Indians and Anthropologists: Vine Deloria, Jr., and the critique of anthropology, edited by Thomas Biolsi and Larry J. Zimmerman, 209–222. Tucson: University of Arizona Press. Denzin, Norman K. and Yvonna S. Lincoln. 2008. “Introduction: Critical Methodologies and Indigenous Inquiry.” In Handbook of Critical and Indigenous Methodologies, edited by Norman K. Denzin, Yvonna S. Lincoln, and Linda Tuhiwai Smith, 1–20. Los Angeles: Sage Publications. Denzin, Norman K., Yvonna S. Lincoln, and Linda Tuhiwai Smith. 2008. Handbook of Critical and Indigenous Methodologies. Los Angeles: Sage Publications. Dobrin, Lise. 2008. “From Linguistic Elicitation to Eliciting the Linguist: Lessons in Community Empowerment from Melanesia.” Language 84(2): 300–324. Dwyer, Arienne. 2006. “Ethics and Practicalities of Cooperative Fieldwork and Analysis.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 31–66. Berlin and New York: Mouton de Gruyter. Florey, Margaret. 2004. “Countering Purism: Confronting the emergence of new varieties in a training programme for community language workers.” In Language Documentation and Description, vol. 2, edited by Peter K. Austin, 9–27. London: Hans Rausing Endangered Languages Project.

222 Racquel-María Sapién Furbee, N. Louanna and Lori A. Stanley. 2002. “A Collaborative Model for Preparing Indigenous Curators of a Heritage Language.” International Journal of the Sociology of Language 154: 113–128. Franchetto, Bruna. 2010. “Bridging Linguistic Research and Linguistic Documentation: The Kuikuro experience.” In New Perspectives on Endangered Languages: Bridging Gaps Between Sociolinguistics, Documentation and Language Revitalization, edited by José Antonio Flores Farfán and Fernando F. Ramallo, 49–64. Amsterdam: John Benjamins. Gerdts, Donna B. 2010. “Beyond Expertise: The Role of the Linguist in Language Revitalization Programs.” In Language Documentation: Practice and Values, edited by Lenore A. Grenoble and N. Louanna Furbee, 173–192. Amsterdam: John Benjamins. Glenn, Akiemi. 2009. “Five Dimensions of Collaboration: Toward a Critical Theory of Coordination and Interoperability in Language Documentation.” Language Documentation & Conservation 3(2): 149–160. http://hdl.handle.net/10125/4437. Grinevald, Colette. 1998. “Language Endangerment in South America: A Programmatic Approach.” In Endangered Languages: Language Loss and Community Response, edited by Lenore Grenoble and Lindsey Whaley, 124–159. Cambridge: Cambridge University Press. Grinevald, Colette. 2003. “Speakers and Documentation of Endangered Languages.” In Language Documentation and Description, vol. 1, edited by Peter K. Austin, 52–72. London: Hans Rausing Endangered Languages Project. Guérin, Valerie and Sebastien Lacrampe. 2010. “Trust Me, I Am a Linguist! Building Partnership in the Field.” Language Documentation & Conservation 4: 22–33. http://hdl. handle.net/10125/4465. Hale, Ken, Michael Krauss, Lucille J. Watahomigie, Akira Y. Yamamoto, Colette Craig, LaVerne Masayesva Jeanne, and Nora C. England. 1992. “Endangered Languages.” Language 68(1): 1–42. Himmelmann, Nikolaus P. 2006. “Language Documentation: What Is It and What Is It Good For?” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 1–30. Berlin and New York: Mouton de Gruyter. Leonard, Wesley. 2008. “When Is an ‘Extinct Language’ Not Extinct?” In Sustaining Linguistic Diversity: Endangered and Minority Languages and Language Varieties, edited by Kendall A. King, Natalie Schilling-Estes, Lyn Fogle, Jia Jackie Lou, and Barbara Soukup, 23–33 (Georgetown University round table on languages and linguistics series). Washington, DC: Georgetown University Press. Leonard, Wesley Y. and Erin Haynes. 2010. “Making ‘Collaboration’ Collaborative: An Examination of Perspectives That Frame Linguistic Field Research.” Language Documentation & Conservation 4: 268–293. http://hdl.handle.net/10125/4482. Mihesuah, Devon A., ed. 2008. Natives and Academics: Researching and Writing About American Indians. Lincoln: University of Nebraska Press. Mithun, Marianne. 2001. “Who Shapes the Record: The Speaker and the Linguist.” In Linguistic Fieldwork, edited by Paul Newman and Martha Ratliff. Cambridge: Cambridge University Press. Morrill, Angela. 2008. “Decolonizing Klamath Termination: Critiquing Factionalism in Klamath Termination Discourse.” MA thesis, University of California, San Diego. Nathan, David. 2006. “Thick Interfaces: Mobilizing Language Documentation with Multimedia.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 363–379. Berlin and New York: Mouton de Gruyter.

Collaborative Language Documentation Projects 223 Nathan, David and Meili Fang. 2008. “Language Documentation and Pedagogy: Seeking Outcomes and Accountability.” In Endangered Languages and Language Learning: Proceedings of FEL XXII, 24–27 September 2008, edited by Tjeerd de Graaf, Nicholas Ostler, and Reinier Salverda, 177–184. Bath, UK: Foundation for Endangered Languages. Peace Corps. 2003. The New Project Design and Management Workshop Training Manual. Peace Corps Information Collection and Exchange Publication No. T0107. http://files.peacecorps. gov/multimedia/pdf/library/T0107_projectdesign.pdf. Peace Corps. 2007. Participatory Analysis for Community Action (PACA) Training Manual. Peace Corps Information Collection and Exchange Publication No. M0053. http://files. peacecorps.gov/multimedia/pdf/library/PACA-2007.pdf. Penfield, Susan D., Angelina Serratos, Benjamin V. Tucker, Amelia Flores, Gilford Harper, Johnny Hill, and Nora Vasquez. 2008. “Community Collaborations: Best Practices for North American Indigenous Language Documentation.” International Journal of the Sociology of Language 191: 187–202. Rice, Keren. 2006. “Ethical Issues in Linguistic Fieldwork: An Overview.” Journal of Academic Ethics 4: 123–155. Rice, Keren. 2011. “Documentary Linguistics and Community Relations.” Language Documentation & Conservation 5: 187–207. http://hdl.handle.net/10125/4498. Sapién, Racquel-María and Tim Thornes. 2017. “Losing a Vital Voice: Grief and Language Work.” Language Documentation & Conservation 11: 256–274. http://hdl.handle.net/10125/ 24735. Seitz, Virginia. 2001. “A New Model: Participatory Planning for Sustainable Community Development.” Race, Poverty, & the Environment 8(1): 8–11, 38. http://www.urbanhabitat. org/node/920. Smith, Linda Tuhiwai. 1999. Decolonizing Methodologies: Research and Indigenous Peoples. London and New York: Zed Books Ltd. Stebbins, Tonya. 2003. Fighting Language Endangerment: Community Directed Research on Sm’algyax (Coast Tsimshian). Kyoto: Nakanishi Printing Co., Ltd. Stebbins, Tonya. 2012. “On Being a Linguist and Doing Linguistics: Negotiating Ideology Through Performativity.” Language Documentation & Conservation 6: 292–317. http://hdl. handle.net/10125/4501. Stenzel, Kristine. 2014. “The Pleasures and Pitfalls of a ‘Participatory’ Documentation Project: An Experience in Northwestern Amazonia.” Language Documentation & Conservation 8: 287–306. http://hdl.handle.net/10125/24608. Swadener, Beth Blue and Kagendo Mutua. 2008. “Decolonizing Performances: Deconstructing the Global Postcolonial.” In Handbook of Critical and Indigenous Methodologies, edited by Norman K. Denzin, Yvonna S. Lincoln, and Linda Tuhiwai Smith, 31–43. Los Angeles: Sage Publications. Vallejos, Rosa. 2014. “Integrating Language Documentation, Language Preservation, and Linguistic Research: Working with the Kokamas from the Amazon.” Language Documentation & Conservation 8: 38–65. http://hdl.handle.net/10125/4618. Whaley, Lindsay J. 2011. “Some Ways to Endanger an Endangered Language Project.” Language and Education 25(4): 339–348. Wilkins, David. 1992. “Linguistic Research Under Aboriginal Control: A Personal Account of fieldwork in Central Australia.” Australian Journal of Linguistics 12: 171–200.

224 Racquel-María Sapién Yamada, Racquel-María. 2007. “Collaborative Linguistic Fieldwork: Practical Application of the Empowerment Model.” Language Documentation & Conservation 1(2): 257–282. http:// hdl.handle.net/10125/24611. Yamada, Racquel- María. 2008. “Integrating Documentation and Formal Teaching of Kari’nja: Design and Use of Teaching Materials Based on Documentary Materials.” In Endangered Languages and Language Learning: Proceedings of FEL XXII, 24–27 September 2008, edited by Tjeerd de Graaf, Nicholas Ostler, and Reinier Salverda, 57–61. Bath, UK: Foundation for Endangered Languages. Yamada, Racquel-María. 2010. “Speech Community-Based Documentation, Description, and Revitalization: Kari’nja in Konomerume.” PhD diss., University of Oregon, Eugene. Yamada, Racquel- María. 2011. “Integrating Documentation and Formal Teaching of Kari’nja: Documentary Materials as Pedagogical Materials.” Language Documentation & Conservation 5: 1–30. http://hdl.handle.net/10125/4486. Yamada, Racquel-María. 2014. “Training in the Community-Collaborative Context: A Case Study.” Language Documentation & Conservation 8: 326–344. http://hdl.handle.net/10125/ 24611. Yamada, Racquel-María, Ferdinand Mandé, and Sieglien Jubithana. 2008. “Collaborative Linguistic Fieldwork: Kari’nja in Konomerume.” Paper presented at the Annual Meeting of the Society for Caribbean Linguistics, July 28–31, Chambre de Commerce et d’Industrie de la Guyane, Cayenne, French Guiana.

Chapter 10

To ols and Tec h nol o g y for L ang uag e D o cum entat i on and Revitali z at i on Keren Rice and Nick Thieberger

1. Introduction A visitor to the Hobart Museum can hear a recording of Fanny Cochrane Smith singing in a Tasmanian language, made on wax cylinders sometime before 1903.1 This two-and- a-half minute recording is the only one made when Tasmanian languages were still being spoken every day. It is of huge value to all of us, but, in particular, to the Palawa or Tasmanian Aboriginal people who are now working to reconstruct their languages from very poor records. The technology of the time captured her voice, and the signal has since been migrated to other media so that it can continue to be heard today. The lesson for those of us making recordings today is that we can do much better than was the case in the past. New methods and technological tools allow rich interlinked multi media records of language performance. A defining feature of new methods of documenting languages is the advance in tech nological means for recording, transcribing, annotating, managing, and analyzing lan guage records, which then facilitates delivering that documentation for use in various forms, in particular for language revitalization efforts. It is the affordances offered by these new methods that have expanded the possibilities of language documentation to create richer records, enabling collaboration over distance, both between linguists and between linguists and speakers (see Paterson 2015). Digital materials can serve multiple purposes, often unforeseen by the original recorder, such as the analysis of phonetic 1

http://aso.gov.au/titles/music/fanny-cochrane-smith-songs/clip1. Accessed June 27, 2016.

226 Keren Rice and Nick Thieberger detail, musicology, narratives, oral tradition, and grammatical features not previously observed, as well as heritage and revitalization purposes. Analog recordings had lim ited availability in single locations and access to timepoints within a recording was labor intensive and slow. Digital media on the other hand can be instantly accessed in many locations and permit citation to the level of a word or even a phoneme and thus offer verification of analyses with reference to the primary recordings. By engaging with new technological methods, linguists can be more responsible both to the people they rec ord, engaging in a kind of repatriation of the material recorded, and to their discipline, providing verifiable claims based in primary recordings. In this chapter, we take technology to be tools used to perform tasks that would other wise be done manually. New technological developments allow for the collection of more data, both audio and video, and for far more sophisticated and searchable systems of data organization and management (see Thieberger and Berez 2012). There are numerous tools, including software, websites, and machines (recorders, cameras, smart phones) resulting in digital files managed by digital language archives. This chapter first provides an over view of the use of technologies for language documentation (for more detailed work see Good 2010, Bowern 2015 (Chapter 2 on technology), or the collection of papers in Jones 2015) and then addresses how technology is being used for language revitalization. Online services like YouTube, Facebook, and Twitter2 are important for language communities and can also contribute to corpus building. The interaction between speakers using these tools and the use of language in this domain is likely to be impor tant for the future prestige of the language, but it raises two issues that are addressed by linguists via services like the Open Language Archive Community (OLAC): how long will these records be available, and how can they be found by others interested in the same language? These questions are central to the use of new technological tools and will be discussed further below. Other strategies for increasing the use of small3 languages include localization of browser interfaces (Lazar 2007), the use of Google Translate4 into small languages, and the use of Natural Language Processing methods (Coler and Homola 2015). Some of these are further discussed in the second part of this chapter where we also critically address the notion that technology can save languages (see also Thomason 2015, 167).

2. Technology and language documentation A major goal for the current language documentation effort is the creation of good records for the world’s languages (see Himmelmann 1998; Gippert, Himmelmann, and 2

See Kevin Scannell’s An Crúbadán—http://crubadan.org/applications. We use the term “small languages” rather than “endangered languages” following Dorian (2014). 4 Currently Google Translate claims to work with Hawaiian, Māori, and Samoan, among the 103 languages it supports (http://translate.google.com.au/about/intl/en_ALL/languages.html). 3

Language Documentation and Revitalization 227 Mosel 2006, Woodbury 2011). This record should be as rich as possible, but, recalling the Cochrane Smith recording and its value today, every record is potentially of use. In fact, the value of language records to their speakers increases exponentially in relation to their scarcity: the fewer the records, the more each of them is likely to be treasured.5 Technology offers the promise of recording more than we have in the past and making better use of the recordings we can now make. Creating well-formed sets of texts permits distant reading (Moretti 2013), that is, automated visualization of the whole corpus, using corpus discovery tools to visualize the text; to contextualize syntactic constructions in order to better determine their use and meaning; to listen to examples from anywhere within the corpus; and, ideally, to allow exploration and discovery within the body of material to reveal patterns in a way that was not previously possible.

2.1. A brief history of language documentation technology It makes sense to use the best technology available for the job at hand. Like any profes sional, linguists keep up to date with the best tools for their work. They have been early adopters of technologies to support capturing dynamic performance, annotating and analyzing it, and creating lexical databases (Antworth and Valentine 1998, Lawler and Aristar Dry 1998, Thieberger 2005). Technologies also facilitate access to recordings and research by community members. In some ways language documentation takes us back to the early twentieth-century anthropological linguistics of Boas and Sapir, with the emphasis on texts and dictionaries in addition to grammatical analysis. A major difference is that new tools allow linguists to qualitatively change their work through instant access to primary data, allowing it to play a greater role in the subsequent analysis than was previously the case (Thieberger 2009). The continued increase in power and reduction in size and weight of modern recording equipment and storage media facilitates larger collections of recordings than was previ ously possible. Technologies for annotating recordings now allow online presentation of text and media that both enriches linguistic analysis by providing primary data for verifi cation and makes the same media accessible to the source community. As long ago as the 1930s Malinowski noted the possible benefits being offered by what were then new technologies: There is no reason whatever why, in the future, an exact and physiological study of speech should not use the apparatus of sound films for reproducing fully contextualised utterances. (Malinowski 1935, 26)

However, since 1935 there has not been a focus on using audio recordings as the basis for linguistic analysis, in part because of the difficulty of accessing specific points of an analog tape. 5 This observation was made by David Nash at the Warra Wiltaniappendi conference in Adelaide in September 2007.

228 Keren Rice and Nick Thieberger Of course, technology isn’t just digital—for example, typewriters were a new tech nology that changed the practice of casual notetaking to more formal and edited man uscript preparation (Clifford 1990, 64), but a major technological change came with the development of personal computers, used by some field linguists since the 1980s. It became clear even in this early stage of computer use that changes in software and storage media meant that the data needed to exist independent of what media they were stored on and of the software used to create them. While there have been huge changes in computers and software, the data created then is still legible if it was copied to new media and new formats over time, just as the sound recorded on wax cylinders mentioned earlier can still be heard today. It is somewhat ironic to have created endan gered digital materials of small or endangered languages, so, to avoid the digital equiva lent of untended cupboards full of tapes or fieldnotes (see Bel and Gasquet-Cyrus 2015, 87), it is critical to build systems that can curate digital records. Despite considerable discussion of the need to preserve linguistic records, disturb ingly few of the primary records created during fieldwork are archived. Glottolog6 lists 1,708 grammars produced between 1967 and 2016, and, assuming that each of these grammars resulted from fieldwork, it should be possible to find primary records asso ciated with the language that is the focus of the grammatical analysis. Looking at the period from 2000 until 2016, when 683 grammars were listed as appearing in Glottolog, of those languages, 555 have forty or fewer items in an OLAC7 repository.8 So, even following the publication of Himmelmann (1998) and the funding available for lan guage documentation, there are very few language recordings being archived, which suggests an ongoing need for developing tools to assist with archiving and training and advocacy in appropriate methods to ensure that the results of documentation projects are not lost. The first problem for linguists seems to be creating their files in a form that other people can understand and so can be archived. The files need metadata in a standard form, and they need to be in formats that others can read. Fortunately there are well- established standards for file formats (see Corti et al. 2014, 56) that linguists can adopt. But even if these files do make it into a curated repository, how long is digital data going to last? There is no proven storage medium, so the key to preservation of data is migra tion to the next storage system as appropriate. Data management is not specific to lin guistics, so we can learn from a well-established existing research field (Corti et al. 2014). It is in fact usually possible to migrate data forward to new formats, despite the gloomy prognostication of a digital dark age that captures headlines.9 Even dictionaries 6

http://glottolog.org/. The Open Language Archives Community an international effort that provides finding aids for language materials. Using standard metadata, the service harvests information from some sixty archives worldwide and aggregates the information every day. 8 OLAC records include historical documents so that a score between 1 and 10 is not unexpected for most languages, even where no fieldwork has been conducted recently. 9 See for example, Google’s vice-president Vint Cerf ’s warning about data loss: http://www.bbc.com/ news/science-environment-31450389 7

Language Documentation and Revitalization 229 made in word processors like Microsoft Word can be converted (for example, using the service provided by OxGarage10) into the kind of structured data that can be used in current lexicographic software like Fieldworks Language Explorer (FLEx11; cf. Rogers 2010) or TshwaneLex12 (cf. Bowern 2007). Archives are a key to supporting this migra tion and are also critical in ensuring that the extremely fragile digital records are safely curated until the next storage solution appears.

2.2. Technology traps As is usual with the introduction of each new technology, there is a need to understand its impact and the degree to which it may influence the course of research (for better or for worse). Unfortunately, it is easy to be seduced by the novelty of a new tool without knowing how to assess its potential usefulness. It is even easier to avoid having to learn how to use a new tool that may improve your practice. Experience over the past thirty years of personal computer use shows the attraction of making multimedia displays with language content. Without listing the many examples, a common feature of high-quality multimedia packages is that they took some effort and expense to produce, and they sometimes served a useful language learning function (Rogers, Antworth, and Valentine 1998) or helped in raising the prestige of the language. A significant problem is that almost none of the products of such work were playable five years after they were created. For many of these products, the only primary material that existed was created for the software and was at risk of being lost when the software became unplayable, without some effort being put into extracting and converting it into a current format. A more recent version of the same problem is the creation of elaborate websites with no plans for long-term access. Once a project has ended, it is common for a host to abandon the website. One solution to the transient nature of websites is to reference the Internet Archive13 as the citation form for a website. A foundational source to help avoiding the kinds of traps that technologies can intro duce is Bird and Simons (2003). It outlines seven principles for creating well-formed and re-usable linguistic data and concludes that technological solutions alone will be inad equate; instead, “the technological solutions must be coupled with a sociological inno vation, one that produces broad consensus about the design and operation of common digital infrastructure for the archiving of language documentation and description” (Bird and Simons 2003, 580). The present chapter aims to support the building of this consensus.

10

http://www.tei-c.org/oxgarage/. http://fieldworks.sil.org/flex/. 12 http://tshwanedje.com/tshwanelex/. 13 https://archive.org/. 11

230 Keren Rice and Nick Thieberger

2.3. Legacy and born digital documentation Documentation is of two kinds: (1) legacy documentation, which exists already and is typ ically not digital, and (2) that which is recently created and is already digital. Each requires its own management strategies and technologies, as discussed in the next sections.

2.3.1. Legacy documentation Legacy documentation, or recordings and notes collected in the past, needs to be discov ered before it can be made usable again. Field tapes are virtually invisible in a researcher’s cupboard, attic, or filing cabinet. But for some languages they may be the only known record, as we saw with the Cochrane Smith recording. A digital archive can list these items in a catalog (cf. Thieberger 2016) to announce their existence to the world, even if the items themselves remain in an analog repository. It is then critically important to convert them to a format that can be accessed by current technology, which means they have to be digitized. Discussing the potential failure of audio tapes over time, and the diminshing number of playback machines, Schüller (2008, 5) notes: Today audiovisual archives associations estimate the time window still open for the transfer of dedicated analogue and digital carriers into digital repositories to be not more than just 20 years.

2.3.1.1. Digitization changes everything Technology can transform the use of analog manuscripts and dynamic media, and make them more accessible via digitization. Paper documents or media that are held in a single location are very difficult to access, but that changes once they can be distributed as digital files. For example, the 15,000 pages of notes that were left in Arthur Capell’s estate were only available by visiting his executor’s house in Sydney. PARADISEC14 took images of each page of notes and provided enough metadata to describe them and put them and the catalog of their contents online.15 While the documents have not been transcribed and their text cannot be searched, this is a useful first step. Large collections of text like this or the enormous collection of Harrington’s records of North American languages16 are difficult to deal with as paper but become more tractable as digital files. A richer treatment of this kind of material would include text linked to images, as in the Ticha project (Lillehaugen et al. 2015) of transcribed texts in Colonial Zapotec, or the Bates project of early twentieth-century vocabularies of Australian languages (Thieberger 2016). In each of these, the text of the original document is typed or scanned using Optical Character Recognition (OCR) and linked back to the original so that a reader can search and annotate the primary documents. 14

http://paradisec.org.au. The papers of Arthur Capell, who died in 1986, are available at http://paradisec.org.au/fieldnotes/ AC2.htm. 16 http://anthropology.si.edu/naa/harrington/. 15

Language Documentation and Revitalization 231 Legacy audio refers to analog recordings that are now often unplayable, due either to deterioration of the tape or to a lack of playback machines. The task of locating these tapes and digitizing them is urgent, especially as the researchers who created them are often now retired or deceased. An ongoing survey17 by the Digital Endangered Languages and Musics Archives Network (DELAMAN) solicits details of these “orphaned” collections and then works to digitize and accession suitable collections of recordings.

2.3.2. Recording, transcribing, annotating Technology for recording dynamic performance has been an integral part of ethno graphic research from the earliest days (cf. Malinowski’s comments above), but modern digital methods provide unparalleled access to any point within large collections of recordings. All current fieldwork recording uses digital recorders, and, as the tech nology for recording is constantly changing, it is worth understanding that basic recommendations are to use uncompressed file formats (preferring wav format over mp3), to place a microphone as close to the speaker’s mouth as possible (maybe using a head-mounted microphone), and to avoid extraneous noise sources. A good discus sion of recording methods for linguistic fieldwork is Margetts and Margetts (2012). Numerous websites outline best practices more generally for recording and standards for file formats. (See Boyd (2012) on digital recording methods and tools, and see Boyd and Hardy (2012) on microphone selection and use.) Digital audio or video recorders are highly portable and storage media are increasing in capacity, with the potential that current fieldwork will produce more recorded mate rial. This is in accord with one of the desiderata of the language documentation ethos, re cording as wide a range of people and genres as possible (Himmelmann 1998). However, it also creates the need for better data management practices and for recognizing the order in which tools or methods are applied in a workflow that takes recordings through transcription, annotation, and analysis, and then to publication, archiving, and curating (see Thieberger 2004; Thieberger and Berez 2012). With new technological means to record more than was previously possible, and with video becoming a standard part of field recording, linguists can be more intrusive and so need to think even more carefully about the ethical implications of this work (Thieberger and Musgrave 2006). Transcription allows us to search recordings of dynamic performance, be they speech, singing, gesture, or signing. Transcription should always be time-aligned; that is, it should include timecodes marking the beginning and end of a transcribed event, for example, a sentence or utterance unit. Tools for transcribing in this way have been in use since the early part of this century and include Elan (Wittenburg et al. 2006, cf. Berez 2007), Exmaralda (cf. Meißner and Slavcheva 2013), F4 (cf. Jones and German 2016), and CLAN (cf. Meakins 2007). Each of these facilitates typing a transcript and inserts timecodes automatically. Some allow many tiers of annotation to account for multiple

17

http://www.delaman.org/project-lost-found.

232 Keren Rice and Nick Thieberger participants or for notation of multimodal events. Typically they transcribe audio and video, and some allow multiple videos to be displayed simultaneously.

2.4. Metadata The technology for working with masses of data files relies on a metadata description— essentially a catalog of their contents. Just as a catalog saves you from having to look through each book in a library, so a good metadata description will be simple enough to capture the information that is in your collection using standard terms and also allow free text descriptions. The modern web relies on metadata, so the more this can be built into the workflow of fieldwork, the better integrated into a broader network the records created can be. The backlog of recordings that are not being archived (as noted in section 2.1) can, in part, be attributed to the difficulty of writing a description of a collection sometime after fieldwork is over. The most basic type of information that should be noted is the name of the participants in the recording and when and where it was re corded, together with a summary of the contents. This can be done in a notebook at the time of the recording and then be transferred to a more structured format. Linguists have two systems of metadata provided for them (one is CMDI,18 the other was devel oped by OLAC19). It is worth keeping these standard metadata systems in mind when describing your own collection. There can be much richer descriptions in one’s own metadata notes than are required by these systems, but as long as the notes can be easily converted into a format that can be imported to the archive, that saves work and makes it more likely that a collection will actually be archived. Also important for computational access to information is a consistent filenaming strategy, having unique (not duplicated) filenames in your collection and never changing filenames once they are assigned, so that any metadata you have recorded is clearly related to a particular file (see Thieberger and Berez 2012, 102). A further metadata item covers permissions for the use of the recorded material (Newman 2012) that would have been discussed with the people recorded and whose consent was obtained at the time of recording.

2.5. Automated transcription A severe limitation for scaling up (Bender and Good 2010) the effort of creating acces sible records of the world’s languages is the time required to transcribe recordings. While not all recordings in a language need to be transcribed, enough should be transcribed to provide a key to the rest. Keep in mind that an exemplary set of recordings from current

18

19

https://www.clarin.eu/content/component-metadata. http://www.language-archives.org/tools.html.

Language Documentation and Revitalization 233 fieldwork may only total in the low tens of hours and for most fieldwork there are only a few hours recorded and fewer transcribed.20 The promise that technology offers is that of automated transcription, taking the known correlation of text and a segment of audio and then searching the rest of the audio for similar patterns and labeling that segment with the same text. It is already the case that forced alignment of text and media (in which an existing transcript can be aligned with the audio file it relates to, at the level of the phoneme) works for most tested languages (Strunk, Shiel, and Seifart 2014). The next step will be using that material to search unannotated audio files. Traditionally this kind of work required as large a training set as possible, but, given the small corpus that is normally created during fieldwork, it will be possible to recognize words in much smaller data sets. With this project in mind, how can the corpus itself be expanded? We can assume that a method will be developed to identify the most commonly occurring words in a language and to either extract them from an existing set of recordings, or else record them in isolation, and then use them as the basis for searching unannotated recordings. Building a high-quality corpus of recordings is then the next challenge. The phone app Aikuma21 (Bird et al. 2014) aims to make it easier for speakers to record themselves while capturing simple metadata and then respeaking a slower version or a translation using a mobile app and bypassing written transcription. There will soon come a time when recordings will be searchable even if they have no transcripts; that is, the entire process will be oral and will bypass the need for literacy at all.

2.6. The digital divide Technology can be expensive and contribute to the “digital divide” between those with new technological tools and those without. And, as Stone (2015) observes, there is a risk that small languages can be left behind in digital innovation when the speakers are re quired to use a more dominant language, stressing the importance to adapt or be left behind. It is unclear how much, if at all, the kind of diglossia experienced by speakers of most of the world’s languages when using the internet contributes to the potential de cline in use of their first language. The interface to digital tools or websites is in a metro politan language, but the content of social media can be in any language and so provide a forum in the local language. In fact, there is good reason to think that speakers of small languages are able to turn these technologies to their own uses. In a study of the use of new technologies with the Tagish language of Canadian British Columbia, Moore and Hennessy (2006, 134) note that “Native communities are using digital technologies to regain control of their language resources, while conceptualizing indigenous ideologies to use in restoring these languages.” 20 These estimates come from NT’s experience in archiving collections of research material in PARADISEC. 21 http://aikuma.org.

234 Keren Rice and Nick Thieberger We want to stress that there is no choice for linguists working in language docu mentation: digital tools and methods are now the only way in which proper records can be made. So we distinguish between digital methods for linguistic research and for community-based work, in which technologies appropriate to the local context need to be considered. Technology can assist in breaking down the lack of access to information currently experienced by people in remote areas who cannot get to research libraries or repositories, who may also benefit from the ability to create online language-learning materials (de Graaf, van der Meer, and Jongbloed-Faber 2015). In Papua New Guinea around 10% of the population in 2016 have access to the internet,22 but 80% of the popu lation own a mobile phone23 and so will increasingly have access to the internet.

3. Technology in community-based documentation and revitalization In this section we turn from linguist-driven use of technology to community-driven use of technology for purposes of both documentation and revitalization.

3.1. Communities, documentation, and technology Interest in documentation comes not just from academics but also from communities. In this section we focus briefly on some of the resources developed with community- based documentation in mind (see in addition Galla (2010), Gresczyk (2011), Hermes and King (2013), and Baldwin et al. (2013), among others). The Miromaa Language Centre24 in Australia has developed a language technology program for use in documentation, conservation, and dissemination of traditional languages. The website includes the following statement: “Miromaa aims to meet the needs of a variety of users including language workers, language centres, and linguists. Miromaa can import and export from standard format text files, and the materials in Miromaa are easy to use, empowering, act as a single stop for written, audio and video evidence of language in one place, employ community protocols with security controls, create word lists and set the foundation for a strong recovery of the language.”25 While Miromaa is concerned with the development of materials for use in language communities, at the same time the website speaks of the importance of the cultures as

22

http://www.internetworldstats.com/sp/pg.htm. https://www.budde.com.au/Research/Papua-New-Guinea-Telecoms-Mobile-and-Broadband- Statistics-and-Analyses.html?r=51. 24 http://miromaa.org.au/. Accessed December 20, 2016. 25 http://miromaa.org.au/miromaa/miromaa-features.html. Accessed December 20, 2016. 23

Language Documentation and Revitalization 235 an integral part of identity and society, as well as of the importance of respect for rules and practices for the transmission of heritage to future generations as essential to indig enous peoples’ ways of life. The technology is an aid, not an end in itself. The First Peoples’ Cultural Council (FPCC)26 in British Columbia, Canada supports the revitalization of Indigenous languages, art, and culture in British Columbia. The FirstVoices27 arm of the FPCC is concerned with archiving and developing teaching resources that allow Indigenous communities to document their language for fu ture generations. It provides technologies, training, and technical support to commu nity language champions, with “teams of fluent Elders and technically savvy youth upload[ing] dictionaries, alphabets, songs, stories, words and phrases as well as audio and video to their community archives.” There are children’s games of various sorts— identifying words and phrases, word search, concentration, and others. The site also has pages for individual communities to present their materials. Another Indigenous- led initiative, Grassroots Indigenous Multimedia (GIM),28 creates, produces, and distributes high quality indigenous language materials, stating that “Using tech in innovative ways, Grassroots Indigenous Multimedia aims to help close the gap between those who are trying to learn and the speakers of our indige nous languages. Since our founding, we have developed a strong base of knowledge for documenting elder language and creating accessible learning media for language re generation. We have developed over 20 picture books, based on Ojibwe conversational archives. In this way, we are re-creating but still responsive to the way past generations spoke and thought.” Miromaa, FirstVoices, and GIM all recognize the importance of documentation and the need for user-friendly software. They support local control and also recognize the need for material that integrates language and culture, not seeing language as a distinct entity in itself.

3.2. Technology and language revitalization In the remainder of this chapter, we focus on technology and language revitalization. Holton (2011) provides an overview of the role of information technology in supporting minority and endangered languages, reviewing the various products of language documen tation, how technology can be used to foster language communication, and outlooks for the future. Holton (2011, 398) quotes Fishman (1991): Although cyber-space can be put to use for [reversing language shift] purposes, neither computer programmes, e-mail, search engines, the web as a whole, chat boxes or any thing directly related to any or all of them can substitute for face-to-face interaction with real family imbedded in real community. 26

http://www.fpcc.ca/. http://www.firstvoices.com/. 28 http://gim-ojibwe.org/. Accessed 29 January 2018 27

236 Keren Rice and Nick Thieberger Holton (2011, 398) responds to Fishman at twenty years remove: in the nearly two decades which have passed since Fishman’s warning, information technologies have permeated our lives, becoming ubiquitous, leading us to rethink just what it means to be a “real” community. . . . We cannot expect a multimedia programme or website to create new speakers of an endangered language. What we can expect from information technology resources is that they contribute to the development and appre ciation of endangered languages in new terms.

We address this theme in the remainder of this work. In what follows we review some of the roles that technology plays in revitalization, drawing largely from statements of people who are deeply engaged in revitalization. It is useful to begin with a few quotes from the media about the role of technology in lan guage revitalization. • One of the first things Harrison learned upon his arrival in Papua New Guinea was that the speakers of Matukar Panau wanted to get on the internet. . . . “The first time they ever saw the internet, they could hear the voices of their elders and see them speaking their language”. . . “The moment was powerful because it showed these villagers “that their language is just as good as any other.” • Noodin also thinks that the internet and other modern tools can give endangered languages a fighting chance. To keep Anishnaabemowin alive, she’s turned to a va riety of tools that bring together people who are interested in the language. . . . she has created a web page with reading and teaching materials in the language. She’s also launched a Facebook page for people who speak the language. • “We see technology as playing a wonderful role in enabling small languages, through texting, social media and YouTube.”29 Such quotes suggest that there is an important role for technology in language revitaliza tion, particularly in encouraging and motivating people to value the language. On a large scale, technology has been instrumental in developing awareness of what is happening in terms of language transmission around the world and a sense of pride in being associated with a language and culture. In the fall of 2014 the Indigenous Language Challenge30 aimed to raise awareness of Native American languages through individuals posting videos in their language on YouTube, with most of the videos coming from adult second-language learners. In an article in Indian Country Today, Kearns reports that thousands of people were involved.31

29

https://student.societyforscience.org/article/saving-vanishing-tongues-3000-world-languagesface-extinction-apps-can-help-save-them. 30 http://www.huffingtonpost.com/colleen-m-fitzgerald/the-indigenous-language-c_b_5850364.html. 31 http://indiancountrytodaymedianetwork.com/2015/05/29/ online-challenges-save-indigenous-languages-americas-160531.

Language Documentation and Revitalization 237 Popular movies are being translated into indigenous languages. In 2013, Star Wars32 in Navajo was released, and Finding Nemo33 has been translated into Navajo. Movies are also being made in Indigenous languages; examples include the 2001 movie Atanarjuat: The Fast Runner34 in Inuktitut and the 2006 Ten Canoes,35 in languages of the Yolŋu Matha group in Australia. There are also external developments of technology in aid of language revital ization. The Rosetta Stone is one of these, with language lessons developed for some North American endangered Indigenous languages (Navajo, Chitimacha, Mohawk, Inuttuit, and Inupiak, with the development of lessons for Chickasaw in progress).36 The 7,000 languages project,37 launched by Transparent Language in 2013, has the goal of making world-class technology available for learning and teaching to proponents and practitioners of under-resourced languages. Such efforts are important, partly because they valorize the language, giving it a place in the communities in which the language is or was spoken, and in the larger world as well. Looking beyond the value of technology for valorization, what are the goals in devel oping technology for language revitalization? To some degree this depends on how one views the goal of revitalization. For instance, Tsunoda (2005, 168) provides one defini tion, focusing on the sustainability of language in a narrow sense: language revitaliza tion involves “restoration of vitality to a language that has lost or is losing this attribute.” The Aboriginal Language Revitalization Program website38 at the University of Victoria, Canada describes the objectives of revitalization somewhat differently: “The goal of the CALR program is to support communities in language revitalization initiatives, by strengthening understanding of the complex context and characteristics of lan guage loss, maintenance, and recovery, and by developing knowledge of strategies and successes in language revitalization in communities. The program honours traditional knowledge and practices to reach a diverse group of learners. It provides the foundation for language activism, language learning, community language programming, and fur ther study in linguistics, education, and related areas.” This sense of revitalization places language in its larger context, with revitalization not only sustaining a language, but also a community and its knowledge and culture. It is in this latter context that we consider the role of technology. Communities and individuals are making use of various kinds of technology, including radio program ming,39 YouTube videos, digital storytelling, talking books, word-of-the-day, bingo, 32 http://www.hollywoodreporter.com/heat-vision/star-wars-navajo-translation-has-446533; http:// www.nativepeoples.com/Native-Peoples/May-June-2016/Finding-Nemo-Finds-Its-Voice--in-Navajo/. 33 http://www.wsj.com/articles/navajo-version-of-finding-nemo-aims-to-promote-native-language- 1419033583. 34 http://www.thecanadianencyclopedia.ca/en/article/atanarjuat-the-fast-runner/. 35 https://en.wikipedia.org/wiki/Ten_Canoes. 36 http://www.rosettastone.com/endangered. 37 http://www.transparent.com/about/7000-languages-project.html. 38 https://www.uvic.ca/humanities/linguistics/undergraduate/programs/calr/index.php. 39 See, for instance, Grounds (2016, 9) and http://www.cbc.ca/news/aboriginal/mohawk-broadcaster- janet-rogers-launches-ndns-on-the-airwaves-1.3444557. Accessed February 2016.

238 Keren Rice and Nick Thieberger quiz shows, literacy contests, songs on iTunes, video games, Facebook, language lessons, CDs, ELAN, Twitter, and much more. Such developments serve to give a language a vis ible and audible presence in the community. There is thus no doubt about the interest in technology for revitalization, and its value. The question then, one that frequently arises in the popular media, is whether technology can save a language. The considered response to this is clear—technology is a support, but it is not technology that saves a language, it is people. This is well under stood by people involved in areas of language and technology. Inée Slaughter, long-time Executive Director of the Indigenous Language Institute in Santa Fe, New Mexico, discusses the use of technology in language revitalization, noting that “It can be a very powerful tool in helping revive or revitalize endangered languages,” but adding “What we caution is that these are purely tools, and they do not substitute for a person’s willpower and discipline to study and learn the language.”40 Valerie Alia (2009, 173), writing on new media and indigenous communication worldwide, focuses on the role of technology in empowering people, stating that “New technologies do not automatically empower people or improve communications. However sophisticated they may be, they are designed and run by people, and are only as effective as the people who are involved. In fact, new technologies offer empower ment and disempowerment in nearly equal measure.” Finally, Richard Grounds (2016, 8), the Executive Director of the Euchee Language Project, writes that “The very notion that these technological solutions somehow repre sent a kind of comprehensive and easy fix can itself become a problem that stands in the way of finding more effective directions for growing new fluent speakers. And this too often leads to diverting energy away from more effective paths for restoring the strength of our languages.” Grounds further notes that technology cannot be an end in itself. He speaks of the value of radio for raising the prestige of a language, helping youth and elders to gain pride, but suggests that it is not a significant tool for advancing fluency. In fact, the digital tools, he asserts, are probably the least effective means of producing new speakers with real fluency and cultural competence. Unlike the “saving language” rhetoric often found in the media, these authors recog nize the complexities of what language revitalization means, and that technology is but one piece of this undertaking. Training programs in language revitalization have as goals the understanding of learning, worldview, and research methods, among others. For instance, the Aboriginal Language Revitalization Program at the University of Victoria, mentioned earlier, offers several programs including a diploma,41 Bachelor of Education,42 graduate certificate, and Master’s degree.43 The courses cover topics in Indigenous education (immersion, curriculum, and instruction for Indigenous arts), Indigenous epistemologies, 40

http://www.nydailynews.com/news/national/save-languages-tribes-turn-tech-article-1.1319153. https://web.uvic.ca/calendar2018-01/undergrad/education/general-info.html. 42 https://web.uvic.ca/calendar2018-01/undergrad/education/general-info.html. 43 https://www.uvic.ca/education/assets/docs/2017_MILR%20Application%20Flyer.pdf. 41

Language Documentation and Revitalization 239 revitalization, second-language learning, field methods, issues in minority language maintenance, curriculum development in revitalization, and Indigenous research methods, among others. There are no courses with terms like “technology” in the title, although undoubtedly students are exposed to technology in their courses. The Resource Network for Linguistic Diversity44 runs a program called the Documenting and Revitalising Indigenous Languages (DRIL) Training Program45 which includes a module on “Using technologies” that covers topics such as operating a personal com puter; searching and accessing existing language resources; making sound quality digital recordings; managing language resources; using digital literacy skills to access the internet; editing sound recordings. This is one of several modules; others include topics such as developing a language program, Master-Apprentice language learning programs, models for language revitalization, and creating language resources. Recent work by Indigenous scholars examines the role of technology in language revitalization. Galla (2010) carried out a survey on the use of technology for language revitalization, finding that use was fairly low. She notes of Rosetta Stone that there are testimonials about it but not data available to support the claim that it is producing a new generation of speakers.46 Galla recognizes that technology is embedded in our lives, and we cannot get away from it, with technology literacy a standard for the work force, school, and business. Galla concludes that technology may not be enough to learn or teach a language, but some only have technology available to them for their language learning, making it an important component to consider in language revitalization (Galla 2010, 222). Ozbolt (2014), in a study of community perspectives in language ideology and learner motivation in Chickasaw language programs, includes discussion of multi media. He reiterates a theme brought out by Buszard-Welcher (2001), Galla (2010), Holton (2011), and others, that, in the case of geographical dispersion, mass media offers a way to (re)create a sense of community. Ozbolt notes that technology can help to revalue the status of an endangered language and help counteract the view that the language is “backward” and “associated with the past” (citing Eisenlohr 2004, 24, 32). Technology also allows the community to represent itself (citing Coleman 2010, 491) and display and perform a sense of Chickasaw identity. Ozbolt further notes that multimedia offers more complete documentation of linguistic practices, it allows for easy duplication and distribution, and it privileges the development of self-learning methods. In addition to providing an overview of the potential value of technology, Ozbolt presents the findings of a survey of attitudes of members of the Chickasaw com munity toward Chickasaw and attempts at its revitalization. Respondents expressed a

44

http://www. rnld.org/. http://www.rnld.org/DRIL. 46 The Rosetta Stone Endangered Languages website (http://www.rosettastone.com/endangered), realistically states that they collaborate with communities to help revitalize languages; they add, however, that children are learning their heritage languages through Rosetta Stone software. 45

240 Keren Rice and Nick Thieberger desire for online and multimedia resources, mentioning a Chickasaw version of Rosetta Stone, online learning methods, and self-learning methods for people who cannot attend community classes (Ozbolt 2014, 98–99). At the time of the survey, people were interested in internet and multimedia for language learning rather than for language use. Ozbolt also asked what resources were being used to learn the language, and found that resources such as word-of-the-day, classes offered through a program designed to encourage people to take language classes by offering extra pay, dictionary/textbook/ other printed material, and community classes were used but that multimedia did not rank very high in use (Ozbolt 2014, 106). In information about a 2015 conference “To App or Not App: Looking at How Technology Impacts Language Learning,”47 speakers addressed a number of questions concerning tools and apps, asking whether tools have increased motivation and interest, and how they are used. The speakers, based on their abstracts, focused on the impor tance of sound curriculum planning, tips for utilizing technology in language programs, ways of encouraging language learning (finding a young speaker of Diné (Navajo) to do the voiceover for the movie Finding Nemo), how technology can be used in language and cultural revitalization, and user-driven technology designs. Two focused more narrowly on technology, one on the role of technology in documentation—archiving, creating re sources and catalogues of words, phrases, locations, oral history, and so on—and one on using the app Aikuma. These are both more associated with documentation than with revitalization. The general consensus that can be drawn from groups and individuals concerned with revitalization is that technology is very valuable, but it cannot, at least in its current forms, in and of itself, sustain a language. One must ask then what it means to do this.

3.3. What does it take to learn a language? To “save” a language? When people say that they are interested in learning a small language, they most prob ably think in terms of some level of fluency in day-to-day communication. This is not the only possibility. It could be that they want to have some vocabulary, particularly cultural vocabulary, or be able to introduce themselves, to greet people appropriately, or to say prayers. Depending on the goal, technology might be of more or less value. Thus, in revi talization, sustaining a language is not one thing; much depends on what the individual and community hope to achieve and how they see knowledge developing over time. In the work surveyed above, and in much work beyond that, one major point emerges about what it takes to learn a language—human factors. This is seen in the quote from Slaughter given earlier—she stresses willpower and discipline.

47

http://www.ilinative.org/iliss/.

Language Documentation and Revitalization 241 In McIvor’s study of her learning of her ancestral language (2012), she focuses on the importance of motivation, sacrifice to make room for learning, learning about second- language acquisition, finding mentors, finding ways to use the language, building the confidence to speak, finding outside support, having emotional support, and taking a long-term view. Hermes and King (2013), in a study of learning Ojibwe, repeat important points made by Fishman (2001): the importance of community initiative, investment, and commitment. They note that technology has been employed for communicative use, materials production, and documentation, and ask how within urban Ojibwe family homes technology might be used to promote face-to-face communication (2013, 126). The authors focus on using the Ojibwemodaa48 software developed through Grassroots Indigenous Multimedia to help promote face-to-face interpersonal interactions within families of Ojibwe heritage but where the language is not used. They note that the use of Ojibwemodaa created family discussion about the language, but, during the eight weeks of the testing, while the software “did not directly impact language use patterns with their children, it did seem to have supported family interactions and connections around the Ojibwe language” (2013, 136). Hermes and King (2013, 138) remark that Ojibwemodaa might jumpstart authentic language use, suggesting that the technolog ical tool can provide scaffolding (2013, 139). Others speak to some of the challenges of language revitalization that technology, on its own, cannot meet. Daryl Baldwin (Baldwin et al. 2013, 8) writes that “Our work in understanding our language, developing cultural fluency, building cultur ally appropriate teaching models and technologies, building communal educational infrastructure, and developing proven educational pedagogy builds the kind of foun dational understanding needed for a future community-wide immersion.” This is a long-term view, recognizing that language learning is slow and difficult, and is part of something broader, cultural fluency, but ultimately worthwhile for all that it gives to the learner. Thus revitalizing a language is viewed not as learning the language alone; the lan guage is a component of something larger—identity, culture, history, health, a sense of who the person is and where that person belongs in the world. Technology has an im portant role in this, as a tool. The authors cited above mention technology and speak of the need for culturally appropriate technologies. Some authors focus on the importance of these technologies, especially for sharing the language with the world, helping to give the language the “cool” factor. Gresczyk (2011) writes of the value of creating a multi-media presence for Ojibwe people. He quotes Gaagigegaabaw, an Ojibwa language teacher, as saying “As curriculum has evolved in the past twenty years, I’ve tried to make it as widely available as possible using the internet, doing podcasts about vocabulary lessons or creating interactive multi-media web-pages and websites that our kids can go to at anytime, ipod touch,

48

http://gim-ojibwe.org/software/. Accessed 29 January 2018.

242 Keren Rice and Nick Thieberger google, any cell phone, computer labs in any school, inspiring them by seeing it more often –on Facebook, embed in their my space page and just by creating that presence. I’m Anishinaabe. This is how my language sounds. It is really cool and we’re able to share it with the world” (Gresczyk 2011, 196–197). This does not touch on the actual use of the media but rather on the important cool factor of its very existence. Gresczyk (2011, 141) further writes that “There’s no website or no podcast or no e-log that can compare with another teacher that is committed and is interested in sharing the language with other people.”

3.4. Revitalization and technology, a conclusion What are the goals in language revitalization? Valorization of a language is impor tant, especially at early states of revitalization, and technology plays a critical role in this. Technology allows for learning at a distance, helping to develop and sustain communities, as stressed by many authors. Technology provides access to oral lan guage as well as to written language. Many apps, games, and other language learning devices are engaging—they make it fun to learn and motivate the learner. Technology gives the language a presence, viewed by many as especially important in the digital age. But one cannot forget the drawbacks, at least of current technology. The develop ment of technological tools requires expertise. The software can be fragile and needs ongoing attention. Recordings alone did not provide the complex language and social ization opportunities that McIvor was looking for, and played only a minor role in the studies of technology use by Galla and Ozbolt. As Grounds (2016, 8), writes, technology is never an end in itself, given the primary goal of developing new groups of young fluent speakers. Grounds (2016, 8–9) further notes that research has not shown that dig ital language teaching tools are successful in producing new fluent speakers—rather, he suggests, they represent the least effective approach for developing speakers with real fluency and cultural competence. In undertaking revitalization, it is important to consider both immediate and longer- term goals, the expertise available, the financial situation, and attitudes toward tech nology in the setting. Some will choose to work closely with technology early on, some as revitalization develops, and some perhaps little or not at all. In the works on revitaliza tion cited in this chapter, cultural fluency is a major goal. King and Hermes (2014, 279), citing Gee (1991), write “Language competencies grow from meaningful interaction in the target language. For all learners, but particularly for learners of an endangered in digenous language, this process is rooted in identity and the quest to be accepted into a new Discourse community.” Can technology save a language when the goals have to do with healing, identity, spirituality, and cultural fluency? Such goals seem difficult, if not impossible, to meet with technology alone, at least with today’s technology. With dedi cated individuals with a long-term view and the perseverance and patience to keep at it,

Language Documentation and Revitalization 243 technology provides a valuable, and invaluable, component of revitalization. It is simply not a magic bullet—it does not save a language, rather people do.

4. The promise of technology in language documentation and revitalization Language documentation outputs, or the results of fieldwork and research on the world’s small languages, rely on digital technologies, and have benefited hugely from those technologies in the past twenty years. New technologies have vastly improved the scale and kind of work that can be done to record, analyze, and preserve records of the world’s languages to create a “multipurpose” collection—meaning it can be used by speakers, linguists, and other researchers in ways not necessarily envisaged by the recorder. There is a divide between the access of many speakers of small languages to digital technologies on the one hand, and that of practitioners of language documen tation that needs to be understood and addressed, in order to make records available to those speakers, on the other. As the techniques for using new technologies become more widespread they will result in a richer record of the world’s languages and better resources to support the ongoing use or revitalization of those languages.

References Alia, Valerie. 2009. The New Media Nation: Indigenous Peoples and Global Communication (vol. 2). New York: Berghahn Books. Antworth, Evan and Randolph J. Valentine. 1998. “Software for Doing Field Linguistics.” In Using Computers in Linguistics: A Practical Guide, edited by John Lawler and Helen Aristar Dry, 170–196. London and New York: Routledge. Baldwin, Daryl, Karen Baldwin, Jessie Baldwin, and Jarrid Baldwin. 2013. “Myaamiaataweenki oowaaha: Miami Spoken Here.” In Bringing Our Languages Home: Language Revitalization for Families, edited by Leanne Hinton, 3–18. Berkeley, CA: Heyday Books. Bender, Emily M. & Jeff Good. 2010. “A Grand Challenge for Linguistics: Scaling Up and Integrating Models.” Unpublished manuscript. http://bit.ly/GrandChallengeForLinguistics. Bel, Bernard and Medéric Gasquet-Cyrus. 2015. “Digital Curation and Event-Driven Methods at the Service of Endangered Languages.” In Endangered Languages and New Technologies, edited by Mari C. Jones, 113–126. Cambridge: Cambridge University Press. Berez, Andrea L. 2007. “Review of EUDICO Linguistic Annotator (ELAN).” Language Documentation & Conservation 1(2): 283–289. Bird, Steven, Florian R. Hanke, Oliver Adams, and Haejoong Lee. 2014. “Aikuma: A Mobile App for Collaborative Language Documentation.” Workshop on the Use of Computational Methods in the Study of Endangered Languages, 1–5, Baltimore, MD. http://www.aclweb.org/ anthology/W14-2201.pdf.

244 Keren Rice and Nick Thieberger Bird, Steven and Gary Simons. 2003. “Seven Dimensions of Portability for Language Documentation and Description.” Language 79: 557–582. Bowern, Claire. 2007. “Review of TshwaneLex Dictionary Compilation Software.” Language Documentation & Conservation 1(1): 94–99. Bowern, Claire. 2015. Linguistic Fieldwork: A Practical Guide. 2nd ed. Hampshire: Palgrave Macmillan. Boyd, Douglas A. 2012. “Digital Audio Recording: The Basics.” In Oral History in the Digital Age. Institute of Library and Museum Services, edited by Douglas Boyd, Steve Cohen, Brad Rakerd, and Dean Rehberger. http://ohda.matrix.msu.edu/2012/06/digital-audio- recording/. Boyd, Douglas A. and Charles Hardy. 2012. “Understanding Microphones.” In Oral History in the Digital Age, edited by Doug Boyd, Steve Cohen, Brad Rakerd, and Dean Rehberger. http://ohda.matrix.msu.edu/2012/06/understanding-microphones/. Buszard-Welcher, Laura. 2001. “Can the Web Save My Language?” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Ken Hale, 331–348. San Diego: Academic Press. Clifford, James, 1990. “Notes on (Field)notes.” In Fieldnotes: The Makings of Anthropology, ed ited by Roger Sanjek, 47–70. Ithaca, NY: Cornell University Press. Coler, Matt and Peter Homola. 2015. “Rule-Based Machine Translation for Aymara.” In Endangered Languages and New Technologies, edited by Mari C. Jones, 67–80. Cambridge: Cambridge University Press. Coleman, E. Gabriella. 2010. “Ethnographic Approaches to Digital Media.” Annual Review of Anthropology 2: 487–505. Corti, Louise, Veerle van den Eynden, Libby Bishop, and Matthew Woollard. 2014. Managing and Sharing Research Data: A Guide to Good Practice. London: Sage Publications. de Graaf, Tjeerd, Cor van der Meer, and Lysbeth Jongbloed-Faber. 2015. “The Use of New Technologies in the Preservation of an Endangered Language: The Case of Frisian.” In Endangered Languages and New Technologies, edited by Mari C. Jones, 141–149. Cambridge: Cambridge University Press Dorian, Nancy C. 2014. Small-Language Fates and Prospects: Lessons of Persistence and Change from Endangered Languages: Collected Essays (Brill’s Studies in Language, Cognition and Culture, vol. 6). Leiden: Brill. Eisenlohr, Patrick. 2004. “Language Revitalization and New Technologies: Cultures of Electronic Mediation and the Refiguring of Communities.” Annual Review of Anthropology 33: 21–45. Fishman, Joshua A. 1991. Reversing Language Shift. Clevedon, UK: Multilingual Matters. Fishman, Joshua A. 2001. Can Threatened Languages be Saved? Clevedon, UK: Multilingual Matters. Galla, Candace Kaleimamoowahinekapu. 2010. “Multimedia Technology and Indigenous Language Revitalization: Practical Education Tools and Applications Used Within Native Communities.” PhD diss., Tucson: University of Arizona. Gee, J. P. 1991. “What Is Literacy?” In Rewriting Literacy: Culture and the Discourse of the Other, edited by Candace Mitchell & Kathleen Weiler, 3–11. Westport, CT: Bergin & Garvey. Gippert, Jost, Nikolaus P. Himmelmann, and Ulrike Mosel. eds. 2006. Essentials of Language Documentation. Berlin: Mouton de Gruyter.

Language Documentation and Revitalization 245 Good, Jeff. 2010. “Valuing Technology: Finding the Linguist’s Place in a New Technological Universe.” In Language Documentation: Practice and Values, edited by Lenore A. Grenoble and N. Louanna Furbee, 111–131. Amsterdam: John Benjamins. Gresczyk Sr., Richard A. 2011. “Language Warriors: Leaders in the Ojibwe Language Revitalization Movement.” PhD diss., Minneapolis: University of Minnesota. Grounds, Richard A. 2016. “Indigenous Perspectives and Language Habitats.” Paper presented to the International Expert Group Meeting on Indigenous Languages: Preservation and Revitalization. United Nations, Department of Economic and Social Affairs, New York, January 19–2 1. http://w ww.un.org/e sa/s ocdev/u npfii/d ocuments/2 016/e gm/Paper_ Grounds2.pdf. Hermes, Mary and Kendall A. King. 2013. “Ojibwe Language Revitalization, Multimedia Technology, and Family Language Learning.” Language Learning and Technology 17(1): 125–144. Himmelmann, Nikolaus P. 1998. “Documentary and Descriptive Linguistics.” Linguistics 36: 161–195. Holton, Gary. 2011. “The Role of Information Technology in Supporting Minority and Endangered Languages.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 371–399. Cambridge: Cambridge University Press. Jones, Caroline and Amit German. 2016. “Review of F4, a Simple Interface for Faster Annotation.” Language Documentation & Conservation 10: 347–355. Jones, Mari C., ed. 2015. Endangered Languages and New Technologies. Cambridge: Cambridge University Press. King, Kendall A. and Mary Hermes. 2014. “Why Is This so Hard?: Ideologies of Endangerment, Passive Language Learning Approaches, and Ojibwe in the United States.” Journal of Language, Identity, and Education 13: 268–282. Lawler, John and Helen Aristar Dry, eds. 1998. Using Computers in Linguistics: A Practical Guide. London and New York: Routledge. Lazar, Jonathan. 2007. Universal Usability: Designing Computer Interfaces for Diverse User Populations. Chichester, UK: John Wiley & Sons. Lillehaugen, Brook Danielle, George Aaron Broadwell, Michel R. Oudijk, and Laurie Allen. 2015. “Ticha: A Digital Text Explorer for Colonial Zapotec.” 1st ed. Online: http://ticha. haverford.edu/ Malinowski, Bronislaw. 1935. Coral Gardens and Their Magic. Vol II. London: G. Allen & Unwin. Margetts, Anna and Andrew Margetts. 2012. “Audio and Video Recording Techniques for Linguistic Research.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger. 13–53. Oxford: Oxford University Press. McIvor, Onowa. 2012. “Îkakwiy Nîhiyawiyân: I Am Learning [to be] Cree.” PhD diss., Vancouver: University of British Columbia. Meakins, Felicity. 2007. “Review of Computerized Language Analysis (CLAN).” Language Documentation & Conservation 1(1): 107–111. Meißner, Cordula and Adriana Slavcheva. 2013. “Review of EXMARaLDA.” Language Documentation & Conservation 7: 31–40. Moore, Patrick and Kate Hennessy. 2006. “New Technologies and Contested Ideologies: The Tagish FirstVoices Project.” The American Indian Quarterly 30(1): 119–137. Moretti, Franco. 2013. Distant Reading. London: Verso.

246 Keren Rice and Nick Thieberger Newman, Paul. 2012. “Copyright and Other Legal Concerns.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 430–456. Oxford: Oxford University Press. Ozbolt, Ivan Camille. 2014. “Community Perspectives, Language Ideologies, and Learner Motivation in Chickasaw Language Programs.” PhD diss., Norman: University of Oklahoma. Paterson, Hugh. 2015. “Keyboard Layouts: Lessons from the Meꞌphaa and Sochiapam Chinantec Designs.” In Endangered Languages and New Technologies, edited by Mari C. Jones, 49–66. Cambridge: Cambridge University Press. Rogers, Chris. 2010. “Review of Fieldworks Language Explorer (FLEx) 3.0.” Language Documentation & Conservation 4: 78–84. Rogers, Henry, Evan Antworth, and Randolph J. Valentine. 1998. “Education.” In Using Computers in Linguistics: a Practical Guide, edited by John Lawler and Helen Aristar Dry, 61–100. London and New York: Routledge. Schüller, Dietrich. 2008. Audiovisual Research Collections and Their Preservation (Report for TAPE (Training for Audiovisual Preservation in Europe)). http://www.tape-online.net/ docs/audiovisual_research_collections.pdf. Stone, Adam. 2015. “The Internet and Language Education.” FEL Canada Newsletter, September/October, p. 1. Strunk, Jan, Florian Schiel, and Frank Seifart. 2014. “Untrained Forced Alignment of Transcriptions and Audio for Language Documentation Corpora Using WebMAUS.” Proceedings of the Ninth International Conference on Language Resources and Evaluation. European Language Resources Association, Reykjavik, Iceland, 3940–3947. Thieberger, Nicholas. 2004. “Documentation in Practice: Developing a Linked Media Corpus of South Efate.” Language Documentation and Description, vol. 2, edited by Peter K. Austin, 169–178. London: Hans Rausing Endangered Languages Project, School of Oriental and African Studies. Thieberger, Nicholas. 2005. “Computers in Field Linguistics.” In Encyclopedia of Language and Linguistics, 2nd ed., edited by Keith Brown. 780–783. Amsterdam: Elsevier. Thieberger, Nicholas. 2009. “Steps Toward a Grammar Embedded in Data.” In New Challenges in Typology: Transcending the Borders and Refining the Distinctions, edited by Patience Epps and Alexandre Arkhipov, 389–408. Berlin and New York: Mouton de Gruyter. Thieberger, Nick. 2016. “What Remains to Be Done—Exposing Invisible Collections in the Other 7,000 Languages and Why It Is a DH Enterprise.” Digital Scholarship in the Humanities. http://dsh.oxfordjournals.org/content/early/2016/03/08/llc.fqw006.full.pdf. Thieberger, Nick. 2016. “Daisy Bates in the Digital World.” In Language, Land and Song: Studies in Honour of Luise Hercus, edited by Peter K. Austin, Harold Koch, and Jane Simpson. 102– 114. London: EL Publishing. Thieberger, Nicholas and Andrea Berez. 2012. “Linguistic Data Management.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 90–118. Oxford: Oxford University Press. Thieberger, Nicholas and Simon Musgrave. 2006. “Documentary Linguistics and Ethical Issues.” In Language Documentation and Description, vol. 4, edited by Peter K. Austin, 26– 37. London: Hans Rausing Endangered Languages Project, School of Oriental and African Studies. Thomason, Sarah G. 2015. Endangered Languages. An Introduction. Cambridge: Cambridge University Press.

Language Documentation and Revitalization 247 Tsunoda, Tasaku. 2005. Language Endangerment and Language Revitalization. Berlin: Mouton de Gruyter. Wittenburg, Peter, Hennie Brugman, Albert Russel, Alex Klassmann, and Han Sloetjes. 2006. “ELAN: A Professional Framework for Multimodality Research.” In Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation, Genoa. Woodbury, Anthony C. 2011. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 159– 186. Cambridge: Cambridge University Press.

Chapter 11

C orpus C ompi l at i on a nd Ex pl oitation i n L a ng uag e D o cum entati on Proj e c ts Ulrike Mosel

1. Introduction This chapter describes the content, form, and function of corpora in language docu mentation (LD) projects and aims for a better understanding of documentary linguis tics from a corpus linguistic perspective. For this purpose, it starts with a definition of corpus linguistics and the notion of text. Then it describes how LD corpora fit into the common corpus typology in section 2; the content of LD corpora in terms of genres, registers, and topics in section 3; and how LD corpora are organized into subcorpora in section 4. Section 5 gives an overview of the types of data contained in LD corpora such as metadata, recordings, transcriptions, translations, and further annotations, whereas section 6 shows how these data can be exploited for grammatical and lexical analyses by corpus linguistic methods. The chapter concludes with a brief summary of those features that distinguish LD corpora from corpora of dominant, well-researched lan guages. The utilization of LD corpora for language maintenance measures is addressed in sections 3.1, 3.3, and 6.3. The chapter does not deal with technical aspects (see Austin 2006; Thieberger und Berez 2012; Rice and Thieberger, Chapter 10, this volume) and the limitations that are a consequence of ethical and juridical issues (see Himmelmann 1998, 172–176, 2006, 7; Dwyer 2006; Newman 2012; Rice 2012; McCarty, Chapter 4, this volume). Corpus linguistics is defined by McEnery and Hardie (2012, 1–2) as follows: “We could reasonably define corpus linguistics as dealing with some set of machine-readable texts which is deemed an appropriate basis on which to study a specific set of research questions. The set of texts or corpus dealt with is usually of a size which defies analysis by

Corpus Compilation and Exploitation 249 hand and eye alone within a reasonable timeframe. It is the large scale of the data used that explains the use of machine-readable text.” The term “text” is used in corpus linguistics in a very broad sense as “any artefact containing language usage—typically a written document (book, periodical, leaflet, sign, webpage, t-shirt slogan) or recorded and/or transcribed spoken text (speech, broadcast, conversation)” (McEnery and Hardie 2012, 252). In the context of LD cor pora, this broad definition means that a corpus may not only consist of narratives, procedural texts, descriptions, speeches, and conversations on various topics but also collections of elicited example sentences (see section 2.5). LD corpora usually consist of texts that are not sampled from already existing lan guage materials but are collected in a fieldwork project. In contrast to large corpora of dominant languages, their purpose is not exclusively to provide data for theoretical and applied linguistic research but also to document the speakers’ memory of the past, their oral literature, or whatever they think is worthwhile to transmit to future generations in the form of annotated recordings and written texts (Seifart 2011).

2. Types of corpora and subcorpora in language documentations 2.1. Introduction Drawing on corpus typologies in Hunston (2002, 14–16), McEnery and Hardie (2012, 6–13), and Tognini-Bonelli (2010, 20–26), this section provides an overview of the char acteristic features of LD corpora and their subcorpora. The classification of corpora is based on: 1. the method of text collection (sections 2.2–2.5), 2. the number of languages and language varieties (sections 2.6–2.7), and 3. the number and kind of modalities of communication (section 2.8)

2.2. Dynamic and static corpora A dynamic corpus, also called a monitor corpus, is a corpus that is “continually growing over time, as opposed to a static corpus, which does not change its size once it has been built. Dynamic corpora are useful in that they provide the means to monitor language change over time” (Baker, Hardie, and McEnery 2006, 64). An example for a dynamic LD corpus is the “Ju|’hoan Audio and Video Material 1970 to Present, Work in Progress” (https://elar.soas.ac.uk/Collection/MPI78647).

250 Ulrike Mosel

2.3. Sample and opportunistic corpora As mentioned above in section 1, LD corpora are usually not built from already existing texts but are created during research projects that aim at the documentation of an en dangered language and culture, which may include the production of a grammar, a dic tionary, and educational materials. Since the focus of the project may change once the researchers have become more familiar with the language and culture and the local team members more aware of what is important to them, the kind of texts that are collected may change. In other words, the sampling framework of a LD corpus is not static and would as a whole not qualify as a sample corpus that aims “for balance and representa tiveness within a specified sampling frame.” (McEnery and Hardy 2012, 250). Rather it is an opportunistic corpus which is defined as making “no pretension to adhere to a rig orous sampling frame,” but representing “nothing more or less than the data that it was possible to gather for a specific task” (McEnery and Hardy 2012, 11) But the LD corpus may contain sampled subcorpora as, for instance, the language ac quisition data that were regularly collected in the Chintang LD project in Nepal (http:// dobes.mpi.nl/projects/chintang/) or the collections of grammatical or lexical data that are elicited by visual stimuli in rigorously planned elicitation sessions (see section 2.5).

2.4. Contrastive corpora Contrastive corpora are two corpora or subcorpora that represent two registers, genres, or other varieties of the same language (Tognini Bonelli 2010, 21–22). In the LD proj ect of the Oceanic language Teop, we created two contrastive subcorpora consisting of spontaneously spoken folk tales on the one hand and the edited versions of the transcribed folk tales on the other (see section 3.1 and Mosel 2015a). Narratives, proce dural texts, and object descriptions may form contrastive subcorpora (see section 6.1).

2.5. Subcorpora of elicited data A corpus that consists of narratives, procedural texts, object descriptions, and conversations may not provide sufficient data for the analysis of the phonology, the grammar, or the semantics of certain lexical items, so that supplementary data have to be elicited. Such elicitations can be gathered in files and form a subcorpus. In corpus lin guistics such subcorpora are called artificial corpora, because “the material has no social or cultural rationale for being collected” (Ostler 2008, 459). Common types of elicited data are 1. words, phrases, or stories elicited by showing pictures or video clips (see Chelliah and De Reuse 2011, 369–371; Hellwig 2006; Lüpke 2009; Majid 2012);

Corpus Compilation and Exploitation 251 2. sentences spoken or written by native speakers who are given a list of words of the documented language and asked to form sentences with them in their language; or who are asked to paraphrase or transform already collected sentences (for an overview of such non-translational elicitation methods and references see Mosel 2012a, 82–84). If the LD project is embedded in a larger comparative research program, the picture prompts and video clips used by the field workers are created outside the field site and thus may contain unsuitable features (Lüpke 2009, 70–72). For elicitations that are not meant for cross-linguistic comparisons the pictures and video clips are better produced together with local research assistants.

2.6. Comparable corpora In an article that mainly deals with parallel corpora, Aijmer (2008, 276) defines the notion of comparable corpus, “A comparable corpus . . . consists of texts from different languages which are similar or comparable with regard to a number of parameters such as text type, formality, subject-matter, time span, etc.” Such corpora can be created by LD projects that record texts by using the same kind of stimuli as, for instance, Chafe’s Pear film (Chafe 1975), the picture book Frog, where are you? (Mayer 1969), or the field manuals of The Language and Cognition Department of the Max-Planck-Institute for Psycholinguistics, Nijmegen (http://fieldmanuals.mpi.nl/). Both the Pear Film and the Frog story are used worldwide as stimuli in comparative re search projects where the test subjects are asked to retell the Pear Film they have seen before or tell the Frog story while looking at the pictures in the book. The value of such recordings is questionable, because both kinds of narratives are not indigenous genres (see Foley 2003). A better way of creating comparable corpora is to choose texts of similar indigenous genres in different languages as, for instance, folk tales, and annotate them in the same way. An example of this kind is the French ANR HimalCo project which built a corpus of aligned stories from the Kiranti mythological cycle in the three Himalayan Rai lan guages Koyi, Thulung, and Khaling. The stories of this corpus are transcribed in a prac tical orthography and translated. In addition, each story is marked for the similarities it shows with any of the other stories (http://himalco.huma-num.fr/corpus/comparable/ index.htm). A more detailed but totally different annotation format is found in Multi-CAST (Multilingual Corpus of Annotated Spoken Texts, https://lac.uni-koeln.de/en/category/ multi-cast/) using the annotation format GRAID (Grammatical Relations and Animacy in Discourse) which has been developed by Haig and Schnell and is used for research on grammatical relations (Haig and Schnell 2011, 2016; Haig, Schnell, and Wegener. 2011; see section 5.4).

252 Ulrike Mosel

2.7. Parallel corpora According to Aimer (2008, 276), “parallel corpora consist of a source text and its trans lation into one or more languages.” Typical LD corpora look like parallel corpora be cause the texts of the documented language are translated into a dominant language; that is, the documented language is the source and the dominant language the target language. But there is an important difference; the translation is not done by profes sional translators but merely presents the translator’s understanding of the source text for the time being. It can be used for the grammatical and lexical analyses of the corpus but not for translational studies as the parallel corpora of well-researched languages such as English and Norwegian (see section 5.3 for translation and section 6.2 for gram matical analysis).

2.8. Multimodal corpora Multimodal corpora are corpora with texts in an audio-visual format so that they not only record speech but, in addition, non-speech modalities of human communica tion as, for instance, facial expressions, head and eye movements, gestures, and body postures. The interaction of speech and gestures seems to be more researched than other possible interactions of speech and paralinguistic modalities, but still a conventionalized transcription system like the International Phonetic Alphabet is missing. Seyeddinipur (2012) gives a brief introduction into the value and the methods of documenting gestures; an overview of transcription systems in found in Bressem (2014). Wittenburg (2008) describes the technical aspects of creating multimodal corpora.

3. Genres and registers in language documentations 3.1. Introduction Communicative events comprise various kinds of text types which can be classified in different ways. For the analysis of corpora Biber and Conrad (2009, 2, 15–19) distinguish between genre and register as two approaches or perspectives. “In the genre perspective the focus is on the linguistic characteristics that are used to structure complete texts. These are conventional linguistic characteristics that usually occur only once in the text” (Biber 2010, 241). For example, a characteristic of English fairytales is that they are typically introduced by the phrase Once upon a time and end with happily ever after. Similarly, the folk tales of the Teop people of Papua New Guinea (dobes.mpi.nl/projects/ Teop) consistently end with the non-translatable phrase kuhoo te kara tete (Magum et al.

Corpus Compilation and Exploitation 253 2007). For a brief description of distinctive conventional genre markers, see Baumann (2001) and Biber and Conrad (2009, 69–7 1). A concise overview of distinctive genre markers and textual framing devices is given by Foley (1997, 359–378). When texts are analyzed from the register perspective, the focus is on statistically sig nificant preferences for certain phonological, grammatical, or lexical forms that occur throughout texts of specific situational characteristics such as the kind of participants, the medium, the topic, and the purpose of the communicative event; for a more com prehensive list of such situational characteristics, see Biber and Conrad (2009, 40). The use of ideophones is a point in case. When working on the documentation of Awetí, a Tupian language spoken in Brazil, Reiter (2011, 348, 383, 386) observed that ideophones are much more frequently used in narrative than in non-narrative texts. Texts of a particular genre may belong to different registers. When native speakers ed ited the transcriptions of the oral Teop folk tales for a schoolbook, they kept the genre- specific opening and closing phrases but removed hesitation phenomena, reduced repetitions, consistently replaced loan words by Teop words, increased the number of complex syntactic constructions, and thus created a written register of Teop folk tales (Mosel 2006, 80; Mosel 2015a). Since genres and registers are not universal but culture-and language-specific categories, the identification of genres and registers requires a thorough linguistic analysis of texts which starts with sorting the texts according to their production circumstances. Subsequently one searches these groups of texts for fixed expressions that mark their structure and for variant frequencies of certain linguistic features. A different, ethnolinguistic approach to genre and register analysis is followed by Senft (2010). In his book The Trobriand Islanders’ Way of Speaking he investigates their indig enous categorization of text types as reflected by their own metalinguistic vocabulary. On the basis of the above-mentioned corpus-linguistic concepts, the following sections discuss the role that genres and registers play in theoretical articles about language doc umentation (section 3.2) and then look at the presentation of text types in LDs in section 3.3. We speak of text types rather than genres and registers, wherever it is not quite clear, if the texts can be classified as genres or registers according to corpus linguistic criteria.

3.2. Theoretical and methodological issues In his seminal article “Documentary and descriptive linguistics” Himmelmann (1998, 166) wrote, “The aim of a language documentation, then, is to provide a comprehensive record of the linguistic practices characteristic of a given speech community.” Later, in the students’ textbook Essentials of Language Documentation (Gippert, Himmelmann, and Mosel 2006), he expanded his formulation of the aims of a language documenta tion (Himmelmann 2006a, 2): “. . . a language documentation should strive to include as many and as varied records as practically feasible, covering all aspects of the set of interrelated phenomena commonly called language. Ideally, then, a language documen tation would cover all registers and varieties, social or local; it would contain evidence

254 Ulrike Mosel for language as a social practice as well as a cognitive faculty, it would include specimens of spoken and written language; and so on.” With respect to this idealistic aim, Evans (2008, 343) makes the following critical comment in his review: “The question of how long documentation projects need to be to accomplish the ambitious goals set out in the book could well have been addressed more systematically with some hard facts about what takes how long to achieve, particu larly as the demanding new standards of documentary linguistics substantially drive up the time it takes to process data in the initial phase of research.” This question, however, cannot be answered because the ideal of covering “all registers and varieties” cannot be realized, which of course questions the usefulness of such an ideal. Himmelmann (1998, 172) himself explains that there are “limits to documenta tion” because “the interests and rights of contributors and the speech community should take precedence over scientific interests.” The theoretical problem of Himmelmann’s ideal LD is that registers and other varieties of speech can only be identified by corpus linguistic analyses of language usage in different speech situations, which, obviously, presupposes the existence of a corpus. Only after texts have been recorded in different speech situations, transcribed, and translated can we formulate hypotheses about genre and register distinctions and start a genre and register analysis. The small size of LD corpora also questions the recommendation that “a language documentation should strive to include as many and as varied records as practically fea sible” (Himmelmann 2006a, 2; similar statements are made by Woodbury 2003, 48–49; Seifart 2008; Lüpke 2009, 55). The more diversified the corpus is, the more difficult it becomes to identify regular patterns of language use that are typical for certain genres or registers. If in the extreme case, for example, the corpus contains what in English would be called a legend, a personal narrative, a political speech, a sermon, a proverb, a cooking recipe, a joke, a poem, a children’s song, a discussion in the village council, and a chat between girls, it would be impossible to identify formal features in these texts that can be considered as characteristics for a genre or register in the documented language. The diversity of a corpus alone cannot count as a quality criterion for the assessment of a corpus. Rather a corpus should “be assessed relative to the focus and expertise of the projects and researchers, and also to opportunity” (Thieberger et al. 2015, 16).

3.3. Genres and registers in LD projects The kinds of text type found in language documentation projects depend on multiple interacting factors: 1. the priorities of the speech community; 2. previous research and already existing text collections; 3. the aims of the project set by those team members who had successfully applied for the funding of the project; 4. the resources of the project in terms of time, money, and manpower.

Corpus Compilation and Exploitation 255 As a result, the corpora found in the Dokumentation Bedrohter Sprachen (DoBeS) archive and the Endangered Languages Archive (ELAR) are heterogeneous, but we can distinguish between diversified and specialized corpora. The diversified cor pora typically contain folk tales, procedural texts, descriptions, and elicitations (see the Awetí, Beaver, Savosavo, and Saliba corpora (http://dobes.mpi.nl/projects/)), whereas the specialized ones focus on a single text type. For example, in ELAR, the corpus of Australian Arandic songs (https://elar.soas.ac.uk/Collection/MPI1016619) or the Unagam Tunuu (Aleut Language) Conversation Corpus (https://elar.soas.ac.uk/ Collection/MPI78647). The latter focuses on spontaneous conversations because an edition of narrative texts, a grammar, and a lexicon had already been published by Bergsland (Woodbury 2011, 166, 181). The above-mentioned aim “to provide a comprehensive record of the linguistic practices characteristic of a given speech community” (Himmelmann 1998, 166) seems to imply that the corpus should not include text types that are created during the LD project. But the practice of LD inevitably leads to innovations. Besides various kinds of elicitations (section 2.5), at least two other kinds of new text types are not uncommon in LD projects, the already mentioned written versions of narratives and, second, what could be called hybrid documentary genres. As mentioned in the introduction (section 1), a typical characteristic of LD corpora is that the production of texts and the building of corpora are simultaneous processes. This characteristic is sometimes formally indicated by opening and closing phrases spoken by the person who does the recording and, second, by opening and closing phrases spoken by the person whose story is recorded. In the following example a Teop research assistant recorded an old man’s personal history. In the beginning she informs the potential audience about the speaker, the topic, and the date of the recording, and she closes the recording by thanking the speaker (as rendered below in her English translation). Today is Friday morning on the 16th of May 2003. I am sitting in the village they call Hatana. The village belongs to Rum and his wife and their children. This morning I am going to ask Rum, Reuben Rum, to talk about his life. . . . Alright, I am going to give him the microphone now. . . . (Teop, Rum_01R.001-010) Thank you very much, Rum. I am very happy (and appreciate) your talk . . . Thank you very much. (Teop, Rum_01R.477-483)

Within this frame, Reuben Rum talks about his life and marks the beginning and the end of his autobiography by the following phrases (here rendered in the English translation) that in this or similar forms can be observed in other personal histories: “I, my name is Reuben Rum. I was born on Teop Island. My mother and father are from Teop Island. . . . (Teop, Rum_01R.012-013) . . . That is all, I am able to talk about. (Teop, Rum_01R.476)

256 Ulrike Mosel Because of its double, nested framing, this recording presents a hybrid genre that may in general be typical for recordings in LDs. The outer framing provides the metadata of the enclosed story in the beginning of the recording and concludes the recording by a thank-you to its speaker, whereas the inner framing marks the beginning and the end of Reuben Rum’s story. In the edited version of Reuben Rum’s story (Rum 2014, 57–63), the outer frame is missing. Probably procedural texts are also a new type of text in at least some LDs because, in many cultures, how to do things is not learned by explanations but by observing (Duranti 1997, 104–106; Mosel 2006, 74). The procedural texts of the Teop corpus do not classify as a genre or register in the strict sense (section 3.1), because they lack conventionalized framing devices and show a conspicuous variation of person forms referring to the actor. Some speakers or writers use the first person inclusive plural, others the first person exclusive plural, the second person singular, or the third person plural. If these texts represented an indigenous register, one would expect a preference for the one or the other person form.

4. The structure of language documentation corpora LD corpora are embedded in archived language documentations and are organized in different ways. In the DoBeS-Archive (Trilsbeek and Wittenburg 2006, 322–323, dobes. mpi.nl/projects/) most corpora show a hierarchical structure such that a corpus is di vided into subcorpora and further subdivisions according to text types like folk tales, conversations, and elicitations, or topics such as fishing and history as, for example, the corpus of the Papuan language Savosavo spoken in the Solomon Islands (see dobes.mpi. nl/projects/savosavo). In some LD projects, however, the data are classified according to the year of their creation, so that the user does not get any information about the content when browsing the websites of the LDs (see, for example, the languages of the Morehead LD project (dobes.mpi.nl/projects/morehead)). In contrast, the corpora of the ELAR are not hierarchically organized, but the in terface of the individual LDs facilitates the access to a corpus by the four options “Type,” “Genre,” “Topic,” and “Participants.” The category “Type” is subdivided into the standardized types of data format such as “Audio,” “Video,” “ELAN,” “Flex,” “Image,” or “Document,” whereas the subcategories “Genre” and “Topic” are organ ized by keywords attached to the documents. These categories are not disjunctive and not standardized, so that one and the same text may be listed in more than one category.

Corpus Compilation and Exploitation 257

5. Types of data 5.1. Introduction The data contained in a LD corpus are classified into metadata (i.e., data about the data) (section 5.2), raw data (i.e., recordings and manuscripts), primary data that is directly derived from the raw data (i.e., transcriptions and translations) (section 5.3), and sec ondary data which is created by a basic grammatical and textual analysis of the primary data and presented by annotations in the corpus (section 5.4). Furthermore, the data is classified according to the data formats of recordings and texts (Austin 2006, 95–96; Thieberger and Berez 2012).

5.2. Metadata In corpus linguistics the term “metadata” first refers to information about the character istics of the situation in which the corpus has been created, and second, to the produc tion circumstances of each text. In LDs, the information about the building of the corpus is given in the introduction to the LD project where the genealogical affiliation, the loca tion and the socio-linguistic features of the language, the culture of its speakers, the aims of the project, its duration, its team members, and its funding are briefly described. The introduction may also contain a typological profile of the language, a description of the orthography, references to other works on the language and culture in question, and a list of abbreviations. The second kind of metadata is directly linked to each text and gives information about the speakers and other people attending the recording, its topic, genre and/or register, location, date, and format (for a list of situational characteristics see Biber and Conrad 2009, 40). Some of this information may also be implied by the position of the texts in a structured corpus that is divided into subcorpora of text types (see section 4) and might be included in the titles of the texts and the abbreviations used in the utterance IDs (see section 5.3), e.g., an abbreviation for the speaker’s name and the date of the recording. For brief descriptions of the content of metadata in LD projects, see Himmelmann (2006a, 11–14), Conathan (2011, 244–249), and Good (2011, 228–232). A detailed de scription of various kinds of metadata standards for corpora in general is found in Lehmberg and Wörner (2008, 492–501). Thieberger and Berez (2012, 105–109) describe how the metadata of the individual texts of a LD corpus can be organized in a relational database which can serve as a catalogue and facilitate complex searches across the whole corpus for texts sharing particular metadata.

258 Ulrike Mosel

5.3. Raw and primary data In their original unique form, spoken and written texts, i.e., recordings and manuscripts, are raw data, which in LD corpora need to be further processed to be of any use for either research or language maintenance efforts. In the case of audio and video recordings, Himmelmann (2012) distinguishes two processing stages which re sult in “primary data” and “structural data.” The first one involves the segmentation of the sound stream into utterance units, the transcription, and the translation. When audio and video recordings are processed in ELAN (Wittenburg et al. 2006; section 6.4), the primary data is organized on three annotation tiers that are time-aligned with the segmented sound stream. 1. the tier called utterance ID, on which the utterance units are labeled and numbered; 2. the transcription tier; 3. the translation tier. Unless the corpus builders are native speakers trained in documentary linguistics, the creation of these three annotation tiers requires the close cooperation of linguists and native speaker research assistants. The transcription of texts is done in a practical orthography for most texts in LD corpora, because phonetic transcriptions are very time consuming, are difficult to read for non-linguists, and are not practical for searches (see section 6.1). But for spe cial purposes, subcorpora are transcribed in a phonological or phonetic transcription whose level of details depends on the purpose and the intended users of this subcorpus. For the various kinds of transcriptions in LD projects, including transcriptions of multi- speaker and multi-lingual discourse, see Schultze-Berndt (2006, 219–232). Transcribing recordings is one of the most important tasks in building a LD corpus. If the linguists are not native speakers, the first, preliminary transcription is, if possible, done by com munity members. Even if the native speakers are used to writing in their language, they need to be trained to accurately render the recording without changing the texts. Jung and Himmelmann (2011, 219) have observed, “. . . the transcription process it self, while often tedious and disliked by all parties involved, provides valuable insights into the linguistic knowledge of speakers: insertion, omission, or change of items show the range of the linguistic repertoire and (un)acceptable variation and thus complement the evidence otherwise gathered in elicitation tasks.” Consequently, a documentation of how native speakers change texts in their transcriptions would be a valuable resource for further research. The practical orthography is not necessarily a standardized, totally consistent orthog raphy, because standardization is such a complex endeavor that it can only be achieved in LD projects if special time and resources are available. The challenge of orthography development is that it has not only to consider linguistic problems but also pedagogical, socio-political, and technical aspects (Seifart 2006; Lüpke 2011).

Corpus Compilation and Exploitation 259 The first translation of any text in the LD corpus will only be preliminary and needs to be revised later, because the meanings of words, lexical phrases, and constructions only become clear when a considerable number of texts have been analyzed and, if nec essary, supplemented by elicited texts for the clarification of meanings or comments by native speakers. Evans and Sasse (2007, 73) remark that the process of translating is “fragmented and open ended, pointing both backward to earlier recordings, analyses and insights, and forward to questionings, analyses and attempts at translation that may continue to be worked through for a considerable and in principle unbounded time after the recording of the original vernacular text.” If the morphosyntax and the discourse structure of the documented source language and the target language are extremely different, a better understanding may be achieved by supplementing the free translation with a literal one with comments on the context (Schultze-Berndt 2006, 232–238). In addition to audio and video recordings, the corpus may also contain texts that are written by native speakers. The handwritten manuscript of a legend or the first edited version of a legend would count as raw data, but when these handwritten documents are proofread and edited by other native speakers, translated, typed, and formatted to become part of the corpus, they are primary data. All primary data have in common that it is directly derived from raw data and that their annotation requires the linguistic com petence of native speakers so that this data reflects the native speakers’ observations and interpretations. The term “structural data” is used by Himmelmann (2012, 199) for a wide range of kinds of data that are derived from the primary data: “descriptive statements, dictionary entry, interlinear glosses, frequency data, typological database, treebank, implicational universal.” Since Himmelmann’s notion of “structural data” does not distinguish be tween data that is annotated in the corpus and data that is presented otherwise, it is un suitable for our purpose. Therefore, we speak of secondary annotation instead.

5.4. Secondary annotation Secondary annotations have in common: 1. that their content is derived from the primary annotations and that they present some further analysis of the primary data to facilitate a better understanding of their form and meaning; 2. that they are aligned to the tiers of the primary data; 3. that they are done by linguists who do not need to be native speakers; 4. that they facilitate sophisticated searches for the investigation of particular re search questions. A kind of widely practiced secondary annotation is the morphological annota tion which adds two further tiers to the primary annotation. The first additional tier shows the morphological segmentation of the word forms, the second one the glosses

260 Ulrike Mosel of each morphological segment; for details see Schultze-Berndt (2006, 238–241). The abbreviations used for grammatical categories usually follow “The Leipzig Glossing Rules” (Bickel, Comrie, and Haspelmath 2004). A more sophisticated annotation scheme, called GRAID (Grammatical Relations and Animacy in Discourse), has been developed by Haig and Schnell (Haig and Schnell 2011; Haiget al. 2011; Haig and Schnell 2016). In addition to the morphological segmen tation and glossing, GRAID annotations show clause boundaries, zero arguments, syn tactic functions of arguments, and the distinction between human and non-human arguments. Apart from such morphological and syntactic annotations, tiers can be used for any kind of comments on the form or the content of the particular text segment they are aligned with.

6. Corpus analysis 6.1. Introduction In LD projects the two processes of corpus building and corpus analysis are not sharply separated, because when analyzing the corpus, the researchers may realize that they missed out on important data and, consequently, extend the corpus as illustrated below by two examples of the Teop LD project. The first example comes from the grammatical analysis of clause linkage, the second one from the definitions of plant names for the dic tionary. Both examples also show that the collection of grammatical data can also pro duce interesting data for a dictionary and that, vice versa, lexicographic work may lead to a better understanding of the grammar. This, however, does not come as a surprise, because as demonstrated by corpus linguistics, grammar and lexicon are not entirely dis tinct entities, as traditional linguistic theories have assumed. (see Conrad 2010, 229–231). When analyzing clause linkage in Teop, I realized that in narratives the sequence of actions is expressed by parataxis of the type they did X and then they did Y, but in procedural texts by constructions with adverbial clauses of the type when they have done X, then they do Y. Since the narratives and procedural texts were different in many other respects, I thought of creating a narrative and a procedural text about the very same event and thus reduce as many variables as possible. So I bought a rooster from a neighbor and asked him to butcher it, while I was taking photographs. He was helped by his four-year-old twins and his wife who watched the children. Later I used the series of photos as picture prompts for the story “How the twins helped their father butchering the rooster,” told by the mother, and for a procedural text about “How to butcher a rooster,” told by a woman who had not attended the butchering. The result is unambiguous: the procedural text contains nine adverbial clauses out of a total of forty clauses, whereas the narrative, which consists of fifty-three clauses, has none (Mosel

Corpus Compilation and Exploitation 261 2014, 146–149). For the lexical database the butchering experiment provided names of the chicken’s body parts and six distinct cutting verbs. The second example comes from our work on O Naono. English Plant Encyclopedia (Kamai et al. 2012), which like a monolingual dictionary defines word meanings in the source language Teop, but in addition provides an English translation of these definitions and, if possible, English translation equivalents of the plant names. The Teop definitions show that Teop is not an SVO-language, as we had assumed when analyzing narratives, but a verb-second language. The position before the verbal predicate is held by the topic, irrespective of its argument function and animacy. In the definition of plant names, the topic is the plant name which may be either an intransitive subject or a tran sitive object (Mosel 2014, 151–153). Neither a grammar nor a dictionary of a previously under-researched language can be comprehensive, but they should be adequate with respect to the data provided by the LD (see Good 2012 who makes the distinction between comprehensiveness and co-extensitivity, i.e., adequacy with respect to the corpus data). This goal can only be achieved if the corpus is built with a tool like ELAN (see section 6.4), which facilitates annotations of primary and secondary data on several tiers and, correspondingly, searches for linguistic items on each of these tiers as well as any combination of tiers with the query language Regular Expressions (see Mosel 2012b; Kübler and Zinsmeister 2015, 207–216; Mosel 2015b). Among other things, Regular Expressions facilitate complex searches on all annota tion tiers for 1. words of a specific beginning or ending, e.g., all words starting with be- or all words ending with -ness, 2. discontinuous sequences of linguistic items, e.g., un_able as in unbelievable, unsearchable, or the ___ woman as in the old woman, the other woman, etc. 3. two or more alternative expressions, e.g., all words that start with two vowels, e.g., air, eat, out; or all constructions in which look, looks, looked or looking is directly and at some distance followed by at. Simultaneous searches on both the transcription and the translation tier are conven ient if the words or constructions searched for are homonymous or polysemous and you only want to find them in one of their meanings in the corpus. But if you want to find all occurrences of a word form or construction with their translations, you simply insert the wildcard “.*,” which matches any number of characters, on the translation tier. And conversely, if you want to know how a word form or construction on the transla tion tier is expressed in the documented language, you insert the wildcard “.*” on the transcription tier and the particular word form or construction on the translation tier. For example, if you insert than on an English translation tier, you’ll find all comparative constructions translated by than in the corpus.

262 Ulrike Mosel

6.2. Grammatical analysis and description In recent years many branches of theoretical linguistics have become interested in grammatical descriptions of endangered languages because these often belong to non-European language families that are under-researched. Although the grammars describing such languages are based on elicitations and texts that were collected during fieldwork, many writers and publishers are not aware of the fact that scientific account ability of grammars can only be achieved if the data are retrievable and the corpus is accessible so that the analysis is in principle replicable. In many grammars of the re nowned series Mouton Grammar Library, for example, the origin of examples is not indicated by explicit references to an accessible corpus so that the reliability of the gram matical descriptions can be questioned. Corpus-based grammaticography as it is now practiced in many LD projects has the following advantages: 1. linguistic phenomena can be described with reference to their context and co-text; 2. the data on which the grammatical description is based are retrievable so that the analysis and description can be scrutinized (Thieberger 2009; Mosel 2014); 3. the method of searching for specific linguistic forms can be made explicit so that it becomes replicable (see section 6.1); 4. the frequency of specific linguistic forms can be stated in exact figures and be statistically analyzed and interpreted (Haig et al. 2011; Mosel 2014, 139; Haig and Schnell 2016); 5. the grammatical description can—and should—indicate from which genre and register the examples come from (see section 3); 6. the grammatical description can state which research questions cannot be answered on the basis of the corpus data and indicate what kind of corpus data are missing; 7. the description of the syntax of spoken language varieties can describe the inter action of syntactic structures and prosodic phenomena (Himmelmann 2006b; Simard and Schultze Berndt 2011; Schultze-Berndt and Simard 2012); 8. digital grammatical descriptions can be linked to the corpus of recordings with aligned transcriptions and translations (Thieberger 2009).

6.3. Lexical analysis and lexicography When translating the texts of a LD corpus, the corpus builders usually create a lexical database for words, morphemes, and multi-word expressions with their glosses and/or translation equivalents. The tools most commonly used are Toolbox or FLEx (Fieldworks Language Explorer) which have been developed by the SIL International as data management and analysis tools for field linguists (see section

Corpus Compilation and Exploitation 263 6.4). A lexical database of this kind can function as the basis of an electronic or a print dictionary, but the work of making a comprehensive dictionary should not be underestimated, because each lexical item requires a thorough corpus-based mor phosyntactic and semantic analysis and a careful selection of examples that illus trate its various meanings and constructions (see Rehg, Chapter 13, this volume; Mosel 2016). Therefore, I recommend to start the lexicographic work with the crea tion of small dictionaries that only cover the lexical items of particular themes, pref erably those the knowledge of which is in danger of being forgotten by the younger generations (Mosel 2011).

6.4. Tools for building and analyzing LD corpora AntCon, a freeware corpus analysis toolkit for concordancing and text analysis. www. laurenceanthony.net/software/AntCon/. Accessed Oct. 10, 2016. ELAN Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands. http://tla.mpi.nl/tools/tla-tools/elan/. Accessed Oct. 10, 2016. Field Manuals and Stimulus Materials of the Language and Cognition Department of the Max-Planck-Institute of Psycholinguistics. Accessed October 10, 2016. http://fieldmanuals.mpi.nl/. Accessed June 26, 2016. Field Linguist’s Toolbox (SIL International), which is now superseded by FLEx (see below), but still preferred by many researchers. http://www-01.sil.org/computing/ toolbox/?_ga=GA1.2.1598941485.1456327377. Accessed October 10, 2016. FieldWorks Language Explorer (FLEx)—(SIL International). http://fieldworks.sil. org/flex/. Accessed Oct. 10, 2016.

7. Concluding remarks The corpora of LD projects are heterogeneous with respect to their size, content, and structure (section 4), but they have a number of features in common that distinguishes them from corpora of major languages: 1. Their aim is to contribute to the conversation of endangered languages and the speech communities’ knowledge of their cultural heritage insofar as it can be encoded in language (section 1). 2. The languages represented in LD corpora are under-researched and spoken by small speech communities. 3. The texts of the corpora are collected in fieldwork projects by teams that typically consist of a non-native speaker linguist and native speakers who are not linguists but receive some training during the project. The collections may be opportunistic or follow a rigorous sampling frame (section 2.3).

264 Ulrike Mosel 4. The potential users of LD corpora are not only linguists but also researchers of other fields of the humanities and social sciences and the speech communities (sections 1, 3.2; 6.1, 6.3). 5. The corpora are multimodal corpora (section 2.8) as they consist of audio and video recordings with time-aligned transcriptions and translations (section 5.3) and possibly further annotations giving information about the form and the meanings of the utterances (section 5.4). 6. Most corpora contain texts of various types as, for example, folk tales, procedural texts, and descriptions which form contrastive subcorpora (section 2.4), but to what extent these text types can be classified as genres and registers is problematic and an issue to be investigated by corpus linguistic methods (section 3). A special type of text is elicited data. It can be integrated with its metadata into a LD corpus as a subcorpus (section 2.5) and included in corpus searches for grammatical or lexical items. 7. Texts of different languages that belong to similar text types and are annotated in the same way can be put together as a comparative corpus (section 2.6) and used for comparative research. The corpora of language documentation projects provide an unprecedented va riety of data from un-or under-researched languages that are accessible via the in ternet and facilitate analyses and descriptions that can be scrutinized and are replicable (section 6).

References Aimer, Karin. 2008. “Parallel and Comparable Corpora.” In Corpus Linguistics. An International Handbook, edited by Anke Lüdeling and Merja Kytö, 275–291. Berlin and New York: Mouton de Gruyter. Austin, Peter. 2006. “Data and Language Documentation.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 87–112. Berlin and New York: Mouton de Gruyter. Baker, Paul, Andrew Hardie, and Tony McEnery, eds. 2006. A Glossary of Corpus Linguistics. Edinburgh: Edinburgh University Press. Baumann, Richard. 2001. “Genre.” In Key Terms in Language and Culture, edited by Alessandro Duranti, 79–82. Oxford: Blackwell Publishing. Biber, Douglas. 2010. “What Can a Corpus Tell Us About Registers and Genres?” In The Routledge Handbook of Corpus linguistics, edited by Anne O’Keeffe and Michael McCarthy, 241–254. Abingdon, UK: Routledge. Biber, Douglas and Susan Conrad. 2009. Register, Genre, and Style. Cambridge: Cambridge University Press. Bickel, Balthasar, Bernard Comrie, and Martin Haspelmath. 2004. “The Leipzig Glossing Rules. Conventions for Interlinear Morpheme by Morpheme Glosses.” Leipzig: Max Planck Institute for Evolutionary Anthropology. Last modified May 31, 2015. https://www.eva.mpg. de/lingua/pdf/Glossing-Rules.pdf. Accessed October 10, 2016.

Corpus Compilation and Exploitation 265 Bressem, Jana. 2014. “Transcription Systems for Gestures, Speech, Prosody, Postures, Gaze.” In Body-Language-Communication: An International Handbook on Multimodality in Human Interaction (Handbooks of Linguistics and Communication Science 38.1), edited by Cornelia Müller, Alan Cienki, Ellen Fricke, Silva H. Ladewig, David McNeill, and Sedinha Teßendorf, 1037–1058. Berlin and Boston: Mouton de Gruyter. Chafe, Wallace. 1975. The Pear Film. http://www.linguistics.ucsb.edu/faculty/chafe/pearfilm. htm. Accessed June 1, 2016. Chelliah, Shobhana L. and Willem J. De Reuse. 2011. Handbook of Descriptive Fieldwork. Dordrecht, Heidelberg, London, and New York: Springer. Conathan, Lisa. 2011. “Archiving and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter Austin and Julia Sallabank, 235–254. Cambridge: Cambridge University Press. Conrad, Susan. 2010. “What Can a Corpus Tell Us About Grammar?” In The Routledge Handbook of Corpus linguistics, edited by Anne O’Keeffe and Michael McCarthy, 227–240. Abingdon, UK: Routledge. Duranti, Alessandro. 1997. Linguistic Anthropology. Cambridge: Cambridge University Press. Dwyer, Arienne. 2006. “Ethics and Practicalities Cooperative Fieldwork and Analysis.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 31–66. Berlin and New York: Mouton de Gruyter. Evans, Nicholas. 2008. “Review of Gippert, Jost, Nikolaus Himmerlmann, and Ulrike Mosel.2006. Essentials of Language Documentation. Berlin, New York: Mouton de Gruyter.” In Language Documentation & Conversation 2(2): 340–350. http://scholarspace.manoa.ha waii.edu/bitstream/10125/4353/7/evans.pdf. Accessed October 10, 2016. Evans, Nick and Hans-Jürgen Sasse. 2007. “Searching for Meaning in the Library of Babel: Field Semantics and Problems of Digital Archiving.” In Language Documentation and Description, vol. 4, edited by Peter K. Austin, 58–99. London: Hans Rausing Endangered Languages Project, Department of Linguistics, School of Oriental and African Studies. Foley, William A. 1997. Anthropological Linguistics. Oxford: Blackwell Publishers. Foley, William A. 2003. “Genre, Register and Language Documentation in Literate and Preliterate Communities.” In Language Documentation and Description, vol. 1, edited by Peter K. Austin, 85–98. London: Hans Rausing Endangered Languages Project, Department of Linguistics, School of Oriental and African Studies. Gippert, Jost, Nikolaus Himmerlmann, and Ulrike Mosel, eds. 2006. Essentials of Language Documentation. Berlin and New York: Mouton de Gruyter. Good, Jeff. 2011. “Data and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 212– 234. Cambridge: Cambridge University Press. Good, Jeff. 2012. “Deconstructing Descriptive Grammars.” In Electronic Grammaticography, edited by Sebastian Nordhoff, 2–32 (Language Documentation & Conservation Special Publication No. 4). Honolulu: University of Hawai‘i Press. http://scholarspace.manoa.ha waii.edu/bitstream/10125/4528/1/good.pdf. Accessed October 10, 2016. Haig, Geoffrey and Stefan Schnell. 2011. Annotations Using GRAID (Grammatical Relations and Animacy in Discourse) Introduction and Guidelines for Annotators. Version 7.0. https://www. uni-bamberg.de/fileadmin/aspra/Publications/GRAID7.0_manual.pdf. Accessed Oct. 10, 2016. Haig, Geoffrey and Stefan Schnell. 2016. “The Discourse Basis of Ergativity Revisited.” In Language 92(3): 591–618. https://muse.jhu.edu/issue/33989. Accessed Oct. 10, 2016.

266 Ulrike Mosel Haig, Geoffrey, Stefan Schnell, and Claudia Wegener. 2011. “Comparing Corpora from Endangered Languages: Explorations in Language Typology Based on Original Texts.” In Documenting Endangered Languages. Achievements and Perspectives, edited by Geoffrey L. J. Haig, Nicole Nau, Stefan Schnell, and Claudia Wegener, 55–86. Berlin and Boston: Mouton de Gruyter. Hellwig, Birgit. 2006. “Field Semantics and Grammar-Writing: Stimuli-Based Techniques and the Study of Locative Verbs.” In Catching Language. The Standing Challenge of Grammar Writing, edited by Felix K. Ameka, Alan Dench, and Nicholas Evans, 321–358. Berlin and New York: Mouton de Gruyter. Himmelmann, Nikolaus. 1998. “Documentary and Descriptive Linguistics.” In Linguistics 36: 161–195. Himmelmann, Nikolaus. 2006a. “Language Documentation: What Is It and What Is It Good For?” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 1–30. Berlin and New York: Mouton de Gruyter. Himmelmann, Nikolaus. 2006b. “Prosody in Language Documentation.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 163–185. Berlin and New York: Mouton de Gruyter. Himmelmann, Nikolaus. 2012. “Linguistic Data Types and the Interface Between Language Documentation and Description.” Language Documentation & Conservation 6: 187–207. https://scholarspace.manoa.hawaii.edu/bitstream/10125/4503/1/himmelmann.pdf. Accessed October 10, 2016. Hunston, Susan. 2002. Corpora in Applied Linguistics. Cambridge: Cambridge University Press. Jung, Dagmar and Nikolaus Himmerlmann. 2011. “Retelling Data.” In Documenting Endangered Languages. Achievements and Perspectives, edited by Geoffrey L. J. Haig, Nicole Nau, Stefan Schnell, and Claudia Wegener, 201–220. Berlin and Boston: Mouton de Gruyter. Kübler, Sandra and Heike Zinsmeister. 2015. Corpus Linguistics and Linguistically Annotated Corpora. London, New Delhi, New York, and Sydney: Bloomsbury. Lehmberg, Timm and Kai Wörner. 2008. “Annotation Standards.” In Corpus Linguistics. An International Handbook, edited by Anke Lüdeling and Merja Kytö, 484–501. Berlin and New York: Mouton de Gruyter. Lüpke, Friederike. 2009. “Data Collection Methods for Field-Based Language Documentation.” In Language Documentation and Description, vol. 6, edited by Peter K. Austin, 53–100. London: Hans Rausing Endangered Languages Project, Department of Linguistics, School of Oriental and African Studies. Lüpke, Friederike. 2011. “Orthography Development.” In The Cambridge Handbook of Endangered Languages, edited by Peter Austin and Julia Sallabank, 312– 336. Cambridge: Cambridge University Press. Magum, Horai, Joyce Maion, Jubilee Kamai, Ondria Tavagaga, Ulrike Mosel, and Yvonne Thiessen, eds. 2007. Amaa Vahutate vaa Teapu (illustrated by Rodney Rasin). Kiel: Seminar für Allgemeine und Vergleichende Sprachwissenschaft, Christian-Albrechts-Universität (Teop folk tales in the Teop language). Kamai, Jubilee, Owen Kasinori, Enoch Horai Magum, Leah Arovi Magum, Joyce Maion, Naphtali Maion, Janet Nasin, Ruth Sima Rigamu, Ruth Saovana Spriggs, Ondria Tavagaga, and Jeremiah Vaabero with Ulrike Mosel, Marcia Schwartz, and Yvonne Schuth. 2012. O Naono. The Teop-English Plant Encyclopedia. Kiel: ISFAS, Christian-Albrechts-Universität.

Corpus Compilation and Exploitation 267 Majid, Asifa. 2012. “A Guide to Stimulus-Based Elicitation for Semantic Categories.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 54– 7 1. Oxford: Oxford University Press. Mayer, Mercer. 1969. Frog Where Are You? https://www.philfak.uniduesseldorf.de/fileadmin/ Redaktion/Institute/A llgemeine_Sprachwissenschaft/Frogstory-2_0 1.pdf. Accessed October 10, 2016. McEnery, Tony and Andrew Hardie. 2012. Corpus Linguistics. Cambridge: Cambridge University Press. Mosel, Ulrike. 2006. “Fieldwork and Community Language Work.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 67–85. Berlin and New York: Mouton de Gruyter. Mosel, Ulrike. 2011. “Lexicography in Endangered Language Communities.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 337–353. Cambridge: Cambridge University Press. Mosel, Ulrike 2012a. “Morphosyntactic Analysis in the Field—A Guide to the Guides.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 72–89. Oxford: Oxford University Press. Mosel, Ulrike. 2012b. “Advances in the Accountability of Grammatical Analysis and Description by Using Regular Expressions.” In Electronic Grammaticography, edited by Sebastian Nordhoff, 236–250 (Language Documentation & Conservation Special Publication No. 4). Honolulu: University of Hawai‘i Press. https://scholarspace.manoa.hawaii.edu/bit stream/10125/4537/1/mosel.pdf. Accessed October 10, 2016. Mosel, Ulrike. 2014. “Corpus Linguistic and Documentary Approaches in Writing a Grammar of a Previously Undescribed Language.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 236–250 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. https://scholarspace. manoa.hawaii.edu/bitstream/10125/4589/1/9_Mosel.pdf. Accessed October 10, 2016. Mosel, Ulrike. 2015a. “Putting Oral Narratives into Writing—Experiences from a Language Documentation Project in Bougainville, Papua New Guinea.” In Language Contact and Documentation [Contacto lingüístico y documentación], edited by Bernard Comrie and Lucia Golluscio, 321–342. Berlin, Munich, and Boston: Mouton de Gruyter. Mosel, Ulrike. 2015b. “Searches with Regular Expressions in ELAN Corpora.” http://www.isfas. uni-kiel.de/de/linguistik/mitarbeiter/mosel/publications_mosel/elan-regular-expressions. Accessed October 10, 2016. Mosel, Ulrike. 2016. “Dictionaries of Under-researched Languages.” In Linguistic Fieldwork and Language Documentation, A Course Book on Foundational Skills, edited by Firmin Ahoua, Dafydd Gibbon and Stavros Skopeteas. http://www.uni-bielefeld.de/lili/forschung/ ag_fachber/as/p07/chapters/D01MOSEL.pdf. Accessed October 10, 2016. Newman, Paul. 2012. “Copyright and other legal concerns.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 430– 456. Oxford: Oxford University Press. Ostler, Nicholas. 2008. “Corpora of Less Studied Languages.” In Corpus Linguistics. An International Handbook, edited by Anke Lüdeling and Merja Kytö, 457–483. Berlin and New York: Mouton de Gruyter. Reiter, Sabine. 2011. “Ideophones in Awetí.” PhD diss., Kiel University, Germany. http://macau. uni-kiel.de/servlets/MCRFileNodeServlet/dissertation_derivate_00004920/Ideophones_ Aweti.pdf;jsessionid=F1606ADBFE937D9376D8CB11A8143D58. Accessed October 10, 2016.

268 Ulrike Mosel Rice, Karen. 2012. “Ethical Issues in Linguistic Fieldwork.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 407–429. Oxford: Oxford University Press. Rum, Reuben. 2014. “A tootoo tenaa ae amaa kiu tenaa. My Life and Work.” In Amaa moon bara otei vaa Teapu [The Life and Work of Teop women and men], recorded, edited, and translated by Jubilee Kamai, Enoch Horai Magum, Shalom Magum, Joyce Maion, Ulrike Mosel, Yvonne Schuth, Ruth Saovana Spriggs, and Ondria Tavagaga, 57–63. Kiel: ISFAS. https:// www.isfas.uni-kiel.de/de/linguistik/mitarbeiter/mosel/publications_mosel/amaa-moon- bara-otei-vaa-teapu. Accessed June 20, 2016. Schultze-Berndt, Eva. 2006. “Linguistic Annotation.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 163–181. Berlin and New York: Mouton de Gruyter. Schultze-Berndt, Eva and Candide Simard. 2012. “Constraints on Noun Phrase Discontinuity in an Australian language: The Role of Prosody and Information Structure.” Linguistics, 50(5): 1015–1058. http://eprints.soas.ac.uk/21953/. Accessed October 10, 2016. Seifart, Frank. 2006. “Orthography Development.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 275–299. Berlin and New York: Mouton de Gruyter. Seifart, Frank. 2008. “On the Representativeness of Language Documentations.” In Language Documentation and Description, vol. 6, edited by Peter K. Austin, 60–76. London: Hans Rausing Endangered Languages Project, Department of Linguistics, School of Oriental and African Studies. Seifart, Frank. 2011. “Competing Motivations for Documenting Endangered Languages.” In Documenting Endangered Languages. Achievements and Perspectives, edited by Geoffrey L. J. Haig, Nicole Nau, Stefan Schnell, Claudia Wegener, 17–32 Berlin and Boston: Mouton de Gruyter. Senft, Gunter. 2010. The Trobriand Islanders’ Way of Speaking. Berlin and New York: Mouton de Gruyter. Seyeddinipur, Mandana. 2012. “Reasons for Documenting Gestures and Suggestions for How to Go About It.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 147–165. Oxford: Oxford University Press. Simard, Candide and Eva Schultze Berndt. 2011. “Documentary linguistics and prosodic evidence for the syntax of spoken language.” In Documenting Endangered Languages. Achievements and Perspectives, edited by Geoffrey L. J. Haig, Nicole Nau, Stefan Schnell, and Claudia Wegener, 151–176. Berlin and Boston: Mouton de Gruyter. Thieberger, Nicholas. 2009. “Steps Toward a Grammar Embedded in Data.” In New Challenges in Typology: Transcending the Borders and Refining the Distinctions, edited by Patricia Epps and Alexandre Arkhipov, 389–408. Berlin and New York: Mouton de Gruyter. Thieberger, Nicholas and Andrea Berez. 2012. “Linguistic Data Management.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 90–118. Oxford: Oxford University Press. Thieberger, Nicholas, Anna Margetts, Stephen Morey, and Simon Musgrave. 2015. “Assessing Annotated Corpora as Research Output.” Australian Journal of Linguistics 36: 1–21. Tognini Bonelli, Elena. 2010. “Theoretical Overview of the Evolution of Corpus Linguistics.” In The Routledge Handbook of Corpus Linguistics, edited by Anne O’Keeffe and Michael McCarthy, 14–27. Abingdon, UK: Routledge. Trilsbeek, Paul and Peter Wittenburg. 2006. “Archiving Challenges.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 311–335. Berlin and New York: Mouton de Gruyter.

Corpus Compilation and Exploitation 269 Wittenburg, Peter. 2008. “Processing Multimodal Corpora.” In Corpus linguistics. An International Handbook, edited by Anke Lüdeling and Merja Kytö, 664–685. Berlin and New York: Mouton de Gruyter. Wittenburg, Peter, Hennie Brugman, Albert Russel, Alex Klassmann, and Han Sloetjes. 2006. “ELAN: a Professional Framework for Multimodality Research.” In Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation. http://www. lrec-conf.org/proceedings/lrec2006/pdf/153_pdf.pdf. Accessed October 10, 2016. Woodbury, Anthony. 2003. “Defining Documentary Linguistics.” In Language Documentation and Description, vol. 1, edited by Peter K. Austin, 35–51. London: Hans Rausing Endangered Languages Project, Department of Linguistics, School of Oriental and African Studies. Woodbury, Anthony. 2011. “Language Documentation.” In The Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 337–353. Cambridge: Cambridge University Press.

References to Corpora on the internet Awetí dobes.mpi.nl/projects/aweti/. Accessed June 24, 2016.

Beaver dobes.mpi.nl/projects/beaver/. Accessed June 24, 2016.

Chintang dobes.mpi.nl/projects/chintang/. Accessed June 24, 2016.

Dokumentation Bedrohter Sprachen Archive (DoBeS-Archive) dobes.mpi.nl/projects/. Accessed June 24, 2016.

Endangered Languages Archive (ELAR) www.elar-archive.org. Accessed June 24, 2016.

Arandic Songs: Aboriginal Verbal Art in Central Australia https://elar.soas.ac.uk/Collection/MPI1016619. Accessed October 16, 2016.

Ju|’hoan Audio and Video Material 1970 to Present, Work in Progress https://elar.soas.ac.uk/Collection/MPI854174. Accessed October 16, 2016.

Morehead dobes.mpi.nl/projects/morehead. Accessed June 24, 2016.

270 Ulrike Mosel Multilingual Corpus of Annotated Spoken Texts (Multi-CAST) https://lac.uni-koeln.de/en/category/multi-cast/. Accessed January 31, 2018.

Parallel Corpora in Languages of the Greater Himalayas http://himalco.huma-num.fr/corpus/comparable/index.htm. Accessed June 24, 2016.

Saliba dobes.mpi.nl/projects/saliba/. Accessed June 24, 2016.

Savosavo dobes.mpi.nl/projects/. Accessed June 24, 2016.

Teop dobes.mpi.nl/projects/teop/. Accessed June 24, 2016.

Unagam Tunuu (Aleut Language) Conversation Corpus https://elar.soas.ac.uk/Collection/MPI78647. Accessed October 16, 2016.

Chapter 12

W riting Gr a mma rs of E ndangered L a ng uag e s Amber B. Camp, Lyle Campbell, Victoria Chen, Nala H. Lee, Matthew Lou-Magnuson, and Samantha Rarrick

1. Introduction This chapter aims to bring together and present recommendations for writing grammars, particularly for the purpose of documenting little- known and often endangered languages. While there are a number of publications that present recommendations for writing grammars (see the bibliography), none of these is com prehensive. The goal of this chapter is to present a full set of best-practice guidelines for grammar writing with specific recommendations. These recommendations are based on surveys of (1) a number of exemplary grammars, (2) various questionnaires aimed at aiding fieldwork and guiding grammar preparation, and (3) various publications that make recommendations for grammar writing.1 A word of clarification about the role of grammars in language documentation is in order here at the outset. Some scholars follow earlier views of Himmelmann (1998, 2006) that “language documentation may be characterized as radically expanded text collection” (Himmelmann 1998, 2). They hold that grammars and dictionaries be long rather to language description and are not essential to language documentation. We, however, agree with Rehg (2007, 15; 2014, 53–55), Rhodes et al. (2007, 3), Campbell (2016, 249–251), Himmelmann (2012), and many others that grammars and dictionaries are valuable, if not central, to documenting a language (see Rhodes and Campbell,

1 This chapter is a much revised version of the document, “Best-practices recommendations for grammar writing,” written as a team effort in the Grammar Writing graduate linguistics course taught by Lyle Campbell at the University of Hawai’i at Mānoa, Fall 2012.

272 Amber B. Camp et al. Chapter 5, this volume). A large text corpus is necessary but not sufficient for adequate language documentation. Kenneth Rehg tells of a project in Melanesia where scholars collected many hours of recordings but analysis of the data was left until after their re turn to the university, “whereupon they discovered that that did not have a single ex ample of how to ask a question” (Rehg 2014, 56). Rehg cites as a basic rule-of-thumb: analysis must be an on-going task, so that one has a clearer idea of where the holes are in the data. The idea that one could collect a sufficiently large corpus that would provide answers to any question one might have about the data is simply unrealistic. Analysis provides a check on the adequacy of the data, and writing a description of the data provides a check on the adequacy of one’s analysis. (Rehg 2014, 56.)

In short, an adequate grammar is essential to and an integral part of adequate documen tation of a language.

2. Before you begin Before one begins writing the grammar, some decisions are needed involving such things as the goal, audience, kind of grammar, and so on. These are addressed in this section.

2.1. Audience and purpose The first priorities are to determine the purpose of the grammar and its intended au dience—purpose and intended audience are inherently interconnected.2 Grammars can be written for various purpose, to contribute to the documentation of a language, particularly a little-known language, to provide a theoretical analysis of the struc ture of the language, to facilitate learning of the language, etc., as seen in the kinds of grammars discussed below (see section 2.2). Grammars can also have varied kinds of intended audiences. Is the target audience for the grammar primarily linguists, other scholars, or members of the language community? Is the grammar intended for lan guage learning or for a more specialized audience? Often grammar writers have both professional linguists and members of the language community in mind; however, it is

2 We assume here that the grammar writer already has selected the language of which a grammar is to be written. For advice about language selection and preparation for grammar writing, especially in the context of a grammar as a PhD dissertation, see Pawley (2014); see also Hauk and Heaton (2018). Also, here we do not focus on fieldwork or on how data for the grammar are obtained (whether by fieldwork, corpus analysis, combinations of these, etc.), rather on its representation and presentation within a grammar.

Writing Grammars of Endangered Languages 273 difficult to succeed in reaching both with a single grammar (cf. Noonan 2005, 114–115; Payne 2014). For this reason, sometimes it is appropriate, perhaps even important, to attempt to provide two grammars: one that is more technical and written in English or the dominant language of the country where the language is spoken (Spanish, Portuguese, Russian, etc.), and another that is more practical, written in the dominant language or a local language of broader communication, and aimed at members of the language community, local teachers, and others who can benefit from a non-technical, less scientific presentation of the language (see section 2.2.5).

2.2. Kinds of grammars The intended audience, the grammar’s purpose, and other factors such as the amount of time available for the task and the amount of information to be presented determine the type of grammar to be written. A brief description of some of the main kinds of grammars follows. These kinds of grammars are not necessarily fully distinct from one another and sometimes overlap. In discussion of kinds of grammars it can be difficult to differentiate between kinds of grammars per se and approaches to writing grammars; the discussion here involves both the kinds of grammars and approaches to writing grammars.

2.2.1. Descriptive grammar Descriptive grammars are not concerned with notions of “proper” or “correct” language; rather, they provide grammatical analyses based on actual usage instead of prescriptions about assumed proper usage.3 Descriptive grammars are contrasted with prescriptive grammars. A prescriptive grammar deals with what is believed to be correct grammar, combatting what it views as incorrect. That is, it is concerned with “good” or “bad” language usage; it describes the language according to how some people or institutions think it should be used. Prescriptive grammarians tend to provide guidelines to ensure that language users avoid making “errors.” Reference grammars and sketch grammars can have prescriptive tendencies, but not usually—these grammars are typically descriptive grammars (see below). The prescriptive/descriptive distinction is not generally of much significance for writing a grammar of a language that has not been standardized and where there might be little or no tradition of literacy.4 3

The term “descriptive grammar” thus is used in two senses. (1) It is a kind of grammar itself, which aims at grammatical analysis based on actual usage rather than prescriptions about assumed proper usage (a noun). (2) It refers to an attribute of grammars, being descriptive and not prescriptive (an adjective). 4 However, there is a sense in which even “descriptive” grammars of such languages can be taken as to some extent prescriptive, since they are typically based on one major dialect, on the speech of individuals judged (by the community or by the grammar writer, or by both) to be “good” speakers of the

274 Amber B. Camp et al. Grammars of endangered languages or of previously less well-documented languages are mostly descriptive grammars of a spoken language that usually lacks a strong tradi tion of writing (Mosel 2006b, 42). In this sense, a descriptive grammar and a reference grammar (see below) may not be distinct. Grammars that contribute to documenting languages are typically written in a language of broader communication, not in the target language itself. The term “traditional grammar” refers to grammars that utilize grammatical concepts derived from the Greek and Latin traditions of describing grammar; traditional grammars are used mostly in language education, and are very often prescriptive in nature.

2.2.2. Reference grammar A reference grammar describes the structure of a language in depth and typi cally illustrates the constructions of the language with ample examples. Reference grammars are usually intended for linguists and other scholars with some general knowledge of grammar. They are typically more comprehensive than most other kinds of grammars.

2.2.3. Sketch grammar A sketch grammar is usually shorter than a reference grammar. The amount and type of information included in a sketch grammar depends on the intended audience or reader ship. Some different kinds of sketch grammars, following Mosel (2006a, 30), are: 1. A preliminary grammar, limited in scope and often based on a small corpus. 2. A grammar overview as a chapter or appendix that accompanies some other publication. 3. A summary of a larger grammar for specialized publications (for example, in handbooks that are concerned with several languages, or works that contain chapters on other aspects of a people or their culture, etc.). 4. A grammar overview or outline included with a dictionary. 5. A sketch grammar as part of a language documentation project aimed at collecting abundant primary information on and texts in the language, a foundation from which to work toward a more comprehensive reference grammar. (See Mosel 2006a for discussion of sketch grammars generally.) language. To the extent that a grammar is well-received by the community whose language it describes, it may become prescriptive for them, whether intended or not. Typically it is intended that a grammar to document a language be non-prescriptive and that it describe the language as actually used. However, such grammars are almost always also normative, that is, they describe norms of usage, and the very codification of these norms in a grammar can make it seem to some that this is the way they should speak, since typically descriptive grammars do not have the luxury of representing the range of variation in actual language use.

Writing Grammars of Endangered Languages 275

2.2.4. Pedagogical grammar A pedagogical grammar is a grammar for learners of the language. It is usually ordered for the ease of learning, so that simpler aspects of the language are introduced before more complex ones. Usually, a pedagogical grammar also includes exercises designed to facilitate language learning.

2.2.5. Practical grammar (community grammar) A practical grammar (also called a community grammar) is one that is not prepared for linguists or scholars but for those with little experience with grammars or descriptions of languages. It is especially intended to be accessible to and useful for members of the community or group whose language is described in the grammar, and for educators and others who work with people whose language is the subject of the grammar (cf. Noonan 2005; see Rehg 2014). Some language documentation projects take it as a given that they should provide both a more scholarly grammar and a more practical/community-oriented grammar. Some grammars have attempted to accommodate the various needs and interests of these different users by having a section of “notes to linguists” at the end of each chapter or as appendices, where these notes provide information likely to be more specifically of interest to linguists, keeping the bulk of the description more accessible to general readers (see Rehg 2014 for general treatment of community grammars).

2.2.6. Other grammars Some other types of “grammar” (including approaches to grammar, not necessarily different kinds of grammar per se) that are of less relevance here include the following. Comparative grammar is not so much a kind of grammar but rather refers more to the examination in historical-comparative linguistics of what related languages have in common. The term “comparative grammar” has been used also to refer to the scholarly enterprise that determines the relationships among related languages by comparing the forms in their grammars. In more recent times, the term “comparative grammar” has also been used to refer to the activity of making cross-linguistic comparisons among grammatical structures for theoretical purposes; however, most scholars avoid this use of the term, since the term “comparative grammar” is mainly associated with historical- comparative linguistics and any other use of it is confusing. Formal grammar refers to the treatment primarily of syntax in formal, theoret ical approaches to linguistics, and is usually associated with generative grammar and approaches that derive from it. Some theoretical approaches to grammar (mostly syntax, not necessarily generative) include: Construction grammar Dependency grammar Functional-typological grammar Generalized Phrase Structure Grammar (GPSG)

276 Amber B. Camp et al. Government and Binding (GB) (Principles and Parameters) Head-Driven Phrase Structure Grammar (HPSG) Lexical-Functional Grammar (LFG) Minimalism Role and reference grammar Relational grammar Transformational grammar Still other approaches to grammars include tagmemics, stratificational grammar, etc., now no longer pursued. In principle, a descriptive reference grammar or sketch grammar could be written in almost any of these frameworks, although most have rarely if ever been used for the preparation of grammars for language documentation purposes or for general audiences, with the exception of functional-typological grammar, role-and-reference grammar, and tagmemics.

2.3. Authorship Often the author of a grammar of a language being documented turns out to be a lin guist who is not a native speaker of the language being described and may not speak it. Much better is a team of authors, at least some of whom are native speakers, and all of whom have skills in aspects of language documentation or linguistic analysis, though this is rare (see Mosel 2006b, 44). The fieldwork and language analysis on which these grammars depend is always enhanced if the language consultants can be trained and be come part of the team preparing the grammar. A question that comes up more and more frequently now is the attribution of au thorship. Should the scholar or scholars who do most of the analysis and write-up be considered the (only) author(s), or do key language consultants also deserve credit as authors? The answer may vary from case to case, depending particularly on the role and amount of contribution, and on the wishes of the consultants. Yet in general, as mentioned, grammars benefit from teamwork that involves native speakers, and many today feel it is a matter of ethics that key language consultants should be included as co- authors of the grammar.5 Of course non-native speaker scholars are not the only ones who write grammars. Some grammars are written by native speakers, some with little or no formal training in linguistics. However, today more and more we see native speakers of minority lan guages receiving formal linguistic training and bringing that training to bear on the 5 It should not go unmentioned that involving multiple native speakers in the research is not without its own complications. There can be variation in their idiolects that complicates the analysis. Speakers can disagree about how the language should be analyzed and about language policy for the community (see Nakayama and Rice 2014, 2).

Writing Grammars of Endangered Languages 277 description of their languages. The distinction between academic scholar and com munity member ceases to be pertinent in such cases. Clearly, linguistically trained native speakers are optimal for grammar writing, and of course in such situations, no issue arises as to the authorship status of the native speakers involved in the grammar- writing project.

2.4. Kinds of examples cited in grammars Some scholars believe that only material from naturally occurring discourse, sponta neous speech, should be cited as examples in a grammar. Others rely more on examples from direct elicitation. Many prefer naturally occurring examples but also include examples from elicitation when they serve their purposes well. In Mithun’s (2014, 50) recommendation, examples should be: drawn as much as possible from spontaneous connected speech, in a variety of genres critically including ample conversation. . . augmented by elicited examples for clear pronunciations of individual words, completeness of descriptions of allomorphy and paradigms, and illustration of contrasting structures.

This approach that combines examples from spontaneous speech and elicitation usu ally proves most productive, particularly since it can be difficult or even impossible to find sufficient appropriate examples for infrequent forms or constructions from nat urally occurring discourse alone (cf. Pawley 2014, 11–12; see Rhodes and Campbell, Chapter 5, this volume). For example, in a language with thirty- five numeral classifiers and thus thirty-five different ways of counting from, say, one to nine, even with an extremely ample corpus, the likelihood of finding all of these forms in nat urally occurring discourse is vanishingly small. Elicitation can be especially helpful for fleshing out paradigms and recurrent patterns. Recall the example above (section 1) of the large corpus of a Melanesian language which had no examples of how to ask questions in it. Examples such as these underline both the necessity for targeted elic itation and the potential inadequacy of a corpus-only attempt at adequate language documentation. (See section 4.3.3 for more on the citation of examples in grammars.)

3. Format and organization of grammar In this section we consider the organization of the grammar and the front matter and back matter, that is, all that is not part of the actual grammarical description itself.

278 Amber B. Camp et al.

3.1. Front matter Front matter refers to material that is presented before the main body of the grammar. It includes, among other things, information that may help the reader use the grammar and locate things in the grammar, as in the following subsections.

3.1.1. Table of contents The table of contents should be maximally informative. It should include the following: Numbered headings: Headings and subheadings should be numbered (Noonan 2005, 117). Nested headings: The table of contents should have a nested (hierarchical) format, with headings and subheadings of three to five levels of inclusion, numbered and optionally indented to indicate the level of inclusion (see Good 2004). For example:

4. Nouns

4.1. Noun classes

4.2. Noun morphology

4.2.1. Inflectional morphology

4.2.1.1. Case

4.2.1.2. Number

4.2.1.3. Gender

4.2.2. Nominal derivational morphology

4.2.3. Non-productive nominal morphology, irregularities, exceptions

3.1.2. Acknowledgements In this section, all who helped with the grammar should be thanked, in particular, lan guage consultants and agencies that provided funding and other aid.

3.1.3. Abbreviations A list of abbreviations used and what they stand for should be included in the front matter of the grammar. While the list of abbreviations used can adequately be given in the back matter, it is preferable to include this information at the beginning for easier access. Many readers find it more useful for the list of abbreviations to be presented early, since they may examine the abbreviations first and then know what an abbrevia tion refers to upon encountering it later in the text, or, having seen the list as they read the grammar, they know where to go to find explanations of unfamiliar abbreviations they may encounter later.

Writing Grammars of Endangered Languages 279

3.1.4. User guide Some good grammars contain a user guide. This is a brief description of anything the reader may find useful, explaining any special features, formats, notations, etc., used in the grammar. For example, if the grammar uses conventions or notations that are unique and different from those of other grammars, the user guide can explain how and why they are used. Other information potentially useful to the reader about the form and function of the grammar and its contents can be included here.

3.2. Back matter This refers to material that is presented after the main body of the grammar. It includes such things as appendices, indexes, and the bibliography.

3.2.1. Bibliography The grammar should contain a comprehensive bibliography that references all works cited in the grammar. A bibliography may also include works not actually cited in the grammar but that are nevertheless relevant to the language and its grammar. These may include works written in the language, references on related languages, and eth nographic sources about the people who speak the language that mention aspects of the language in its cultural setting. If references that are not actually cited in the text are included in the bibliography, it should be explained that this is the case and that this was done to make the work a more comprehensive resource for those interested in the lan guage. Some bibliographies of grammars also include annotations of some of the works cited in the bibliography; these annotations typically contain information regarding background, content or utility of the work, and sometimes its accuracy and limitations. Such annotation, however, is not a frequent practice.

3.2.2. Index The index should include all linguistic terms, names, and topics mentioned in the grammar (cf. Noonan 2005, 116). Typological traits not found in the language: Some grammar-writing guides recom mend that the index also include features which do not occur in the language that is described in the grammar or that are not treated in that particular grammar. This was done in Haspelmath’s (1993) grammar of Lezgian and many found it to be an appealing feature (see also Noonan 2005, 118). If included in the index, these non-occurring features should be clearly distinguished from items that do occur in the grammar, for example, with an asterisk, italics, or some other device to make them visibly distinct. A question is, how many and what kind of non-occurring traits should be included? Certainly, it is not practical or possible to include all the structural traits that are known from the languages of the world that do no play a role in the particular language being described. Rather, it is more useful to indicate typological features not found in the

280 Amber B. Camp et al. language but that might be expected to appear based, for example, on what is known from related languages, other languages in the area, and language typology generally.

3.2.3. Glossary Some grammars include a glossary of the grammatical and technical terms used in the grammar, especially terms that may not be common or conventional. We recommend that grammar writers consider what the intended audience may need so they can un derstand the terms used in the grammar in order to determine whether a glossary may be useful. The glossary should include in particular any terms that are not generally known, terms particular to this grammar, or terms used only in the linguistic practice of a particular region or language family. Since these terms usually are described in the grammar itself and are not difficult to find from the table of contents or the index, many do not consider such a glossary necessary or particularly useful.

3.2.4. Vocabulary Good recent grammars contain a list of all the words with their glosses that are encountered in examples presented in the grammar and in any texts that are included with the grammar (cf. Noonan 2005, 120–121; Rice 2006, 390). This vocabulary list, sometimes called the lexicon, is sometimes like a small dictionary. The format of this list of lexical items should be made accessible and user-friendly, for example, by putting the word that is being defined in boldface to set it apart visually from its gloss or translation. Glosses in this vocabulary should be in the language in which the grammar is written and also in the language of wider communication in the country or region where the target language is spoken, if the language of presentation and the dominant language of the region are different. For example, a grammar written in English of an indigenous language of Guatemala should have glosses and translations both in English and in Spanish. Having the gloss/translation in both languages can often provide for a clearer identification of the meaning, hence making the vocabulary more useful to more readers. For example, a translation with “hog plum” in English may be less meaningful than Spanish “jocote, tejocote” in a region where most English speakers may have no idea what a “hog plum” is (or what Spondias purpurea, its scientific name, means), but where some will be familiar with the fruit via its Spanish name, “jocote, tejocote.” It is also valuable to consider including a vocabulary look-up in reverse order, for example, not only from the subject language to English (or whatever language the grammar is written in) but also from English (or the language the grammar is written in) to the subject language of the grammar. Where relevant, it is also useful to provide al phabetical look-up lists in other languages, as well. The lexicon is more useful if it is not just presented in English-target language and target-language English but if it contains also an alphabetical look-up in any dominant language of the region (Spanish in the ex ample given above). For example, for a grammar written in English about an indigenous language of Brazil, it will be valuable to have not only an alphabetical look-up list of the

Writing Grammars of Endangered Languages 281 vocabulary in English but also another one in Portuguese, in addition to the main vo cabulary of the indigenous language with its glosses in English and Portuguese.6

3.2.5. List of affixes It is valuable to provide a list of the grammatical affixes of the subject language with their meanings/functions. This can be done by: (1) providing a separate, independent list of affixes (with glosses/functions and the section numbers where they are described in the grammar); (2) listing these affixes interspersed alphabetically with the lex ical items of the vocabulary list (together with the associated section numbers where they are described in the grammar); or (3) including them alphabetically interspersed through the index (and not in the vocabulary list). This last option is least favored. It is also possible to adopt more than one of these three options in a grammar, having, for ex ample, both a separate list of affixes in the body of the grammar itself and also the affixes interspersed in the index or in the vocabulary list.

3.2.5. Texts Some texts (preferably representing several genres) presented with interlinear morpheme- by-morpheme glosses and flowing free translation is generally considered essential for a grammar (following the Leipzig glossing rules, see below); cf. Noonan 2005, 121, Mosel 2006b, 52–53). Good recent grammars typically include at least some analyzed texts, often with examples used in the grammar selected from these texts, and accompanied by means to access and listen to the examples from the texts either in embedded sound clips or with location information for accessing the archived texts (see section 4.3.3).

3.3. Other material There are also other materials, not just front or back matter, that need to be noted, too, which is the purpose of this section.

3.3.1. Maps, tables, charts, and figures Maps, tables, charts, and figures can be valuable aids for the audience. Tables and charts are especially useful for presenting paradigms and similarly patterned material, and can help to clarify points or make information more comprehensible and accessible (see Mosel 2006b, 50). Note that all images used, including maps, should not violate copyright laws. The colors on each image should also be distinctive. These materials may have to be provided in black and white (with appropriate devices, such as shading, 6 Descriptions of languages where speakers may have to deal with more than one dominant language can benefit from having lexical look-up in multiple languages, for example, in addition to English (if that is the main language of presentation), for languages spoken both in Brazil and in Colombia or Venezuela, having lexical look-ups in both Portuguese and Spanish is very valuable.

282 Amber B. Camp et al. crosshatching, etc., to signal distinctions that the colors would show) if the grammar is to be published by a publisher that does not accommodate color. Phonetic figures: It is very helpful to present figures with spectrograms and other acoustic measurements, particularly of sounds that may be unusual or perceptually dif ficult to distinguish. These can provide additional support for the analysis presented in the grammar (Noonan 2005, 118; Rice 2014). Again, as with images in general, the colors on all phonetic figures should be distinctive and should be provided in black and white if the grammar is to be printable in a format that does not accommodate color.

3.3.2. Historical information Though not commonly included in synchronic grammars, some grammars mention the historical linguistic facts behind some constructions and lexical items where known. For example, the source of loanwords is often indicated, as are instances of grammatical ization (see Rice 2006, 402).

3.3.3. Dialects If examples from different dialects (regional varieties) are utilized, the dialect/va riety from which an example is taken needs to be clearly identified (Rice 2006, 397). Some grammars rely primarily on one dialect/variety, clearly specified in the intro duction, and then provide the dialect source information only with those occasional examples taken from other dialects. Mixing examples taken from different dialects without proper identification of their provenance often creates problems and should be avoided.

3.3.4. Variation If there is known sociolinguistic variation among speakers for language phenomena in the grammar that correlates with social variables in the language being described, the variation should be represented in the grammar. However, while grammars often have brief treatment of salient sorts of variation, most grammars keep the amount of effort dedicated to the analysis and description of sociolinguistic variation to a minimum, since a full-scale variationist sociolinguistic investigation is usually not possible given the constraints of time and funding most language documentation projects are under. While the inclusion of such sociolinguistic variation can be valuable, cases of free varia tion do need to be described in the grammar.

4. Presentation of the grammar This section is about the core of any grammar—the actual grammatical description itself (minus the front and back matter).

Writing Grammars of Endangered Languages 283

4.1. Introduction 4.1.1. Goal It is important to tell the reader the purpose (or purposes) of the grammar and what to expect from this grammar.

4.1.2. Context Information on the setting of the language and its speakers is needed. This can include: Status: number of speakers, the vitality of the language, etc. Affiliation: genetic classification of the language (Noonan 2005, 121). Social, cultural, and geographic information: the geographical distribution of the language, multilingualism and literacy among speakers, etc. (Noonan 2005, 121; Payne 2014). maps: Maps are very useful for making locations and distributions clear. photos: Most grammars have no photos, but some do, and recent ones tend to have more than older grammars

4.1.3. Sources It is important to explain how the information was obtained and to provide information about the speakers consulted, and when and where the research upon which the gram matical analysis is based was undertaken. A summary of relevant fieldwork or any other research activity in which data were obtained is helpful. Any ideas and examples taken from other sources need to be clearly identified.

4.1.4. Previous work A survey of previous work on the language belongs in the grammar (see Noonan 2005, 122). It can be part of the introduction or a separate section or chapter. This can be fairly brief but should mention and assess any work of relevance to this grammar.

4.1.5. Typological profile Most good recent grammars include an overview of major features of the language (see, for example, Noonan 2005, 117; Payne 2014), often called a typological sketch or typo logical profile of the language. It can be part of the introduction or a separate section or chapter. These profiles range from five to forty or more pages in length. The over view does not need to repeat significant amounts of information from the body of the grammar (though it can be helpful to include the section numbers where the features mentioned in the overview are discussed in the grammar). The typological profile mentions major structural traits that characterize the language, such as its basic word order, verb alignment, how arguments of verbs are signaled, etc., and also unusual traits of the language that may be of particular interest.

284 Amber B. Camp et al.

4.2. Accessibility considerations This section treats several matters that relate to how accessible, user-friendly the grammar should be.

4.2.1. Orthography It is important that users of the grammar are actually able to read it, that is, are able to understand the system used in the presentation of examples (i.e., the orthography utilized). A chart should be provided which shows how the orthographic (and pho nemic) symbols used in this grammar correspond to those of the IPA (International Phonetic Alphabet)7 (cf. Noonan 2005, 119; Rice 2006, 408, 2014). Where more than one writing system or orthography has been in use for the language, it is very useful to provide a chart showing the equivalences among the symbols of the different sys tems used to write the language as well as their IPA equivalents. If there is an established linguistically adequate orthography for the language, using it to present examples can make the grammar more accessible to members of the language community and other non-linguists. In some instances, it may be helpful for clarity and accessibility to present examples in both the practical orthography and in a more linguistically based orthog raphy. An alternative sometimes used is to present the examples primarily in the prac tical orthography, giving the examples also in the linguistically based orthography in parentheses or in a separate line.

4.2.2. Non-theory-specific description It is commonly held that the description of grammatical elements and constructions should be as devoid of theory-specific jargon and notation as possible to make the grammar accessible to a wide audience and to make the grammar useful to readers after current theories may have shifted radically or been abandoned: “Much good linguistic work of past decades remains largely inaccessible to modern scholars simply because the frameworks employed have gone the way of the dinosaurs” (Payne 2014, 94). As Pawley (2014, 9) puts it: If a reference grammar is to be readable generations from now it should use a de scriptive framework (i.e. a body of analytic concepts and terms) that is familiar to most or all grammarians. This is easier said than done because all descriptions are to some extent theory-specific and specific theories of grammar are notorious for having a short shelf life –and even in their heyday are accessible only to specialists.

The analysis and description can be informed by theoretical considerations, but formal notation and terminology from specific theoretical frameworks should be 7 Of course, the phonemes represented by the orthographic symbols may have more than one allophone each in specific environments. The phonetics of these allophones naturally need to be described and represented in the grammar with their IPA phonetic values made clear.

Writing Grammars of Endangered Languages 285 avoided for the accessibility and greater longevity of the grammar. Attributes of the grammar that may have particular significance for theoretical claims can be pointed out in the grammar; however, it is best to publish the theoretical treatment of these matters elsewhere, in journal articles or online, where an appropriate audience will comprehend the theoretical analysis, argumentation, and claims for linguistic theory (cf. Noonan 2007, 113; Rice 2006, 395, 397, 403; Nakayama and Rice 2014, 2; see Rhodes and Campbell, Chapter 5, this volume). As Rice (2006, 403) notes, “the grammar should be informed by theory. This will help make it coherent, and it will allow questions to be asked that might not come up otherwise. But . . . theory is not the goal of a grammar.”8 As Genetti (2014, 121) put it, “how can a grammar writer do justice to the language-specific richness and variety of structural categories without being either straight-jacketed by typological and theoretical convention or overrun by it? It is all about finding balance.” A related issue is how much argumentation should the grammar writer incorporate in order to justify the analyses presented in the grammar? (Genetti 2014, 126). More theoretically oriented authors tend to give greater amounts of argumentation in their grammars. More practically oriented grammar writers tend away from argumentation in the grammar itself, though they may provide the argumentation in other publications aimed more specifically at linguists who expect and can understand such argumenta tion, or in the grammar itself as appendices or notes at the end of chapters aimed spe cifically at linguists. Some argumentation may be helpful in cases where more than one structural analysis is possible, where the language differs from what would otherwise be expected from a general typological perspective or violates claims in the literature, or where the language differs from what might be expected of languages in its language family or geographical area (Genetti 2014, 128–129).

4.2.3. Terminology The terminology should be appropriate for the intended audience. For a grammar in tended for language documentation, very specific technical terminology should be avoided as far as possible, and if used at all, all technical terms and notational devices that are not generally known need to be defined and explained carefully. If non-standard terms are introduced in the grammar, they need to be defined clearly (Noonan 2005, 116l; Rice 2006, 397; cf. Mosel 2006b, 51). If terms used are common in works on other languages of the language family or of the geographical region but are not generally known, these should be explained and clarified for those who may not be familiar with them. It is best where possible to avoid terms that are only specific to a particular lan guage family, region, or the particular language described in the grammar.

8 In the literature, this recommendation is often talked about as being “theory-neutral.” The sense of this is clear—do not describe the language in terms of some formal linguistic framework that makes the grammar inaccessible to a wider range of readers. However, in reality all descriptions are theory- specific to one degree or another (Pawley 2014, 9); there are no grammars that are entirely free of some theoretical assumptions underlying aspects of the description. The point is to not let these be a major obstacle to understanding.

286 Amber B. Camp et al.

4.3. Presentation of examples Sufficient examples should be given to make the phenomenon being described clear and to support the analysis presented. It may be valuable to provide abundant examples, beyond the usual two or three found in many grammars (Noonan 2005, 117). More examples can make the material under discussion clearer, often more interesting, and in particular will be helpful for those who may wish to use the grammar for language learning or for preparing pedagogical materials. How the examples are integrated in the grammar is also important. The relevance of particular examples to the point being made should be explained in the prose right before or after the example is presented. The presentation of description and of illustrating examples should be integrated, and lengthy descriptions followed by a lengthy series of examples should generally be avoided (Mithun 2014, 28; Weber 2007). When a particular form in the language or a specific example is not clear, it is im portant to say so. In addition, including information where one’s analysis is not certain (with an acknowledgement of the uncertainty) is better than not mentioning it at all (Noonan 2005, 119–120). It is helpful to indicate when things are infrequent, non-productive, or irreg ular. Where options are available, it is helpful to mention whether the choices are equally frequent or whether some are infrequent while others are frequent (Noonan 2005, 120).

4.3.1. Glossing conventions Examples should follow the Leipzig glossing rules,9 with morpheme-by-morpheme in terlinear glossing and free translation (cf. Noonan 2005, 117; Rice 2006b, 50; Mithun 2014, 28–29). This most typically involves a three-line presentation format (as in ex ample (1) below); however, scholars find that a different number of lines with differing amounts of interlinear information can be better suited for different languages or audiences. The different interlinear lines may show such things as morphological seg mentation with morpheme glossing, underlying forms, phonetic form, direct word-by- word translation, etc. (cf. Mithun 2014. 50). In the first line is the example written in the language being described, with mor pheme boundaries indicated. The second line provides a morpheme-by-morpheme gloss of the line above, where there is a one-to-one correspondence between the number of words and morphemes in the subject-language line and in the gloss line, with words of the first line directly above and aligned with the corresponding words of the second line. Free translation is provided in the third line. A typical example might look like the following from Nivaclé (of Argentina and Paraguay):

9

The Leipzig glossing rules can be found online at http://www.eva.mpg.de/lingua/resources/ glossing-rules.php [accessed June 29, 2016].

Writing Grammars of Endangered Languages 287 (1)

y-oy na’ sajech 3act-escape dem.vis fish “the fish is escaping” (The abbreviations in this example are: 3 “third person,” act “active,” dem “demonstrative,” vis “visible.”)

Sometimes other lines are also included. For example, some write the example in the subject language in the chosen orthography on the first line with the linguistically based, parsed representation on another line below that. Some put this first line in italics, some in a basic roman font, and some make it boldface, especially communities that “want to highlight the importance of the language being described” (Mithun 2014. 28). Some include a separate line that shows the example in a more phonetically oriented transcription represented by IPA symbols, to make it easier for linguists to interpret the line written in the orthography, as seen in (2), an expanded version of (1): (2)

joj naʔ y-oy na’ 3act-escape dem.vis “the fish is escaping”

saxeč sajech fish

This practice is, however, not common. It is good practice to put the forms in the example which illustrate the point under dis cussion in boldface. In example (2) naʔ is in bold because the example is from a section of the grammar that discusses properties of this demonstrative. Others add a line that reveals underlying forms of morphemes before certain phono logical alternations (deletions, assimilations, mergers, etc.) modify the form, as seen in the Nivaclé example in (3): (3)

xa-p’aklan-eš naʔ ji-xpɑjič ka-waʔ wɑkɑ lhaʔ-samuk-uj [underlying] [xap’aklaneš na ixpɑič ka-wa wɑkɑ ɫaʔsamkux] [phonetic] ja-p’aklan-esh na’ yi-jpôyich ka-wa’ wôkô lha’-samju-y [orthography] 1act-plaster-val vis 1pos-house rem-nhum.pl cow 3pos-excrement-pl “I am plastering my house with cow dung” (The abbreviations in this example are: val = “valency increasing morpheme,” vis = “visible demonstrative,” 1pos = “1st person singular possessive,” rem = “removed.demonstrative” [no longer extant], nhum = “non-human,” pl =”plural,” 3pos = “3rd person possessive.”)

In example (3), line 1 gives the underlying (morphophonemic) representation; line 2 gives the phonetic representation; line 3 represents the example in orthographic/pho nemic form; line 4 provides the morpheme-by-morpheme gloss; and line 5 has the free translation. One finds in the grammar the explanation that -eš “valency-increasing morpheme” clitic’ serves as an applicative in this context, thus “plaster with,” and in the

288 Amber B. Camp et al. phonology section the description of the how glottal stops are lost at the end of words before other consonants (hence na and kawa in the demonstratives in line 2), how /j/ (orthographic ) is lost before /i/, and how /k/after another consonant changes to /x/ (so /samkuj/→ /samxuj/, in the orthography. Again, it is rare to see this many lines for presentation of an example. Others may also include in a separate line a flowing translation of the example in the dominant language of the region under the line with the free translation in the main language in which the grammar is written. An example is (4), which is an expansion of (1) above, which adds a Spanish translation line after the English one: (4) joj naʔ y-oy na’ 3act-escape dem.vis ‘the fish is escaping’ (‘el pescado se escapa’)

saxeč sajech fish

Finally, some grammars present the interlinear morpheme-by-morpheme and some times also the free translation lines in such examples in smaller font size (pitch) to fit more onto a single line, where publishers permit this. For example, the relevant parts of example (3) above might look like this: (3’) ja-p’aklan-esh na’ yi-jpôyich ka-wa’ wôkô lha’-samju-y 1act-plaster-val vis 1pos-house rem-nhum.pl cow 3pos-excrement-pl

4.3.2. Multilingual glossing As seen in example (4), some grammars provide translations of examples not only in the language in which the grammar is written but also in the dominant language of the country where the language is spoken, or in a language of wider communication in the region if that language is different from the language in which the grammar is written. An example in K’iche’ (a Mayan language of Guatemala, where Spanish is the dominant language) is seen in (5): (5)

q’eq nu-kye:x black my-horse ‘my horse is black’

[negro mi-caballo] [‘mi caballo es negro’]

In this instance, while English glosses are provided directly under the K’iche’ examples, Spanish glosses that directly correspond with the English ones are provided in brackets—though these could also be presented in a separate line below the English glosses. Most commonly the morpheme-by-morpheme glossing is given only in the principal language that the grammar is written in and just the free translation is given in the dominant language, Spanish in this example.

Writing Grammars of Endangered Languages 289

4.3.3. Sources of examples Many grammar writers consider it important to provide an indication of the source of each example sentence used in the grammar. Examples in the grammar that come from texts can be coded to indicate the text from which each is taken. Examples from direct elicitation can, for example, identify the speaker, location, and date when recorded. Some believe scientific responsibility and replicability (verifiability) make it important to provide direct links to the recordings from which the examples are taken, to provide where possible “references to locations of the examples in texts and/or audio recordings that would allow the reader to see them in their discourse context or hear them” (Mithun 2014, 50). This can be done in web-accessible grammars with clickable embedded sound clips (or video clips) next to the written example that allow the user to hear (or hear and see) the example. It is also possible to indicate to the reader how an archived recording from which the example is taken can be accessed, with location information on where in the recording the example is found so the user can easily find the example. Thieberger (2006) is an excellent example of a grammar with audio-linking of the examples and he discusses why this is important and how it was done in this particular grammar. Berez (2015) also emphasizes the importance of citing archived materials in language documentation. Lee’s (2014) grammar of Baba Malay illustrates this approach well. It provides speaker name, resolvable URL of a digital archive location, and starting and ending time codes of the utterance in the recording for every example. (See Mosel 2014 for general discussion of relations of a corpus to grammar writing.) Of course, for all examples taken from other sources, proper citation must be pro vided (e.g., for examples taken from an academic publication, someone else’s field notes, a Bible translation, etc.).

4.3.4. Appropriateness of examples Examples should be avoided, wherever possible, that may be offensive to readers of the grammar or that are taboo, offensive, or inappropriate in the culture of speakers of the language. No example should be included if its content could result in negative consequences for the consultant or for members of the language community, or indeed for anyone. Mithun (2014, 27, referring to Weber 2007) adds that one should be certain that examples chosen “project a good image of the speakers and their culture, and do not embarrass particular individuals or groups.”

4.3.5. Complete sets A recommendation followed by several grammar writers is that when categories of the grammar are discussed that naturally fall into patterned sets (for example, paradigms of verb conjugation or noun declension, etc.), wherever possible the full set of related forms should be presented and not just a few members of the set (Noonan 2005, 119). For example, in presenting a verb paradigm involving pronominal agreement markers, one should not present only forms illustrating just two or three of the pronominal per sons but should give the whole set with all persons represented. In discussions of small

290 Amber B. Camp et al. closed classes of grammatical elements, all the members of the class should be given, for example, for evidentials, noun-classifiers, adpositions (prepositions and postpositions), entire sets should be presented rather than just a few members followed by “etc.”

4.3.6. Whole sentences It is recommended that examples not present only a portion of a sentence that may illus trate the topic under discussion but rather that whole sentences be given. For example, examples of relative clauses should not include just the head noun and the relative clause modifying this noun but should be the whole sentence of which the relative clause is part. For example in English, (6a) is not as useful as (6b): (6a) the dog that bit her son (6b) the dog that bit her son ran away.

4.4. Recommendations for formatting and form of presentation Recommendations for the formatting and form of presentation for grammars are considered in this section.

4.4.1. Order and structuring of topics in the grammar Grammars primarily follow one of two approaches: (1) Proceeding from more basic (simpler) constructions to more complex ones (see Rice 2006, 396; Payne 2014, 94); or (2) following a generalized format or standard template (for example, the Lingua Grammar Questionnaire10; see Mosel 2006b, 56–57 for discussion). The first follows a logical progression where basic phenomena are presented earlier in the grammar and then constructions that depend on the understanding of these more basic ones are presented later. In this approach, the order of presentation is adapted to fit the structural complexity of constructions in the language (cf. Rice 2006, 400–401; Pawley 2014, 14). The standard templates or questionnaires follow a fixed order of presentation, meaning that sometimes phenomena described earlier in the grammar will need cross- referencing to topics presented later for them to be fully clear (see sections 4.4.3. and 5.1 for cross-referencing). The order followed in a grammar might be: phonology→morphology→syntax.

Or, stated with a bit more elaboration: phonology→morphology → word classes (parts of speech) → phrase-level elements (noun phrases, verb phrases, etc.) → clause-level elements. (See Mosel 2006b, 48–49.) 10 The Lingua Grammar Questionnaire is available online at http://www.eva.mpg.de/lingua/tools-at- lingboard/questionnaire/linguaQ.php [accessed June 29, 2016].

Writing Grammars of Endangered Languages 291 It should be noted that some grammars and grammar templates place pho nology last, almost as an appendix in some instances. Most linguists are of the opinion that phonology should be presented first, before descriptions of mor phosyntactic properties of the language—it is important to understand from the outset how the language is pronounced and the conventions that the grammar follows for representing the sounds of the language in order for the reader to grasp examples fully. Another question about structuring the topics presented in a grammar is whether the grammar should follow a form-to-function or a function-to-form approach, “whether the description should select particular functional domains, and show how they are expressed in the language, or it should select particular forms, and describe the range of functions associated with these forms” (Cristofaro 2006, 137; cf. also Cristofaro 2006, 140–148; Mosel 2006b). Most have found it most advisable to approach description of morphology from a form-to-function perspective and syntax in a function-to-form approach (Payne 2014). With greater specificity, Payne (2014, 201) recommends a “form-driven approach for those areas of grammar that are the most controlled, systematic and rule-dominated, and a function-first approach for those areas that tend to cross-cut structural levels.” For him, the controlled, systematic, rule-dominated parts of language include: phonology (excluding into nation), morphophonemics, inventory of derivational morphology, inflectional in ventory (the range of inflectional possibilities for person and number “agreement” and case marking), pronoun inventory, and lexical inventory. The function-first per spective is best for analyzing more pragmatic, semantic and subtle parts of language (from “a large body of naturally occurring text, supplemented by elicitation where necessary” (Payne 2014, 102). For Payne these include: intonation, constituent order, inflectional morphology, voice (alignment of grammatical relations and semantic roles of verbal arguments), sentence-level particles, clause combining (including relativization, complementation, adverbial clauses, and clause chaining), lexical se mantics, and pragmatically marked structures, such as clefts, questions, etc. (Payne 2014, 101–103).

4.4.2. Formatting The use of different formatting devices can help to make the grammar more user- friendly and accessible. For example, boldface, italics, indentation, different fonts, or font sizes (pitch) can make headings, sections, and the emphasized parts of examples much clearer. Headings should be numbered. Highest-level headings should be in boldface; lower- level headings can be given in a different format to set them apart. Some grammars include a page header which indicates the section title/topic that is under discussion at the beginning of this page. A few even include boldface section labels in the page margins each time a section changes, although this is rare. It is recommended that in examples, the part that illustrates the topic under discus sion be put in boldface, as seen in example (2) above.

292 Amber B. Camp et al.

4.4.3. Cross-referencing It is important to provide cross-references to sections elsewhere in the grammar which treat aspects of the phenomenon under discussion or items related to it (Mosel 2006b, 43). It is also much more preferable for cross-references to include specific section num bers than be simply notes that state “see above” or “see below.” Multimedia and web-based grammars can easily incorporate links and anchors for cross-referencing (see section 5.1.).

4.4.4. Footnotes Grammars often contain no footnotes, though the use of footnotes has become more common in recent grammars. They can be effective for presenting incidental, addi tional, or background information. Readers usually find footnotes at the bottom of the page more useful than endnotes, though some publishers require endnotes (notes presented at the end of each chapter or at the end of the entire grammar). Some grammars have included “notes to linguists” at the end of each chapter, or in an appendix at the end of the grammar, providing more technical information that might be of interest to linguists but not necessarily to other users of the grammar (see section 2.5.2).

4.4.5. Cultural comments Often, understanding of a particular example or even a particular grammatical con struction can be aided by including comments on relevant cultural information. It is not necessary to include a lot of ethnographic background—just enough to clarify matters. This sort of extra content and background information is often presented in footnotes, or can be tagged in the example with the cultural explanation given in parentheses after the example. For example, in the grammar of Nivaclé (Campbell et al. forthcoming), an example illustrating evidentials from a text about evil sorcerers’ auxiliary spirits contains wat-shayk’u* “unpossessed-egg.” After the three-line interlinear presentation of the ex ample in the grammar that contains this, the explanation is given in parentheses: *wat- shayk’u “unpossessed egg” also refers to a sorcerer’s auxiliary spirit/soul, souls from other people captured by evil shaman/sorcerers.11 For unfamiliar plants and animals, it can be useful to add the scientific name in this way, with the common name tagged with the asterisk in the text and the scientific name added after the example in parentheses.

4.4.6. Multimedia As mentioned, grammars designed for online dissemination or for other electronic formats can be integrated with other aspects of the documentation, for example, via links to other sections of the grammar, to texts, a dictionary, audio files, video, databases, archives, etc. 11

This example is from the archived text, The Story of the Demon Jaguar.

Writing Grammars of Endangered Languages 293

4.4.7. On use of abbreviations Some grammar writers prefer to avoid abbreviations, writing things out in full, in order to make it easy for readers so they do not have to remember or have to look up what the abbreviations stand for. Others prefer to use abbreviations to save space. Often abbreviations are especially useful in the morpheme-by-morpheme glosses of examples so that individual examples do not end up being cumbersomely long, occupying several lines. In any event, if abbreviations are used, a list of all abbreviations and what they stand for must be provided (typically in the front matter, sometimes as an appendix, as mentioned in section 3.1.3.). Where the Leipzig Glossing Rules have recommended abbreviations, these conventional abbreviations should be used.

4.4.8. Publisher requirements It needs to be borne in mind that some publishers have format requirements that do not allow for the implementation of some of the recommendations listed here. Meeting publishers’ formatting and style guideline requirements is usually less important for on line grammars. In any event, the formatting and referencing conventions used should be consistent throughout the grammar. A good suggestion is that grammar writers who intend to seek a formal publisher would do well to find out what the publisher’s requirements are before they write the grammar.

5. For web-based and electronic grammars For online grammars and others in electronic formats, separation of form from content becomes less relevant to the grammar writer. The content of the grammar is the analysis and data, while the form refers to the physical appearance on the page, including type setting, fonts, font size (pitch), spacing, color, etc. (see Weber 2005, 418–419).

5.1. A web version of a grammar A web-based grammar need not be so concerned with the order in which the infor mation is presented but more with how particular content is related to other content. Hyperlinks between sections and examples can facilitate cross-referencing, and can in tegrate other kinds of media such as photos and other images, audio and video clips, etc. In addition, entries in a dictionary (if one exists) can be integrated with the grammar using hyperlinking. An advantage of a web-based grammar is that it is possible to make different formats of grammars for different audiences and purposes, or to include more content for one

294 Amber B. Camp et al. kind of audience and less for another. If the grammar’s information is properly tagged, the author can quickly produce a sketch grammar that contains only basic information, specialized grammars that focus on particular groups of phenomena, or switch between languages in which the grammar is written, among other things.

6. A grammar template The grammar template presented here is intended to provide broad guidelines to help with writing grammars for endangered and other lesser-studied languages. It is essen tially a broad possible table of contents or skeletal structure for what a grammar should contain. While following the full extent of the template can result in what is essentially a reference grammar, it is intended to be adapted to fit the needs of the particular lan guage which the grammar describes and the audience for which it is intended; it is not to be perceived as a rigid formula to be followed slavishly. We agree with Keren Rice’s (2006, 400–401) advice against following a predetermined outline too rigidly because each language can demand its own strategy. Some may wish to follow this organization more closely. Others may want to reorder sections of the grammar to achieve a progres sion from simpler constructions to more complex constructions specific to the language being described. This template is far from comprehensive with respect to the kinds of grammatical categories and constructions that can be found in the various languages of the world. Its intent, rather, is to deal with broader groupings of such elements, princi pally those found in many languages. Likewise, not all constructions indicated may here be relevant to a particular language being described. Writers of individual grammars should add more specific categories and constructions not mentioned here as may be relevant to the particular language at hand, and ignore those that are not relevant. (Other templates and questionnaires also exist, e.g., Comrie and Smith 1977; Payne 2014, 104–108; among others.) It should be noted that some scholars may prefer to present some aspects of their grammar in a different order and manner from that which is presented here—alternative approaches may be equally effective for presenting particular types of phenomena, and selecting between different approaches can be difficult. As a case in point, grammars differ in terms of where they place derivational morphology and how they treat it, and whether they present first the parts of speech (word classes) with their associated mor phology (as done here) and then later present phrasal constructions (noun phrases, verb phrases, etc.) in subsequent sections, or whether they present the parts of speech, asso ciated morphology, and phrasal constructions in the same section (e.g., nouns, nom inal morphology, and noun phrases described together in the same section). In general, there is variation also with regard to whether some items are presented in subsections or are given independent sections or chapters of their own, and whether some items are presented in appendices at the end or are integrated into the introduction or in some other section of the grammar itself.

Writing Grammars of Endangered Languages 295 This template is offered with little commentary on its individual parts. It is designed to be used, flexibly, in conjunction with the above recommendations for grammar writing. I. 1.1. 1.2 1.3. 1.4. II. 2. 2.1. 2.2. 2.3. 2.4.

2.5. 2.6. 2.7. 2.7.1. 2.7.2. 2.7.2.1. 2.5.3. 2.7.4. 2.7.5. 2.8. 2.9.

Preliminaries (front matter, preliminary information that is not part of the main body of the grammar) Acknowledgments (acknowledge language consultants, individuals who helped, sources of funding, etc.) Table of Contents (organized hierarchically nested in 3 to 5 levels of subcategories with numbered headings and subheadings) List of maps, figures, and tables Abbreviations (list of abbreviations and what they stand for) Introduction [section heading] Introduction (cf. Mosel 2006b, 47) Title of the grammar Name of the language (with any alterative names, endonym(s), exonym(s), ethnonym(s) of the speakers, etc.) Purpose of grammar (goals, intended audience) User guide (how to use the grammar, description of user-friendly features of the grammar, structure of the grammar, explanation of grammatical terminology and of any notational conventions, format of example sentences and glossing, any unusual features of the grammar) Data collection methods, sources, consultants, corpus, etc. Previous work on the language (overview and evaluation) Context (about the language and its speakers) Geographical location (and ethnic population, demographic information), map(s) Classification (affiliation, subgrouping membership, shared areal traits) Borrowing and the impact of language contact on the structure of the language Language status/vitality (information about number of speakers, domains of language use, language socialization, language attitudes, intergenerational transmission) Sociocultural context (ethnographic information about the language and its use, multilingualism, language attitudes) Dialects (regional and social variants) (Rice 2006) Writing systems used for the language (brief characterization of different writing systems that have been used for the language [details of orthographies and writing systems belong in the phonology chapter]) Typological profile (structural overview) (including, basic word order, verb alignment, notable features of the language, unusual or unique traits of the language)

296 Amber B. Camp et al. III. 3. 3.1. 3.2. 3.2.1. 3.3.

3.4 3.5. 3.6. 3.6.1. 3.7. 3.8. 4. 4.1. 4.1.1. 4.1.2. 4.1.2.1. 4.1.2.2. 4.1.2.3. 4.1.2.4. 4.2. 4.2.1. 4.2.2. 4.2.3. 4.2.4. 4.2.5.

4.3.

Body of the Grammar Phonology (cf. Rice 2014.) Phone chart, phoneme chart (segmental inventories) Phonemes and allophones: list of phonemes with all their allophones (contextual variants) (described in terms of place and manner of articulation, with examples) Minimal pairs Orthographic representations of the sounds of the language (chart showing equivalencies between symbols of orthographic systems that have been used for the language, including the orthography used in the grammar, together with IPA equivalents) Suprasegmental phonology (stress, tones, intonation; best if presented with pitch tracks) Phonotactics: Syllable structure and distributional restrictions Phonological processes (morphophonemic alternations—rules, processes, constraints) Phonological differences between fast speech and slow speech (if relevant) Exceptions, irregularities, and complications (sounds limited in occurrence, with complexities; exceptions and irregularities) Others Parts of speech (i.e., word classes) (with emphasis on morphology) Nouns Noun classes Noun morphology Inflectional morphology: nouns (e.g., case, number, gender, class, etc.) Derivational morphology: nouns (nominalizations, etc.) Non-productive noun morphology, irregularities, and exceptions Nominal compounds Pronouns (and pronominal morphology) Personal pronouns Independent pronouns; bound pronominal affixes Possessive pronouns Reflexive pronouns, reciprocal pronouns Interrogative pronouns Others (Note that bound pronominal agreement markers may need to be mentioned in more than one place in the grammar with cross-referencing to whichever section gives the fuller description, for example, pronominal verb agreement markers might be mentioned both here with cross-reference to the verb morphology section where they may be described in greater detail, or vice versa.) Verbs

Writing Grammars of Endangered Languages 297 4.3.1. 4.3.2. 4.3.2.1. 4.3.2.2. 4.3.2.3. 4.3.3. 4.4. 4.4.1. 4.4.1.2. 4.4.1.3. 4.4.2. 4.5. 4.5.1. 4.5.2. 4.6. 4.6.1. 4.6.2. 4.6.3. 4.6.4. 4.6.5. 4.7. 4.8. 4.9. 5. 5.1. 5.1.1. 5.1.1.1.

5.1.2. 5.1.2.1.

Verb classes Verb morphology Inflectional morphology: verbs Derivational morphology: verbs Non-productive verb morphology, irregularities, and exceptions Verbal compounds Adjectives Adjective morphology Inflectional morphology: adjectives Derivational morphology: adjectives Others (e.g., adjective classes, exceptions, irregularities, and non-productive morphology involving adjectives, adjectival compounds, etc.) Adverbs Adverb morphology Others Determiners, demonstratives, numerals, quantifiers Articles Demonstrative pronouns, demonstrative adjectives Numerals Quantifiers Others Adpositionals (prepositions, postpositions) Conjunctions, complementizers, and particles (can also be dealt with in a section on complex sentences of multiple clauses) Clitics (and their distribution, hosts, and functions) Syntax Phrases (phrase-level constructions) (Some items listed here may be signaled by bound morphology in some languages and in phrasal syntax in others.) Noun phrases Elements of noun phrases, associated agreement and other morphology, e.g., Demonstratives and determiners in noun phrases Adjectival modifiers Dative constructions Classifiers: numeral classifiers, noun classifiers, gender, etc. Order of elements in noun phrases Nominal possession Verb phrases and related phenomena Copula constructions (“to be”) Existential constructions, presentational constructions, kinds of complements of copulas (nominal, adjectival)

298 Amber B. Camp et al. 5.1.2.2. 5.1.2.3.

5.1.2.4.

5.1.2.5. 5.1.2.6. 5.1.3. 5.1.3.1. 5.1.3.2. 5.1.4. 5.1.4.1. 5.1.5. 5.2. 5.2.1.

Auxiliaries, modal verbs, etc. Tense, aspect, and mood Grammatical tenses may include such categories as: present, past (anterior past, immediate past, recent past, remote past, historical past), future (near future, remote future), non-past, etc. Grammatical aspects may include such categories as: perfective/completive, imperfective/incompletive, iterative, continual, inchoative, inceptive, durative, progressive, habitual, desiderative, prospective, etc. Grammatical moods may include: indicative, subjunctive, conditional, optative, jussive, potential, imperative, interrogative, realis/irrealis (also less commonly: hortative, permissive, necessitative, prohibitive, dubitative, hypothetical, evidential, etc.) Voice and valency Grammatical voice includes such categories as: active, passive, middle voice (reflexive, mediopassive), impersonal, antipassive, causative voice, applicative voice, inverse systems, etc. Valency (about the number of arguments of intransitive verbs, of transitive verbs, of ditransitive verbs) Valency increasing categories (such as causative, applicative, instrumental verbs, etc.) (any construction which adds a core argument to a verb) Valency decreasing categories (such as passive, antipassive, anticausative, etc.) (any object-removing or agent-removing construction) (Note: Voice and valency are treated differently by some scholars but are not distinct in many treatments) Negation (verb phrase negation) (Note: Negation of all sorts is given its own section or chapter in some grammars, with abundant cross-referencing) Serial verb constructions Adjectival phrases Comparative and superlative constructions Modification of adjectives Adverbial phrases (see also adverbial clauses) (temporal, manner, location [adpositional phrases], degree, intensifier, etc.) Modification of adverbs Others Clauses and sentence types (clause-level syntax) Word order (constituent order) Order of subject, object, and verb (SOV, SVO, VSO, etc.) Order of adjective and noun (Noun-Adjective/Adjective-Noun) Order of genitive (possessor) and noun (Genitive-Noun/Noun-Genitive) Order of adposition and noun (Noun-Postposition/Preposition-Noun) Other word orders (e.g., adverb and verb, head noun and relative clause, order in comparative constructions, etc.)

Writing Grammars of Endangered Languages 299 5.2.2.

5.2.3.

5.2.3.1.

5.2.3.2. 5.2.3.3. 5.2.3.4. 5.2.3.5. 5.2.3.6.

5.2.3.7. 5.3. 5.4. 5.4.1. 5.4.2. 5.4.3. 5.4.4. 5.5.

Grammatical relations and verb alignment (nominative-accusative, ergative- absolutive, active-inactive, symmetrical voice system, tripartite) Argument structure: roles of (and definitions of) subjects, direct objects, indirect objects, obliques Aspects of transitivity (if not covered in other sections) (different kinds of intransitive verbs, e.g., unergative vs. unaccusative) Clause types (declarative clauses, interrogative clauses, imperative clauses, adverbial clauses, relative clauses, other adjectival clauses, etc.) Main clauses Subordinate clauses Finite vs. non-finite subordinate clauses Relative clauses Restrictive vs. non-restrictive relative clauses Finite vs. non-finite restrictive relative clauses Order (head noun-relative clause or relative clause-head noun, or other) The role of the head noun (shared noun) in the relative clause (subject of the embedded clause, direct object, oblique [indirect object], possessor) How the role of the head noun (shared noun) is shown in the relative clause Kinds of relative clauses by strategies: Gap strategy (no overt indicator of the role of the head noun in the relative clause) Relative pronoun strategy Pronoun retention strategy (“resumptive” pronoun strategy) Non-reduction strategy (head-internal relative clauses) Other non-finite clauses: participles, infinitives, gerunds, etc. Complement clauses (predicate complements) Other subordinate clauses Coordination Clause conjoining: clause-combining constructions (how are clauses combined): Gapping, raising, extraposition (extraction, movement), switch reference, etc. (What processes/changes do verbs and their arguments undergo in subordinate clauses with respect to location, deletion, case or other marking, etc.) Others Topicalization and focus devices Topic-comment, new vs. old information, etc. Questions Yes-no questions Content questions (Wh-questions) Indirect questions Other kinds of questions Direct and indirect speech

300 Amber B. Camp et al. 5.6. 5.6.1. 5.6.2. 5.6.3. 5.6.4. 5.7. 6. IV. 7.1. 7.2. 7.2.1. 7.2.2. 7.2.3. 7.3. 7.4.

Negation (negation can affect various areas of morphology and syntax— cross-referencing is expected as it may be mentioned in different chapters and sections of the grammar) Negative statements Negative questions Negative imperatives Other kinds of negatives Grammatical requisites of different speech styles Honorifics and reverentials Grammatical attributes of formal speech, casual speech Unresolved problems (directions for future research) Back Matter Texts (with interlinear morpheme-by-morpheme glosses and free translation) Vocabulary (lexical concordance) All vocabulary from all examples and texts in the grammar Swadesh basic vocabulary list (optional) Appendices (if any) Bibliography Index

7. Some Bibliography The following are works cited in this chapter and other works that have recommendations that are relevant for grammar writing: Ameka, Felix, Alan Dench, and Nicholas Evans, eds. 2006. Catching Language: the Standing Challenge of Grammar Writing. Berlin: Mouton de Gruyter. Austin, Peter. 2006. “Data and Language Documentation.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 87–112. Berlin: Mouton de Gruyter. Berez, Andrea L. 2015. “Reproducible Research in Descriptive Linguistics: Integrating Archiving and Citation into the Postgraduate Curriculum at the University of Hawai‘i at Mānoa.” In The Oxford Handbook of Linguistic Fieldwork, edited by Amanda Harris and Nick Thieberger, 90–118. Oxford: Oxford University Press. Campbell, Lyle. 2016. “Language Documentation and Historical Linguistics.” In Language Contact and Change in the Americas: Studies in Honor of Prof. Marianne Mithun, ed ited by Andrea L. Berez, Diane M. Hintz, and Carmen Jany, 249–271. Amsterdam: John Benjamins. Cristofaro, Sonia. 2006. “The Organization of Reference Grammars: A Typologist User’s Point of View.” In Catching Language: The Standing Challenge of Grammar Writing, edited by Felix K. Ameka, Alan Dench, and Nicholas Evans, 137–170. Berlin: Mouton de Gruyter.

Writing Grammars of Endangered Languages 301 Genetti, Carol. 2014. “Walking the Line: Balancing Description, Argumentation and Theory in Academic Grammar Writing.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 121–134 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. http://nflrc.hawaii.edu/ ldc/?p=715/. Gippert, Jost, Nikolaus P., Himmelmann, and Ulrike Mosel (eds.). 2006. Essentials of Language Documentation. Berlin: Mouton de Gruyter. Good, Jeff. 2004. “The Descriptive Grammar as a (Meta)database.” http://emeld.org/work shop/2004/jcgood-paper.html/. Haspelmath, Martin. 1993. A Grammar of Lezgian. Berlin: Mouton de Gruyter. Himmelmann, Nikolaus. 1998. “Documentary and Descriptive Linguistics.” Linguistics 36: 161–195. Himmelmann, Nikolaus. 2006. “Language Documentation: What Is It and What Is It Good For?” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 1–30. Berlin: Mouton de Gruyter. Himmelmann, Nikolaus. 2012. “Linguistic Data Types and the Interface Between Language Documentation and Description.” Language Documentation & Conservation 6: 187–207. http://nflrc.hawaii.edu/ldc/. Mithun, Marianne. 2014. “The Data and the Examples: Comprehensiveness, Accuracy, and Sensitivity.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 25–52 [Language Documentation & Conservation Special Issue No. 8]. Honolulu: University of Hawai‘i Press. Mosel, Ulrike. 2006a. “Sketch grammar.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 301–309. Berlin: Mouton de Gruyter. Mosel, Ulrike. 2006b. “Grammaticography, the Art and Craft of Writing Grammars.” In Catching Language: The Standing Challenge of Grammar Writing, edited by Felix Ameka, Alan Dench, and Nicholas Evans, 41–68. Berlin: Mouton de Gruyter. Mosel, Ulrike. 2014. “Corpus Linguistic and Documentary Approaches in Writing a Grammar of a Previously Undescribed Language.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 135–157 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. http://nflrc.hawaii.edu/ ldc/?p=715/. Nakayama, Toshihide, and Keren Rice, eds. 2014. The Art and Practice of Grammar Writing (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. http://nflrc.hawaii.edu/ldc/?p=715/. Nakayama, Toshihide and Keren Rice. 2014. “Introduction.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 1–6 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. http://nflrc. hawaii.edu/ldc/?p=715/. Newman, Paul. 2000. “Writing a Reference Grammar of an African Language: Conceptual and Methodological Issues.” In Proceedings of the 2nd World Congress of African Linguistics Leipzig, edited by Ekkehard H. Wolff and Orin D. Gensler, 33–47. Cologne, Germany: Rüdiger Köppe. Noonan, Michael. 2005[2007]. “Grammar Writing for a Grammar-Reading Audience.” Studies in Language 30: 351–365. (Reprinted 2007 in Perspectives on Grammar Writing, edited by Thomas Payne and David Weber, 351–365. Amsterdam: John Benjamins.)

302 Amber B. Camp et al. Pawley, Andrew. 2014. “Grammar Writing from a Dissertation Advisor’s Perspective.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 7–24 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. http://nflrc.hawaii.edu/ldc/?p=715/. Payne, Thomas. 2014. “Toward a Balanced Grammatical Description.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 91–108 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. http://nflrc.hawaii.edu/ldc/?p=715/. Payne, Thomas and David Weber, eds. 2007. Perspectives on Grammar Writing. Amsterdam: John Benjamins. Rehg, Kenneth L. 2007. “The Language Documentation and Conservation Initiative at the University of Hawai‘i at Mānoa.” In Documenting and Revitalizing Austronesian Languages, edited by D. Victoria Rau and Margaret Florey, 13– 24 (Language Documentation & Conservation Special Publication No. 1). Honolulu: University of Hawai‘i Press. Rehg, Kenneth L. 2014. “On the Role and Utility of Grammars in Language Documentation and Conservation.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 53–67 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. https://scholarspace.manoa.hawaii.edu/bit stream/10125/4584/1/4_Rehg.pdf. Rice, Keren. 2006[2007]. “A Typology of Good Grammars.” Studies in Language 30: 385–415. (Reprinted 2007 in Perspectives on Grammar Writing, edited by Thomas Payne and David Weber, 143–172. Amsterdam: John Benjamins.) Rice, Keren. 2014. “Sounds in Grammar Writing.” In The Art and Practice of Grammar Writing, edited by Toshihide Nakayama and Keren Rice, 69– 89 (Language Documentation & Conservation Special Publication No. 8). Honolulu: University of Hawai‘i Press. http://nflrc. hawaii.edu/ldc/?p=715/. Rhodes, Richard, Lenore A. Grenoble, Anna Berge, and Paula Radetzky. 2007. Adequacy of Documentation (a preliminary report to the CELP). Washington, DC: Linguistic Society of America Committee on Endangered Languages and Their Preservation. Thieberger, Nicholas. 2006. A Grammar of South Efate: An Oceanic Language of Vanuatu (Oceanic Linguistics Special Publication 33). Honolulu: University of Hawai‘i Press. van Driem, George. 2002. “A Holistic Approach to the Fine Art of Grammar Writing: The Dallas Manifesto.” http://www.himalayanlanguages.org/files/driem/pdfs/DM_Yogendra.pdf/. Weber, David J. 2006[2007]. “Thoughts on Growing a Grammar.” Studies in Language 30: 417– 444. (Reprinted 2007 in Perspectives on Grammar Writing, edited by Thomas Payne and David Weber, 173–198. Amsterdam: John Benjamins.) Weber, David J. 2007. “The Linguistic Example.” In Perspectives on Grammar Writing, edited by Thomas E. Payne and David J. Weber, 199–213. Amsterdam: John Benjamins.

Other Grammar Outlines/Overviews/Templates Comrie, Bernard, and Norval Smith. 1977. “Lingua Descriptive Studies: Questionnaire.” Lingua 42: 1–72. Healey, Joan and Liisa Järvinen. 1998. Grammar Notebook. Ukarumpa, PNG: SIL. Mead, David. 2005. “Grammar Sketches.” Unpublished manuscript. (Earlier version 2004 “Growing a grammar: The brief grammatical summary,” Section 8.1 in Linguistics Resource Manual, edited by Howard Sheldon et al., prepared for the Indonesia Branch of SIL, Jakarta.)

Writing Grammars of Endangered Languages 303 Payne, Thomas E. 2006. “A Possible Outline for a Balanced Formal/Functional Grammatical Description; appendix to ‘A Grammar as a Communicative Act’.” Studies in Language 30: 367–383. Roberts, Jim. 2004. “Grammar Sketch Outline.” SIL.

A Few Useful Grammars Bauer, Winifred. 1993. Maori (Routledge Descriptive Grammars). London: Routledge. Bowern, Claire. 2012. A Frammar of Bardi. Berlin: Mouton de Gruyter. Cook, Eung-Do. 1984. A Sarcee Frammar. Vancouver: University of British Columbia Press. Dayley, Jon P. 1985. Tzutujil Grammar (University of California Publications in Linguistics, 107). Berkeley: University of California Press. Epps, Patience. 2008. A Grammar of Hup. Berlin: Mouton de Gruyter. Georg, Stefan. 2007. A Descriptive Grammar of Ket (Yenisei-Ostyak). Leiden: Brill. Givón, Talmi. 2011. Ute Reference Grammar. Amsterdam: Benjamins. Guillaume, Antoine. 2008. A Grammar of Caviñena. Berlin: Mouton de Gruyter. Haspelmath, Martin. 1993. A Grammar of Lezgian. Berlin: Mouton de Gruyter. Hauk, Bryn and Raina Heaton. 2018. “Triage: Setting Priorities for Endangered Language Research.” In Cataloguing the World’s Endangered Languages, edited by Lyle Campbell and Anna Belew. London: Routledge. Hill, Jane H. 2005. A Grammar of Cupeño (University of California Publications in Linguistics, 136). Berkeley: University of California Press. Huttar, George L. and Mary L. Huttar. 1994. Ndyuka. London: Routledge. Konnerth, Linda Anna. 2014. “A Grammar of Karbi.” PhD diss., Eugene: University of Oregon. Kung, Susan Smythe. 2007. “A Descriptive Grammar of Huehuetla Tepehua.” PhD diss., University of Texas Austin. Lee, Nala H. 2014. “A Grammar of Baba Malay with Sociophonetic Considerations.” PhD diss., University of Hawai‘i at Mānoa. Maganga, Clement and Thilo C. Schadeberg. 1992. Kinyamwezi: Grammar, Texts, Vocabulary. Cologne, Germany: Rüdiger Köppe. Merlan, Francesca. 1981. Mangarayi (Lingua Descriptive Studies, Vol. 4). Amsterdam: North Holland Publishing Company. Quirk, Randolph, Sidney Greenbaum, Jeffrey Leech, and Jan Svartvik. 1974. A Grammar of Contemporary English. London: Longmans. Rice, Keren. 1989. A Grammar of Slave. Berlin: Mouton de Gruyter. Stenzel, Kristine. 2004. “A Reference Grammar of Wanano.” PhD diss., Boulder: University of Colorado. Ring, Hiram. 2015. “A Grammar of Pnar.” PhD diss., Singapore: Nanyang Technological University. Thieberger, Nicholas. 2006. A Grammar of South Efate: An Oceanic language of Vanuatu. (Oceanic Linguistics Special Publication, 33). Honolulu: University of Hawai‘i Press. Thompson, Sandra A., Joseph Sung-Yul Park, and Charles N. Li. 2006. A Reference Grammar of Wappo (University of California Publications in Linguistics, 138). Berkeley: University of California Press. Valentine, J. Randolph. 2001. Nishnaabemwin Reference Grammar. Toronto: University of Toronto Press. van der Voort, Hein. 2004. A Grammar of Kwaza. Berlin: Mouton de Gruyter.

304 Amber B. Camp et al. Watahomigie, Lucille, J. Bender, P. Watahomigie Sr., and A. Y. Yamamoto, with E. Mapatis, M. Powshey and J. Steele. 2001. Hualapai Reference Grammar (revised and expanded). Osaka, Japan: Endangered Languages of the Pacific Rim. Zhang, Sihong. 2013. “A Reference Grammar of Ersu: a Tibeto-Burman Language of China.” PhD diss., Cairns, Queensland, Australia: James Cook University. http://researchonline.jcu. edu.au/31252/ Zeitoun, Elizabeth. 2007. A Grammar of Mantauran Rukai (Language and Linguistics Monograph Series, No. A4-2). Taipei: Academia Sinica.

Other sources on grammars Glottolog, at: http://www.glottolog.org/langdoc. OLAC (Open Languages Archive Community). http://www.language-archives.org/. The Internet Archive includes many grammars. http://archive.org/details/texts/.

Chapter 13

C ompiling Dic t i ona ri e s of E ndangered L a ng uag e s Kenneth L. Rehg

1. Introduction On several occasions, when speaking to American audiences about the task of compiling a dictionary, I began my presentation by providing the following six names on a Power Point slide. Elbridge Gerry Daniel Tompkins George Dallas Thomas Hendricks Levi Morton Noah Webster I then asked the audience which of these names they recognized. As I anticipated, virtu ally no one knew the first five. However, everyone had heard of Noah Webster. So, who were these first five people? They were all Vice Presidents of the United States. Who was Noah Webster? He was a lexicographer (1758–1843), a maker of dictionaries.1 I employed the activity described above as a whimsical means of impressing upon my audience the fact that the compiling of a dictionary is a worthy undertaking of I would like to thank Joel Bradshaw and Kimi Miyagi for their comments on a preliminary draft of this chapter. 1 The fact that Americans commonly recognize the name “Noah Webster” (actually Noah Webster Jr.) is obviously a consequence of the fact that his last name is still employed in the title of many dictionaries of American English. Only the Merriam-Webster dictionaries, though, can trace their lineage to his pioneering work.

306 Kenneth L. Rehg long-term significance. Dictionaries are products linguists can construct that can be useful to the general public. Relatively few of the world’s 7,099 languages,2 however, have dictionaries. How many do? That question, of course, is difficult to answer, for two major reasons. First, it is unclear what counts as a “dictionary.” Where, for example, does one draw the boundary between a word list, a dictionary, and an encyclopedia? Second, no one, so far as I am aware, is compiling such statistics.3 In spite of these uncertainties, one can nevertheless venture a guess. Among the languages with which I am most familiar— the Austronesian languages belonging to the Oceanic subgroup, spoken in Polynesia, Micronesian, and Melanesia—probably only about 20% have stand-alone dictionaries, most of which are quite small.4 I would venture to guess that this percentage is probably fairly representative of all the world’s languages. Dictionaries are clearly important and useful documents, with both symbolic and functional value. They can enhance the prestige of a minority language, and they can be put to use by the speakers of such languages. One might then ask why there are so few. What is the problem? One problem, of course, is that many of the world’s languages are either undescribed or are so poorly described that compiling a dictionary for them is not feasible. Other major problems include the availability of resources and time. The problem addressed in this chapter, however, is that relatively few linguists receive any training in how to write a dictionary. In American universities and colleges, for ex ample, regular course offerings in lexicography are uncommon. This purpose of this chapter, then, is to address the latter problem. While it is impossible to do justice to the complexity of compiling a dictionary in a single short chapter like this one, it is possible, I believe, to provide a preliminary guide to this task in the form of a conceptual frame work that can serve as starting point for such an undertaking.

2. How does one compile a dictionary? It is useful, I think, to approach the task of compiling a dictionary as that of building a useful product. The creation of any successful product entails at least the five following steps:

(1) Research (2) Preliminary Planning (3) Design and Construction (4) Distribution (5) Support 2

From the twentieth edition of Ethnologue (www.ethnologue.com). Glottolog (glottology.org) provides much useful information about which languages have dictionaries. 4 By a “stand-alone dictionary” I mean one that exists as a book, rather than as an appendix to a grammar or another document. 3

Compiling Dictionaries of Endangered Languages 307 I will discuss each of these five steps in the sections that follow. I should note at the outset, however, that my views of these matters have been significantly shaped by my work on a dictionary of Pohnpeian5 and my familiarity with dictionaries of other Pacific Island languages. I will also focus primarily on bilingual dictionaries, since monolin gual dictionaries typically come into existence after the development of bilingual ones.6 Nevertheless, I hope that much of what I have to say here is applicable to the task of de veloping a dictionary for any small language. The common problems encountered in compiling dictionaries for endangered languages are given prominence in these sections.

3. Research Two kinds of research are required before one sets about building a dictionary—basic research and applied research.

3.1. Basic research To produce a high-quality dictionary of a language, one must have a solid understanding of nearly all of its major grammatical features. As Ron Moe has noted, dictionaries of the major world languages integrate nearly all of the major fields of linguistics, providing information about orthography, phonology (in the form of pronunciation guides), mor phosyntax (by providing parts of speech labels), inflectional morphology (for example, walked, walking, walks), derivational morphology (walker), semantics (in the form of one or more definitions), sociolinguistics (for example, by specifying levels of usage), historical linguistics (in the form of etymologies), and more.7 Not all dictionaries, of course, are nearly as comprehensive as those for major world languages. Dictionaries for endangered and poorly documented languages are typically far more limited in scope and size.

3.2. Applied research In the area of applied research, one needs to know at least the basics of lexicography, the principles and practices of dictionary making. Landau’s (2001, 153) writes: “. . . lexicog raphy is a craft, a way of doing something useful.” 5

Pohnpeian is not currently classified as an endangered language, though it is clearly threatened by English which is encroaching in all domains. 6 The earliest dictionaries of English were typically bilingual, between Latin and English, for example. 7 Ron Moe, Lexicography Course Introduction, Slides 6–23 (http://www-01.sil.org/computing/ddp/ DDP_lex.htm).

308 Kenneth L. Rehg When I first took a course in lexicography in the early 1970s, we had no textbook. Today, however, there are many excellent introductory book-length works on lexicog raphy. Among those that I recommend are Béjoit (2000), Landau (2001), Hartman (2001), Frawley, Hill, and Munro (2002a), Fontenelle (2008), Atkins and Rundell (2008), Svensén (2009), Fuertes-Olivera and Bergenholtz (2011), Jackson (2013), and Durkin (2016). I have used Svensén and Atkins and Rundell, in lexicography seminars that I taught in the Linguistics Department at the University of Hawaiʻi at Mānoa, along with readings from a variety of other sources. There are, of course, many other useful resources. An excellent one from SIL is Coward and Grimes’ Making dictionaries: A guide to lexicography and the multi- dictionary formatter,8 designed as a manual to accompany Toolbox. Another valuable resource is the lexicography course that is part of SIL’s Dictionary Development Process to which I made reference previously. Brief introductions to the making of dictionaries can also be found at the websites for InField 2008, CoLang 2012, CoLang 2014, and CoLang 2016, all of which can be located by a Google search.9 A dated but nevertheless useful bibliography of lexicography can be found at the site referenced in this footnote.10 Ultimately, though, as anyone who has attempted to compile a dictionary can con firm, you need to know much more than just linguistics and lexicography. As you begin to build a dictionary, you will, in fact, come to feel that you need to know everything— anthropology for kinship terms, astronomy for star names, botany for plant names, ich thyology for fish names, zoology for animal names, and much, much more. It is therefore important to read widely about the people and place where you will be working to see what useful information might be available from other disciplines. If possible, try to partner with scholars from other disciplines who might be willing to contribute to your work (see section 4.4 for further discussion).

4. Planning Careful planning is essential in the compilation of a dictionary. You will save countless hours in the future if you first carefully determine what kind of a dictionary you are going to build, how you are going to do it, and what it will contain. The first matter you should consider, of course, is what makes a good dictionary? The simple answer to that question is that a good dictionary is one in which you can find what you are looking for. To build a dictionary that will satisfy that criterion, though, requires careful pla nning. As any experienced lexicographer will tell you, good planning for a dictionary entails making a great many decisions. Most of these, however, fall into two major 8 See http://www-01.sil.org/computing/shoebox/MDF_2000.pdf?_gat=1&_ ga=GA1.2.1903758458.1485801336. 9 All of these sites also list useful books and articles on lexicography. 10 http://euralex.pbworks.com/f/Hartmann+Bibliography+of+Lexicography.pdf.

Compiling Dictionaries of Endangered Languages 309 categories—preliminary decisions, considered in this section, and decisions about dic tionary design and construction, discussed in section 5. I strongly recommend that you document all your decisions and that you create a flow chart for your work and a style manual for the content of the dictionary.

4.1. Audience The primary consideration in compiling a dictionary is that of “audience.” Who is going to use the dictionary and for what purpose? To this end, it essential that you consider the priorities of the speakers of the language. Of course, it is not always easy to deter mine what these priorities are. Linguists commonly talk about serving the “community,” but what exactly does this mean? There appears to be an implicit assumption that all members of the speech community are of one mind. I doubt that this is ever true (see Genetti, Chapter 36, this volume). What one can do, though, is consult with respected members of the community. Who these people are cannot be determined without inves tigation. Talk to people within the community as well as those from outside who have lived and worked there. Ask who you should consult. What you may find is that there are political tensions within the community that are far beyond your control. It is important that you be aware of such issues so that you can hope to produce a product that will not come to be exploited as a means to political ends. My use of the word “community” in subsequent sections of this chapter assumes consultation of this nature (see Hill 2002; Rehg 2004). Although some might not agree, I think it is also important to keep in mind the needs of linguists, who are likely to constitute the second largest audience for the dic tionary. If one is compiling a dictionary of an undocumented language, whose linguistic affiliations are unknown or uncertain, it would, in my opinion, be irresponsible not to attempt to include lexical items that will be useful to historical linguists in their efforts to determine the genetic affiliations of the language, and what that information might tell us about prehistory of the language and its speakers. In my experience working on Pohnpei, information of this nature is of great interest to the community. Of course, it may well be the case that the speakers of the language and linguists do not agree on the goals and design of the dictionary. With judicious planning, however, such tensions can usually be resolved (see Hinton and Weigel 2002). There are other possible audiences for a dictionary, including academics from other disciplines and, most especially, learners of the language. If one of the goals of the com munity is to revitalize the language, then the importance of a good dictionary cannot be overestimated. Larry Kimura, one of the initiators of the highly regarded Hawaiian revitalization effort, wrote about the Hawaiian dictionary compiled by Pūku‘i and Elbert (1986): When Mrs. Mary Kawena Pūkuʻi began writing down Hawaiian words and their meanings, she knew the value of preserving Hawaiian for generations to come.

310 Kenneth L. Rehg When news of her project reached the public, some of her most severe critics were from among her own Hawaiian people. Though this reaction very much saddened Mrs. Pūkuʻi, she persevered, saying that this dictionary was being written for her critics’ own grandchildren as well as for the generations to come. The Hawaiian Dictionary by Mary Pūkuʻi and Sam Elbert has been positively pivotal for the sur vival and revitalization of the Hawaiian language today. (personal communication, July 8, 2016.)

4.2. The scope of the dictionary In deciding what kind of dictionary to produce, it is essential that you realistically con sider both the time and the resources available for it. These concerns necessitate priority setting. The compiling of a dictionary, even a small dictionary, requires an enormous amount of work and almost always takes longer than one would expect. Dictionaries for major world languages have typically taken decades to compile, with ongoing work being the norm. Dictionaries of endangered languages, however, are typically produced within highly restricted time frames and are poorly funded, or, often, receive no direct funding at all. Consequently, dictionaries compiled under such constraints vary widely in scope. They might be thematic, focusing on one or several semantic domains, corpus-based, with the content of the dictionary restricted to just those words that occur in a corpus, or they might attempt to be more comprehensive, in the form of a general purpose dictionary, which includes as much information as possible within the constraints of the project. Because of the constraints under which most dictionaries of endangered languages are produced, Mosel (2011; Chapter 11, this volume) recommends the initial produc tion of thematic dictionaries (or thesauri as they are sometimes called). If this is what the community wants, then there are sound reasons for proceeding in this way. Such dictionaries can be produced within a limited period of time, they can serve as pilot projects for a more comprehensive work, educators can often make use of them, and they can serve to satisfy impatient communities and funding agencies which typically have little understanding of how long it takes to produce a dictionary.11

4.3. Orthography While most linguists view themselves as being concerned with describing how a lan guage is used rather than prescribing its usage, the fact is that the production of a dic tionary necessarily entails making decisions about the standardization of the language, especially when deciding on the orthography to be used (for example, see Rehg 2004.) 11

There are, of course, many other kinds of dictionaries—children’s dictionaries, learner’s dictionaries, pronouncing dictionaries, pictorial dictionaries, slang dictionaries, etc.

Compiling Dictionaries of Endangered Languages 311 There is a very substantial body of literature dealing with orthography design (see Cahill, Chapter 14, this volume). Consequently, I will not further comment on this topic, except to say that, while the old maxim about “one sound/one symbol” or “one pho neme, one grapheme” is where one should start, it is usually far too simplistic to serve as the sole criterion upon which to develop a spelling system. One reason is that linguists who are new to a language typically underestimate the amount of variation that might be present in it. Writing a word the way you say it in such circumstances does not work. Who is saying it? While one might attempt to get around this problem by restricting the dictionary to a single dialect, assuming one knows what constitutes a dialect within the language, there can still be a very considerable range of pronunciations even within a single dialect. Consequently, hard decisions must be made about how much of this variation is to be included, for example, in the form of alternate spellings (see section 5.2.1 for further discussion). Dictionaries can be embraced or rejected on the basis of orthography alone. Given these challenges, it is essential that very careful consideration be given to the spelling system used in the dictionary. If no standard spelling system exists for the lan guage, it is nevertheless possible to begin work on the dictionary using a working or tentative orthography, but considerations of spelling will need to be carefully addressed before the dictionary is placed in the hands of the community.

4.4. Staffing The staffing of a dictionary for an endangered language typically entails significant challenges. The linguist who is a member of the team may not speak the language, or may not speak it well, and the community members available to work on the dictionary may have limited facility in the language of the linguist. Consequently, dictionaries are sometimes produced via one or more translators who speak a lingua franca known to both the linguist and the native speakers. Obviously, this is not an optimal practice, but it sometimes cannot be avoided. It is useful to think of the staffing of the dictionary in terms of “core” staff and “support” staff. The selection of the core staff should be done with input from the com munity. A simple question like “who speaks the language well” will often result in in formation that is useful for choosing members of the team. In my experience, it is important to include older members of the speech community; they are likely to have larger vocabularies (acquired through longer exposure to the language), and their opinions may carry greater weight than those of younger speakers. The support staff is typically made up of consultants with special areas of expertise, such as fishing, hunting, building, farming, healing—whatever is considered to be important in the commu nity. Other members of the support staff might include individuals with computer and recording expertise, as well as scholars from other disciplines such as anthropology, botany, ichthyology, astronomy, etc. (see McClatchey, Chapter 31, this volume; Holton, Chapter 33, this volume).

312 Kenneth L. Rehg If a team approach is used, with many speakers making contributions to the dic tionary, it is essential that one, or at most two, members serve as the primary editors. Typically these will be the linguist and the primary consultant. Their task is to maintain consistency and give final approval to what goes in the dictionary. One should also make every effort to train key members of the team in what they need to know to produce useful results. All members of the team who are writing entries should be given a copy of the style sheet used for the dictionary. Further, everyone who contributes to the project should be given recognition in the final product. Since not all team members will con tribute equally, one might want to establish degrees of recognition by including names under labels such as “editor(s),” “associate editor(s),” and “contributors”—or “compiled by,” “in collaboration with,” and “contributors.”

4.5. Software selection Do not use a word processing program to compile your dictionary. There are a number of dedicated software packages that can be used for this purpose. Lists of those currently available can be found at the InField and CoLang sites previously referenced. With suffi cient funding, you might wish to invest in custom-designed software that attends to the specific needs of your project. The software programs most commonly used by linguists, for example, Lexique Pro, Toolbox, FLEx, TshwanaLex, etc., are designed to be all things to all people and consequently are overly complex and may not, in fact, suit your purposes. Nevertheless, many of these programs are quite good and one need not be hesitant about using them. The journal Language Documentation & Conservation (online, open access) has published reviews and papers dealing with software that can be used to compile a dic tionary and is a good place to begin to get an idea of their strengths and weaknesses, at least as they were when the reviews or papers were published. You can also learn from these reviews who might have special expertise in the software you might wish to use. It is useful to envision the usage of this software as entailing the creation of a data base from which one might extract dictionaries of various types—for example, a the matic dictionary, a dictionary for children, a spelling dictionary, a dictionary of polite vocabulary—as well as lists of words belonging to specific grammatical categories—e.g., noun, verbs, numerals, etc. That is, while it is essential that you carefully consider audi ence in the production of a dictionary, the database from which it is extracted should be designed so that it can potentially serve multiple purposes.

4.6. Other considerations As part of the preliminary planning of a dictionary, one must also consider how the project is going to be funded, as well as how the final product will be distributed. The sources of funding for dictionary projects depend upon the individual circumstances of the linguist compiling the dictionary. How the dictionary might be distributed or

Compiling Dictionaries of Endangered Languages 313 published will be discussed in section 6 of this chapter. It should be briefly noted here, though, that it is important to consider these matters when applying for funding. If the goal is to produce a hard copy of the dictionary, then you need to be aware that few publishers, even academic publishers, are willing to publish a dictionary of an endan gered language without a substantial publication subsidy. There are, however, a number of companies by which one can self-publish books that are sold on demand at reason able prices; examples are Lulu.com and Amazon.

5. Dictionary design and construction General purpose bilingual dictionaries compiled by American linguists for small lan guages are typically organized into at least two parts, with an optional third. Here, and elsewhere in this chapter I will use L1 to represent the target language of the dictionary, and L2 for the language into which the L1 entries are being translated. In this chapter, I use English as the L2 default language. (English is also the default language of most dictionary software programs.) However, in reading this chapter, one can substitute the name of any language as the L2, which will typically be a regional, national, or interna tional lingua franca. Further, as previously noted, I assume that the dictionary will be bilingual rather monolingual. Trilingual dictionaries are also possible, and many dic tionary software programs can accommodate them. However, I would strongly recom mend that no more than three languages be used in your dictionary, the primary reason having to do with problems of inter-translatability. The typical parts of a bilingual dictionary produced by using the kinds of software available to linguists are then as follows: 1. Main Body: Target language (L1) → English (L2) 2. Finder List: English (L2) → Target language (L1) 3. (Thesauri) The main body of the dictionary lists target language (L1) headwords, which are pro vided with definitions in English (L2). The second part, the finder list, typically includes a list of just those L2 words that are treated as keywords in the definitions in the first part of the dictionary. Finally, the dictionary may include one or more thesauri that list words in the target language according to semantic or grammatical domains (assuming the dictionary itself is not a thesaurus). Note that a dictionary organized along these lines is not intended for translation work. It is not meant to be a true bilingual dictionary, for reasons that will become clearer in section 5.2.2, which discusses how words are defined in a dictionary of this type. Another way of viewing the design of a dictionary, and the one that will be employed in the remainder of this section, is that of considering the dictionary in terms of its macrostructure, microstructure, and megastructure (Svensén 2009).

314 Kenneth L. Rehg

5.1. Macrostructure The term “macrostructure” refers to the list structure used to enable one to locate infor mation within a dictionary. As Svensén (2009, 368) notes, there are two main types of macrostructure, one in which information is organized alphabetically, and one in which the information is organized semantically or by some other strategy. In both types of list structure, information is organized in association with headwords. A headword, also sometimes called a lexeme or a lemma, is a morpheme, word, or phrase that occurs at the beginning of an entry. An entry is the unit in the dictionary that consists of a headword and all the accompanying information. The basic challenge in building a dictionary is that of compiling a list of headwords. This is by no means a trivial task, especially when one gets beyond the first 1,000 or 2,000 entries. The question is, where do they come from? An excellent source of headwords is a large corpus (see Mosel, Chapter 11, this volume). For endangered languages, however, the size of such corpora are likely to be quite small in comparison to those available for major world languages. The Oxford English corpus, for example, contains nearly 2.5 billion words and is growing. The cor pora available for endangered languages are minuscule by comparison. The reasons for this disparity obviously have to do with available resources, past and present. Compiling corpora for the language you are working on, however, should be considered an essen tial part of the documentation process and an integral task in building a dictionary. While ideally all the headwords that are included in a dictionary should come from the corpora available for the language, in practice this is rarely feasible for endangered languages, given the limited sizes of such corpora. Unless one is compiling a dictionary of only those words that occur in a corpus, the overt elicitation of headwords conse quently remains an essential part of the dictionary building process. There are, in fact, many strategies that one can employ to collect headwords, including elicitation. A few examples follow. a. If they exist, utilize earlier dictionaries or word lists of the language, and/or dictionaries of related languages (but see section 7 on legal and ethical issues). b. Elicit headwords by semantic domains; therefore, collect all the words your consultants can think of that are related, for example, to plants, fishes, body parts, tools, transportation, diseases, etc. Ron Moe’s Dictionary Development Process provides tools that can be used for this purpose, but be aware that entries collected using groups of speakers are certain to require extensive editing.12 c. Elicit all the words that are part of a grammatical paradigm—for example, pronouns, inflected forms of verbs, numerals, etc. d. Utilize books about plants, fish, mammals etc., especially those with pictures which are available for the region in which you are working. Ideally, one would 12

http://rapidwords.net/es.

Compiling Dictionaries of Endangered Languages 315

want to include information about traditional taxonomies, but discovering what these are is no simple matter. You should be aware, however, that pictures rarely provide a means for determining the size of the things being depicted, and thus can lead to misidentification. (See Holton, Chapter 33, this volume.) e. If a standardized word list exists for the region in which you are working, as they do in some parts of the world, utilize it. Swadesh wordlists modified for particular regions of the world or language families might also exist. f. Take a hike! Taking a walk in a variety of environments with one or more consultants is an excellent way of eliciting words that might otherwise be overlooked. If the language is written, in whatever form, also pay close attention to signs, graffiti, advertising, etc. g. If the phoneme inventory of the language you are working with is relatively small, then you might find computer generated word lists to be useful. Such lists might include all possible words of specific canonical shapes—for example, all monosyllables of the shapes CV, CVC, VC, etc. The task then is to discover among these forms which words actually occur in the language. So far as I am aware, the earliest user of such a strategy was Vern Carroll, who employed it in compiling a dictionary of Nukuoro (Carroll 1996). A program for this purpose can currently be found on William Poser’s website.13 h. Create a “culture calendar.” Find out from your consultants what important events occur throughout the year—holidays, ceremonies, optimal times for planting, fishing, hunting, etc. It is likely that a product of this nature will be valued by the community, and you will learn much. i. Stimulus kits are also useful for the purpose of eliciting entries for a dictionary.14 j. There are, of course, many other means of collecting headwords. If you speak the language, or gain some fluency in it, you will often discover new words that are op portunistically encountered in everyday uses of the language.

It is important to try to determine if there are words the community might want to exclude from the dictionary. Should you include loanwords? As linguists, we would probably want to include them if they have been fully assimilated into the language, as evidenced, for example, by the fact that they are employed by monolinguals who oth erwise do not speak the donor language (if such speakers exist). However, this may not be the wishes of the community, especially when the loanwords are replacive rather than additive; that is, they are replacing indigenous words that have essentially the same meaning, rather than being words for new things or concepts that have been in corporated into the culture from an external source. You should also try to determine if it is permissible to include swear words, taboo words, slang, or words that might be considered the property of a certain group of people within the community, such as

13

www.billposer.org/Software/WordGenerator.html.

14 See https://www.eva.mpg.de/lingua/tools-at-lingboard/stimulus_kits.php.

316 Kenneth L. Rehg healers, shamans, etc. While working on a dictionary of Pohnpeian, I was on several occasions approached by people who wanted to tell me about a word they were confi dent I did not know (which I did not); in all such instances, these were words associated with forms of magic. I was always instructed, however, not to share these words with others. That is, these words were intended as gifts, not to be included in the dictionary. What does one do under such circumstances? You accept the gift and respect the intel lectual property rights of the giver by not putting the words in the dictionary—not an easy thing for a lexicographer to do. There are also concerns about how many headwords should you plan to include in your dictionary (see Mosel, Chapter 11, this volume). In the Pacific region, probably most “general purpose” dictionaries contain 4,000 to 6,000 entries. Among Oceanic languages a “significant” dictionary is one with more than 10,000 (Andrew Pawley, personal communication). Of course, apart from the constraints of available time and resources, the number of headwords a dictionary might contain depends upon what is considered to be a candidate for a headword. The size and scope of dictionaries of languages with complex morphology cannot easily be reckoned simply by counting headwords, as discussed in the next section. It might also be noted that, of course, one never completes a dictionary. Instead, one simply stops when work on the dictionary is no longer possible, or when one decides to move on to other projects. Even small dictionaries, however, are valuable and represent time well-spent. As previously noted, Svensén recognizes two main types of list structure—one in which entries are organized alphabetically, and one in which entries are organized by other means. Ordering entries alphabetically is a common list strategy. If a new spelling system is being employed in the dictionary, then it may be necessary to decide upon an appro priate ordering of the graphemes, especially when diacritics or digraphs (or trigraphs) are employed. In determining alphabetical order, the order used in the dominant lingua franca of the region should be considered, as well as traditional ordering systems that might exist in the region. For example, many Oceanic languages list all words beginning with vowels first, and then all words beginning with consonants. When systems are in conflict, then a decision needs to be made by representatives of the community, pre sumably by an orthography committee. If the dictionary is to be online and the software you are using permits it, then multiple ordering options might be provided to the user, including even the possibility of reverse order by which a user can list words according to their last rather than their first letter, an option that might be especially appealing to linguists (Bickford 2015, 158). Alphabetical ordering, however, is not appropriate for all languages, for a variety of reasons to be briefly discussed here. Many languages use affixes in combination with stems to form words.15 In the case of English verbs, for example, the verb “pass” can occur as “pass,” “passing,” or “passed.”

15

This discussion is an adaptation of information presented in the Dictionary Development Process, previously referenced in footnote 7.

Compiling Dictionaries of Endangered Languages 317 Typically, a single headword can be used to represent all of these forms, in this case “pass.” English is a relatively simple language in this respect. English has no inflectional prefixes and relatively few irregular verbs and nouns, such as “go/went/gone” or “woman/ women.” The challenge in using alphabetical ordering arises in association with lan guages in which roots only occur in combination with prefixes, or when the language has complex morphophonemics. For languages like this, there are at least three possible solutions. 1. Create a root-based dictionary that includes numerous minor entries that steer the user to the entry where the root is listed; for example, entries can be listed that include inflectional prefixes but are cross-referenced to the root which serves as the main entry. 2. A better solution, if possible, is to publish an electronic version of the dictionary with a built-in parser, or one can produce a finder list of all possible inflected forms so that a user can click on a form which will then toggle to the main entry. 3. Or, as is commonly the case, one can abandon alphabetical ordering and organize the dictionary by semantic domains. In trying to decide which strategy to employ, it is helpful to consider how dictionaries of related languages are structured, assuming they exist. As Hinton and Weigel (2002, 163) note: “The choice of the citation form and its role in the dictionary are highly language-specific matters.” There are, of course, other possible ordering systems, as for example, for languages that use symbols to represent mora (like Japanese) or morphemes (like Chinese). In such cases, one needs to investigate what might be the common practice for such lan guages in the region in which you are working.

5.2. Microstructure The term “microstructure” refers to the internal composition of dictionary entries. Some common core elements of an entry for small dictionaries are the following.

1. headword 2. alternate spelling(s) 3. pronunciation 4. usage label (slang, honorific, etc.) 5. part of speech 6. definition 7. phrase or sentence example 8. etymology 9. cross-reference 10. semantic domain

318 Kenneth L. Rehg Each of these types of information are stored individually in what are called “fields.” There are, of course, many other types of information that one might wish to include in a dictionary, including pictures, and, if one is creating an online dictionary, sound files. I have found it useful also to include fields for personal notes, the source of the entry, the date of the entry, cultural information, and dialect information.16 Good dictionary software will allow you to select those fields that you wish to appear in the printed or online version of the dictionary. I will not further discuss the possible content of these fields here, except for (3) alternate spellings, (6) definitions, and (7) phrase or example sentences. Again, I encourage you to create a style manual so that you are and others working with you are able to maintain consistency throughout the dictionary.

5.2.1. Alternate spellings As Rice and Saxon (2002) observe, there are two widely accepted desiderata for the de sign of orthographies for languages. One, as previously noted in section 4.3, is that of representing each phoneme with a single grapheme. The second is that of employing one spelling per word. Both of these approaches are motivated by “the pedagogical ar gument that reading and writing come easier if words are consistently spelled. . . .” (Rice and Saxon 2002, 130). But as Rice and Saxon also note, variation, which is present in every language, presents problems for implementing standardization along these lines. The standardization of the alphabet can usually be dealt with, but the standardization of the spelling of words is a separate problem. The question for the lexicographer, there fore, is how much variation to include. One approach is to decide on a single spelling for each word, and then stick with it. In this case, however, one might well encounter significant resistance to the dictionary from those speakers whose pronunciations are not represented. The second is to deal with such variation in the form of alternate spellings, but then one encounters the question of how many alternates to include. This is a question for which there is no simple answer. Ultimately, one would hope that a solu tion could come from the community, or at least from those members of the community who are contributing to the dictionary. For example, the orthography committee for Pohnpeian decided to deal with the problem of variation by using the northern dialect as a basis for spelling words17 (Rehg 1981, 378–379), but even within that dialect there is a great deal of variation, some of which we have now included in an online version of the Pohnpeian-English dictionary (under construction). As I noted in Rehg 2004, standards for writing languages typically evolve over a long period of time, with a laissez faire attitude toward spelling being the norm before stand ardization on the order of that for English comes into being, as evidenced, in fact, by English. Rigid approaches to standardization in the early stages of literacy can in fact have an inhibitive effect. Speakers, and especially teachers, become reluctant to write 16 See Coward and Grimes (2000) for an extensive list of the kinds of fields one might include in a dictionary. 17 This decision was later contested for essentially political reasons. See Rehg 2004 for further discussion.

Compiling Dictionaries of Endangered Languages 319 the language for fear they will make a mistake. This is a common problem that must be considered by the lexicographer, lest his or her dictionary becomes an obstacle to literacy.

5.2.2. Definitions in a bilingual dictionary Learning to write good definitions is challenging. It is a skill that is highly regarded in the world of commercial dictionaries. The meanings of words, as I previously noted, emerge from the contexts in which they are used. Consequently, the makers of large commer cial dictionaries rely exclusively on the use of corpora in the process of defining words. However, because the size of the corpora available for endangered languages is typically quite small, it is usually necessary to supplement corpora by eliciting meanings, ide ally gathering information such information from multiple sources. (These elicitation sessions can, of course, become part of a corpus.) Group discussions about meanings with native speakers are especially useful, but, they are also extremely time-consuming and should probably be restricted to just those entries for which meanings are diffi cult to determine. In the following discussion, I will comment on the task of writing definitions, with all examples coming from Pohnpeian. As will be shown, providing meanings for headwords necessarily ranges from straight translation to the creation of original definitions. In some cases, true equivalents exist between the target language (L1) and the lingua franca (L2), at least in terms of their denotations (though probably rarely, if ever, in their connotations). Information of the latter type, however, is not typically included in a dic tionary, except perhaps under a usage label. Some simple examples of Pohnpeian words with equivalent denotations in English are: aio “yesterday” kidi “dog” usu “star” You will also encounter words where a translational equivalent exists but is not known to you. Such is often the case for body parts, diseases, species, and other events or things that occur in the natural environment. Examples are: edin marer “fontanel, any of the spaces covered by membrane between the bones of a fetal or young skull” kens “yaws, a tropical disease that first affects the skin and later the bones” rawahn “tree sp., false durian, Panguim edule, edible fruit” If you use Latin binomials in your definition, it is important to exercise caution. Unless you are certain about the identity of the species, you should qualify your definition, for example, by using wording like “possibly X” or not use them at all, as recommended by Holton (Chapter 33, this volume).

320 Kenneth L. Rehg There are also likely to be many words that have near-equivalents in the L2, but that do not occupy the same semantic range. Examples are: pap “to swim, of people, non-marine animals, and turtles” nohno “mother, any person one’s mother or father would call sister” sinopwunopw “fat, healthy looking, of infants or young domestic animals” In addition, there will be many words for which there are no equivalents in L2. Such words, typically culture-specific, have to be fully defined. You will likely find that the number of such words that you encounter will increase as the size of your dictionary increases (see Mackenzie and Wade, Chapter 34, this volume). Examples are: kapwilihda “to wake someone up by pulling the hair on his/her big toe” songmaterek “to fish for the first time after someone’s burial to test whether the spirit of the deceased will bring good or bad luck” ilewe “to anchor a canoe by placing a pole between the outrigger and the hull and sticking it into the ocean bed” Problems of semantic range also occur when a single headword corresponds to a number of L2 equivalents or synonyms. For example, the Pohnpeian adjective masamwahu (lit erally “face-good”) could be translated by any of the following English words: “pretty, attractive, beautiful, cute, good-looking, adorable, nice-looking, gorgeous.” It is prefer able, however, to avoid the gratuitous listing of synonyms. Instead, one might define this word, as follows: masamwahu “good-looking, pretty, handsome” “Good-looking” can be used for both genders, while “pretty” and “handsome” are conventionally more gender specific.

Words can also have two or more related but distinct meanings. This is called polysemy. Examples from Pohnpeian follow. kamehlel “1. to be verified, to be believed to be true. 2. final proof; final heat of a race” kahka “1. desiring peace and quiet while under the influence of kava. 2. (Biblical) to show respect, to honor” amas “1. raw, uncooked. 2. sober” Homophony, unlike polysemy, involves words with distinct meanings and origins, but having the same pronunciation. For example: ele1 “to draw a line, to inscribe” ele2 “perhaps, maybe, possibly”

Compiling Dictionaries of Endangered Languages 321 Homophones are listed separately in the dictionary, typically with subscripts or superscripts.

5.2.3. Sample phrases or sentences in a bilingual dictionary Sample phrases or sentences that illustrate how a word is used in context are also often included in an entry. Some linguists take the position that every headword should be exemplified by one or more sample sentences. If you are producing a hard copy of the dictionary, however, this might not be feasible for a variety of reasons, not the least of which is that it will increase the size and consequently drive up the cost of the dictionary. There are also differences of opinion concerning whether sample sentences should be “real,” that is extracted from a corpus of natural speech, or “constructed,” created for the purposes of the dictionary. The use of real examples is optimal, but sometimes the sentences taken from a corpus in which the headwords appear are too long to be useful, or are unclear when taken out of context. Consequently, carefully constructed example sentences, provided by a native speaker, can also be employed. Bickford (2015, 162) notes: “Ideally, an example sentence should provide enough con text that if the entry word is omitted it is possible (for a speaker of the language) to guess what word should fill in the blank.” This criterion can only be satisfied, of course, if the speaker knows the word. Two other considerations in selecting or constructing sample sentences are that they should justify the definition assigned to the word, and, if pos sible, place that word in a context that provides information not only about the word but about the culture as well. Consider, for example, the following possible example sentence for the headword ahia “rainbow.” Ahia me lingan. “Rainbows are beautiful.” This is a poor example sentence, for several reasons. First, the slot occupied by ahia could be filled by any of a very large number of nouns. Second, it may be culturally in appropriate. On Pohnpei, as in many other places in the world, rainbows are not some thing to be admired but rather feared and respected. The following example sentence is much better and illustrates a Pohnpeian belief about rainbows. Ma ke idih ahia, sendin pehmwen pahn kensda. “If you point at a rainbow, your finger will ulcerate.” Pointing at rainbows, in many of the world’s cultures, is something to be avoided (Blust, 1999). The primary challenge of using constructed sample phrases or sentences is one of finding someone who can write them. My experience is that such individuals are quite rare. Consequently, it is not uncommon to find dictionaries in which the sample sentences are unimaginative and reveal little about how a word is used and the contexts in which it might occur.

322 Kenneth L. Rehg

5.2.4. Bilingualized dictionaries The discussion of definitions and sample sentences raises another important issue about the microstructure of dictionaries. Thus far, we have been assuming that the end product of a dictionary project will be a bilingual dictionary. However, if words are defined in L2, the definitions will be of no use to monolingual speakers of L1 and of limited use to those who have a poor command of L2. One option, then, is to produce a monolingual dic tionary in L1. However, unless the primary compilers of the dictionary are fully fluent in the target language, the production of a monolingual dictionary is not feasible. Further, if one plans to produce a hard copy of such a dictionary, it might be very difficult to find a publisher for it. An alternative is to produce a “bilingualized” dictionary. Native speakers of L1 are not going to use the dictionary to look up the meaning of high-frequency words. They might want to check the spellings of such words but not their meanings. In the case of unfamiliar vocabulary, however, the speakers will want to know what the words mean. In such cases, either (a) the words can be defined in both languages, or (b) one or more sample sentences or phrases can be provided that clearly illustrate what the word means.

5.3. Megastructure The term “megastructure” is used with reference to all the components of the dictionary, including both the front and back matter. That is, a dictionary typically consists of the following major parts—the front matter, the body of the dictionary, and the appendices. The front matter of small dictionaries typically includes a title page, perhaps a dedi cation, a table of contents, the names of all contributors, a preface and/or introduction, a section on how to use the dictionary, a list of abbreviations or symbols, and possibly maps. Ideally, the front matter of the dictionary should be provided in both L1 and L2, with the L1 content presented first. From the user’s perspective, the most important part of the front matter is the mate rial on how to the dictionary. Unfortunately, users are not likely to read this information unless it is presented in such a way that it is easily and immediately accessible. In a print dictionary, a condensed version of this information should occur early, if feasible, on the inside of the front and back covers. A diagram illustrating the structure of an entry should be provided, along with the alphabet used in the dictionary. If the alphabetic order is one that is likely to be unfamiliar to the user, then it might be helpful to include the alphabet as a footer, or to make it readily accessible in an online dictionary. Other materials that might be included in the front matter are a pronunciation guide, an explanation of how to find words, especially if the form of the headword that serves as a main entry can be obscured by the presence of affixes, information about spelling conventions, including information about word division, and an explanation of the labels used for parts of speech, levels of usage, etc. In short, the front matter of the dic tionary should include information about the language that facilitates the use of the dictionary.

Compiling Dictionaries of Endangered Languages 323 Dictionaries differ in terms of what is included in the front versus the back matter, but some of the types of information that might be included in the back matter are thesauri, which might include lists of words belonging to specific semantic domains or closed lexical sets, a sketch grammar, an ethnographic sketch, labeled illustrations, as, for ex ample, for house parts, canoe parts, etc., maps, place names (if not included in the body of the dictionary), etc. That is, the back matter of a dictionary can serve as a repository for information about the language and culture for which there might not be another outlet.

6. Distribution and support It is also essential that careful consideration be given to how the dictionary will initially be distributed and ultimately placed in the hands of its audience. Ideally, this will entail three steps. (1) Archiving: All work on the dictionary should be backed up and archived on a regular schedule, with copies distributed to all its compilers as well as to others who might have a stake in the work (see Berez-Kroeker and Henke, Chapter 15, this volume). (2) Field Testing: The dictionary should be field tested early and more than once while under construction to help insure that that it will be accepted by its potential users. (3) Publishing: The final step is that of placing the dictionary in the hands of the gen eral public via hard copies, CDs or DVDs, cell phones, a web-based version, or some combination of the above. The decision about how to publish the dictionary depends upon who will be using it and the resources available to them. If the intended users have access to computers and the internet, there are many advantages to producing an electronic dictionary. Two major ones from the users’ perspective are that electronic dictionaries can include sound files and hyperlinks. Hyperlinks allow users to move easily between entries as well as to different sections of the dictionary. From the compilers’ perspective, it is easy to edit and add to the dictionary on an ongoing basis, which is not true of hard copies. It is all too often the case that compilers of dictionaries for endangered languages assume their work is done once the dictionary has been published. As with any product, however, it is important that you plan to provide support to its consumers. For example, you might build an educational component into your dictionary project, working with a local education department or NGO (nongovernmental organization) to provide training in how to use the dictionary. Ideally, your support effort should also include plans to insure that work on the dictionary will be continued, even after the original project ends.

324 Kenneth L. Rehg

7. Legal and ethical issues It was previously noted that one way to gather headwords for a dictionary is to con sult earlier word lists or dictionaries of the language, or of closely related languages. All lexicographers do this. For this reason, however, the compiling of dictionaries has sometimes been viewed as an extended act of plagiarism. Commercial publishers of dictionaries of major languages are fully aware of this concern and sometimes include “ghost words,” words that are made up to see if other publishers are copying them. For example, the New Oxford American Dictionary (2010) includes the following fake entry.18 “esquivalience n. the willful avoidance of one’s official responsibilities; the shirking of duties” Interestingly, this made- up word did indeed later turn up in competitors’ 19 dictionaries. To some extent, of course, what might be considered “plagiarism” is unavoidable. If an earlier dictionary includes a word meaning “dog,” is it plagiarism if a subsequent dic tionary includes the same word with the same meaning? Nevertheless, one should en deavor insofar as possible to avoid directly copying from earlier work. There are other legal and ethical issues that are relevant to lexicographers, not all of which can be considered in this short chapter. A good place to begin, however, is by reading Paul Newman’s (2007) Copyright Essentials for Linguists. You should also keep in mind the fact that dictionaries are often viewed as reflecting social values. Be careful to insure that your work reflects, or at least does not violate, the values of the community in which you are working.

8. Conclusion As Terry Crowley (1999) has observed, linguists are typically interested in grammars, whereas communities are interested in dictionaries. But a good dictionary is not only a means of giving back to a community; it is also an essential part of creating a lasting rec ord of a language. For this reason, I began this chapter with the claim that the compiling a dictionary is a project of significance to both linguists and the general public. Let me then conclude this chapter with the following quote. A dictionary is a thousand pages of ideas and history, a guide to the mind and the world of a people. No book—except for, perhaps, religious documents, themselves 18 For additional information, see https://en.wikipedia.org/wiki/New_Oxford_American_ Dictionary#Fictitious_entry. 19 http://articles.chicagotribune.com/2005-09-21/features/0509200275_1_ electronic-dictionary-new-oxford-american-dictionary-new-yorker.

Compiling Dictionaries of Endangered Languages 325 guides to the mind and world of a people—has a shelf life longer than a dictionary. Surely that must be worth something. (Frawley et al. 2002b, 22)

References Atkins, B. T. Sue and Michael Rundell. 2008. The Oxford Guide to Practical Lexicography. New York: Oxford University Press. Béjoint, Henri. 2000. Modern Lexicography: An Introduction. New York: Oxford University Press. Bickford, J. Albert. 2015. “Review of the Marshallese-English Online. Language Documentation & Conservation. 9: 158–163. http://hdl.handle.net/10125/24638. Blust, Robert. 1999. “The Fox’s Wedding.” Anthropos 94: 486–499. Carroll, Vern. 1996. “Generative Elicitation Techniques in Polynesian Lexicography.” Oceanic Linguistics 5(2): 59–70. Coward, David F. and Charles E. Grimes. 2000. Making Dictionaries: A Guide to Lexicography and the Multi-dictionary Formatter. Dallas, TX: SIL International. http://www.sil.org/com puting/shoebox/MDF.html. Crowley, Terry. 1999. “The Socially Responsible Lexicographer in Oceania.” Journal of Multilingual and Multicultural Development 20(1): 1–12. Durkin, Philip, ed. 2016. The Oxford Handbook of Lexicography. New York: Oxford University Press. Fontenelle, Thierry, ed. 2008. Practical Lexicography: A Reader. New York: Oxford University Press. Frawley, William, Kenneth C. Hill, and Pamela Munro, eds. 2002a. Making Dictionaries: Preserving Indigenous Languages of the Americas. Berkeley: University of California Press Frawley, William, Kenneth C. Hill, and Pamela Munro. 2002b. “Making a Dictionary: Ten Issues.” In Making Dictionaries: Preserving Indigenous Languages of the Americas, edited by William Frawley, Kenneth C. Hill, and Pamela Munro, 1–22. Berkeley: University of California Press. Fuertes- Olivera, Pedro A. and Henning Bergenhotz. 2011. e-Lexicography. London: Bloomsbury. Hartmann, Reinhard R. K. 2001. Teaching and Researching Lexicography. New York: Taylor and Francis. Hill, Kenneth C. 2002. “On Publishing the Hopi Dictionary.” In Making Dictionaries: Preserving Indigenous Languages of the Americas, edited by William Frawley, Kenneth C. Hill, and Pamela Munro, 299–311. Berkeley: University of California Press. Hinton, Leanne, and William F. Weigel. 2002. “A Dictionary for Whom? Tensions Between Academic and Nonacademic Functions of Bilingual Dictionaries.” In Making Dictionaries: Preserving Indigenous Languages of the Americas, edited by William Frawley, Kenneth C. Hill, and Pamela Munro, 155–170. Berkeley: University of California Press. Jackson, Howard, ed. 2013. The Bloomsbury Companion to Lexicography. London: Bloomsbury. Landau, Sydney I. 2001. Dictionaries: The Art and Craft of Lexicography. Cambridge: Cambridge University Press. Mosel, Ulrike. 2011. “Lexicography in Endangered Language Communities.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 337–353. Cambridge: Cambridge University Press.

326 Kenneth L. Rehg Newman, Paul. 2007. “Copyright Essentials for Linguists.” Language Documentation & Conservation 1(1): 28–43. http://hdl.handle.net/10125/1724. Pūku’i, Mary Kawena, and Samuel H. Elbert. 1986. Hawaiian Dictionary: Hawaiian-English, English-Hawaiian. Honolulu: University of Hawaiʻi Press. Rehg, Kenneth L. 1981. Ponapean Reference Grammar (PALI Language Texts: Micronesia). Honolulu: University of Hawai‘i Press. Rehg, Kenneth L. 2004. “Linguists, Literacy, and the Law of Unintended Consequences.” Oceanic Linguistics 43: 498–518. Rice, Keren and Leslie Saxon. 2002. “Issues of Standardization and Community in Aboriginal Language Lexicography,” In Making Dictionaries: Preserving Indigenous Languages of the Americas, edited by William Frawley, Kenneth C. Hill, and Pamela Munro, 125–154. Berkeley: University of California Press. Stevenson, Angus and Christine A. Lindberg, eds. 2010. New Oxford American Dictionary, 3rd ed. New York: Oxford University Press. Svensén, Bo. 2009. A Handbook of Lexicography: The Theory and Practice of Dictionary-Making. Cambridge: Cambridge University Press.

Chapter 14

Ortho graph y De si g n and Im plem en tat i on for E ndangered L a ng uag e s Michael Cahill

1. Introduction Orthography applied to endangered languages has attracted increased attention in re cent years, even an entire forthcoming volume (Jones and Mooney 2017). In spite of the quite limited focus of that book and this chapter, it must be emphasized from the start that orthography is not an isolated topic. An orthography is both a tool and a symbol. It is a tool through which local literacy can happen, including preservation of traditions of the elders, ethnobotanic knowledge, folk tales, health books, etc. It is also an emblem of identity for a language community. The functions of an orthography are both pragmatic and symbolic, and we will see both of these strongly represented in relation to endan gered languages. The title of this chapter is deliberately “Orthography design AND implementation.” Design has attracted the lion’s share of writing about orthography, but without imple mentation, developing an orthography may be merely an isolated exercise, with no lasting impact in either academia or the language community. Implementation has two major components: testing and establishing the orthography itself, and promoting its use in literacy in the community. As complex as developing an orthography is, actually implementing it in practical literacy is at least as complex. Due to space limitations, the focus of this chapter will be development and “implementation as establishment,” with only a brief outline of what is needed for literacy.

Thanks to a number of people who gave valuable input on this chapter, including Elke Karan, Paul Lewis, Leila Schroeder, and the editors of this volume.

328 Michael Cahill Orthography has clear connections to some other chapters in this book. If a grammar of an endangered language is to be written, and its audience includes the people speaking that language (what has been termed a “pedagogical grammar”), then a good orthography will help immensely (Campbell and Rehg, Introduction, this volume, as well as Camp et al, Chapter 12, this volume). If a dictionary is written for the use of people speaking an endangered language (Rehg, Chapter 13, this volume), developing a good orthography is a wise use of time. Both grammars and dictionaries have been written solely for academic audiences, of course, and these are valuable in their own right. But grammars and dictionaries also can further people’s awareness and value of their own language, and eventually become an aid to boosting the vitality of a language. The use of corpora (Mosel, Chapter 11, this volume) may potentially be simplified if done in a practical orthography rather than phonetic transcription. Though efforts in language revitalization (chapters in Part III of this volume) fundamentally focus on increasing the use of language in various domains, and most of these domains are spoken, a written form of the language also expands the domains of usage, and orthog raphy development is often appropriate for such efforts. Developing orthographies for relatively unknown languages is not exclusively a modern phenomenon. For the last few centuries, a number of missionaries and inter ested colonial officials who have doubled as amateur field linguists have been paying attention to unwritten languages. They have developed writing systems for these, published grammars and dictionaries (the missionary Christaller 1875, 1881 for Akan and the administrator Carbou 1913 for Chadian Arabic), and Bibles (the uneven but re doubtable William Carey for many languages in India). Also, though schools run by governments or missionaries are frequently and justly criticized for their often harsh policies toward local language use, some missionary schools of the nineteenth century did use local languages to produce learning materials. Starting in the twentieth century, SIL International1 has done orthography development as a natural accompaniment to Bible translation for over eighty years, having involvement in over 1,300 languages. In recent decades, more attention has been paid to orthographies in wider academic circles, especially as language endangerment has come to the attention of the academic community. This is particularly so since the groundbreaking paper of Krauss (1992) in Language, which jolted linguists into considering that they were possibly presiding over the demise of over 90% of their subject domain.2 The intersection of orthography and endangered languages is relatively new. Besides Jones and Mooney (2017), the Cambridge Handbook on Endangered languages has an

1

Formerly known as the Summer Institute of Linguistics. The percentages are open to challenge, especially since the most-studied endangerment situations have been in North America and Australia, which have arguably the most precarious language situations of the world. There is no doubt, however, as to the magnitude of the threat of losing a multitude of the world’s languages (Lewis, Simons, and Fennig 2016). See also Simons and Lewis (2013) for an updated estimate of world language endangerment, and the Endangered Languages Project (http://www. endangeredlanguages.com/) for a wealth of resources on specific endangered languages. 2

Orthography Design and Implementation 329 article on orthographies (Lüpke 2011), and Grenoble and Whaley (2006) not only have a chapter on orthography but one on literacy as well. Having an adequate orthography does not guarantee the continued existence of an endangered language, but it can be of help when connected to a vital literacy program. Literacy is one of nine factors that UNESCO lists as helpful in improving the vitality of endangered languages, asserting “Education in the language is essential for lan guage vitality” (UNESCO 2003, 6). Orthography and concomitant literacy can also help in improving attitudes of local people who have considered their language to be inferior. Cases abound where smaller, economically underprivileged groups have been denigrated by more powerful neighbors for decades, and to find that their language can be written at all is a wonderful revelation. The Paumarí people of Brazil, for example, had been told for decades by Portuguese- speaking river traders that they didn’t even speak a real language—it was just animal sounds. When they first saw field linguists transcribing their language, they were ex cited; this showed them that Paumarí had the same value as Portuguese. When reading was finally introduced to the long-waiting Paumarí, it was immediately popular. This was the beginning of a reversal of the negative attitude which they had of themselves and their language. They began to speak their language in front of outsiders. The Paumarí now say “Our language is just like Portuguese” (Cahill 2004). The practical versus symbolic necessity for an orthography varies with the degree of endangerment of a language. If a language is moribund, with no potential for active native speaker literature or involvement, then an orthography may be of only academic interest. But if the ethnic community still has a strong sense of identity, an orthography may well serve as a symbol of that identity. In another situation, if a language is endan gered but still has active speakers, then the orthography and corresponding literacy materials and activities can boost the community’s esteem and use of the language in a variety of domains. These are discussed more fully in section 3. In this chapter, common conventions are used for different representations. Orthographic representations are enclosed in angle brackets, e.g., , phonetic rep resentation in square brackets [phul], and phonemic representations between forward slashes /pul/.

2. Orthography design This section will assume the case of a language that has never had an accepted orthog raphy. The focus in this section is specifically on how orthographies are designed for any unwritten language, endangered or not. (Section 3 specifically discusses what difference it makes if the language is endangered.) The subject of reform of an existing orthography is a topic in its own right. The factors of good orthographies discussed here are relevant in reform as well, and the need or desire for reform will emerge from a perceived lack in one or more of these areas.

330 Michael Cahill Orthography reform is often more difficult than initial orthography development be cause of various inertial factors. First, there is likely to be literature in the old orthog raphy. This either has to be redone in the new orthography, or people will need to control both old and new orthographies, or the old literature will just become irrelevant to new readers, eventually disappearing. Second, in smaller languages, native speakers often know the developer of the old orthography, and may have high respect for that person. Changing the orthography can be perceived as disrespectful to the creator of the old or thography. If one is dealing with an existing orthography, the remainder of this chapter will help identify potential areas of improvement. However, the emphasis here will be on previously unwritten languages.

2.1. What is an orthography? A foundational concept, sometimes unexamined, is what an orthography consists of. It is a misconception that an orthography consists solely of symbols (graphemes) that rep resent the speech sounds of the language, e.g., . But an orthography is more than just individual symbols. One issue is capitalization—not only the shapes of the capital letters, but also decisions on when to capitalize. Most languages using a Roman-based script capi talize the first word of a sentence. But what about proper names, or language names, or locations? German capitalizes nouns in general, while French does not capitalize names of languages; is this desirable? This is largely an issue of preference of the local speakers. Where to put word breaks is a major issue in some languages. It makes a huge difference in meaning at times (“psychotherapist” ≠ “psycho the rapist”) and fluency in other cases. This is largely a linguistic issue, discussed in section 2.2.2. Related to this is hyphenation. Hyphens can be used to join certain types of compounds or clitics. But hyphens are also used to split long words at the end of lines. Hyphens are only one sort of punctuation. Questions involve another. Would the local speakers prefer only one question mark at the end of the question, as in English, or an additional inverted question mark at the beginning, as Spanish does? Are ex clamation points desired (with the same kinds of issues as question marks)? How are quotation marks used? These are represented by different characters in French than in English. Diacritics on graphemes are also part of the orthography. These are extra marks attached to a base grapheme that modify the pronunciation of that particular grapheme. These may be tone or accent marks (e.g., Kɔnni daáŋ, with acute accent marking high tone), the various French diacritics on vowels (e.g., côte), Yoruba vowel underdots (e.g., ọ̀gẹ̀dẹ̀) and the English spelling naïve. Non-Latin scripts such as Arabic or the family of Devanagari scripts may have their own diacritics. To claim that one has developed an orthography, all these (basic graphemes, diacritics, punctuation, capitalization, and word break conventions) must be included. For many languages, principles of how to spell loanwords are also necessary.

Orthography Design and Implementation 331

2.2. What is a good orthography? Various lists of what makes an effective orthography have been proposed. However, it is helpful to consider the very broad categories of “usability” and “acceptability” (very close to the terms “readability” and “acceptability” in Dawson 1989). Usability refers to how well native speakers of a language are able to read materials written in that orthography. (This of course assumes adequate training and orientation.) Linguistics and logistics are the main factors that are relevant here. Acceptability deals with whether native speakers want to use the orthography. Various political and social reasons often come into play powerfully. An orthography may be rejected by local people for an astonishing variety of reasons, even if linguistically it is well-designed. So if the goal is to actually have people use the orthography (and of course it is . . .), then the category of acceptability can often be more important than strict usability. Smalley (1959, reprinted in 1964) has been cited multiple times for his proposal of five criteria for effective orthographies, as follows: 1. maximum motivation for the learner—and acceptance by his local society 2. maximum representation of speech—actual spoken language is represented 3. maximum ease of learning—not too complicated to learn 4. maximum transfer—same symbol is used in the local language as in a larger or na tional language 5. maximum ease of reproduction—typing and printing facilities are available Of these, the broad “acceptability” category would include 1, and “usability” includes 2, 3, and 5. Interestingly, 4 “maximum transfer” can go in either category. If local speakers want their orthography to look like the national language and the symbols to have the same value as the national language, then maximum transfer is a plus on the accepta bility side. On the other hand, since having a common symbol in both local and larger languages produces a less steep learning curve, then this is also a usability issue. Upon even brief reflection, it is obvious that there is potential for clashes between these principles, and someone (ideally, members of the local community) must de cide which is more important to them in that particular situation. Such conflicts are examined in section 2.4. Below we discuss a number of acceptability and usability issues.

2.2.1. Acceptability issues for a good orthography A foundational concept is that an orthography is a representation of the identity of a local speech community (Lewis 1993; Sebba 2007; and many others). It can be regarded by the local people as symbolic of their distinctiveness and individuality. A language group’s identity connects to a whole host of related issues. Does the group consider itself connected to its neighbors or not? Does the group have a sense of con fidence in who they are, or are they an oppressed minority group with little sense of

332 Michael Cahill self-worth? Are they united or fractured? Do they have a strong religious identity as a group? These are all attitudinal factors that need examination; they can make the difference between enthusiastic acceptance, lukewarm assent, indifference, or outright rejection of an orthographic system. It has often been said that “all orthographies are political,” and this captures a deep and pervasive truth. Let us turn to some ingredients of politics. An obvious political factor, though sometimes overlooked, is governmental policy on new orthographies. If an orthography facilitator is not a native citizen, he may not be aware of such policies in place, but it is crucial to check this. If one is a guest in the country, one wants to be a polite guest, one the government would welcome back. In an extreme case, ignoring or defying government policies may lead to one’s being invited to leave the country! Some governments, such as Indonesia, have no policy on orthographies. Others, such as Cameroon, may have a list of graphemes which the or thography must choose from, not introducing novel shapes (Tadadjeu and Sadembouo 1984). Others, such as Ethiopia, may have a governmental office which must approve all orthographies. 3 The religion or mix of religions of a group may be a powerful factor influencing ac ceptability. While in some cases it is not important, in others it is critical. Divisions of Protestant versus. Catholic, Christian versus Buddhist, Muslim or not Muslim can all influence acceptability. Even the choice of script (whether Latin-based, Arabic, Cyrillic, etc.) may be closely tied to one religion. A crucial question is does the language group have a positive or negative view of the already-written neighboring languages, or of the regional or national language? The his tory of a group’s interaction with its neighbors is relevant here. Consider the case where Group X and Group Y have been rivals for decades, perhaps centuries, so Group X does NOT want their orthography to look like Group Y. More happily, it may be that Group X admires Group Y, or considers them to have a higher status. In this case, Group X would want their orthography to look like Group Y. Even if relations are cordial, there could be such a strong sense of self-identity of Group X that they want their orthography to appear distinct. For example, the Komba in Ghana did not want to include the dis tinctively Konkomba in their orthography (Cahill 2014). On a larger scale, in past decades the USSR imposed Cyrillic scripts on its subject countries. When these coun tries achieved independent statehood and had a choice, many of them rejected Cyrillic script in favor of a Latin-based script, as a symbolic rejection of their former rulers, and a proclamation of their independent identity (Clement 2008; Hatcher 2008). When a language has multiple dialects, politics again arises. When I first traveled to the various Kɔnni-speaking villages in northern Ghana, I found that one village had a distinctly different dialect. On my very first visit there, when I expressed an interest in developing writing for Kɔnni, their first question was, “Are you going to write it how 3 Since government policies change from time to time, the reader should check the current policy of the relevant country. The policies mentioned here were current at the time of this writing, but there is no guarantee that they will be the same in the future.

Orthography Design and Implementation 333 we speak it or how they speak it?” At that point, I had no idea! If there is one dominant dialect that everyone agrees should be the standard, then the decision is simplified con siderably; there will be a unilectal orthography. If there are rival dialects, then decisions must be made. Will one be chosen (again, unilectal), or a combination of dialects (multilectal orthography)? Evaluation of the relative vitality of different dialects may tip the scales toward the most vital dialect. It is clear from this that a careful dialect survey is a highly desirable step in orthography development. It is also possible that an orthography may be developed by a single person with no community backing or input. With no community involvement in its development, chances are less for widespread acceptability. Some methods that communities can be involved will be discussed below (section 2.3). To sum up, acceptability issues are crucial in designing an orthography. If speakers don’t accept it, they will not use it, no matter how linguistically sound it is.

2.2.2. Usability issues for a good orthography The usability category includes factors that make it possible for readers to read easily and fluently, and writers to write easily and fluently. This does not happen in isolation, of course. In most circumstances, some sort of ad equate instruction must occur, whether the famous “each one teach one” approach of Laubach literacy (Laubach and Laubach 1960), a more structured class, or whatever mix is locally and culturally appropriate. The first and most obvious factor relating to usability, to linguists at least, is that of phonemic match—precisely one grapheme for one phoneme. This is often phrased as “one sound per symbol, and one symbol per sound.” The idea is that if a reader sees a grapheme, then it is clear how that grapheme is pronounced. The concept of phonemes in phonological theory is not universally accepted in theoretical circles today (e.g., the “Richness of the Base” concept in Optimality Theory in Smolensky 1996, as well as other writings, has undermined it somewhat, though see Dresher 2011 for a vigorous recent defense of the phoneme). However, for practical, real-life orthographic purposes, there is no substitute for an analysis that gives a clear idea of what contrasts exist, at least as a starting point. This starting point may often need to be modified because of acceptability factors, as has been long recognized (e.g., Nida 1964) and amplified in section 2.2.1 above. If the one-for-one matching of phonemes and graphemes is not present, then we have either underdifferentiation or overdifferentiation in the system. Underdifferentiation occurs when some contrasts are not represented. It is not unu sual for an orthography to represent fewer vowels than there are vowel contrasts in the language, or to not mark tone when tone is contrastive. In cases like these, the reader encounters a written word with two (or more) possible pronunciations—and meanings. Sometimes context may disambiguate these, but sometimes not. The reader must slow down to consider possible interpretations, and sometimes is forced into guessing: One African language was written with only five vowels (though it has seven), and without tones (though it needs full tone marking). When trying to read, the people

334 Michael Cahill would half read and half guess. In church, they needed to read a Scripture verse in English first, and only then read it in their own language. Most native speakers thought this was “normal.” This orthography was eventually reformed. (Kutsch Lojenga, personal communication 2015)

The writer, on the other hand, is not handicapped by underdifferentiation. He will write a symbol which may represent one of two sounds, but he does not have to stop and think which sound he is symbolizing.4 Overdifferentiation occurs when two graphemes are used to represent one phoneme. Commonly, these are allophones, and so there is a systematic pattern of which grapheme is used when. Since native speakers are not generally aware of allophonic variation, they need to be specifically taught two symbols for what they perceive as one sound. (An exception for awareness is when local speakers have learned another language which does have these sounds in contrast and they are represented by separate graphemes.) If well taught, overdifferentiation is not generally as big a problem for the local reader as underdifferentiation. However, writers, in particular new ones, must consciously think “which letter should I use?” Until they develop a word-specific memory, then overdifferentiation slows writing. It is possible for an orthography to exhibit both underdifferentiation and overdifferentiation in different parts of the orthography. For example, a language can underdifferentiate by not marking tone, but overdifferentiate by symbolizing both and to represent the phoneme /d/. A mix of overdifferentiation and underdifferentiation should not be automatically disparaged, but each case should be evaluated on its own merits. We see, then, that a basic phonological analysis is foundational for usability of an orthography. This includes all contrasts found in the language in question—not only consonants and vowels but tone, nasality, and other vowel qualities. Tone is frequently a challenge to represent. It is especially thorny in Africa, where al most all languages are tonal, but strategies for representing tone in an orthography are quite varied on the continent. Much more research needs to be done on adequate tone marking. (Bird 1999 is often cited as showing that not marking tone at all gives better fluency than some types of tone marking.) A good start in explaining how language structure typology affects the need to mark lexical or the often more crucial grammat ical tone, as well as examples of various tone-marking strategies, is in Kutsch Lojenga (2014a). The level of phonological depth is an issue that arises when there are alternations in morphemes as a result of regular phonological processes. English (definitely not a

4

Underdifferentiation can be very serious or less so, depending on the functional load of the phonemes in question. The concept of functional load can be approached from a variety of angles, which there is no space for here. For one example, though, if there are very few words distinguished by one phoneme versus another, then it may not affect reading fluency much if the distinction is not represented.

Orthography Design and Implementation 335 paragon of excellent orthographic practices . . .) sometimes represents the result of a lexical process, as in nasal place assimilation in “im-possible” versus “in-tangible.” But “dog-s” and “cat-s” both keep a constant representation of the plural morpheme, though the phonetic output in the first is [z]‌rather than [s]. The concept is often invoked of keeping a “constant word image” or “constant morpheme image” in order to facil itate recognition of the morphemes, and this seems to help fluent readers especially. However, there is also phonological theory that can assist in decision making. Snider (2014) points out that there has been surprisingly little systematic application of pho nological theory to orthography since the basic generative phonology of Chomsky and Halle (1968). In keeping with the principle that orthography should reflect speakers’ awareness, Snider advocates using the lexical/postlexical distinction espoused in the theory of Lexical Phonology. Though the theory as a whole has dropped from favor, the distinction between lexical and post-lexical processes is still frequently invoked. The rel evant issue here is that speakers are aware of the output of lexical processes, but they are not aware of the distinctions produced by post-lexical processes. This is similar to the phoneme/allophone distinction, but does not always give the same answers, as Snider amply illustrates. Another crucial linguistic issue related to usability is word breaks. Are particular morphemes to be joined together, divided by space, or joined with a hyphen? Word break decisions affect readability in at least two opposing ways: long words are diffi cult for beginning readers to decode; however, breaking up grammatical units can make comprehension difficult. Basic principles of word break decisions are detailed in Kutsch Lojenga (2014b), with a rich set of examples from different languages. She gives a spe cific checklist of methodology which can be summarized as follows. The phonology, morphology, and syntax need to be studied and tentative decisions made as to the status of different morphemes: affixes, clitics, or free morphemes. Write affixes joined to stems, write free morphemes as independent, and check native speaker reaction and acceptance. Clitics can be more complex, and joining or separating them can depend on whether they exhibit phonological dependence on their host, and also if they are proclitics or enclitics (Kutch Lojenga 2014b: 83–84). Finally, revise as necessary, keeping in mind that some issues will become clearer only with time and testing. Not all usability factors have to do with linguistics. The visual representation of graphemes can help or hinder good reading. One aspect of this is how many diacritics are used, and what they visually contrast with. For example, to have , with three diacritics, contrasting with and and , is going to be a challenge. The relative size of diacritics may also be an issue at times, as is the positioning of a diacritic (since those above a base grapheme are usually more visually salient than those placed below). Finally, the impact and use of electronic devices must be considered. In many places in the world, local people have skipped the personal computer stage and gone straight to smart phones for personal messaging and access to the internet. If a language’s orthography uses only standard ASCII characters, there will be no problem. But if there are non-Roman graphemes, even so basic as or , then there are challenges ahead. Anecdotal evi dence is starting to mount (and more studies should be done) that local language speakers

336 Michael Cahill are creatively using keyboards to communicate with each other, perhaps using a for the , or using other custom abbreviations that English users are becoming familiar with, e.g., for “see you later.” Cahill (2014) cites a case where, in an alphabet de velopment workshop in Papua New Guinea, local participants experimented with their phones to see which characters required the fewest keystrokes to implement. The bottom line on usability is that if an orthography is badly designed, people will have difficulty reading, often going back and forth to figure out what it means, and it takes more motivation to break through the barrier. There will be problems for writers as well.

2.3. A few notes on methodology The question of who develops an orthography is not trivial. In past decades, the assumption has been that an outside linguist, whether a fully trained PhD linguist or an armchair linguist whose main role was a missionary or colonial administrator, would unilaterally design the orthography (Smalley 1959; Sebba 2007). Sebba and others have termed this an “autonomous” orthography approach. This is definitely not the usual pattern today (Dawson 1989; Cahill and Rice 2016). Decades ago, many minority lan guage groups were isolated, monolingual, and with a lack of formal education. Such conditions are increasingly rare in today’s world. Minority language groups commonly have quite a lot of contact with the world outside their group, multilingualism is the norm, and education levels have risen. The result is that local language speakers, by in clination as well as capability, are much more likely to be closely involved in developing their own languages. These days much of the initial work in orthographies is carried out in workshops of two or more weeks’ duration, involving mother-tongue speakers, and there are two common approaches. In the InField course on Orthography, which has been taught in 2008 and every other year since (since 2012 under the CoLang label), Rice and various co-presenters, espe cially Stenzel in the original course, present what they term the “Midwife Approach” (Cahill and Rice 2016 is the most recent version). An outside linguist acts metaphori cally as a midwife, helping the local people, but largely drawing out their native speaker intuitions and preferences about their own language. Local language participants in itially write out stories. They compare their spelling, and problems are discussed and addressed. The linguist points out consequences of one decision versus another, but leaves final decisions to them. This approach depends on having participants who are already literate, probably in the national language. This method has the advantage of being relatively quick, not requiring a full foundational phonological analysis, but fo cusing only on the problem areas. The SIL branch in Papua New Guinea had developed almost exactly this approach in 1993, there termed the Alphabet Design Workshop. Again, this approach starts with speakers writing their own stories, and “problem areas are not based on the problems or questions that we have as linguists, but rather the problems the language speakers have in

Orthography Design and Implementation 337 reading, writing or teaching their language” (Easton 2003). In 1998, the government asked SIL to develop orthographies quickly for sixty-eight languages. People from other lan guages demanded equal time, and 103 languages had trial orthographies developed from 1998–2002 (Easton 2003; Mark Onken, personal communication, 2015). As with all “trial orthographies,” there were generally details that needed to be worked out in due course. Also in the mid-1990s, Kutsch Lojenga started championing a “participant meth odology” approach in SIL (Kutsch Lojenga 1996) which had some points in common with the previous methods above. It also depends crucially on native speakers’ intui tion and judgments about their own language. One difference between this and the other approaches is that this “participatory” approach starts not with texts, but with collecting about 1,000 individual words on slips of paper (with glosses), written how ever they can. If there are literate speakers, they should write the words. An outsider can also write them, but this is not ideal. The advantage of a local speaker writing the words is that mistakes in transcription tend to be systematic and therefore can reveal something about the local language’s phonology and native speaker perception and in tuition. If an outsider transcribes, mistakes tend to be more random. The papers are grouped by criteria such as word length, all-identical versus differing vowels in the syllables, etc. Often starting with vowels, the workshop coordinator asks participants to sort the words into piles which have the same sound. If they feel the vowel is different, they start a new pile, even if they initially transcribed it the same way. A similar process is followed for consonants and tones. In this way, local speakers’ judgments on what are psycholinguistically different sounds, the phonemes, is affirmed. Kutch Lojenga herself has done workshops with this method with over 100 languages. A bonus of this method is that it also lays the basis for a dictionary. Note that both of the above approaches depend on having a core of local speakers who are already literate in some language. For the ever-diminishing instances where there are absolutely NO local literates, the outside linguist must of necessity take a more active role. Some orthographies are relatively simple, and the two to three weeks typical for the above workshops is sufficient to work out the basics. However, languages with complex vowels and grammatical tone, with difficult word break issues, etc., will not be done in that time. And some issues only become evident after the workshop is completed. So it must be emphasized to the local people that what they have at the end of any of these workshops is a draft orthography, and suggestions for change are welcome, not forbidden. Thus a polished practical orthography cannot be produced in two weeks. It will def initely take more time than that, and often a few years before it settles down to a final form (Karan 2014).

2.4. Conflicts between factors or people Conflicts in orthography development should be expected, not come as a surprise. Various factors of usability and acceptability often point in opposite directions. Some

338 Michael Cahill specific clashes might be anticipated, depending on the amount of background research an investigator has done. Others may ambush the outsider. Here is a hypothetical situation, but based on very real cases, where a clash is predict able. We invoke here what we will label Fiinaka, a fictional East African language. Fiinaka has seven phonemic vowels (/i, e, ɛ, a, ɔ, o, u/). It is a tonal language; tone distinguishes a few words, but also several verbal aspects (e.g., the difference between “come” and “don’t come” is purely tonal). It seems obvious to a linguistic consultant that seven vowels and at least grammatical tone should be included somehow in the orthography. However, Swahili is a powerful influence in this area. Many Fiinaka speak and write it, and they would like Fiinaka to look like Swahili. Unfortunately, Swahili has only five vowels, and is not tonal, thus no tone marking. Linguistics points one way; the people’s preference points another way. Usability and acceptability are in direct conflict. This is obviously a delicate situation, and depending on depth of feelings and negotiation skills, decisions could go either way. If the outside linguist can sensitively and persuasively show that five vowels and no tone would confuse people, and local people are not too committed to Swahili, then it may be that usability will come out on top. But if people are quite attached to Swahili, then linguistic demonstrations will have little impact. Recall that if an orthography is not acceptable, then people will not use it. So in the interest of having people actually engage with and use the orthography, linguistic factors should probably give way in favor of acceptability, though the outside linguist may feel quite uncomfort able with this. Again, orthographies can change over time, and it is highly desirable to revisit an orthographic decision a year or so later, after local speakers have had a chance to actually try it. A case which may catch an outsider unaware would be discovering that two local parties (e.g., two religious groups, two influential chiefs, a teacher versus a district offi cial) have radically differing viewpoints on what constitutes an acceptable orthography. Until the outsider is actually involved in talking to people personally, this situation could well be invisible. If there seems to be resistance to an orthography with no apparent reason, it could be that there are specific personalities involved which are not initially obvious to an outsider. The interested reader may find accounts of conflicts in various places, but the aptly titled “Orthography Wars” by Hinton (2014), as well as Adams’s (2014) account of a very messy and complex situation in Southeast Asia, are especially recommended.

3. Factors pertaining specifically to endangered languages When orthography is mentioned in the same breath as language endangerment, sometimes the assumption is that orthography is relevant solely because it is irreduc ibly connected with education in the mother tongue. Thus the “real” discussion is how

Orthography Design and Implementation 339 literacy in the mother tongue interacts with language viability and vitality. However, this is not always the case. As we will see, an orthography may function as a symbol of lan guage identity even without extensive literature development. With that caveat, the important role that literacy in the mother tongue (which nec essarily entails orthography development) plays in language preservation and revi talization has been cited by a variety of researchers. Karan (2006) highlights the dual influences of UNESCO’s Education for All movement, as well as the growing interest in preventing the death of minority languages through language development, as motivators for orthography development. UNESCO’s (2003) paper listing local lan guage literacy as a positive factor in increasing language vitality has been previously mentioned. Fishman (1991), though noting that language shift is caused by a complex network of factors, views the availability of education in the community language as a key factor in reversing the process of language shift. Skutnabb-Kangas (2000) asserts that though the availability of mother-tongue education in itself cannot prevent lan guage death, a lack of schools teaching in the mother tongue can be a leading cause of language death. Landry and Allard (1992) explain that subtractive bilingualism is often the result of minority-language children accessing education only in the majority lan guage. Strong bilingualism and ethnolinguistic vitality, in their view, requires that the minority language be used in as many domains as possible, including in the home and school. Thus, the goals of educational development and language preservation are closely related. Language communities, seeing either or both of these goals as felt needs, may choose to undertake the task of orthography development as one of their language development activities (Page 2013). In spite of the endorsements above, it needs to be acknowledged that literacy in a local language is not a guarantee of that language’s survival. We need look no further than the case of Manx, which died as a living language in 1974 in spite of having a fair amount of literature. (Manx is currently undergoing lively revitalization efforts.) Also, orthography and literacy are not always valued by all the members of a lan guage community. Oberly et al. (2015), for example, noted that within the endangered Ute community, there were different attitudes toward writing as an aid to revitalization. Younger people, not fluent in Ute, tended to value writing as an aid to memorization, while fluent speakers were not interested in learning to read and write Ute but favored more immersion in orality. It is also possible that the entire language community may be not very interested in writing. Henne (1991, 3) notes this for some Mayan communities (not endangered ones), and observes that this viewpoint is common when literacy has not been a part of people’s traditional lives. Her observation is that writing is viewed as abstract, while the voiced prayer of a pastor or shaman is concrete communication and carries authority. The Ute and Mayan cases show that reasons for aversion to literacy depend on language ideology and the particular goals of the project or subgroup. The necessity for an orthography is partly determined by the degree of endangerment of a language; a legitimate question is whether an orthography or literacy is needed at all. There are at least two major functions of an orthography. One is practical, as an aid

340 Michael Cahill to active reading and writing; the other is symbolic, a mark of identity for that language community. Let us examine how both of these play out in three broad situations: a mor ibund language, a seriously endangered language, and an “at risk” language. Though these three labels are general, the EGIDS scale5 provides a reasonably objective guide to the degree of endangerment of a language. If the language is moribund or nearly extinct, and the language is predictably going to die in the next decade, it is unrealistic to expect that orthography development will be helpful in preserving that language. Even with no realistic potential for native speaker reading, there are still at least two possible positive functions of an orthography. One is that an orthography may be of academic interest to outsiders even with no active use by speakers of the language. There may still be a place for orthography in documenting the language (Seifart 2006), especially in recording texts in usable form. It could be used by either the native speaker or an outside recorder to transcribe stories, songs, conversations, and other textual materials. The spoken language may disappear soon, but a record of materials can still be preserved for future study. In some cases of mori bund languages, however, where time is running out fast, the priority may be to do audi ovisual recordings, with an orthography being a luxury. The other function is symbolic for the language community, if the community is based on a common heritage and ethnic identity rather than a current common lan guage. In this case, an orthography may serve as representational of identity, whether or not it ever is used for actual written communication. An intermediate situation is when the language’s words may be used on local town signs, a very limited usage. An orthography of a moribund language does not have everyday usage as its pur pose. Thus linguistic usability issues such as accurate phonemic representation and word breaks are of lessened importance, since the primary purpose is not ac tual reading. In these cases, the usability issues do not require as rigorous attention here as they do in cases where the orthography is intended as an active vehicle for communication. If, on the other hand, a language is not moribund but seriously endangered, it will have a number of fluent speakers, often with quite a low view of their language, and a shift to another language is taking place. In this case, an orthography with corre sponding literacy materials and activities has potential to boost the community’s esteem and actual use of the language in a variety of domains, as exemplified by the Paumarí example in section 1. The challenges are quite significant but often surmountable. When literacy is effective, people’s view of their language can rise, literacy usage may be in corporated in several domains, and there is potential to move up the EGIDS scale more than one notch. 5

The Expanded Graded Intergenerational Disruption Scale, or EGIDS (Lewis and Simons 2010), is an expanded version of Fishman’s GIDS scale. Each language in the Ethnologue (Lewis et al. 2016) is tentatively assigned an EGIDS number. Relevant to our purposes here are the middle numbers: 5 = Developing, 6a = Vigorous, 6b = Threatened, 7 = Shifting, 8a = Moribund, 8b = Nearly Extinct. “Developing” assumes literature and some degree of orthographic standardization.

Orthography Design and Implementation 341 With a slightly endangered language, perhaps 6b on the EGIDS scale, speakers still have an excellent grasp of their language and children are still learning it. However, a survey of attitudes may reveal a negative mind-set toward their language, and they may be wondering what future it has in the context of larger and more economically pow erful languages they are in contact with. In this case, an orthography and accompanying literacy may increase the perceived status of the language, helping it to be used more frequently in more domains. People may start to feel more pride in their language, being more positive about using it and teaching it to their children. And that language has just become less endangered. Another major factor in how orthographies can develop is the presence of matrix languages, those by which the endangered language is surrounded and influenced, populous languages which have well-established orthographies. If speakers of an en dangered language have largely shifted to the matrix language, then the effect of this “neighboring” language may be much more powerful than the usual effect discussed in section 2.1. Several cogent examples of this are illustrated in Hinton (2014). The most detailed case is Yurok in California, in which not many speakers were fluent in what could be termed their “heritage language.” They highly valued it as a mark of cultural identity, but most were to all extents and purposes fluent native English speakers—and readers. In this case, where few were fluent in Yurok, the main audience for an orthography were lan guage learners. It was totally unworkable to propose a deep orthography with a constant word image, or even a phonemic orthography. They adopted orthographic conventions from English (i.e., to represent /i/). This was done consistently enough so that it turned out to function well; it was usable. Since the local people themselves had come up with this, it was also acceptable. Acceptability and usability were both achieved, though not in a way the outside linguists had proposed. The various case studies in Jones and Mooney (2017) illustrate how different groups can prioritize different factors differently.

4. Implementation: Literacy and more In this section, we discuss the question “What comes after an orthography is devised?” This has two broad areas: testing/refining and publicizing the orthography and then ac tually putting it to use in a literacy effort. Once an orthography is developed, whether in a workshop or another setting, the developer(s) must realize that even after all their work, it is not perfect. Testing, whether formal or informal, should be done to make sure it works. This is primarily a test of usability, how well people are able to read material written in this orthography. However, it may happen that some people, perhaps educated in literacy of other languages, may find features that are objectionable not because of usability but because of acceptability. The

342 Michael Cahill issues discussed in section 2.2 may arise at this point, however offputting this is to the developers. Publicizing the new orthography is a necessary step in enhancing acceptability. There may be some societal mechanism in place to carry this out relatively easily, as in through the school system, if one exists. If not, then a public relations campaign should be planned. Rybka (2015) notes that lack of publicity, as well as lack of standardization, was a significant hindrance to literacy in the endangered language Lokono. Revising is part of orthography development. Again, patience is called for; orthographies often take years to settle down into some standardized form (Karan 2006, 2014) and premature standardization should be resisted. Also, it should be communicated to people, especially to testers in the beginning of the process, that it is permitted to point out where the orthography could be better, and actually, they would be doing their fellow speakers a favor by doing so. If there are a number of changes being proposed, it might be well to call a second con ference or workshop for reforming or revising the orthography. The same principles of acceptability and usability should be applied to any new proposal. Finally, orthography development is just a part of a total literacy program. Documented spelling rules, materials development, training of teachers, funding, and instruction in writing are all needed to develop vibrant literacy in a language. The above discussion assumes the desirability of standardization of an orthog raphy. This may not be the most suitable approach in all situations with endangered languages. Essegbey (2015), based on experience in Ghana, advocates standardization when a language will be used in schools. However, with endangered languages that are unlikely to ever be incorporated into a school system, he advocates that speakers be encouraged in what he calls “vernacular literacy,” where people more freely “write as they speak.” A good number of papers that discuss orthography do not specifically discuss lit eracy as a natural consequence of orthography development. Lüpke (2011), for example, mentions literacy but devotes little space to it. Grenoble and Whaley (2006), in contrast, offer an entire chapter to literacy issues before their chapter on orthographies. They take a balanced look at issues such as literacy as hastener of language death versus literacy as language preserver, literacy as culture preserver versus damager, etc. Space does not allow a complete discussion of these issues here, but the reader is reminded that orthog raphy does not exist in a cultural vacuum. A somewhat dated but still useful resource that spells out many of the practical considerations of a literacy effort in some detail is Waters (1998). Pedagogical grammars (Campbell et al., Chapter 12, this volume) and dictionaries (Rehg, Chapter 13, this volume) are useful outputs. However, entertaining materials such as local and translated stories, and life-helpful materials such as health and agricultural materials, may be more popular. The details of implementing an orthography depend on whether people are literate in another language already, or are basically illiterate to start with, but space does not allow detailed discussion of this.

Orthography Design and Implementation 343

5. Concluding remarks The orthography developer should make an explicit goal of writing and publishing an orthography statement. Statements have different degrees of depth. Some, appended to a phonology write-up, consist of one to three pages, listing phonemes and their or thographic representations. Others are thankfully more comprehensive, including graphemes, word divisions, rules of hyphenation, and capitalization, etc. Again, what depth of coverage is aimed at would depend on the audience and their background. If the community is relatively well educated in another national language, the statement would focus on differences between the local language and the national one. This chapter has implicitly assumed alphabetic scripts rather than syllabaries or logographic systems. Following Grenoble and Whaley (2006), alphabetic systems are recommended in most situations, especially for endangered languages whose speakers are embedded in other languages, and the speakers are familiar with these other lan guages, but may not have total native fluency in their heritage language any longer. Orthography development is by nature a multi-disciplinary task, involving linguis tics, literacy and education principles, and sociopolitical angles. Those engaged in it must be willing to engage with disciplines that perhaps they were not specifically trained for. The return on such an investment, though, will be substantial in benefiting the lan guage communities.

References Adams, Larin. 2014. “Case Studies of Orthography Decision Making in Southeast Asia.” In Developing Orthographies for Unwritten languages, edited by Michael Cahill and Keren Rice, 231–250. Dallas, TX: SIL International. Bird, Steven. 1999. “When Marking Tone Reduces Fluency: An Orthography Experiment in Cameroon.” Language and Speech 42(1): 83–115. Cahill, Michael. 2004. “From Endangered to Less Endangered: Case Histories from Brazil and Papua New Guinea.” SIL Electronic Working Papers 2004-004. (revised from 1999 version). http://www.sil.org/resources/publications/entry/7880. Cahill, Michael. 2014. “Non-linguistic Factors.” In Developing Orthographies for Unwritten languages, edited by Michael Cahill and Keren Rice, 1–12. Dallas, TX: SIL International. Cahill, Michael and Keren Rice, eds. 2014. Developing Orthographies for Unwritten Languages. Dallas, TX: SIL International Cahill, Michael and Keren Rice. 2016. “Orthography Development: The ‘Midwife’ Approach” (Orthography Course Presented to CoLang Institute, June 27–30). Fairbanks: University of Alaska-Fairbanks. Carbou, Henri. 1913. Méthode pratique pour l’étude de l’arabe parlé au Ouaday et à l’est du Tchad. [Practical method for studying the Arabic spoken in Waddai and the east of Chad]. Paris: Librairie orientaliste Geuthner. Chomsky, Noam and Morris Halle. 1968. The Sound Pattern of English. New York: Harper and Row.

344 Michael Cahill Christaller, J. G. 1875. A Grammar of the Asante and Fante Language Called Tshi, (Chwee, Twi). Basel: Basel Evangelical Society. Christaller, J. G. 1881. A Dictionary of the Asante and Fante Language Called Tshi (Chwee, Twi): With a Grammatical Introduction and Appendices on the Geography of the Gold Coast and Other Subjects. Basel: Basel Evangelical Society. Clement, Victoria. 2008. “Emblems of Independence: Script Choice in Post- Soviet Turkmenistan.” International Journal of the Sociology of Language 192: 171–185. Dawson, Jean. 1989. “Orthography Decisions.” Notes on Literacy 57: 1–13. Dresher, B. Elan. 2011. “The Phoneme.” In The Blackwell Companion to Phonology, Volume I, edited by Marc van Oostemdoorp, Colin J. Ewen, Elizabeth Hume, and Keren Rice, 241–266. West Sussex, UK: Blackwell Publishing. Easton, Catherine. 2003. “Alphabet Design Workshops in Papua New Guinea: A Community- Based Approach to Orthography Development.” Paper presented at Conference on Language Development, Language Revitalization and Multilingual Education in Minority Communities in Asia. Bangkok, Thailand, November 6–8. http://www-01.sil.org/asia/ldc/ parallel_papers/catherine_easton.pdf. Accessed June 15, 2016. Essegbey, James. 2015. “‘Is This My Language?’: Developing a Writing System for an Endangered- Language Community.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson and Fiona McLaughlin, 153–176. Amsterdam: John Benjamins. Fishman, Joshua. 1991. Reversing Language Shift: Theoretical and Empirical Foundations of Assistance to Threatened Language. Clevedon, UK: Multilingual Matters. Grenoble, Lenore and Lindsay J. Whaley. 2006. Saving Languages: An Introduction to Language Revitalization. Cambridge: Cambridge University Press. Hatcher, Lynley. 2008. “Script Change in Azerbaijan: Acts of Identity.” International Journal of the Sociology of Language 192: 105–116. Henne, Marilyn. 1991. “Orthographies, Language Planning and Politics: Reflections of a SIL Literacy Muse.” Notes on Literacy 65: 1–18. Hinton, Leanne. 2014. “Orthography Wars.” In Developing Orthographies for Unwritten Languages, edited by Michael Cahill and Keren Rice, 139–168. Dallas, TX: SIL International. Jones, Mari C. and Damien Mooney, eds. 2017. Creating Orthographies for Endangered Languages. Cambridge: Cambridge University Press. Karan, Elke. 2006. “Writing System Development and Reform: A Process.” MA thesis, Grand Forks, North Dakota: University of North Dakota. http://arts-sciences.und.edu/summer- institute-of-linguistics/theses/_files/docs/2006-karan-elke.pdf. Karan, Elke. 2014. “Standardization: What’s the Hurry?” In Developing Orthographies for Unwritten Languages, edited by Michael Cahill and Keren Rice, 231–250. Dallas, TX: SIL International. Krauss, Michael. 1992. “The World’s Languages in Crisis.” Language 68: 4–10. Kutsch Lojenga, Constance. 1996. “Participatory Research in Linguistics.” Notes on Linguistics 73: 13–27. Kutsch Lojenga, Constance. 2014a. “Orthography and Tone: A Tone-System Typology with Implications for Orthography Development.” In Developing Orthographies for Unwritten languages, edited by Michael Cahill and Keren Rice, 49–72. Dallas, TX: SIL International. Kutsch Lojenga, Constance. 2014b. “Basic Principles for Establishing Word Boundaries.” In Developing Orthographies for Unwritten languages, edited by Michael Cahill and Keren Rice, 73–106. Dallas, TX: SIL International.

Orthography Design and Implementation 345 Landry, Rodrigue and Réal Allard. 1992. “Ethnolinguistic Vitality and the Bilingual Development of Minority and Majority Group Students.” In Maintenance and Loss of Minority Languages, edited by William Fase, Koen Jaspaert, and Sjaak Kroon, 223–251. Amsterdam: John Benjamins. Laubach, Frank C. and Robert S. Laubach. 1960. Toward World Literacy, the Each One Teach One Way. Syracuse, NY: Syracuse University Press. Lewis, M. Paul. 1993. “Real Men Don’t Speak Quiché: Quiché Ethnicity, Ki-che Ethnic Movement, K’iche’ Ethnic Nationalism.” Language Problems and Language Planning 17(1): 37–54. Lewis, M. Paul and Gary F. Simons. 2010. “Assessing Endangerment: Expanding Fishman’s GIDS.” Revue roumaine de linguistique (RRL) 55(2): 103–120. http://www-01.sil.org/ ~simonsg/preprint/EGIDS.pdf. Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig, eds. 2016. Ethnologue: Languages of the World. 19th ed. Dallas, TX: SIL International. Online: http://www.ethnologue.com. Lüpke, Friederike. 2011. “Orthography Development.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 312– 336. Cambridge: Cambridge University Press. Nida, Eugene A. 1964. “Practical Limitations of a Phonemic Alphabet.” In Orthography Studies: Articles on New Writing Systems. Helps for Translators 6, edited by William A. Smalley, 22–30. London: United Bible Societies. Oberly, Stacey, Dedra White, Arlene Millich, Mary Inez Cloud, Lillian Seibel, Crystal Ivey, and Lorelei Cloud. 2015. “Southern Ute Grassroots Language Revitalization.” Language Documentation & Conservation 9: 324–343. http://nflrc.hawaii.edu/ldc/?p=603. Page, Christina Joy. 2013. “A New Orthography in an Unfamiliar Script: A Case Study in Participatory Engagement Strategies.” Journal of Multilingual and Multicultural Development. doi:10.1080/01434632.2013.783035. http://dx.doi.org/10.1080/ 01434632.2013.783035. Rybka, Konrad. 2015. “State- of- the- Art in the Development of the Lokono Language.” Language Documentation & Conservation 9: 110–133. http://nflrc.hawaii.edu/ldc/?p=603. Sebba, Mark. 2007. Spelling and Society: The Culture and Politics of Orthography Around the World. Cambridge: Cambridge University Press. Seifart, Frank. 2006. “Orthography Development.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Hmmelman, and Ulrike Mosel, 275–279. Berlin: Mouton de Gruyter. https://www.academia.edu/3248986/Orthography_development. Simons, Gary F. and Lewis, M. Paul. 2013. “The World’s Languages in Crisis: A 20 year Update.” In Responses to Language Endangerment. In Honor of Mickey Noonan. New Directions in Language Documentation and Language Revitalization, edited by E. Mihas, B. Perley, G. Rei- Doval, K. Wheatley, & E. van Gelderen, 3–19. Amsterdam and Philadelphia: John Benjamins. Skutnabb-Kangas, Tove. 2000. Linguistic Genocide in Education–Or Worldwide Diversity and Human Rights? Mahwah, NJ: Lawrence Erlbaum. Smalley, William A. 1959. “How Shall I Write This Language?” The Bible Translator 10(2): 49– 69. http://www.ubs-translations.org/tbt/1959/02/TBT195902.html?num=49. Smalley, William A. 1964. “How Shall I Write This Language?” (reprint of Smalley 1959). In Orthography Studies: Articles on New Writing Systems. Helps for Translators 6, edited by William A Smalley, 31–52. London: United Bible Societies. Smolensky, Paul. 1996. “The Initial State and Richness of the Base in Optimality Theory.” Rutgers Optimality Archive 154. roa.rutgers.edu/files/154-1196/roa-154-smolensky-2.pdf.

346 Michael Cahill Snider, Keith. 2014. “Orthography and Phonological Depth.” In Developing Orthographies for Unwritten Languages, edited by Michael Cahill and Keren Rice, 27–48. Dallas, TX: SIL International. Tadadjeu, Maurice and Etienne Sadembouo. 1984. General Alphabet of Cameroon Languages. Bilingual Version. Propelca 1. Yaoundé: University of Yaoundé. UNESCO Ad Hoc Expert Group on Endangered Languages (Matthias Brenzinger, Arienne M. Dwyer, Tjeerd de Graaf, Collette Grinevald, Michael Krauss, Osahito Miyaoka, Nicholas Ostler, Osamu Sakiyama, María E. Villalón, Akira Y. Yamamoto, Ofelia Zapeda). 2003. “Language Vitality and Endangerment.” Document submitted to the International Expert Meeting on UNESCO Programme Safeguarding of Endangered Languages, Paris, March 10–12. Online: http://www.unesco. org/culture/ich/doc/src/00120-EN.pdf. Waters, Glenys. 1998. Local Literacies: Theory and Practice. Dallas, TX: SIL International.

Chapter 15

L anguage Arc h i v i ng Andrea L. Berez-Kroeker and Ryan E. Henke

1. Introduction Awareness of the endangerment of languages goes hand in hand with the awareness of the vulnerability of language data, especially digital language data. In the time of fast- changing technologies for creating and sharing digital records of endangered languages, it can be easy to forget just how ephemeral data is without proper attention to how it will be stored, described, discovered, and migrated to new storage media in the long term. At the same time, the need to enable access to materials for the appropriate stakeholders for mobilization for language maintenance and research is a central tenet of language documentation, and a general concern for those interested in the advancement of en dangered languages. From these concerns, and from a tradition of archiving the physical records of eth nographic and anthropological research, the field of digital archiving of records of en dangered languages has risen. Language archiving, once an endeavor limited to scholars at the end of their careers—and to the archive users lucky enough to travel to view materials in person—is now an integral part of the widespread response to language en dangerment and an expected part of the language documentation workflow. In this chapter, we examine the evolving role of language archiving in endangered- language scholarship. In section 2 we explore the history of archiving for endangered languages, from the age of Boas and the archiving of analog materials through the rise of the endangered-language movement and the development of best practices for digital language archiving to the current era of established archiving standards. In section 3 we then discuss a potential future for language archiving, that of the participatory model of language archiving, which is radically user-centered and draws on trends in the ar chival sciences. In section 4 we present some of the extant archives for language docu mentation and the members of the Digital Endangered Languages and Music Archiving Network. Finally, because archiving is an activity that is now available to anyone

348 Andrea L. Berez-Kroeker and Ryan E. Henke undertaking endangered-language work, we close in section 5 by presenting the steps one would take to work with an archive to deposit one’s own materials.

2. History of archiving In this section we consider some historical developments and trends in archiving re lated to endangered languages, primarily throughout the United States, Canada, and Australia. In general, this history can be divided into several overlapping periods, which are explored in turn below. This timeline, and its implications for the field of language documentation, are explored more fully in Henke and Berez-Kroeker (2016).

2.1. Archiving before the 1990s The tradition of archiving in modern linguistic work goes back to at least the nine teenth century, when Franz Boas and his anthropologist compatriots began the task of documenting the indigenous languages and cultures of the Americas (Golla 1995). Boas and his Americanist successors like Edward Sapir deposited their materials with trusted institutions such as archives, universities, and museums (Johnson 2004). Journals and monographs served as de facto archives for linguistic records as well (Woodbury 2011); for example, in 1918, a year after it was founded, the International Journal of American Linguistics published a collection of Penobscot texts as a way to make those texts perma nent and accessible. Prior to the digital revolution of the late twentieth century, the linguistic material that found its way into archives consisted primarily of written materials, especially lexical lists, texts, and file slips (Golla 1995). As technology progressed and provided the capa bility to capture sound, audio materials like wax cylinders, vinyl records, and magnetic tapes also became part of the linguistic archival record (Johnson 2004). The concept of an archive remained stable for many decades: a brick-and-mortar in stitution that holds and preserves physical items, where access is available only to a se lect few with the permissions and capabilities to travel to the archive. Furthermore, not all linguists were careful to deposit their materials with archives, which meant that most texts, recordings, and photographs remained in private offices or homes (Trilsbeek and Wittenburg 2006; Conathan 2011). For more than a century, most endangered-language collections were not necessarily created and curated to be open and accessible to the larger scientific community or to speakers of indigenous languages. This held true even after the advent of digital archives in the 1960s (Doorn and Tjalsma 2007), as many archives did not have enough support or resources to usher linguistic data smoothly into a digital era of proper curation and open access (Golla 1995; Johnson 2004; Trilsbeek and Wittenburg 2006).

Language Archiving 349

2.2. Language documentation and the archive The rise of modern language documentation in the late twentieth century changed the way that many linguists regarded the importance and necessity of archiving. Naturally, documentation efforts place an emphasis squarely on depositing documentary materials in a linguistic archive (Conathan 2011; Austin 2014). After all, only a dedicated archive can consistently provide the guidance, infrastructure, and support necessary to maintain a “lasting, multipurpose record of a language” (Himmelman 2006, 1). This is especially true in the face of the progression of digital technology, which by the dawn of the twenty-first century had come to enable to an unprecedented extent the collection, storage, and distribution of vast amounts of linguistic information (Bird and Simons 2003a; Woodbury 2003; Johnson 2004). It was not long before linguists engaged in language documentation—and very im portantly, the funders of such work (Woodbury 2011; Austin 2014)—came to recognize the inextricability of archives and documentary linguistics. The archive had become an integral part of endangered-language work. As Austin and Grenoble (2007, 19) stated, “All documentation projects should be conceived with an eye toward the ultimate de posit of the recorded data and analysis in an archive.”

2.3. The “best” way to archive? Beginning in the early 2000s, this new conceptualization of the place of the archive in documentary linguistics led to major developments in establishing best practices for digitally archiving endangered-language work. Altogether, efforts focused on finding the best ways to make data as long-lasting and usable as possible (Bird and Simons 2003a). This included devising approaches to metadata aimed at facilitating the discovery and usage of archived data, such as creating the Open Language Archives Community (OLAC) and the ISLE Meta Data Initiative (IMDI) (Bird and Simons 2003a; Conathan 2011). The literature also hosted discussions devoted to standards and workflows for collecting, managing, and storing data from documentary projects (e.g., Austin 2006; Thieberger 2010; Good 2011; Thieberger and Berez 2011). Furthermore, organizations and initiatives, such as the Electronic Metastructure for Endangered Language Data (E- MELD) project and the Digital Endangered Languages and Music Archives Network (DELAMAN), arose to help develop and propagate best practices (Boynton et al. 2006; Austin and Grenoble 2007). The first decade of the twenty-first century also saw the founding of digital language archives dedicated to best practices, such as the Archive of the Indigenous Languages of Latin America (AILLA), the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC), the Endangered Language Archive (ELAR), and Kaipuleohone (Albarillo and Thieberger 2009; Nathan 2010, 2014; Thieberger and Barwick 2010, 2012; Berez 2013; Thieberger, Harris, and Barwick 2015a). These developments, of course, engendered critical reactions to the ideas and assumptions behind best-practice efforts. Many of these responses (e.g., Berez and

350 Andrea L. Berez-Kroeker and Ryan E. Henke Holton 2006; Bowden and Hajek 2006) revolved around an important question: Does a one-size-fits-all set of “best practices” make sense, given the wide spectrum of language documentation situations and the variegated challenges that researchers encounter in the field? Still others (e.g., Johnson 2004; Nathan and Austin 2004; Nathan 2009) argued that existing best practices were not deep and comprehensive enough. Perhaps the richest critical dialog regarding best practices centered on ethical issues involved with endangered-language data, such as informed consent, privacy, ownership, rights, and access (e.g., Dwyer 2006; Trilsbeek and Wittenburg 2006; Thieberger and Musgrave 2007; Macri and Sarmento 2010; O’Meara and Good 2010; Robinson 2010). See Austin (2014) for additional discussion regarding critical responses to best practices.

2.4. This decade and beyond Recent years have seen further developments related to archiving endangered-language data. For instance, matters related to best practices are not set in stone; advocates (e.g., Thieberger 2012) continue to develop and proselytize methods for collecting, storing, and sharing data from documentary projects, and critical responses to such efforts continue as well (e.g., Austin 2013; Dobrin and Holton 2013; Bow, Christie, and Devlin 2015). Furthermore, some archives are moving toward more participatory models, which break from the traditional “one-way” model where depositors put materials into an ar chive and then the archive serves as the gatekeeper for others seeking materials (Nathan 2014). This new scheme follows on the heels of, for example, the rise of more collaborative, community-centered approaches to language documentation in general (e.g., Yamada 2007; Czaykowska-Higgins 2009) as well as endeavors to increase the audiences and usage of existing language archives (e.g., Holton 2012; Trilsbeek and König 2014; Woodbury 2014). These participatory developments are explored in more depth in section 3. Finally, efforts are under way to help documentary linguists receive proper credit for the work they put into creating, stewarding, and sharing their archived endangered- language data (e.g., Linguistic Society of America 2010a, 2010b; Gawne et al. 2017; Thieberger et al. 2015b). This will ensure that archived collections count toward job con sideration and promotion, and it also aims to incentivize documentary linguists to adopt best practices to further the interests of both scientists and language communities.

3. Looking ahead: Participatory archives For many decades, the model of archiving related to endangered-language work was essentially “one-way” (Nathan 2014); depositors placed materials into the hands of

Language Archiving 351 archivists, and anyone interested in the materials had to go through the archive to get them. There was little to no interaction between depositors and users. This was not a situation unique to linguistics, though, as the rest of the archiving world followed the same model for a long time (Theimer 2014). However, recent years have seen impor tant developments in documentary linguistics toward participatory archiving, which reconfigures the relationships, interactions, and boundaries between depositors, archivists, and users (Henke and Berez- Kroeker 2016; Nathan and Austin 2014; Holton 2016).

3.1. What is a participatory archive? Huvila (2008, 25) describes a participatory archive as having the following character istics: 1) decentralized curation, where the archive harnesses the knowledge of users to contribute to the curatorial process; 2) radical user orientation, where the archive revolves first and foremost around the user’s needs and experience, rather than just preserving materials; and 3) contextualization of both records and the entire archival process, which means that the archive makes “an explicit attempt to capture a wider context of archival material beyond its provenance.” More simply, Theimer (2011, 2014) defines a participatory archive as: An organization, site or collection in which people other than the archives professionals contribute knowledge or resources resulting in increased under standing about archival materials, usually in an online environment.

Although the tenets of a participatory archive bear a resemblance to Web 2.0 concepts like crowdsourcing or social engagement, the archive goes beyond simply outsourcing labor and requires meaningful contributions from the public to the archival pro cess and establishes methods for responding to users and communities (Simon 2011; Spindler 2014). Proponents of participatory approaches (e.g., Shilton and Srinivasan 2007) claim they offer a variety of benefits for communities, including addressing “commonly expressed forms of dissatisfaction with cultural institutions” (Simon 2011, 21). This includes making the archive more relevant to the lives of its users, offering changing user experiences, presenting voices and stories from multiple perspectives, and providing a platform for discussion (2011). Furthermore, Theimer (2014) elaborates some specific ways that an archive may become more participatory. For example, she recommends facilitating engagement online through social media outlets, inviting the public to make contributions to the archive by transcribing documents, and drawing on the experi ence and knowledge of individuals by asking them to identify archived photographs. Altogether, participatory archives employing such an approach depart radically from the traditional model of archiving.

352 Andrea L. Berez-Kroeker and Ryan E. Henke

3.2. How did participatory archiving come to linguistics? The development of participatory archives in language documentation stems from the confluence of various factors. Three of the most important are described below. First, the field has seen a rise in the number of research projects grounded in col laboration with speakers, and documentary linguists increasingly proclaim the impor tance of such enterprises (e.g., Dwyer 2006; Yamada 2007; Dobrin and Holton 2013; Wilbur 2014). Of particular significance has been Cameron et al.’s (1992) advocacy for “empowering research,” which is undertaken on, for, and with language communities. The same can be said for Czaykowska-Higgins’ (2009, 24) community-based language research (CBLR) model, which “involves a collaborative relationship, a partnership, be tween researchers and (members of) the community within which the research takes place.” These developments have helped reorient the basic foundations of endangered- language work and allow the formerly passive subjects of research to become active drivers and participants. Second, participatory models have gained steam due to the increasing self- empowerment of indigenous communities in language and cultural endeavors reliant upon archives, particularly language revitalization (Hinton 2001; Krebs 2012; Gehr 2013). This includes Native people not only using existing language archives for their own purposes (Holton 2012) but also establishing their own archives (ANA 2005; Berez, Finnesand, and Linnell 2012; Ormond-Parker and Sloggett 2012; Berez 2013). In short, the communities long regarded as objects of language research are instead taking the reins as creators and stewards of such efforts themselves. Finally, the development of participatory language archives has been precipitated by the rise and propagation of participatory archives outside the discipline of linguis tics. Huvila’s (2008) seminal paper officially established the concept of a participatory archive, which he built upon the participatory archiving model advocated by Shilton and Srinivasan (2007). These efforts find their origins at least as far back as participa tory approaches in disciplines like information systems and education in the mid-1990s (Tammaro 2016), but they likely gained significant momentum from the emergence of efforts in the mid-2000s to incorporate crowdsourcing in the museum world (Simon 2011; Spindler 2014). Moreover, participatory archives have been spurred by the on line revolution in general, where people have unprecedented, easy access to informa tion as well as choices about where they get information pertaining to their interests (Theimer 2014). This means that archives now, for their very survival, have to com pete with other sources of information and find ways to attract users and enrich their experiences (Theimer 2014). See Spindler (2014) for a helpful summary and evaluation of the origins, motivations, authority, quality, methods, and examples of participatory archives. By 2014 at the latest, fields like archival science and museum studies had a ro bust dialogue explicating and critiquing participatory models, and such models were gaining popularity with various archives around the world (Spindler 2014; Theimer 2014; Tammaro 2016).

Language Archiving 353

3.3. Participatory archiving and endangered languages Language archives began adopting participatory models by 2010, when ELAR approached incorporating social networking models to allow depositors and users to interact directly rather than having to use the archive as a mediator (Nathan 2010). This approach reconceptualized the archive as “a forum for conducting relationships be tween information providers (usually the depositors) and information users (language speakers, linguists and others)” (Nathan 2011, 271). Furthermore, linguists around this time had begun to ask themselves who uses language archives and why (e.g., Austin 2011; Holton 2012), and they were also searching for ways to expand the audiences and uses of existing archives (Schwiertz 2012; Trilsbeek and König 2014). Not long after, the literature hosted the emergence of discussions advocating a par ticipatory approach and explicating particular models for language archiving. The ep icenter for these developments was a single volume of Language Documentation and Description (Nathan and Austin 2014) in which several pieces proposed ways to im plement participatory methods. For example, Woodbury (2014) provides advice to increase the frequency and expand the purposes of usage for language archives. For archives, he suggests following an “art museum model” that includes the creation of collection guides and the involvement of guest curators and exhibits. Garrett (2014, 69) argues that archives have for too long focused only on developing relationships with depositors, and so he proposes a participant-driven language archiving (PDLA) model that establishes “direct, web-based, relationships between participants and archives, minimizing the use of depositors as proxies.” Linn (2014) outlines a specific proposal for a Community-Based Language Archive (CBLA), which involves the lan guage community in shaping every aspect of the archiving process, and she describes her own experiences implementing such a model at the Sam Noble Oklahoma Museum of Natural History. As a final example, Gardiner and Thorpe (2014) describe the development of the Aboriginal and Torres Strait Islander Data Archive, which operates with participatory principles focused on collaboration and relationship- building with language communities. Of course, as took place in the archival sciences, a critical dialogue is emerging around participatory archiving in endangered- language work. For example, Stenzel (2014) presents a variety of “pitfalls” associated with her attempts to incorporate a participatory approach to her documentary work in the Amazon. For now, discussions around participatory archives in language documentation are still new. Even in the archival sciences such models are still new, and researchers are still fleshing out particular models, navigating challenges, and undertaking critical evaluations (Tammaro 2016). No doubt, participatory models are gaining momentum for endangered-language work (Holton 2016), and we can look to such discussions for guidance, engagement, and comparison. Many more language archives will be going in this direction, and we will also see important critical debate about specific participatory models, their benefits, and their shortcomings.

354 Andrea L. Berez-Kroeker and Ryan E. Henke

4. DELAMAN archives for language documentation There are multiple networks and organizations of endangered-language archives. These include DELAMAN,1 which is comprised of representatives from the mainly academic or scientific institute-based endangered-language archives; OLAC, which is a standards- setting community for issues concerning metadata for endangered-language data (Bird and Simons 2003b; Simons and Bird 2003; Good 2010); and other national library- based organizations like the Association for Tribal Archives, Libraries, and Museums2 and the American Indian Library Association.3 Below we discuss DELAMAN and the DELAMAN archives. DELAMAN is an international consortium of digital archives of endangered language and music that provides a forum for discussion among archive directors about standards and developments in the field of language archiving. When it was formed in 2003, most of the founding member archives were directed not by archivists trained in archival and libraries sciences but rather by fieldwork-based linguists and ethnomusicologists, who were also in charge of administering digital archival collections of endangered-language and music data. As discussed above, in the late 1990s and early 2000s, a digital model for language archives began to augment and replace analog archiving done primarily by libraries and museums. DELAMAN became “a forum for presenting problems and sharing solutions to a gamut of practical matters in archiving. Among these were questions of how to develop standards for collecting catalog metadata, how to best provide access to digital materials, and how to make endangered-language archiving relevant for mainstream linguistics” (Berez-Kroeker 2015). Today, representatives from the DELMAN member archives meet at least annu ally, and advise bodies like the National Science Foundation Documenting Endangered Languages program on best practices for archiving for language documentation. Archives interested in joining DELAMAN must meet minimum standards for scope and infrastructure. The DELAMAN constitution defines an eligible digital language or music archive as the following: A DELAMAN archive is a trusted digital repository created and maintained by an institution with a demonstrated commitment to permanence and the long-term preservation of archived resources with suitable rights management practices to allow access to as much of its collection as possible. It focuses on languages and musics for which there is little recorded material, particularly from endangered cultures. It holds primary data and may also hold other kinds of information (like secondary analyses, transcripts and so on). Digital ethnographic archives typically 1

http://www.delaman.org/. http://www.atalm.org/. 3 http://ailanet.org/about/. 2

Language Archiving 355 focus on material like oral tradition or musical performance and house collections arising from fieldwork, usually linguistic or musicological fieldwork. Ideally the catalog of the collection is available via the internet, and is harvested by services like the Open Language Archives Community or the Open Archives Initiative. (DELAMAN 2016a)

DELAMAN archives are digital collections of endangered- language and music materials, including multimedia materials, annotations including transcripts and translations, databases, teaching materials, and the like. DELAMAN archives have a proven commitment to long-term data preservation and usually participate in a well- documented international standard for collecting and sharing metadata (like OLAC or the Open Archives Initiative). The DELAMAN board vets applicant archives for their policies regarding collection, preservation, and access and will grant full or associate membership to successful applicants. Full membership is available to organizations that are actively involved in the long-term curation and management of digital endangered- language materials. Full members will have fully developed policies for collection, preservation, and access and a demonstrated commitment to following international standards for metadata collection and sharing. Associate member archives, on the other hand, are those archives that are either newly established or have less stable technological infrastructure, e.g., archives located in remote areas (DELAMAN 2016c). At the time of writing, DELAMAN has fourteen member archives. These are discussed in turn below.

4.1. Archive of the Indigenous Languages of Latin America AILLA,4 at the University of Texas at Austin, was founded in 2001 and is home to digital materials including audio and video recordings, texts, and images related to nearly 300 languages in twenty-two Latin American and Caribbean countries (Kung and Sherzer 2013). The mission of AILLA is to preserve and make available, via the internet, “irre placeable linguistic and cultural resources in and about the indigenous languages of Latin America, most of which are endangered” (Kung and Sherzer 2013, 380). Notable collections include Nahuatl materials collected by linguist Jonathan Amith, Jonathan Hill’s Curripaco collections, and the Terrence Kauffman collection of more than twenty indigenous languages of Latin America.

4.2. Alaska Native Language Archive ANLA,5 at the University of Alaska Fairbanks, is home to more than 15,000 physical and digital items in or on Alaska’s twenty Native languages. The materials were originally 4 5

http://www.ailla.utexas.org/site/welcome.html. https://www.uaf.edu/anla/.

356 Andrea L. Berez-Kroeker and Ryan E. Henke collected under the aegis of the Alaska Native Language Center, which was created by the Alaska state legislature in 1972 (Krauss 1974). In 2009 ANLA became an independent entity in order to better preserve and consolidate these materials and others that had been created “by a wide range of individuals and institutions [and were] scattered in archives, libraries, and attics across the globe” (ANLA 2014).

4.3. California Language Archive CLA6 is a collaboration between the Berkeley Language Center and the Survey of California and Other Indian Languages at the University of California, Berkeley. The Berkeley Language Center is an archive of audio recordings collected since the 1950s on heritage and endangered languages from around the Americas; the Survey is an ar chive of paper materials—more than 2,450 distinct items of indigenous languages of California and beyond—collected since 1950 that are in the process of being digitized (Campbell et al. 2011). The CLA is searchable online; access to some materials is also available online, though some is only available for in-person access.

4.4. Endangered Language Archive ELAR7 is housed in the School of Oriental and African Studies at the University of London and serves as the archive for materials collected under the Endangered Language Documentation Programme (ELDP). ELAR is notable for its protocols for access management: depositors with ELAR designate materials as being accessible by individuals of different profiles; these can be (i) ordinary users, (ii) researchers; (iii) community members, or (iv) subscribers (to a particular collection). Potential users then must register with ELAR and can access materials based on their user profile (ELAR 2016).

4.5. Kaipuleohone Kaipuleohone8 is the digital archive for endangered-language materials collected by affiliates of the University of Hawai‘i, although Kaipuleohone will preserve materials about the languages of Asia and the Pacific region collected by others as well. The collection is part of Scholar Space, the University of Hawai‘i institutional repository, and is administered by the Department of Linguistics (Berez 2013).

6

http://cla.berkeley.edu/. http://www.elar-archive.org/. 8 http://ling.hawaii.edu/kaipuleohone-language-archive/. 7

Language Archiving 357

4.6. Native American Languages Collection at the Sam Noble Museum of Natural History. NALC9 is home to materials on 175 languages of Native America, especially the languages of Oklahoma. It contains significant materials on Osage and Navajo. NALC also provides active outreach to endangered-language communities through participation in the Breath of Life and the annual Oklahoma Native Youth Language Fair.

4.7. Pacific and Regional Archive for Digital Sources in Endangered Cultures PARADISEC10 is an Australia-based digital archiving consortium that includes The University of Melbourne, the University of Sydney, and Australia National University. Founded in 2003, PARADISEC curates digital materials from endangered languages worldwide. At the time of writing, PARADISEC is home to materials from more than 1,000 languages. Furthermore, PARADISEC has in recent years become an interna tional center for training activities, software development, and standards development (Thieberger, Harris, and Barwick 2015a).

4.8. The Rosetta Project The Rosetta Project11 is the library of human languages of The Long Now Foundation.12 It holds documents and recordings of over 2,500 languages and is housed in a special collection in The Internet Archive.13 The aim of the Rosetta Project is to explore “very long-term archiving. It serves as a means to focus attention on the problem of digital ob solescence, and ways we might address that problem through creative archival storage methods” (Rosetta Project 2016).

4.9. The Repository and Workspace for Austroasiatic Intangible Heritage RWAII14 is housed at Lund University and is dedicated to digitizing, preserving, and providing access to materials related to languages of the Austroasiatic language family of India and mainland southeast Asia. RWAII has been in operation since 2012. 9

http://samnoblemuseum.ou.edu/collections-and-research/native-american-languages/. http://www.paradisec.org.au/. 11 http://rosettaproject.org/. 12 http://longnow.org/. 13 https://archive.org/details/rosettaproject. 14 http://projekt.ht.lu.se/r waai. 10

358 Andrea L. Berez-Kroeker and Ryan E. Henke

4.10. The Language Archive. TLA15 is a unit of the Max Planck Institute for Psycholinguistics and is home to one of the world’s largest collections of endangered-language materials: the OLAC website lists over 100,000 items in TLA at the time of writing (OLAC 2016). The best-known portion of the archive is the materials collected as part of the “Documentation of Endangered Languages (Dokumentation bedrohter Sprachen, or DOBES)” project of the VolkswagenStiftung.

4.11. Center for Native American Indigenous Research of the American Philosophical Society. The APS16 has long been a major collector of information on indigenous languages of the United States and beyond, including especially the linguistic papers of Franz Boas and his contemporaries and students. The APS has also been funding linguistic field work through its Phillips Fund for Native American Research since 1945. In 2014 CNAIR was founded to reconnect “these legacy archival materials in digital form to the indig enous communities from which they came, and to forming ongoing partnerships with these Nations, often in collaboration with linguists and other scholars actively working with these communities” (DELAMAN 2016b).

4.12. Digital Himalaya Digital Himalaya17 is a digital archive for linguistic and ethnographic materials related to the Himalayan region. It is currently housed at the University of British Columbia.

5. Preparing to work with an archive In earlier times, becoming a depositor of endangered-language materials to an archive was not a commonplace activity. The permanent archiving of one’s collected materials was generally only available to university scholars and usually only took place toward the end of one’s scholarly career. However, this is no longer the case. The availability of inexpensive technologies for high-quality recording and storage of language materials, as well as growing awareness that digital data is far more ephemeral than physical data 15

https://tla.mpi.nl/. https://amphilsoc.org/cnair. 17 http://www.digitalhimalaya.com/. 16

Language Archiving 359 like notebooks and wax cylinders, means that endangered-language specialists today are expected to regularly deposit the materials they collect in a trusted digital language re pository. In some cases, archiving may be a professional requirement. Funding agencies like the Documenting Endangered Languages program of the US National Science Foundation and the US National Endowment for the Humanities; the Endangered Language Documentation Programme; and the Phillips Fund for Native American Research require grantees to archive their collected materials as a condition of accept ance of the award. Some PhD programs in linguistics (e.g., the University of Hawai‘i at Mānoa; see Berez 2015) require their graduate students to archive the data upon which their dissertations are based. Even if archiving of one’s collected endangered-language materials is not externally required, most endangered-language specialists consider archiving an appropriate and ethical activity in order to ensure that precious language materials are preserved for the stakeholder audience in the long term. Although becoming a depositor to a language archive requires some forethought, it is not a prohibitively difficult or expensive undertaking, provided one takes care to prepare early for archiving as an integral part of an overall plan for data management. Thieberger and Berez (2012, 100) advocates archiving as a regular step in the language documentation process: Many people think of archiving as the last step in a research program, to be done years after the fieldwork is complete, after everything has been transcribed and mined in service of analysis—essentially, something to be done after we are “finished” with the data. Here, we advocate the exact opposite: data should be archived immediately and often [ . . . ] It is quite possible—and becoming increasingly common—to ar chive recordings periodically from the field as they are collected. Then, whenever you finish a transcription, archive it. When you write up an analysis, archive it.

This section summarizes the steps required to prepare to deposit endangered-language materials in an archive: choosing an archive, formatting materials properly, collection of metadata, and consideration of access rights and conditions.

5.1. Choosing an archive Building a relationship with a proper digital language archive—that is, a dedicated re pository with a trained staff and an institutional commitment to preserve your data—is key. Websites, hard drives, and computers are not true archives. You should look for an archive that uses appropriate descriptive metadata, assigns persistent identifiers to deposited items, and has a data backup and migration plan in place. You may have an archive already available to you. Does the language community have a sustainable archive, for example, in a cultural center or library? Does the local univer sity have a language archive (for example, Kaipuleohone at the University of Hawai‘i)? If you are affiliated with a university, does your library have an institutional repository that

360 Andrea L. Berez-Kroeker and Ryan E. Henke is willing to accept language data? Is there already an established archive for materials from the region of the world your language is from? Does the funding agency have an archive? If not, the list of DELAMAN archives above contains several options for archives that accept language materials with no geographical or affiliation restrictions (e.g., PARADISEC). Other options for public data repositories also exist, including Dataverse18 or linguistics-specific Tromsø Repository of Language and Linguistics,19 though these may be less suitable for endangered-language data because of their com pletely open-access policies (see section 5.4 below). Once you have identified an archive you would like to work with, make contact with the archivist as soon as possible so that she can guide you on the archive’s policies for de posit. If your funding agency requires a letter of support from an archivist, be sure to ask for this early so that the archivist has time to understand your needs for archival storage and access.20

5.2. Versions of materials While a discussion of all the developments in technologies for digital file formats and standards for endangered-language documentation over the past fifteen years is beyond the scope of this chapter, a few key points about file formats and versions with regard to archiving are worth keeping in mind. In general, endangered-language specialists who are working to archive, preserve, and share language materials will implement three versions of those materials: the archival version, the working version, and the presentation version. That is, for each par ticular item in one’s corpus of documentary materials—a recording, for instance, or a transcript, or a scan of a field notebook—the linguist should plan to create each of the three versions. Each has a different use but is part of a whole plan for preserving, improving, and sharing archival language materials. The archival version of an item should be thought of as the “master version.” It is intended to be the highest-resolution, most complete version of an item; it is intended for the longest- term preservation and is the version that future generations can refer to if the related working and presentation versions are ever lost or deemed to be incorrect in some way. Archival versions of materials should be deposited into an archive as soon as possible after they are collected, and they should be complete, lossless, and unedited to the extent possible. By complete, we mean the version should be preserved in its entirety without subdividing files (i.e., cutting into clips in the case of audio or video recordings), dig itally combining files (i.e., splicing separate recordings together), or shortening files (i.e., removing sections of silence or disfluency). In other words, the archival version 18

http://dataverse.org/. https://opendata.uit.no/. 20 See http://www.linguisticsociety.org/comment/1225#comment-1225 for more on how to request a letter of support from an archivist. 19

Language Archiving 361 should be as close as possible to what comes directly out of the recording device. There are, however, some exceptions. First, in some cases ethical or privacy concerns will prevent the complete version from being the archival version. In such cases when the wishes of the speaker or potential harm to others conflicts with the general principle of archiving a complete version of a file, the removal or digital obfuscation of such sections is considered to be good practice. Second, some language workers will add post hoc verbal metadata, such as the date of the recording, the location, the language name, the names of the speaker, and the topic, by splicing a recording of herself to the beginning of a documentary recording (see the section on metadata below). Adding verbal metadata where none exists is also considered good practice for the archival version of a file. By lossless we mean that digital file formats should be used that do not lose informa tion in the file compression process. Lossless formats for audio include WAV and FLAC; lossless formats for images include PNG, GIF, and some TIFF formats; lossless video formats include H.264-lossless. By unedited we mean that the archival version of a file should be unadulterated in ways that may later be used to improve the presentation quality, but which could ob scure or otherwise affect some quality of the language. For example, background noises should not be scrubbed, background music should not be added, and photographs should not be retouched. Once the archival version has been deposited into the archive, the linguist will then turn her attention to the working version of the item. Working versions are just that: the version that the linguist works on in the course of normal annotation and analysis and most likely alters in the process. For example, a linguist may choose to convert a high- resolution video to something a little smaller so that she can work with speakers to tran scribe and translate it. A lexical database, while it is being developed, is considered to be a working version, as are transcripts and translations that are in process. Working versions are not necessarily archived, although they certainly can be, provided they are marked as such. It would not be uncommon, for example, for a linguist to archive a date- stamped version of her lexical database once a month. Finally, presentation versions are those that are intended for consumption by an au dience of stakeholders. In some cases the presentation version may have considerable value added over the archival version: for example, a highly polished transcription of a story presented in PDF form, or a CD of endangered songs with commentary. But a presentation version could also be simply a down-sampling of the archival version that is added to an archive for ease of access in low bandwidth areas. To exemplify archival, working, and presentation versions, we can imagine that a par ticular item in an archived collection, let’s say a recording of an oration in an endangered language, may have the following files included in the archive: 1. A high-resolution, lossless audio recording (e.g., WAV) of the oration event in its entirety (archival version) 2. An MP3 of the archival recording that is easier for community members in low- bandwidth areas to download (presentation version)

362 Andrea L. Berez-Kroeker and Ryan E. Henke 3. Several date-stamped versions of the transcription file in the ELAN XML format (working versions) 4. A PDF of the final, highly polished transcript and translation of the oration (pre sentation version) Of course, other configurations are also possible. The key is that attention is paid such that all three versions are included, and that each version fulfills its particular role in the entire collection.

5.3. Descriptive metadata A key component of an archived collection is descriptive metadata, which provides im portant information about the collection and the items in it. Metadata is how potential users of archival materials not only discover the existence of items in an archive but also determine whether or not the item is relevant and of enough interest to access it. Archives will use metadata to create catalogs of their holdings, as well as finding aids and subject guides, all of which may be made available for remote searching through the archive’s website or by publishing to an aggregated search engine like the OLAC search engine.21 International standards for metadata categories exist and are used by many dig ital language archives. The two most common standards are those defined by OLAC, which are based on the Dublin Core22 metadata terms (OLAC 2008), and IMDI, the ISLE Metadata Initiative standard for describing multimedia resources and collections (IMDI 2003, 2009). The archive director will instruct you on how to best prepare your metadata for deposit, including specific categories and controlled vocabulary; but even if you are not yet working with an archive, you should regularly record important meta data about your collection and your individual items, either electronically or in a note book. Conathan (2011) provides helpful minimal lists of metadata categories to guide language workers in collecting descriptive metadata.

5.4. Access Depositors also need to consider how and by whom the materials in the archive can be accessed and used. This is an important part of the deposit process, and depositors should take care to consider the question from the earliest stages of work with speakers. Discussions about archiving and sharing may be brought up as part of the informed consent procedure (see Robinson 2010), and are often complex, involving conflicting desires of various stakeholders in the collection.

21

22

http://search.language-archives.org/index.html. http://dublincore.org/.

Language Archiving 363 On the one hand, because of the urgency of language endangerment and the desire of many stakeholders to encourage language education and maintenance, open access to archived materials for non-commercial uses is desirable. On the other hand, depositors may feel some sense of ownership over collections and the knowledge contained therein, and would wish to stop others from accessing their materials to protect them from po tential misuse. However, “misuse” is a broad category, and concerns should be examined more closely when making decisions to keep collections restricted. First, there may be legitimate concerns for either the protection of traditional knowl edge, or for the non-promulgation of libelous statements. Some cultures may have specific ways of sharing knowledge, including restrictions on which group insiders (and group outsiders) have access to information. And some individual speakers may make statements during a recording session that they—or others—may later wish they had not said. Such cases need to be handled carefully and in communication with the speakers involved, and omitting such items from an archival collection, or carefully lim iting access to them, may be warranted. However, as Conathan (2011) points out, “ma terial that is truly culturally private comprises only a very small part of the language documentation material. Linguists should consider at the outset whether the documen tation of such domains is within the scope of their research project” (2011, 252). Second, some depositors with a stake in academic publication about the materials may wish to keep deposited collections closed to avoid getting scooped, or to avoid others “drawing the wrong conclusions” from the data. In some cases, a time-limited embargo may be appropriate: collections may be kept restricted to other researchers for a very limited and clearly defined period (for example, five years after graduation in the case of doctoral students’ dissertation data), with copies being made available to the language community immediately. It is generally understood among the DELAMAN archiving community that “gatekeeping” for strictly academic reasons is contrary to the values of endangered-language maintenance and scientific principles of reproducibility, and should be discouraged. A final point to consider is that access is a different issue from copyright, and depositors should take care to understand how copyright works in their countries and balance it with the additional ethical obligation of determining appropriate conditions for access and use of archived materials. Newman (2012), the clearest publication on copyright issues for a linguistics audience, points out that copyright laws are generally concerned with economic rights. But since most language documentation materials are not profitable, the “Moral Rights” indicated in the Berne Convention are the aspects of copyright that are of most concern to endangered-language workers. Newman writes: “[n]‌atural languages are not copyrightable. As far as intellectual property is con cerned, languages are not owned by the communities that speak them, and thus native speakers have no legal basis for restricting access by others to materials written in or about their languages” (Newman 2012, 432), but acknowledges that “[o]f course tradi tional peoples do have a valid interest in protecting their ‘intangible cultural heritage’ from exploitation by the rich and the powerful” (Newman 2012, 432 n. 2). Again, on going dialog between depositors, archive directors, endangered-language speakers, and

364 Andrea L. Berez-Kroeker and Ryan E. Henke endangered-language communities is the best way to come to resolution about issues of access to archived materials.

6. Conclusion In this chapter we have seen how digital language archives have developed over the decades, from analog, paper-and-tape collections from ethnology and anthropology to digital collections specifically geared toward the needs and concerns of depositors and users interested endangered languages specifically. We have also presented a description of currently existing archives for language documentation and discussed the steps one would need to take to begin working with an archive. As archival science and endangered-language scholarship continue to converge, archives can become the locus not only of the preservation of language-related re sources but also for the sharing and mobilization of those resources for the purposes of education, cultural maintenance, and language revitalization.

References Administration for Native Americans (ANA). 2005. “Native Language Preservation: A Reference Guide for Establishing Archives and Repositories.” www.aihec.org/our-stories/ docs/NativeLanguagePreservationReferenceGuide.pdf. Accessed January 7. Alaska Native Language Archive (ANLA). 2014. “About the Archive.” https://www.uaf.edu/ anla/about/. Accessed September 30, 2016. Albarillo, Emily E. and Nick Thieberger. 2009. “Kaipuleohone, the University of Hawai‘i’s Digital Ethnographic Archive.” Language Documentation & Conservation 3: 1–14. Austin, Peter K. 2006. “Data and Language Documentation. In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel, 87–112. Berlin: Mouton de Gruyter. Austin, Peter K. 2011. “Who Uses Digital Language Archives?” Endangered Languages and Cultures Blog, April 29. www.paradisec.org.au/blog/2011/04/who-uses-digital- language-archives. Austin, Peter K. 2013. “Language Documentation and Meta-documentation.” In Keeping Languages Alive: Documentation, Pedagogy and Revitalization, edited by Mari C. Jones and Sarah Ogilvie, 3–15. Cambridge: Cambridge University Press. Austin, Peter K. 2014. “Language Documentation in the 21st Century.” JournaLIPP 3: 57–7 1. Austin, Peter K. and Lenore Grenoble. 2007. “Current Trends in Language Documentation.” In Language Documentation and Description, vol. 4, edited by Peter K. Austin, 12–25. London: School of Oriental and African Studies. Berez, Andrea L. 2013. “The Digital Archiving of Endangered Language Oral Traditions: Kaipuleohone at the University of Hawai‘i and C’ek’aedi Hwnax in Alaska.” Oral Tradition 28: 261–270. Berez, Andrea L. 2015. “Reproducible Research in Descriptive Linguistics: Integrating Archiving and Citation into the Postgraduate Curriculum at the University of Hawai‘i at

Language Archiving 365 Mānoa.” In Research, Records and Responsibility, edited by Amanda Harris, Nick Thieberger, and Linda Barwick, 39–51. Sydney: University of Sydney Press. Berez-Kroeker, Andrea L. 2015. “About DELAMAN, the Digital Endangered Languages and Music Archives Network” CELP Discussion Blog, June 11. http://www.linguisticsociety.org/ celp_blog. Berez, Andrea L., Taña Finnesand, and Karen Linnell. 2012. “C’ek’aedi Hwnax, the Ahtna Regional Linguistic and Ethnographic Archive.” Language Documentation & Conservation 6: 237–252. Berez, Andrea and Gary Holton. 2006. “Finding the Locus of Best Practice: Technology Training in an Alaskan Language Community. In Sustainable Data from Digital Fieldwork, edited by Linda Barwick and Nicholas Thieberger, 69–86. Sydney: University of Sydney Press. Bird, Steven and Gary Simons. 2003a. “Seven Dimensions of Portability for Language Documentation and Description.” Language 79: 557–582. Bird, Steven and Gary Simons. 2003b. “Extending Dublin Core Metadata to Support the Description and Discovery of Language Resources.” Computers and the Humanities 37: 375–388. Bow, Catherine, Michael Christie, and Brian Devlin. 2015. “Shoehorning Complex Metadata in the Living Archive of Aboriginal Languages.” In Research, Records and Responsibility: Ten Years of PARADISEC, edited by Amanda Harris, Nick Thieberger, and Linda Barwick, 115– 131. Sydney: Sydney University Press. Bowden, John and John Hajek. 2006. “When Best Practice Isn’t Necessarily the Best Thing to Do: Dealing with Capacity Limits in a Developing Country.” In Sustainable Data from Digital Fieldwork, edited by Linda Barwick and Nicholas Thieberger, 45–56. Sydney: University of Sydney Press. Boynton, Jessica, Steven Moran, Anthony Aristar, and Helen Aristar- Dry. 2006. “E- MELD and the School of Best Practices: An Ongoing Community Effort.” In Sustainable Data from Digital Fieldwork, edited by Linda Barwick and Nicholas Thieberger, 87–98. Sydney: University of Sydney Press. Cameron, Deborah, Elizabeth Frazer, Penelope Harvey, M. B. H. Rampton, and Kay Richardson, eds. 1992. Researching Language: Issues of Power and Method. London: Routledge. Campbell, Amy, Andrew Garrett, Hannah Haynie, Justin Spence, Ronald Sprouse, and John Sylak. 2011. “Geographical Metadata in the California Language Archive.” Poster presented at the Linguistic Society of America Annual Meeting, Pittsburgh, Pennsylvania, January 6–9. Conathan, Lisa. 2011. “Archiving and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 235–254. Cambridge: Cambridge University Press. Czaykowska-Higgins, Ewa. 2009. “Research Models, Community Engagement, and Linguistic Fieldwork: Reflections on Working Within Canadian Indigenous Communities.” Language Documentation & Conservation 3: 15–50. Digital Endangered Languages and Music Archiving Network (DELAMAN). 2016a. “Constitution.” http://www.delaman.org/constitution/. Accessed June 8, 2016. Digital Endangered Languages and Music Archiving Network (DELAMAN). 2016b. “The Library of the American Philosophical Society (APS).” http://www.delaman.org/members/ library-american-philosophical-society/. Accessed July 18, 2016. Digital Endangered Languages and Music Archiving Network (DELAMAN). 2016c. “Joining DELAMAN.” http://www.delaman.org/members/joining/. Accessed July 21, 2016.

366 Andrea L. Berez-Kroeker and Ryan E. Henke Dobrin, Lise M. and Gary Holton. 2013. “The Documentation Lives a Life of Its Own: The Temporal Transformation of Two Endangered Language Archive Projects.” Museum Anthropology Review 7: 140–154. Doorn, Peter and Heiko Tjalsma. 2007. “Introduction: Archiving Research Data.” Archival Science 7: 1–20. Dwyer, Arienne M. 2006. “Ethics and Practicalities of Cooperative Fieldwork and Analysis.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 31–66. Berlin: Mouton de Gruyter. Endangered Languages Archive (ELAR). 2016. “ELAR’s Protocols for Access Management.” http://www.elar-archive.org/using-elar/access-protocol.php. Accessed July 18, 2016. Gardiner, Gabrielle and Kirsten Thorpe. 2014. “The Aboriginal and Torres Strait Islander Data Archive: Connecting Communities and Research Data.” In Language Documentation and Description, vol. 12 (Special Issue on Language Documentation and Archiving), edited by David Nathan and Peter K. Austin, 103–119. London: School of Oriental and African Studies. Garrett, Edward. 2014. “Participant-Driven Language Archiving.” In Language Documentation and Description, vol. 12 (Special Issue on Language Documentation and Archiving), ed ited by David Nathan and Peter K. Austin, 68–84. London: School of Oriental and African Studies. Gawne, Lauren, Andrea L. Berez, Barbara Kelly and Tyler Heston. 2017. “Putting Practice into Words: Fieldwork Methodology in Grammatical Descriptions.” Language Documentation & Conservation 11: 157–189. Gehr, Susan. 2013. “Breath of Life: Revitalizing California’s Native Languages Through Archives.” MA thesis, San Jose State University, California. Golla, Victor. 1995. “The Records of American Indian Linguistics. In Preserving the Anthropological Record, edited by Sydel Silverman and Nancy J. Parezo, 143– 157. New York: Wenner-Gren Foundation for Anthropological Research. Good, Jeff. 2010. “Valuing Technology: Finding the Linguist’s Place in a New Technological Universe.” In Language Documentation: Practice and Values, edited by Lenore A. Grenoble and N. Louanna Furbee, 111–131. Amsterdam and Philadelphia: John Benjamins. Good, Jeff. 2011. “Data and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 212–234. Cambridge: Cambridge University Press. Henke, Ryan E. and Andrea L. Berez-Kroeker. 2016. “A Brief History of Archiving in Language Documentation, with an Annotated Bibliography.” Language Documentation & Conservation 10: 411–457. Hinton, Leanne. 2001. “Language Revitalization: An Overview.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 3–18. San Diego: Academic Press. Holton, Gary. 2012. “Language Archives: They’re Not Just for Linguists Anymore.” In Potentials of Language Documentation: Methods, Analyses, and Utilization, edited by Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, and Paul Trilsbeek, 111–117. Honolulu: University of Hawai‘i Press. Holton, Gary. 2016. “Language Archiving: Where We’ve Been and Where We’re Going.” Paper presented at the Workshop on User-Centered Design of Language Archives, Denton, Texas, February 20–21.

Language Archiving 367 Huvila, Isto. 2008. “Participatory Archive: Towards Decentralised Curation, Radical User Orientation, and Broader Contextualisation of Records Management.” Archival Science 8: 15–36. ISLE Metadata Initiative (IMDI). 2003. “Part 1: Metadata Elements for Session Descriptions.” Last modified October. https://tla.mpi.nl/wp-content/uploads/2012/06/IMDI_MetaData_ 3.0.4.pdf ISLE Metadata Initiative (IMDI). 2009. “Part 1B: Metadata Elements for Catalogue Descriptions.” Last modified August. https://tla.mpi.nl/wp-content/uploads/2012/06/ IMDI_Catalogue_3.0.0.pdf. Johnson, Heidi. 2004. “Language Documentation and Archiving, or How to Build a Better Corpus. In Language Documentation and Description, vol. 2, edited by Peter K. Austin, 140– 153. London: School of Oriental and African Studies. Krauss, Michael E. 1974. “Alaska Native Language Legislation.” International Journal of American Linguistics 40: 150–52. Krebs, Allison Boucher. 2012. “Native America’s Twenty- First- Century Right to Know.” Archival Science 12: 173–190. Kung, Susan Smythe, and Joel Sherzer. 2013. “The Archive of the Indigenous Languages of Latin America: An Overview.” Oral Tradition 28: 379–388. Linguistic Society of America. 2010a. “Resolution Recognizing the Scholarly Merit of Language Documentation.” January 8. www.linguisticsociety.org/resource/resolution-recognizing- scholarly-merit-language-documentation Accessed July 18, 2016. Linguistic Society of America. 2010b. “Resolution on Cyberinfrastructure.” January 8. http:// www.linguisticsociety.org/resource/resolution-cyberinfrastructure Accessed July 18, 2016 Linn, Mary S. 2014. “Living Archives: A Community-Based Language Archive Model.” In Language Documentation and Description, vol. 12 (Special Issue on Language Documentation and Archiving), edited by David Nathan and Peter K. Austin, 53–67. London: School of Oriental and African Studies. Macri, Martha and James Sarmento. 2010. “Respecting Privacy: Ethical and Pragmatic Considerations.” Language & Communication 30: 192–197. Nathan, David. 2009. “The Soundness of Documentation: Towards an Epistemology for Audio in Documentary Linguistics.” Journal of the International Association of Sound Archives 33: 50–63. Nathan, David. 2010. “Archives 2.0 for Endangered Languages: From Disk Space to MySpace.” International Journal of Humanities and Arts Computing 4: 111–124. Nathan, David. 2011. “Digital Archiving.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 255–273. Cambridge: Cambridge University Press. Nathan, David. 2014. “Access and Accessibility at ELAR, an Archive for Endangered Languages Documentation.” In Language Documentation and Description, vol. 12 (Special Issue on Language Documentation and Archiving), edited by David Nathan and Peter K. Austin, 187–208. London: School of Oriental and African Studies. Nathan, David and Peter K. Austin. 2004. “Reconceiving Metadata: Language Documentation Through Thick and Thin.” In Language Documentation and Description, vol. 2, edited by Peter K. Austin, 179–187. London: School of Oriental and African Studies. Nathan, David and Peter K. Austin. 2014. Introduction to Language Documentation and Description, vol. 2, edited by Peter K. Austin, 4–16. London: School of Oriental and African Studies.

368 Andrea L. Berez-Kroeker and Ryan E. Henke Newman, Paul. 2012. “Copyright and Other Legal Concerns.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 430–456. Oxford: Oxford University Press. O’Meara, Carolyn and Jeff Good. 2010. “Ethical Issues in Legacy Language Resources.” Language & Communication 30: 162–170. Open Language Archives Community (OLAC). 2008. “OLAC Metadata.” Last modified May 31. http://www.language-archives.org/OLAC/metadata.html. Open Language Archives Community (OLAC). 2016. “Summary Statistics for The Language Archive’s IMDI Portal.” http://www.language-archives.org/metrics/www.mpi.nl. Accessed July 18, 2016. Ormond- Parker, Lyndon and Robyn Sloggett. 2012. “Local Archives and Community Collecting in the Digital Age.” Archival Science 12: 191–212. Robinson, Laura. 2010. “Informed Consent Among Analog People in a Digital World. Language & Communication 30: 186–191. Rosetta Project. 2016. “About.” http://rosettaproject.org/about/. Accessed July 18, 2016. Schwiertz, Gabriele. 2012. “Online Presentation and Accessibility of Endangered Languages Data: The General Portal to the DOBES Archive.” In Potentials of Language Documentation: Methods, Analyses, and Utilization, edited by Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, and Paul Trilsbeek, 126–128. Honolulu: University of Hawai‘i Press. Shilton, Katie and Ramesh Srinivasan. 2007. “Participatory Appraisal and Arrangement for Multicultural Archival Collections.” Archivaria 63: 87–101. Simon, Nina. 2011. “Participatory Design and the Future of Museums. In Letting Go?: Sharing Historical Authority in a User-Generated World, edited by Bill Adair, Benjamin Filene, and Laura Koloski, 18–33. Philadelphia: The Pew Center for Arts & Heritage. Simons, Gary and Steven Bird. 2003. “The Open Language Archives Community: An Infrastructure for Distributed Archiving of Language Resources.” Last modified June 10. https://arxiv.org/abs/cs/0306040. Spindler, Robert P. 2014. “An Evaluation of Crowdsourcing and Participatory Archives Projects for Archival Description and Transcription.” https://repository.asu.edu/attachments/ 135630/content/Research%20Paper%20v3.pdf. Accessed July 20, 2016. Stenzel, Kristine. 2014. “The Pleasures and Pitfalls of a ‘Participatory’ Documentation Project: An Experience in Northwestern Amazonia.” Language Documentation & Conservation 8: 287–306. Tammaro, Anna Maria. 2016. Participatory Approaches and Innovation in Galleries, Libraries, Archives, and Museums. International Information & Library Review 48: 37–44. Theimer, Kate. 2011. “Exploring the Participatory Archives: What, Who, Where, and Why. Paper presented at the Annual Meeting of the Society of American Archivists, Chicago, Illinois, August 21–26. www.slideshare.net/ktheimer/theimer-participatory-archives-saa- 2011. Accessed December 18, 2015. Theimer, Kate. 2014. “The Future of Archives Is Participatory: Archives as Platform, or a New Mission for Archives.” Paper presented at the Offene Archive 2.1 conference, Stuttgart, Germany, April 3–4. http://archivesnext.com/?p=3700. Accessed July 20, 2016. Thieberger, Nicholas. 2010. “Anxious Respect for Linguistic Data: The Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC) and the Resource Network for Linguistic Diversity (RNLD).” In Endangered Languages of Austronesia, edited by Margaret Florey, 141–158. Oxford: Oxford University Press.

Language Archiving 369 Thieberger, Nicholas. 2012. “Using Language Documentation Data in a Broader Context.” In Potentials of Language Documentation: Methods, Analyses, and Utilization, edited by Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, and Paul Trilsbeek, 129–134. Honolulu: University of Hawai‘i Press. Thieberger, Nicholas and Linda Barwick. 2012. “Keeping Records of Language Diversity in Melanesia: The Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC).” In Melanesian Languages on the Edge of Asia: Challenges for the 21st Century, edited by Nicholas Evans and Marian Klamer, 239–253. Honolulu: University of Hawai‘i Press. Thieberger, Nicholas, and Andrea L. Berez. 2011. “Linguistic Data Management.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 90–118. Oxford: Oxford University Press. Thieberger, Nicholas, Amanda Harris, and Linda Barwick. 2015a. “PARADISEC: Its History and Future.” In Research, Records and Responsibility: Ten Years of PARADISEC, ed ited by Amanda Harris, Nicholas Thieberger, and Linda Barwick, 1–15. Sydney: Sydney University Press. Thieberger, Nicholas, Anna Margetts, Stephen Morey, and Simon Musgrave. 2015b. “Assessing Annotated Corpora as Research Output.” Australian Journal of Linguistics 36:1–21. Thieberger, Nicholas and Simon Musgrave. 2007. “Documentary Linguistics and Ethical Issues.” In Language Documentation and Description, vol. 4, edited by Peter K. Austin, 26–37. London: School of Oriental and African Studies. Trilsbeek, Paul and Alexander König. 2014. “Increasing the Future Usage of Endangered Language Archives.” In Language Documentation and Description, vol. 12 (Special Issue on Language Documentation and Archiving), edited by David Nathan and Peter K. Austin, 151– 163. London: School of Oriental and African Studies. Trilsbeek, Paul and Peter Wittenburg. 2006. “Archiving Challenges.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 311– 336. Berlin: Mouton de Gruyter. Wilbur, Joshua. 2014. “Archiving for the Community: Engaging Local Archives in Language Documentation Projects.” In Language Documentation and Description, vol. 12 (Special Issue on Language Documentation and Archiving), edited by David Nathan and Peter K. Austin, 85–101. London: School of Oriental and African Studies. Woodbury, Tony. 2003. “Defining Documentary Linguistics.” In Language Documentation and Description, vol. 1, edited by Peter Austin, 35–51. London: School of Oriental and African Studies. Woodbury, Anthony. 2011. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 159– 211. Cambridge: Cambridge University Press. Woodbury, Anthony C. 2014. “Archives and Audiences: Toward Making Endangered Language Documentations People Can Read, Use, Understand, and Admire.” In Language Documentation and Description, vol. 12 (Special Issue on Language Documentation and Archiving), edited by David Nathan and Peter K. Austin, 19–36. London: School of Oriental and African Studies. Yamada, Racquel-María. 2007. “Collaborative Linguistic Fieldwork: Practical Application of the Empowerment Model.” Language Documentation & Conservation 1: 257–282.

Chapter 16

To ols from the Ethno g ra ph y of C om muni c at i on for L an g uag e D o cum en tat i on Simeon Floyd

1. Introduction Dell Hymes described his program of the ethnography of communication as one that “fills the gap between what is usually described in grammars, and what is usu ally described in ethnographies” (Hymes 1962, 14). For linguists looking for tools for documenting not just grammar but language usage, this sounds like an attractive proposition. Linguists do not generally receive training in ethnography, and ethno graphic methods attuned to the speech situation can be useful for documentation efforts to include cultural documentation along with language. However, there has been less overlap between the ethnography of communication and language docu mentation than might have been expected, given what seems like a natural affinity. There are historical reasons, both in anthropology and linguistics, for why this pairing has never fully been realized, some of which I will review in the following sections. Then the chapter will summarize the main concepts of the ethnography of communi cation, outlining ways in which documentary linguists could draw on them to develop how we approach the sociocultural contexts around the linguistic forms we docu ment. Finally, this potential is illustrated through accounts of experiences from two language documentation efforts with the Cha’palaa and Imbabura Quechua languages of Ecuador.

Tools from the Ethnography of Communication 371

1.1. Integrated documentation of language and culture While linguistics programs may not tend to provide formal training in ethnography, documentary linguists certainly acquire many useful rough-and-ready ethnographic skills through the practical work of cultural documentation. Modern language doc umentation funding programs have often explicitly stated that cultural informa tion should be included in language documentation projects. This example comes from the Volkswagen Foundation’s DoBes language documentation’s information for applications (Volkswagen Foundation 2011): All languages are intimately interlinked with the culture of their speakers, and all languages and cultures represent specific expressions of human thought and so cial organization. Therefore, with every language which becomes extinct priceless intellectual values will be lost forever” . . . “it is intended that the data collection should encompass linguistic phenomena as part of an extensive cultural and social complex. (1–3)

Yet while the cultural side of language documentation is prioritized, it is still expected that field linguists pick up ethnographic abilities on the fly. We have few scientific guidelines for what counts as a good cultural record, and any notion of a creating “com plete” cultural documentation is both practically and conceptually impossible. While of course documenting a “whole language” is also a flawed concept, linguists at least have a practical idea about what it means to collect a record that documents all major types of sounds, words, and constructions in a language. We have no such similar “checklist” for culture. Documentary linguists follow their own interests, or those of community members, or simply rely on luck to determine the variety of sociocultural information that makes it into a corpus. We know that a good record of a language includes high-quality recordings that can be used for phonetics and phonology and a large quantity of speech that makes it possible to find many examples for morphosyntactic analysis. But how do we know what makes a good record of a culture? It is here where field linguists can draw on ideas from ethnography; this chapter will provide something that approximates a practical cultural “checklist” for language documentation from the ethnography of communication.

1.2. Earlier paradigms: salvage anthropology These days, documentary linguists are mainly on their own when it comes to cul tural documentation, because anthropology has for the most part already given up on this endeavor. It is important for documentary linguistics to recall that it is not the first discipline to follow a program of documentation of indigenous societies; cultural

372 Simeon Floyd anthropology went through a period of “salvage anthropology,” with the explicit pur pose of documenting endangered cultural practices, as one of the formative goals of Boas and his contemporaries. In the late nineteenth and early twentieth centuries, ethnographers rushed to “salvage” information from traditional societies that were under pressure from settlement and expansion that was having increasingly genocidal effects (see Hester 1968). While much of salvage ethnography’s concern was with ar chaeology and material culture, practitioners also created documents like photographs, film, sound recordings, grammars, dictionaries, and text collections. Here we can see some of the roots of modern documentary linguistics. Salvage anthropology came to be seen as misguided and became marginalized within cultural anthropology during the second half of the twentieth century. Researchers made off with sensitive cultural items and disturbed graves in search of museum pieces, their visits often coinciding with oncoming projects like dams or roads that destroyed communities (Hester 1968, 132). Salvage anthropologists fell into a cultural evolutionary thinking that assumed linear path of acculturation, and this became an increasingly problematic position as anthropology moved from more static, structuralist views of cul ture to more dynamic, post-structural, and post-colonial viewpoints (Beck 2010). While communities value documenting older traditions, they also may want to document current experiences of cultural change, creativity, and resistance that were erased in these early accounts. Critiques of salvage anthropology are related to more general twentieth century “crises” in anthropology in which the role of anthropologists in colonialism and the way that Western voices tended to drown out native voices in ethnography compli cated ethnographic research (Bunzl 2005). The indirect effect of this critique was to direct anthropologists’ interest away from linguistic and cultural documentation. The decline of salvage anthropology left the responsibility for linguistic and cultural documentation largely in the hands of documentary linguists. The relationships be tween field linguists and communities are much more collaborative than during those days, hopefully making it possible to continue documentation efforts on more even footing, work which despite its complexities is too important to stall. Yet documentary linguistics has not been totally immune to the critiques leveled at salvage anthropology. One polemical paper argues that when documentation efforts succeed in preserving records of languages that are no longer spoken, these amount to “zombie languages,” living dead languages that are only good for scientists but that do not address the cultural pressures on the community during language shift (Perley 2012). Most documentary linguists would hope that corpora could be useful for communities even after languages have ceased to be spoken, but as we take on more responsibilities as ethnographers we will have to be more and more sensitive to these issues.

1.3. The ethnography of communication The ethnography of communication has historical connections to salvage anthropology, in part due to a regional focus in native North America, but it arose out of other specific

Tools from the Ethnography of Communication 373 developments within anthropology and linguistics in the mid-twentieth century. Earlier descriptions of indigenous languages by Boas, Sapir, and others were done in the four- field tradition of anthropology, and linguistics was one component of a program that complemented other interests, such as archaeology. The founding of anthropology departments somewhat pre-dated modern linguistics departments, but by the second half of the twentieth century, a new paradigm of “formal” linguistics developed within newly independent linguistics departments. This research program, associated with the ideas of Noam Chomsky at MIT, explicitly excluded language usage and sociocultural context as aspects of “performance” of language, focusing on language “competence” and abstract universal notions (Chomsky 1965). One side effect of this approach was that the diverse languages studied in more anthropologically oriented approaches be came irrelevant for many linguists, who believed they could get sufficient results from English and other familiar languages. This left work on indigenous languages and their cultural contexts in the hands of a mixture of field linguists, who continued to describe indigenous languages despite the prevailing climate, and linguistic anthropologists. One of these latter was Dell Hymes, who proposed ways to fill the gap left by the separation of language and society seen in formal linguistics. Hymes’s ethnography of communication was inspired by Prague School linguist Roman Jackobson’s analysis of the functions of language, which laid out much broader range of functions beyond the “referential” function that was the main interest of formal linguistics (Jakobson 1960). A linguistic anthropologist following the Americanist tra dition of Boas and Sapir, Hymes first outlined the “ethnography of speaking” program in 1962, proposing that the study of language should include not just grammar in the abstract but should treat it as integrated into society (Hymes 1962). Later Hymes would modify the name to the “ethnography of communication” in recognition of other com municative channels besides speech. Key to the ethnography of communication for Hymes and later researchers is the perspective that language occurs empirically in a time and a place, among people and in sociocultural context. In Hymes’s terms, sentences are speech acts that occur within specific “speech events,” set against broader “speech situations” (Hymes 1974). Researchers in this tradition frequently focus on the analysis of one or more speech events, approaching linguistic forms through rich sociocultural analysis: “for a given semantic paradigm, the ethnographer of speaking might ask such questions as, when, why, in what form, and by whom is the paradigm used in speaking?” (Bauman and Sherzer 1975, 97). Later work went on to develop the ideas of “performance” (Bauman 1975) and “discourse” (Sherzer 1987; Urban 2000) to uncover ways that language and culture interact through close ethnographic description of speech events. Hymes advocated the study of repertoires of local speech practices, which he called “ways of speaking” (Hymes 1989). Sherzer’s in-depth Kuna Ways of Speaking took up the challenge of describing virtually the full range of speech forms of the Kuna people of Panama (and see Senft 2010 for a more recent account of the Trobriand islands in a similar framework). Approaching indigenous oral traditions in their full complexity led to new methods for looking at the internal aesthetics of their “poetic” function, in

374 Simeon Floyd Jakobson’s terms. Expression in indigenous languages was approached not just a window to grammar but as an oral literature tradition on par with Western literary traditions (Hymes 1987). Dennis Tedlock wrote insightfully about approaching spoken word lit erature, representing it textually and transcribing it (Tedlock 1983). A related school of “ethnopoetics” or “verbal art” looks at the poetics of local ways of speaking (Sherzer 2002). This work has led to the development of methods of careful, ethnographically informed transcription and translation of linguistic expression (e.g., Sammons and Sherzer 2000). Seen from an ethnography of communication perspective, the recorded “resources” preserved in documentary archives are representations of speech events, where lin guistic and cultural elements come together. The imperative for cultural, not just lin guistic, documentation calls for methods for looking at such speech events, since they turn out to be our primary object of study and preservation. Of course, simultane ously to developments in the ethnography of communication, descriptive and anthro pological linguistics continued. While not usually doing in-depth ethnography, these traditions developed practical methods for linking information about sociocultural context to recordings, and annotating them with cultural information. Toward the end of the twentieth century, researchers like Michael Krauss and Ken Hale brought issues of language diversity and endangerment back into the spotlight for mainstream linguistics (Hale et al. 1992), and a period of more intensive global language documentation began to take shape. While language loss became the main rallying point for these renewed efforts, linguists found themselves working in communities that had suffered the worst of colonialism, inequality, and poverty, and which in many cases were in danger of dis appearing as unique social groups or otherwise losing cultural agency, and cultural documentation is one way to address this. This chapter highlights some key concepts from the ethnography of communication that could be further incorporated into the language documentation toolkit to address this imperative for cultural documentation.

2. Linking the ethnography of communication to language documentation The language documentation literature already features a number of engaging studies about the role of ethnography, and the goal of this chapter is to complement rather than replicate those contributions. Many of the authors in the central volume The Essentials of Language Documentation (Gippert, Himmelmann, and Mosel 2006) address ethno graphic themes, in particular the chapters by Hill (2006) and Franchetto (2006). Hill outlines three basic levels on which ethnographic approaches can contribute. First, she recommends drawing on the ethnography of communication to understand the

Tools from the Ethnography of Communication 375 cultural dimensions of the speech events recorded for documentation. Second, she proposes attention to the sociocultural nature of creating documentary recordings, including the factors introduced by a researcher. Third, she recommendations paying attention to language ideologies, including attitudes about language that may relate to language endangerment (Hill 2006, 117). Franchetto, in the same volume, specifies ad ditional intersections of language and culture on which documentary linguists might focus: greetings, songs, onomastics, toponyms, men’s and women’s speech, native meta linguistic terms, conversational turn-taking, uses of parallelism, issues of child language socialization, social identity, community history, culture-specific semantics, and other areas (Franchetto 2006, 189). Linguists have also contributed ideas of how to further in corporate ethnography into documentary linguistics (Harrison 2005). Readers should refer to these sources, as well as to general reviews of the ethnography of communi cation (Hymes 1964; Bauman and Sherzer 1975; Keating 2001; Saville-Troike 2008) for more information.

2.1. The speech event What exactly is a documentary corpus as seen from an ethnographic perspective? From a linguistic perspective, we hope that a corpus includes a variety of speakers and speech styles and represents the major linguistic elements. While we recognize that this is not actually a “complete” linguistic record, it is close enough for most practical purposes. Things are not so straightforward from a cultural perspective. Documentary recordings are not so much recordings of “culture” as recordings of events that reflect culture. Because of this, documentary linguists are confronted with the “speech event,” a key con cept of ethnography of communication that captures real instances of language usage. Hymes’s model of the speech event includes a number of components among which the “message form,” including linguistic elements, is only one. In his first formulation of the ethnography of speaking (Hymes 1962), he points out that in addition to gram matical elements, any speech event has some configuration of these components: (1) There will be at least one “sender” or “addresser,” the speech act participant role com monly referred to as “speaker” (the “animator,” in the terms of Goffman 1981; also see Levinson 1988). (2) There will be at least one “receiver,” the participant role commonly called “addressee.” (3) The communicative acts that occur, produced by speakers and perceived by addressees, will have a “message form,” including linguistic and non- linguistic elements. (4) The message will be conveyed through a “channel,” which might be auditory—including not just speech but also whistles and drumming (Meyer 2015; Sicoli 2016; Stern 1957)—visual (sign, gesture, writing), or sometimes other sensory channels (e.g., tactile; Mesch 2002; Edwards 2015). (5) The linguistic elements will be in a particular “code,” which corresponds both to “language”-level classifications as well to “dialect” and “register.” (6) When speakers encode messages in a channel, they do it about something, with social and practical purpose; this is the “topic.” (7) The situation in which all of these components come together is the “setting” or “scene.” Many of these

376 Simeon Floyd dimensions of the speech event are included in Hymes’s SPEAKING model (Hymes 1972), discussed in detail below.

2.2. The speech community In Hymes’s model, members of a “speech community” engage together in speech events with the components above (see also Gumperz 1968). Speech communities are made up of people who share norms of communicative competence (to some degree). They can be large or small, overlap, and nest one inside the other. This concept is notoriously difficult to apply because speech communities rarely have clear boundaries or exclu sive membership. The idea has been re-invented in different forms, including as a more dynamic “community of practice” (Eckert and McConnell-Ginet 1992). Generally lan guage documentation uses practical methods for indentifying speakers of the “same” language, assuming that neighbors from the same region who can converse with each other are members of the same speech community. This pragmatic solution is mostly unproblematic, although cases of second-language speakers, passive bilinguals, and speakers of related varieties living in the same communities bring up questions about heterogeneity within speech communities. Documentary linguists are also well aware of the problems with “language”-level tags (i.e., between “family” and “dialect” levels), and the fact that these can be obscured by standardized codes for labeling for languages, varieties or “languoids” (Nordhoff and Hammarström 2011). The concept of “speech community” helps to think about the not always straightforward relation of such labels to the sociolinguistic contexts they attempt to describe.

2.3. The SPEAKING model One of Hymes’s best known contributions is the SPEAKING heuristic, which uses the letters of the word “speaking” as a mnemonic device to help in the analysis of speech events. When applying this mnemonic, it is important to recognize that it does not assume that there is a finite list of eight dimensions of language. It would be possible to construct longer or shorter checklists of different dimensions (e.g., see a much longer inventory in Preston 2009). These eight dimensions are simply a selection many of the major, salient details that an analyst approaching language in a sociocultural context should consider, many of which correspond to fields of linguistic metadata that are already used. Below I will review each, making specific connections to documentary linguistics.

2.3.1. Setting and scene The setting refers to the time and place of an event, and the scene is the social back drop. In language documentation we record date and location in standard metadata, and sometimes also limited descriptions of the scene. Documentary linguists also might

Tools from the Ethnography of Communication 377 wish to aim for representing diversity in settings, including situations like informal household interaction, formal meetings or rituals, work environments, and so on. If we expect that linguistic practices vary across such contexts, then capturing a diversity of settings is important.

2.3.2. Participants Participants in an event can be described in terms of demographic data—names, ages, genders, professions— or their participant roles— speaker, addressee, over- hearer. Language documentation metadata generally includes the former, and has paid less attention to the latter, although interactional information can be extracted from good time-aligned transcriptions. Language use clearly varies depending on who is pre sent, for example, depending on whether children are present, if genders are mixed, if participants are close acquaintances versus strangers, and so on. This means that a good corpus will feature a diversity of participants.

2.3.3. Ends While the analysis of the purposes and goals involved in the speech events recorded in language documentation corpora is not easy to approach in terms of metadata, it is something that should be considered in terms of the recording setting itself, in cluding aspects of the researcher’s intervention. Field researchers should be aware of how their own goals may align with or cross-cut those of the participants, an area where Hill proposes honing our ethnographic awareness (2006, 119–124). Keeping a careful balance between the concerns of the field researchers and the concerns of the com munity members is already one of the most developed areas of documentary linguis tics methodology, often discussed in terms of “collaborative” documentation (Dwyer 2006; Mosel 2006; Yamada 2007; Chelliah and Reuse 2010, 161–192; Seifart 2011; see also Sapién, Chapter 9, this volume). Continuing to emphasize this area will help pre vent repeating the mistakes of salvage anthropology, when the ends of researchers and communities were often drastically opposed.

2.3.4. Act sequence The act sequence is the temporal structure of the event. Language documentation recordings are slices of time, so they may need to be situated by temporal metadata. Internally to recordings, time-linked transcription software is one way to approach this type of linearity, which can then be zoomed in upon for sequential analysis. Methods ranging from the ethnography of stages of rituals to the close sequential analysis seen in conversation analysis (Schegloff 2007; Sidnell 2010; Clift 2016) and interactional lin guistics (Ochs, Schegloff, and Thompson 1996; Selting and Couper-Kuhlen 2001) may also be pursued in this area.

2.3.5. Key The key is how participants know “what is going on,” based on cues that help distinguish types of situations, their degree of seriousness or humor, and so on. Noting the tone

378 Simeon Floyd of speech in a recording may be important for language documentation, for example, if recordings of a sacred or private event necessitate extra care in handling, or in cases where a recording may be considered offensive or disrespectful.

2.3.6. Instrumentalities The instrumentalities that participants use include languages, dialects, speech styles, and registers. Language documentation corpora tend to include metadata about the languages involved, often relying on standardized language codes (Nordhoff and Hammarström 2011; Lewis, Simons, and Fennig 2016). Sometimes dialect information is included, but speech styles and registers are not usually described. Additionally, the concept of instrumentalities includes non-verbal forms of expression (McNeill 1992; Kendon 2004; McNeill 2005; Enfield 2009); since modern corpora tend to include video, documentary linguists have become very accomplished at making records of multimodal communication, if not always at tagging or analyzing gesture.

2.3.7. Norms The study of social norms is a core element of many areas of the social sciences, particu larly sociology (e.g., Hechter and Opp 2001), although sociologists do not tend to study the structure of speech events outside the subfield of conversation analysis, which draws on concepts of “normativity” inspired by the work of Garfinkel and Goffman (Heritage 2001). As for linguistics, any field linguist working for a significant amount of time in a community will begin to develop an intuitive sense of norms, knowing what com municative behaviors are appropriate or inappropriate, polite or impolite, encouraged or prohibited, and so on. However, this knowledge is rarely formalized in the corpora we collect. Ideally, a sketch of local ways of speaking based on long-term ethnographic observation and metalinguistic interviews would be a standard part of language doc umentation corpora; collaborative analysis drawing on community members’ insider knowledge is crucial for this.

2.3.8. Genre Most metadata systems include some basic fields or categories for marking speech genres with tags like “conversation,” “narrative,” or “song.” Some investigation into local terms and categories is worthwhile so that these might also be reflected in metadata. For example, in a classic article Sherzer describes “Namakke, sunmakke, kormakke: three types of Cuna speech event,” using local metalinguistic terms as an entry point into the analysis of genre (Bauman and Sherzer 1974, 263–282). Uncovering such categories can help to flesh out the way language documentation approaches issues of genre. As pointed out above, this list is neither exhaustive nor naturally emerging—it is a historical artifact created by Dell Hymes. But it is a useful artifact, because Hymes put a lot of thought into the nature of language and communication beyond grammar; the mnemonic is impressive in the range it covers, including some areas where language documentation is already focusing, and others where it has yet to venture. The following sections show what this model might look like in action.

Tools from the Ethnography of Communication 379

3. Case studies in ethnographic language documentation Here I describe two long-term projects with indigenous languages of Ecuador to illus trate how Hymes’s SPEAKING model can help to flesh out the cultural side of language documentation. In both cases concepts such as “speech community,” “speech event,” and “speech genre” relate differently to the goal we might call “cultural representativity,” the idea that a documentary corpus represents not only a grammar but also its sociocultural contexts. While there is no way to fully document something so open-ended as culture, it is possible to distinguish a corpus with more diverse sociocultural dimensions from one with less sociocultural depth. After describing each project, I will run through the full SPEAKING heuristic to see what avenues for deeper ethnographic language docu mentation it points to.

3.1. Cha’palaa The Chachi people of northwestern Ecuador live in the lowland rainforests of the Cayapas River basin and surrounding areas, between the Andean foothills and the Pacific coast, with a population of around 10,000 (Instituto Nacional de Estadística y Censos, Ecuador 2010). They speak Cha’palaa, one of several remaining Barbacoan languages, from a family that used to cover much of the Andean highlands before the introduction of Quechuan languages and Spanish. Several different lines of evidence, including oral history (Floyd 2010), archaeology (DeBoer 1996), and early colonial documents (Jijón y Caamaño 1914), confirm that the Chachis migrated to the lowlands out of the Andes, and they are linguistically and culturally linked to both highland and lowland societies. Currently the Cha’palaa language is being transmitted to most children, but migration combined with increasing road connections are transforming Chachi society in ways that are putting pressure on the language. The Cha’palaa documentation project began with preliminary data collection in 2006, with a sustained effort to record social interaction in everyday household and commu nity contexts in 2010–2015. The corpus primarily features informal conversation, but it also includes recordings of traditional stories, ethnographic interviews, community events and ceremonies, and more. Here I will begin by considering an example of an in formal conversational recording.

3.1.1. Setting and scene Most of the Cha’palaa corpus consists of recordings of everyday, mundane interactions among family and friends in community contexts. These settings include activities such as cooking and eating, resting, or working on chores or handcrafts. In the recording selected here, a group of women have met up at the river bank to wash clothes. This

380 Simeon Floyd

Figure 16.1. Unattended camera filming Chachi women washing clothes.

type of setting and scene reflects a practical task where no strong formal or institutional elements apply (see Figure 16.1). In Hymes’s terms, the setting of this recording is “on the beach in the morning” and the scene is “washing clothes.” As I had been filming primarily inside houses during that period, filming at the river was an intentional choice to vary the scene and setting.

3.1.2. Participants The main participants in this recording are three women from the same community whose washing times have overlapped. They are all in their twenties or thirties, are married, and have families that live in houses in the village. The women engage in sus tained conversation, while other participants come in and out of the interaction—some of the women’s children are nearby, a few men and boys come to bathe, neighbors pass by in canoes, and so on. In transcriptions, each participant is assigned a unique identifier (name, pseudonym, or code), which is one common way that language documentation tracks participants. Participant metadata is another area where documentary linguists already have standard methods for recording demographic data like age, gender, community of or igin, languages spoken, and other fields (in practice, however, not all of this information

Tools from the Ethnography of Communication 381

Figure 16.2. Chachi women attending to practical tasks while conversing.

is always possible to recover). Ethnographic information about kinship and genealogy could add even more here. For example, linking participant metadata in a genealogy network would show that two of the women on the beach are sisters-in-law, which may be relevant for understanding the recording.

3.1.3. Ends Activities like washing clothes have clear practical goals, but during such activities participants also pursue more social ends, including socially “affiliative” conversa tion (Lindström and Sorjonen 2012; Steensig 2012) in which they talk about commu nity affairs, generally agreeing with each others’ assessments and building on them. For example, in (1) speaker C is giving an extended commentary on the child-raising practices of other community members, and during pauses D and L indicate that they affirm C’s assessment; in this case both of them give affiliative responses at the end of C’s turn in quick succession, almost simultaneously, showing how they both orient to it in similar ways. (1) C:

CHSF2012_08_04S3_1353130 tsaa-ren ya-chi añuñuu-nu m-alara-’mityaa-shee then 3-poss small-acc again-come.up-because-aff So just like that, because she came as a child, antsa añuñuba paree ma junku ma-a-mi that small-also the.same again there again-come-decl that child is about to turn out the same.

382 Simeon Floyd D:

mm uh huh

L

tsaawe uwain like.that-n.ego right It’s like that, right.

Throughout the recording, conversation with more social ends is mixed with interactions with more practical, task-oriented ends, like requests for objects (see below). To these we can add, following Hill (2006), the ends of the researcher, al though due to their long-term experience with language documentation recordings, these participants engage minimally in “camera behavior” and generally go about their business. We can compare this cases with the Imbabura Quechua cases discussed in section 16.3.2 below, in which recordings were planned to document specific areas of cultural knowledge, and the ends of language documentation are more salient in the speech event.

3.1.4. Act sequence The general sequence of actions in this recording is tied to completing the macro- task of washing, but there are numerous meaningful act sequences that occur at a more local level; participants makes observations and jokes that receive sequential responses, parents try to get children to respond, and people engaged in tasks ask for assistance. In (2), L asks D for help passing a plastic tub, and D quickly complies (see Figure 16.3). (2) L: daira ña-a i-nu tina ka-’ ee-de daira 2-foc 1-acc tub grab-sr pass-imper Daira grab that tub and pass it to me. D:

entsa-a this-q This one?

L:

jee yes

D:

((throws tub to L))

Many such small sequences in which one participant requests or “recruits” (Kendrick and Drew 2016; Floyd, Rossi, and Enfield in preparation) the assistance of another occur throughout the recording, corresponding to different stages of scrubbing, rinsing, and squeezing. While this recording has only a loose, practical act sequence, in other cases— such as in recordings of community-wide rituals held at special ceremonial centers (see

Tools from the Ethnography of Communication 383

Figure 16.3. A sequential response to a request.

Figure 16.4. Chachi ceremonial center.

Figure 16.4)—act sequences may be rigidly prescribed over a period of days according to the ritual calendar.

3.1.5. Key One of Hymes’s observations about the “key” to speech events is that the meaning of speech cannot be fully understood without knowing its tone, for example, if it is serious

384 Simeon Floyd or not. In the clothes washing recording, there is a great deal of humorous back-and- forth between the women and the men who come to bathe. A close analysis of the inter action can use clues like laughter to determine what should be taken seriously and what is a joke. For example, in (2), L, G, and H are teasing E about his haircut; the laughter is evidence that this stretch of talk is not serious. (2) CHSF2012_08_04S4_2129930 L ñu-chi-bain juntsa’ llashkapa-a-ba 1-poss-also that forehead-foc-also Your (hair) is like this on your forehead ((gesturing)) G

((laughter))

E

tsaaren iya enu guileechi entsankenu juba like.that-exactly 1-foc here razor-instr this-do-inf be-cntr But I do it like this with a razor.

L

((laughter))

G

eso that’s it

H

tsaa ya-’ nejtun ti-na-ju-min monje de shaolin de-ti-n like.that 3-poss because what-q-be-mem monk of shaolin pl-sayq so he’s um, what is it called? Is it called “Shaolin monk”?

E

eso monje de shaoli kekenuju that monk of shaoli do-do-inf-be Yes, I’m doing “Shaolin monk.”

G

ya ok

H

juntsa llashkapa kikinujuba that forehead do-do-inf-be-cntr You have to do it in the front though.

A

((laughter))

H

((laughter))

E

ura tsaa-nu-bain tsejtu good like.that-inf-also in.that.case good, (I’ll do it) just like that, in that case.

Tools from the Ethnography of Communication 385 G

kaa mono dim monkey little monkey

These references to monkeys and martial arts movies are not meant to be taken liter ally, and the participants are “keyed” in to the humor, which contrasts with a more se rious tone elsewhere. For example, had D in example (2) responded to L’s request for the plastic tub with laughter instead of by passing L the tub, this would probably not have been well received.

3.1.6. Instrumentalities The linguistic forms involved in this recording belong to the Rio Zapallo variety of Cha’palaa. The strategy taken in recording the corpus was that of long-term involve ment in one relatively remote, monolingual community, which has great advantages in terms of relationships, but which also has the drawback of primarily representing a single dialect. While ideally it is best to represent the diversity of the language as a whole, it may be difficult in some cases for a single researcher to cover all areas. To address this, I made several targeted trips to collect samples from other dialects. Luckily in this case, the variety spoken on the Rio Zapallo is not drastically different from other varieties, but in cases of more extreme dialect variation it is challenging to represent all of the different instrumentalities collapsed under standard language labels.

3.1.7. Norms Sociocultural norms linked to dimensions like age and gender structure activities in speech events such as those seen in the recording. In Chachi society, the activity of clothes washing is almost exclusively performed by women, and other roles like hunting are performed primarily by men. The ethnography of communication is especially con cerned with norms about speaking, so this approach would also ask if gender norms also influence speech styles. In Cha’palaa there are titles that are used only by wives to address their husbands (tsuriki) or by men addressing other men (llajcha). Using these terms incorrectly would not violate any grammatical norm, but it would certainly vi olate a social one, and documenting such norms is an important part of documenting such forms.

3.1.8. Genre Most of the Cha’palaa corpus consists of “informal conversation” or “maximally in formal social interaction” (Dingemanse and Floyd 2014), in which no special institutional factors structure how the interaction should proceed. Informal conver sation is not a genre itself, strictly speaking, but it can certainly feature momentary performances of genres like storytelling, lullabies, and curing songs, or the jokes and teases seen above; flagging these in transcription is a good way to track such speech genres as they emerge in conversation. Conversational speech has sometimes been marginalized in language documentation in favor of more planned narratives and

386 Simeon Floyd

Figure 16.5. A Cha’palaa storytelling session in a household context.

performances, so a corpus representing maximally informal interaction helps in crease “ecological validity.” However, documentary coverage of speech genres should be broad, and it may be difficult to catch every local way of speaking in situ, so formal planning may be necessary. When I became aware that some speech styles were under- represented in Cha’palaa conversation, I arranged to record traditional stories and other genres to address this. To some extent this range of genres fit into the categories of current metadata formats, although including extra information about what constitutes these genres locally—a classic ethnography of communication topic—is al ways a good idea when possible. Cha’palaa traditional stories show some clear contrasts with more conversational recordings like the one described above. For example, a reportative evidential marker re-occurs regularly throughout, a common feature of storytelling in South American languages (Floyd 2005). (3)

CHSF2015_02_03S2_491683

juntsa piwalaa shinbu tsa-i-we de-ti, that piwalala woman like.that-become-n.ego pl-say(rep) That piwalala women is like this, they say. tsan-ki-ma-a de-ti-we chachi-lia-nu like.that-do-ag.nmlz-foc pl-say(rep) peron-col-acc She does this to people, they say.

Tools from the Ethnography of Communication 387 kapuka ka-lare-’ fi-mu de-ti-we juntsa shinbu eye grab-caus-sr eat-ag.nmlz pl-say(rep)-n.ego that woman Taking out their eyes, she eats them, they say, that woman. kapuka ka-lare-’ fi-tu chachi-lia-nu eye grab-caus-sr eat-sr person-col-acc When she takes out the eyes of the people and eats them, tsaa kepe de-kas-ware-ki-ma-a de-ti-we like.that night cmpl-sleep-caus-do-ag.nmlz-foc say(rep)-n.ego she makes the people sleep at night, they say. Here we complete the first example of the SPEAKING model applied to a specific lan guage documentation project; the next case study provides another example in the con text of a distinct language and project.

3.2. Imbabura Quechua The Imbabura Quechua documentation project differs from the Cha’palaa project in many ways. Rather than collecting a long-term record of one community, I collected a survey of over a dozen communities. The Imbabura and Pichincha varieties of Quechua represent the northern section of a dialect continuum of varieties of Ecuadorian Highland Quechua, locally known as “Quichua.” While in some areas the language is being transmitted, in gen eral the language shift situation is extreme despite the relatively high number of speakers. To approach documenting a dialect continuum, the project incorporated a team of native speakers from around the region who helped arrange recording sessions in different villages. This led to a different conception of the type of speech community the corpus represents compared to the Cha’palaa corpus. Applying the SPEAKING mnemonic to recordings from the Imbabura Quechua project helps to further compare cases.

3.2.1. Setting Attempting to represent a language across many communities means that fewer hours of recordings for each. In some cases I used a similar method as with Cha’palaa, leaving the camera unattended. But in other cases, time was limited, so I had to work together with participants to determine the types of recordings to prioritize, meaning that the recordings tended to be more planned and controlled than in the Cha’plaa corpus. Rather than working long term with one group, I made appointments to meet com munity members for the first time, and held orientation discussions in Quechua before filming to familiarize participants with the project and to learn what kinds of recordings they were interested in (see Figure 16.6).

388 Simeon Floyd

Figure 16.6. Meeting with a community organization before a recording session (photo E. Von Kreisler).

Sometimes participants preferred an ethnographic interview format, or in other cases they arranged to demonstrate practical skills such as handcrafts, agricultural tasks, or cooking, resulting in a broad range of settings. For an example, we can turn to a re cording in which the speaker himself chose the time, place, and topic of the recording. He wished to record a narrative of the struggles of community activists for land reform against the hacienda owners in the 1970s. He had brought a notebook with notes of dates and events to make sure his account was thorough. I simply arrived and turned on the camera (see Figure 16.7).

3.2.2. Participants The participant structure of this recording differs greatly from that of a conversational recording. Here there is one main speaker who engages in an extended narrative. I often suggest that at least one other participant be present so that there will be a native speaker addressee for an audience, in this case the wife of the speaker.

3.2.3. Ends While in an unplanned, conversational recording, participants more or less follow their own ends, in a planned recording of a narrative the participants and the

Tools from the Ethnography of Communication 389

Figure 16.7. A planned session with the goal of recording a specific narrative.

fieldworker enter into a much closer collaboration. While more controlled, the ad vantage of a “staged” recording is that it allows the participants and the fieldworkers to pinpoint some important aspect for documentation. In contrast to salvage an thropology that primarily represented fieldworkers’ ideas about traditional culture, modern collaborative documentation methods seek to enable community members to control planning and decision making. This results in a better representation of local cultural contexts as dynamic and politically engaged, rather than just representing traditional “folklore.”

3.2.4. Act sequence Unlike the open-ended recordings in Cha’palaa corpus, which usually do not have clear beginnings, middles, and ends, a planned recording has a more well-defined act se quence. For example, in this recording, the speaker began with an introduction, which in turn was followed by a narrative with distinct discourse features. Example (4) shows the initial part of the sequence. (4) QUSF2016_01_20S3_5210 ya, bueno ñuka, ñuka primeramente agradeci-ni, ok, well 1 1 firstly thank-1 well, first I would like to thank you, kikin compadre shamu-sha shug visitata ru-ngapug. 2 compadre come-sr one visit-acc do-purp that you compadre have come to make a visit,

390 Simeon Floyd y . . . ñuka shuk asha historia-ta, and 1 one little history-acc and . . . I will (relate) a little history. Ali shamu-shka ka-chun ñuka kay inka wasi-ya-mun, ñuka comunidad-mun. good come-pfv be-purp 1 this 1 house-lim-to 1 community-to May you be welcome to my house, to my community. After, this initial phase, the speaker then goes on to begin the narrative by specifying the time frame and topic, shown in (5). (5)

QUSF2016_01_20S3_40420

Ima wata-pi kallari-shka-ta ñuka parla-sha, cumpadre chay-ta what year-in begin-pfv-acc 1 speak-1fut compadre that-acc I will talk about the time it began, compadre, that one. kay mil novecientos sesenta y nuevi-pi-mi chay(-rik) ñukunchi-ka anuncio parlu ti-rka this 1969-loc-ev that(-ipfv) 1pl-foc announcement speech exist-past It was in nineteen sixty nine, there was an announcement for us, na yacha-shk-ani-chu nima-ta osea aqui, ima-sha juridicu ka-na no know-pfv-1-q nothing-acc um here what-sr juridical be-inf I had not known anything about it, how to legalize land, ni imasha comunata formana, na yachashkanchi. no what-sr comune-acc form-inf no know-pfv-1pl how to form a community, I did not know. A multi-part narrative continued, relating the struggle to apply land reform laws to former hacienda lands. Time-aligned transcriptions are one way to keep track of such sequential elements in the progression of recorded discourse.

3.2.5. Key In contrast with the Cha’palaa example, where humor played a role, here the topic— political struggles for land rights—is not to be taken lightly. Community members sometimes worry that video representations of their communities will show people being non-serious and informal, and it is important to make sure that public recordings give a dignified impression. This is one area where documentary linguists can benefit by paying attention to the “key.” While more spontaneous recordings risk catching things that speaker might not wish others to see—for example, a messy house—previously planned recordings can be controlled to avoid this.

Tools from the Ethnography of Communication 391

Figure 16.8. Women being interviewed by a native-speaker assistant.

3.2.6. Instrumentalities The variety spoken in this video is generally referred to as Cayambe Quichua (Lewiset al. 2016), which is closely related to Imbabura Quechua. Because this corpus samples many communities over a large area, dialect diversity is one of the many issues of instrumentalities to consider. This affects transcription activities, because most of the transcription team are speakers of Imbabura Quechua, which although their variety is intelligible with Cayambe Quechua, they sometimes hear unexpected pronunciations or unknown words. While metadata fields like “location” may be somewhat of a proxy for dialect varia tion, additional information about specific dialects and as well as about different speech registers is important to include in metadata. One salient speech register in Imbabura Quechua is a politeness register known as “kinguy” or “curving speech,” which contrasts with less embellished “direct” speech. Kinguy often appears in contexts like greetings and requests; in the following excerpt, one of the team members prompts a woman to speak about a specific topic, using honorific and diminutive markers, two elements of kinguy speech (see Floyd 2006). (6) CHSF2015_12_08S2_836534 asha-gu-ta parla-y usha-nki-chu kan-kuna-chu Cashaloma-manta ka-pa-nki a.little-acc speak-nmlz be.able-2-q 2-pl-q Cashaloma-from be-hon-2 Please can you speak a little bit about being from Cashaloma?

392 Simeon Floyd

Figure 16.9. A storytelling session.

Because my assistant is younger than the women in the recording, it is appropriate or “normative” that he formulate his request in this way. This leads to the next element of Hymes’s model: norms.

3.2.7. Norms Planned recordings are not exactly “normal” events, but they do have normative aspects, like the need for polite speech described above. These norms are related to complex politeness norms of daily Quechua interaction that determine the appropriateness of behavior in greetings, leave-takings, announcing oneself at a threshold, offering and accepting food, and many other areas. To the extent that it is possible to register this kind of information in metadata, it can greatly enrich a corpus.

3.2.8. Genre The main genre represented in this recording is an extended personal narrative. Personal narratives are similar to traditional stories in that one person speaks for an ex tended period. In many South American languages, including the Quechuan languages and Cha’palaa seen in example (3) above, one main difference is that stories are marked as second-hand, reported information (Floyd 2005); see example (7). (7) CHSF2015_12_03S6_519560 Wawa-kuna-ta japi-shpa-ka miku-dura ka-shka nin-ma shina. child-pl-acc grab-sr-foc eat-ag.nmlz be-pfv say(rep)-ev like.that They say she used to eat children like that.

Tools from the Ethnography of Communication 393 The Imbabura corpus represents a good variety of different speech genres. One example of highly specialized speech is language for blessing and curing. In one instance, one of the documentation team members brought her son on a visit to a specialist at blessing children, and we documented the event. The transcript in example (8) shows part of the woman’s specch as she passed her hands over the child; she asks where he had been “frightened,” often considered a cause of sickness (see Figure 16.10). (8) CHSF2016_04_28S1_56986 sombra espiritus sombra espiritu sombra espiritu Shadow spirit shadow spirit, shadow spirit ashta lya-ri angelito-wan maypi manchari-shka, may yaku-pi? a.little entwine-refl angel-dim-with frigtened-refl-pfv where water-loc Entwine with the little angel, where were you frightened, in the water? may urma-shka ni-shka where fall-pfv say-pfv Where do they say you have fallen? shungu heart

shungu heart

shungu heart

shungu heart

The voice quality used in the word shungu is notably distinct, with an extension of the initial fricative and emphatic, breathy pronunciation of the following vowel. The

Figure 16.10. Specialized speech for curing a child.

394 Simeon Floyd translation in English shows how esoteric the blessing is, and that it involves addressing spirits directly (prompting the question of whether the “participants” should include these as well), and is difficult to approach in terms of referential function. Community members such as this woman have a special role in documenting speech genres, because while any speaker of a language can contribute to the documentation of grammatical elements, only speakers with knowledge of special ways of speaking—“master speakers” in Haviland’s terms (2004)—are able to collaborate with linguists in documenting such rich verbal art traditions.

4. Discussion This chapter began by pointing out that at present, both linguistic and cultural docu mentation are largely in the hands of documentary linguists, who are encouraged by funding bodies to document culture, but who are generally more prepared to document linguistic elements than cultural ones. I suggested that while we have a good idea of what constitutes well-rounded linguistic documentation, there is no equivalent model for culture. However, while “complete” cultural documentation is an impossible goal, it is possible to strive for more cultural diversity in a corpus, and more ethnographic depth in metadata about the speech events documented. Since Dell Hymes’s proposal for an ethnography of communication takes the speech event as its primary analytical object, in some ways this approach seems tailor-made for linguistic and cultural documentation. Most anthropologists are not likely to make this connection, however, due to the problems of salvage ethnography and the “crises” of the discipline that cast suspicion on concepts like “culture” in the first place (see Abu-Lughod 1991). But documentary linguists do not have to be weighed down by this baggage; the adoption of more collaborative methods has helped mitigate po tential conflicts between the goals of community members and linguists, and our con cept of “culture” does not need to be theoretically sophisticated in order to be practically applied. Whatever culture is, speech events are saturated with it, and we are bound to catch some whenever we turn the camera on. What the ethnography of communication can provide is ways of being more systematic about this. If anthropologists are not moving en masse to do language documentation, docu mentary linguists may go ahead and look to the ethnographic methods from anthro pology and use them as needed. I described Dell Hymes’s SPEAKING model as a key tool from the ethnography of communication tradition. I cautioned that it does not necessarily represent a set of discrete finite categories as much as lens to focus on parts of the speech event that are overlooked in grammatical analysis, covering a broad range of some of the most relevant sociocultural aspects of speaking and commu nication. Applying this model to two case studies helped to show how, at each step, Hymes’s categories helped to flesh out the speech events in the recordings in ways that, if incorporated into standard field practices and metadata, would increase the value of

Tools from the Ethnography of Communication 395 documentary materials as culturally representative documents. Not all of these areas will be equally relevant for all projects, so researchers should choose and modify the ones they find most helpful, rather than rigidly sticking to the model. But as docu mentary linguists are increasingly becoming responsible for documenting more than just grammar, we can use all the new tools we can get in order to go forward with this important task.

Abbreviations ACC = accusative, AG.NMLZ = agentive nominalizer, AFF = affirmative, CAUS = caus ative, CMPL = completive, CNTR = counter-assertive, COL = collective, DECL = de clarative, DIM = diminutive, EV = direct evidential, FOC = focus, FUT = future HON = honorific, IMPER = imperative, INF = infinitive, INSTR = instrumental, MEM = remembering, N.EGO = non-egophoric, PFV = perfective, PL = plural, POSS = possessive, PURP = purpose, Q = interrogative, REFL = reflexive, REP = reportative, SR = same referent

References Abu-Lughod, Lila. 1991. “Writing Against Culture.” In Recapturing Anthropology: Working in the Present, edited by Richard G. Fox, 137–62. Santa Fe, NM: School of American Research Press. Bauman, Richard. 1975. “Verbal Art as Performance.” American Anthropologist 77(2): 290–311. Bauman, Richard and Joel Sherzer, eds. 1974. Explorations in the Ethnography of Speaking. London and New York: Cambridge University Press. Bauman, Richard and Joel Sherzer. 1975. “The Ethnography of Speaking.” Annual Review of Anthropology 4(1): 95–119. doi: 10.1146/annurev.an.04.100175.000523. Beck, David R. M. 2010. “Collecting Among the Menomini: Cultural Assault in Twentieth- Century Wisconsin.” The American Indian Quarterly 34(2): 157–193. doi:10.1353/aiq.0.0103. Bunzl, Matti. 2005. “Anthropology Beyond Crisis.” Anthropology and Humanism 30(2): 187– 195. doi:10.1525/anhu.2005.30.2.187. Chelliah, Shobhana L. and Willem J. de Reuse. 2010. Handbook of Descriptive Linguistic Fieldwork. Heidelberg, Berlin: Springer Science & Business Media. Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Clift, Rebecca. 2016. Conversation Analysis. Cambridge: Cambridge University Press. DeBoer, Warren. 1996. Traces Behind the Esmeraldas Shore. Tuscaloosa: University of Alabama Press. Dingemanse, Mark and Simeon Floyd. 2014. “Conversation Across Cultures.” In The Cambridge Handbook of Linguistic Anthropology, edited by N. J. Enfield, Paul Kockelman, and Jack Sidnell, 447–80. Cambridge: Cambridge University Press. Dwyer, Arienne M. 2006. “Ethics and Practicalities of Cooperative Fieldwork and Analysis.” In Essentials of Language Documentation. Berlin and Boston: Mouton de Gruyter. Eckert, Penelope and Sally McConnell- Ginet. 1992. “Communities of Practice: Where Language, Gender and Power All Live.” In Locating Power: Proceedings of the Second Berkeley

396 Simeon Floyd Women and Language Conference, April 4 and 5, 1992, edited by Kira Hall, Mary Bucholtz, and Birch Moonwomon, 89–99. Berkeley Women and Language Group, University of California, Berkeley. Edwards, Terra. 2015. “Bridging the Gap Between DeafBlind Minds: Interactional and Social Foundations of Intention Attribution in the Seattle DeafBlind Community.” Frontiers in Psychology 6: 1497. doi:10.3389/fpsyg.2015.01497. Enfield, N. J. 2009. The Anatomy of Meaning: Speech, Gesture, and Composite Utterances, , vol. 8. Cambridge: Cambridge University Press. Floyd, Simeon. 2005. “The Poetics of Evidentiality in South American Storytelling.” In Proceedings from the Eighth Workshop on American Indigenous Languages, edited by Lea Harper and Carmen Jany, 16: 28–41. University of California, Santa Barbara (Santa Barbara Papers in Linguistics, 46). Floyd, Simeon. 2006. “The Cash Value of Style in the Andean Market.” In SALSA 13: Texas Linguistic Forum, vol. 49, edited by Er-Xin Lee, Kris M. Markman, Vivian Newdick, and Tomoko Sakuma, 50–60. Austin: Texas Linguistics Forum. http://salsa.ling.utexas.edu/pro ceedings/2005/FloydSALSA13.pdf. Floyd, Simeon. 2010. “Discourse Forms and Social Categorization in Cha’palaa.” PhD diss., University of Texas at Austin. Floyd, Simeon, Giovanni Rossi, and Nick J. Enfield. 2018. “Introduction.” In Recruitments: A Typological Comparison of Pragmatic Agency. Berlin: Language Sciences Press. Manuscript in preparation. Franchetto, Bruna. 2006. “Ethnography in Language Documentation.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 183–212. Berlin and Boston: Mouton de Gruyter. Gippert, Jost, Nikolaus P. Himmelmann, and Ulrike Mosel. 2006. Essentials of Language Documentation. Berlinand Boston: Mouton de Gruyter. Goffman, Erving. 1981. Forms of Talk. Philadelphia: University of Pennsylvania Press. Gumperz, John J. 1968. “The Speech Community.” In International Encyclopedia of the Social Sciences, 381–386. Macmillan. Hale, Ken, Michael Krauss, Lucille J. Watahomigie, Akira Y. Yamamoto, Colette Craig, LaVerne Masayesva Jeanne, and Nora C. England. 1992. “Endangered Languages.” Language 68(1): 1– 42. doi:10.2307/416368. Harrison, K. 2005. “Ethnographically Informed Language Documentation.” Language Documentation and Description, vol. 3, edited by Peter K. Austin, 22–41. London: School of Oriental and African Studies. Haviland, John B. 2004. “Mayan Master Speakers—the Archive of the Indigenous Languages of Chiapas.” Collegium Antropologicum 28(Suppl. 1): 229–239. Hechter, Michael and Karl-Dieter Opp. 2001. Social Norms. New York, NY: Russell Sage Foundation. Heritage, John. 2001. “Goffman, Garfinkel and Conversation Analysis.” In Discourse Theory and Practice: A Reader, edited by M. Wetherell, S. Taylor, and S.J. Yates, 47–56. London: Sage Publications. Hester, James J. 1968. “Pioneer Methods in Salvage Anthropology.” Anthropological Quarterly 41(3): 132–146. doi:10.2307/3316788. Hill, Jane H. 2006. “The Ethnography of Language and Language Documentation.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel, 113–128. Berlin and Boston: Mouton de Gruyter.

Tools from the Ethnography of Communication 397 Hymes, Dell H. 1962. “The Ethnography of Speaking.” In Anthropology and Human Behavior, edited by Thomas Gladwin and William C. Sturtevant, 13–53. Washington, DC: Anthropological Society of Washington. Hymes, Dell H. 1964. “Introduction: Toward Ethnographies of Communication.” American Anthropologist (New Series) 66(6): 1–34. Hymes, Dell H. 1972. “Models of the Interaction of Language and Social Life.” In Directions in Sociolinguistics: The Ethnography of Communication, edited by John Joseph Gumperz and Dell H. Hymes, 35–7 1. New York: Holt, Rinehart and Winston. Hymes, Dell H. 1974. Foundations in Sociolinguistics: An Ethnographic Approach. Philadelphia: University of Pennsylvania Press. Hymes, Dell H. 1987. “Tonkawa Poetics: John Rush Buffalo’s ‘Coyote and Eagle’s Daughter.’” In Native American Discourse: Poetics and Rhetoric, edited by Joel Sherzer and Anthony C. Woodbury, 33–88. Cambridge Cambridgeshire and New York: Cambridge University Press. Hymes, Dell H. 1989. “Ways of Speaking.” In Explorations in the Ethnography of Speaking, 2nd ed., edited by Richard Bauman and Joel Sherzer, 443–451. Cambridge: Cambridge University Press. Instituto Nacional de Estadística y Censos, Ecuador. 2010. “Censo Nacional de Población Y Vivienda.” http://www.ecuadorencifras.gob.ec/. Jakobson, Roman. 1960. “Linguistics and Poetics.” In Style in Language, edited by Thomas A. Sebeok, 350–377. Cambridge, MA: MIT Press. Jijón y Caamaño, Jacinto. 1914. Los Aborígenes de La Provincia de Imbabura. Los Cayapas En Imbabura. Madrid: Blass y Cía., Impresores. Keating, Elizabeth. 2001. “The Ethnography of Communication.” In Handbook of Ethnography, edited by Paul Atkinson, Amanda Coffey, Sara Delamont, John Lofland & Lyn Lofland, 285– 301. London: Sage Publications. Kendon, Adam. 2004. Gesture: Visible Action as Utterance. Cambridge and New York: Cambridge University Press. Kendrick, Kobin H. and Paul Drew. 2016. “Recruitment: Offers, Requests, and the Organization of Assistance in Interaction.” Research on Language and Social Interaction 49(1): 1–19. doi:10.1080/08351813.2016.1126436. Levinson, Stephen C. 1988. “Putting Linguistics on a Proper Footing: Explorations in Goffman’s Participation Framework.” In Goffman: Exploring the Interaction Order, edited by Paul Drew and Antony Wootton, 161–227. Oxford: Polity Press. Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig, eds. 2016. Ethnologue: Languages of the World. 19th ed. Dallas, TX: SIL International. Online: http://www.ethnologue.com. Lindström, Anna and Marja-Leena Sorjonen. 2012. “Affiliation in Conversation.” In The Handbook of Conversation Analysis, edited by Jack Sidnell and Tanya Stivers, 250–369. New York: John Wiley & Sons, Ltd. McNeill, David. 1992. Hand and Mind. Chicago: University of Chicago Press. McNeill, David. 2005. Gesture and Thought. Chicago: University of Chicago Press. Mesch, Johanna. 2002. Tactile Sign Language. 1st ed. Hamburg: Gallaudet University Press. Meyer, Julien. 2015. “The Diversity and Landscape Ecology of Whistled Languages.” In Whistled Languages: a Worldwide Inquiry on Human Whistled Speech , 29–50. New York: Springer. Mosel, Ulrike. 2006. “Fieldwork and Community Language Work.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann, and Ulrike Mosel. Berlin and Boston: Mouton de Gruyter.

398 Simeon Floyd Nordhoff, Sebastian and Harald Hammarström. 2011. “Glottolog/Langdoc: Defining Dialects, Languages, and Language Families as Collections of Resources.” In Proceedings of the First International Workshop on Linked Science 2011, Gonn, Germany. Ochs, Elinor, Emanuel A. Schegloff, and Sandra A. Thompson, eds. 1996. Interaction and Grammar (Studies in Interactional Sociolinguistics 13). Cambridge: Cambridge University Press. Perley, Bernard C. 2012. “Zombie Linguistics: Experts, Endangered Languages and the Curse of Undead Voices.” Anthropological Forum 22(2): 133–149. doi:10.1080/00664677.2012.694170. Preston, Dennis R. 2009. “Fifty Some-Odd Categories of Language Variation.” International Journal of the Sociology of Language 1986(57): 9–48. doi:10.1515/ijsl.1986.57.9. Sammons, Kay and Joel Sherzer. 2000. Translating Native Latin American Verbal Art: Ethnopoetics and Ethnography of Speaking. Washington, DC: Smithsonian Institution Press. Saville-Troike, Muriel. 2008. The Ethnography of Communication: An Introduction. New York: John Wiley & Sons. Schegloff, Emanuel A. 2007. Sequence Organization in Interaction: Volume 1: A Primer in Conversation Analysis. Cambridge: Cambridge University Press. Seifart, Frank. 2011. “Chapter 2. Competing Motivations for Documenting Endangered Languages.” In Documenting Endangered Languages Achievements and Perspectives. Berlin and Boston: Mouton de Gruyter. Selting, Margret and Elizabeth Couper-Kuhlen, eds. 2001. Studies in Interactional Linguistics (Studies in Discourse and Grammar 10). Amsterdam: John Benjamins. Senft, Gunter. 2010. The Trobriand Islanders’ Ways of Speaking. 1st ed. Berlin and New York: Mouton de Gruyter. Sherzer, Joel. 1987. “A Discourse-Centered Approach to Language and Culture.” American Anthropologist 89(2): 295–309. Sherzer, Joel. 2002. Speech Play and Verbal Art. 1st ed. Austin: University of Texas Press. Sicoli, Mark A. 2016. “Repair Organization in Chinantec Whistled Speech.” Language 92(2): 411–432. Sidnell, Jack. 2010. Conversation Analysis: An Introduction. Chichester, UK: Wiley-Blackwell Publishing. Steensig, Jakob. 2012. “Conversation Analysis and Affiliation and Alignment.” In The Encyclopedia of Applied Linguistics. London: Blackwell Publishing. Stern, Theodore. 1957. “Drum and Whistle ‘Languages’: An Analysis of Speech Surrogates.” American Anthropologist 59(3): 487–506. Tedlock, Dennis. 1983. The Spoken Word and the Work of Interpretation. Philadelphia: University of Pennsylvania Press. Urban, Greg. 2000. A Discourse-Centered Approach to Culture. 2nd ed. Tucson, AZ: Hats Off Books. Volkswagen Foundation. 2011. “Documentation of Endangered Languages –Dokumentation Bedrohter Sprachen (DobeS): Information for Applicants 67.” Yamada, Racquel-Maria. 2007. “Collaborative Linguistic Fieldwork: Practical Application of the Empowerment Model.” Language Documentation & Conservation 1(2): 257–282.

Chapter 17

L anguage D o cume ntat i on in Diasp ora C ommu ni t i e s Daniel Kaufman and Ross Perlin

1. Introduction Fieldwork with immigrant communities in urban centers has played an important his torical role in linguistics despite scarce mention of this practice in the growing literature on language description and fieldwork. Bowern and Warner (2015, 63), in a rare excep tion, explicitly identify diaspora fieldwork as a distinct scenario among seven different possible relations between linguists and a language community: Linguist works with a diaspora community. The language is spoken in an area of con flict or severe poverty where direct fieldwork would be irresponsible or impossible. The linguist works with members of a diaspora or refugee community in a local town, with work conducted at the university and a local community center. The linguist works mostly on theoretical work for articles or a dissertation, but provides advice to community members about educational materials for the language, and also college preparation advice for community children seeking to further their education.

While this accurately describes common activities of individual researchers in dias pora settings, we discuss here ways in which the range of activities can be expanded through a formal organization.1 We focus on concrete examples of collaborative work and what we believe to be the future potential of urban fieldwork, drawing in partic ular on the experiences of the Endangered Language Alliance, a non-profit organization

1 We use this term because we would like to include here organizations such as the Multilingual Manchester project and the (now defunct) Jakarta Field Station of the Max Planck Institute, which, while not narrowly focused on endangered languages, overlap to an extent in scope and potential.

400 Daniel Kaufman and Ross Perlin based in New York City with which both authors are affiliated.2 We examine this topic first from the perspective of descriptive and documentary linguistics and finally assess the prospects for language maintenance and revitalization in diaspora. While work in diaspora communities can by no means replace traditional fieldwork, we argue that it has significant advantages of its own in terms of access, visibility, and particular kinds of collaborations that may only be possible in an urban center.

2. Urbanization and the rise of hyperdiverse cities The traditional notion of “fieldwork” bifurcates the world into a natural environment, i.e., the rural areas in which fieldwork is typically carried out, and various types of os tensibly less natural environments, i.e., the labs, offices, libraries, and other centers where academic work is carried out. Traveling to “the field” has been the dominant par adigm in descriptive linguistics for well over a century and previous to these modern fieldwork-based studies, missionary linguists from Europe and the Americas had al ready been involved in a type of long-term fieldwork for centuries (see Chelliah and De Reuse 2011, chap. 3, for a good overview). The bifurcation between field and academic center in many ways continues the divide between colony and metropole and suffers from some of the same imbalances, especially with regard to the research agenda, the background of the researchers, and attribution of credit. While such imbalances seem destined to remain as long as the economic and social conditions that underlie them exist, the divide is now becoming less stark in both directions. On one hand, greater sensitivity to the traditional fieldwork power dynamic has resulted in efforts to bring training in language documentation and linguistics to indigenous peoples and devel oping countries (Jukes 2011). On the other hand, linguists are now able to do much more of their analysis and writing in the field, due to advances in computing power, storage, and portability, which have rendered the academic center, as a physical entity, far less central. Creating a “field station” for linguistic research anywhere in the world is also far more feasible now and this too offers more research opportunities to populations that have thus far only been the subject of research. A staffed, bricks-and-mortar center can enable a local community to take a more active role independent of any individual researcher(s), with a place to work, access to equipment, an opportunity for training, and a chance to get paid. Both technology and technology transfer, combined with new 2 For reasons of space, we must leave to future work a detailed discussion of the ethical issues specific to fieldwork in diaspora settings. Overall, the issues which have been discussed for traditional fieldwork collaborations (e.g., Rice 2011, 2012) apply to diaspora work but, as would be expected, many of the inter- and intra-community tensions commonly found on the village level do not exist on the same scale in urban diaspora communities. Nonetheless, certain other issues, such as those relating to immigration or refugee status, are uniquely relevant to diaspora contexts.

Language Documentation in Diaspora Communities 401 ideologies of collaborative fieldwork, empowerment, and community engagement or even control,3 have thus blurred the line somewhat between the historical antecedent of field and metropole.4 It is crucial both for reasons of social justice and scientific progress that linguists strengthen efforts to train community members in the documentation of their own lan guages. But in the attempt to redress the imbalances of the traditional research model, a significant set of collaborative opportunities has gone largely ignored. Living in any major metropolitan center in the world today, linguists are already in some sense in “the field.” Many of the same communities that linguists travel to from academic centers are already represented in centers of immigration. As of the last two decades, most met ropolitan areas are home to a considerable number of threatened languages, many of which are underdescribed and even some which are undocumented. The trend toward urbanization is only increasing over time. According to UN statis tics, more than half of the world’s population now live in urban areas, up from just 30% in 1950 and set to increase to a full two-thirds of humanity by 2050. While the world’s rural population has stopped expanding, urban centers will add 1.5 billion residents over the next fifteen years and 3 billion by 2050.5 The large-scale population movements have created cities of unparalleled diversity, but the factors behind this demographic shift are precisely those that give rise to ever-increasing rates of language death and en dangerment. In particular, environmental, political, and social forces conspire to make traditional rural livelihoods untenable. In many cases, there have been clear political culprits. Henderson (2015, 241), discussing civil and interstate conflict in some of the most multilingual parts of Africa, argues that “language loss due to displacement has been grossly underestimated.” In another very different case, the North American Free Trade Agreement radically lowered income for already impoverished farmers in Mexico, including the most linguistically diverse states of Oaxaca and Guerrero.6 In the same region, the modernization of farming practices, imposed upon traditional farmers during the “green revolution” beginning in the mid-twentieth century, has created large agricultural dead zones (Sonnenfeld 1992). More generally, deforestation, indus trial pollution, and land grabs throughout the world have robbed indigenous people of their self-sufficiency. Compounding the depletion of resources is an ever-increasing need for cash to cover the costs of mandatory education and other expenses of modern citizenhood. The difficulty of continuing traditional lifeways in the home territory has led to the forcible integration of these communities into the cash economy, leading in turn to the skyrocketing rates of urbanization cited above.

3

See Czaykowska-Higgins (2009), Dwyer (2006), and Wilkins (1992) for discussion. This is in addition to other changes. See Grenoble (2010) for a summary. 5 See http://www.unfpa.org/urbanization#sthash.HD9AQ4Lr.dpuf. 6 Canby (2010) cites estimates in which almost half of the 500,000 undocumented Mexican immigrants who were crossing the border annually in the early 2000s were comprised of dispossessed farmers. 4

402 Daniel Kaufman and Ross Perlin The linguistic and cultural gain of the metropolis has thus come at a great loss to in digenous communities all over the world. While diaspora communities can help keep their home communities afloat economically through remittances, the continual draining of fluent speakers and participants in cultural activities cannot be so easily remediated. Moreover, the draining of fluent speakers is just one way in which linguistic communities are negatively affected by migration. Perez-Baez (2009) investigates a transnational Zapotec community of Oaxaca with possibly up to half of its population in Los Angeles and concludes that even in cases where large-scale emigration does not deplete the pool of fluent speakers, returning immigrants can introduce the politically dominant language into domains that previously belonged exclusively to the local lan guage. In such cases, it is precisely because of the strong connections between an in situ indigenous community and its counterpart diaspora community that language shift is accelerating. While the conditions are clearly complex, we may be able to speak of a “language drain” on par with the mass outmigration of skilled labor referred to as “brain drain.” We would also like to know whether there exists a countervailing effect, on par with what has been termed “brain gain” (Kapur 2010), a net benefit in terms of “human capital” that arguably accrues to countries that export skilled workers when they eventually send home not just monetary remittances but also cultural, business, and techical know-how. As discussed further below, one goal which urban language organizations appear well positioned to achieve is the collaborative creation of digital language material “for ex port” to audiences back home. Inasmuch as this can be further developed, urbaniza tion and emigration need not be a completely negative experience for small language communities. It is especially noteworthy that urbanization and emigration have hit some of the world’s most linguistically dense and delicate areas the hardest. A case in point, mentioned above, are the states of Guerrero and Oaxaca in Mexico, home to a set of highly diverse languages belonging to the Otomanguean, Mayan, Uto-Aztecan, Mixe- Zoquean, and Tequistlatecan families, in addition to Huave, a language isolate. Some local varieties of these languages now have more speakers in the diaspora than in their traditional territories due to the pressures discussed above. In this case, at least, a recognized “language hotspot” (Anderson 2011) now seems to be on the move.

3. Re-conceiving linguistics and the city There are a number of reasons why urban immigrant populations have largely been ignored by linguists. Most importantly, linguists set out to document a language in the broadest range of contexts possible (Himmelmann 1998 inter alia); in the diaspora, these contexts are often radically narrowed. For instance, immigrant communities may

Language Documentation in Diaspora Communities 403 lack appropriate contexts for ceremonial language use and may have less dialect diver sity than in their places of origin. Many traditional activities, including livelihood and subsistence practices, may not be carried out in the diaspora. Furthermore, it is diffi cult to reconstruct the definitions of technical and taxonomical terms without their referents close at hand.7 Clearly, a linguist setting out to describe a language wants to be immersed in that language and culture to whatever extent is possible. Henderson (2015) refers to the “intentionally comical contrast” made by legendary field linguist Terry Crowley between “armchair” and “dirty feet” linguists, and an undesirable “kind of halfway house” between the two types. “Halfway house” linguists, wrote Crowley, “may have travelled no further than the outer suburbs of San Francisco or Manchester. . . . At most, this kind of fieldwork is useful if you are only interested in studying a par ticular feature of a language without intending to produce a coherent overall account.” (Crowley 2007, 13). These serious concerns account for the continued dependency of language docu mentation and description on traditional fieldwork. But there are also less justifiable reasons for ignoring diaspora communities. First, there is uncertainty as to how to go about locating urban populations, especially communities that are largely living under the radar. Second, there is a tacit distrust in the abilities of speakers living abroad for an extended period of time. Third, there is what Errington (2003) calls “localist rhetoric” in the language endangerment discourse such that indigenous languages are conceived of as inseparable from a traditional territory. As an example, Errington cites Maffi (1999, 40), who refers explicitly to ex situ language documentation: There is a very close parallel between [ex situ] language preservation and ex situ con servation in biology: while both serve an important function, in both cases the ec ological context is ignored. Just as seed banks cannot preserve a plant’s biological ecology, ex situ linguistic documentation cannot preserve a language’s linguistic ecology.

The more static view is especially difficult to defend in the face of language communities with large diasporas and even more so for languages that have no easily demarcated territory (e.g., Yiddish and Roma). Several languages are also more widely spoken outside their place of origin, e.g., Vlashki, Yiddish, Juhuri, Lo-ke (Mustang). There are furthermore vast human resources in cities that can help advance doc umentation and revitalization efforts. This includes the presence of linguists and other academics as well as those with knowledge of film, audio, computer science, 7

Lahe-Deklin and Si (2014) discuss a successful ethnobiological study done ex situ in the Australian National University, countering the perhaps premature assumption made by Kaufman (2009), that environmental knowledge is impossible to collect in any detail. outside the area under study. A lexicon can also develop independently, sometimes very quickly, in a diaspora context. Young Pohnpeians in Hawai’i, for example, are creating new vocabulary items not used by speakers on Pohnpei, who reject these diaspora words (when they learn their origin) for not being “real” Pohnpeian. (Kenneth Rehg, personal communication, Jan. 2017).

404 Daniel Kaufman and Ross Perlin and publishing technology. The opportunities for developing long-term, equitable, working relationships with individual speakers and communities can in fact be better in this environment than in traditional fieldwork scenarios, where social and eco nomic disparities can constitute formidable barriers. Note, however, that collaboration with urban populations, rather than precluding traditional fieldwork, has generally served as a gateway to in situ fieldwork. In our experience, urban populations have served as a link to their home communities and have been able to prepare students well for traditional fieldwork. Finally, it is necessary to emphasize the obvious point that diaspora contexts are equally worthy of study in their own terms. Specifically, the ways in which a language is adapted (or not adapted) to new domains differs across communities and can shed light on the role of language ideology and other factors in language maintenance. There are also koine varieties that are emerging or expanding in large cities through dialect mixture (Thomason 2015, 23–24). One such example is Tibetan ramaluk (“neither goat nor sheep” speech), which had its beginnings in Nepal and India but which seems to have gained a life of its own in cities like New York and Toronto (Ghoso 2007). More generally, while multilingualism and language mainte nance in urban settings have been studied extensively for larger languages (see Garcia and Fishman 2002 for a New York example), very little information exists for smaller language communities. Taken together, we believe these points make a persuasive argument for the creation of urban centers for language documentation, description, and even revitalization, too. Note that linguists have been working intensively with speakers in ex situ contexts since the be ginning of modern descriptive linguistics. Bloomfield’s monumental Tagalog grammar and text collection (Bloomfield 1917) was written not in the Philippines but in Illinois through the help of a single speaker of the language, Alfredo Viola Santiago. What is still lacking, however, is a more systematic and long-term approach that involves building networks, not only with individuals but with community institutions. In the following, we discuss our experiences in this regard over the last several years at the Endangered Language Alliance (ELA) in New York and Toronto.

4. The experiences of an urban language organization 4.1. History The Endangered Language Alliance was founded in 2010 as a non-profit organization with a mission to promote language documentation through collaboration with local immigrant communities and to educate the public about the causes and consequences of language death. At its inception, the organization’s modest goal was to bring together linguistics students with speakers of endangered languages for long term collaborations

Language Documentation in Diaspora Communities 405 (Kaufman 2009).8 As the network expanded through prominent articles in the press (Roberts 2010 inter alia), the activities expanded accordingly to include mapping lan guage communities, initiating student-led documentation projects, and hosting classes in several indigenous and threatened languages, beginning with Nahuatl and ultimately extending to Breton, K’iche’, Kichwa, Quechua, and Hawaiian. An important initial activity consisted of basic language surveys on the street with the aim of better understanding the range of minority languages present within well- recognized West African, Mexican, Nepali, and other communities across the city with high linguistic diversity. The US census is virtually silent on languages without national status because of problems inherent in the survey methods. It is impossible to offer a comprehensive list of languages in the paper census form but, more importantly, local languages are often deemed by their speakers to be irrelevant to the purposes of the census. Our surveys involved canvassing with a clipboard as well as distributing fliers with a telephone number to an answering service. The fliers offered short-term work for those who spoke relevant languages and were interested in participating. (The notion of “relevant language” was usually expressed on recruitment materials using either one or both of the terms indigenous and endangered.) The answering services were set up in four major lingua francas (English, Spanish, French, and Russian). This turned out to be an effective method, which led to several long- term collaborations. Weekly meetings with participants involved traditional descriptive ac tivities with a documentary focus on recording narratives, stories, and other oral texts of value. In many cases, these encounters created the only high-quality online media for language communities that lack technological resources. Recorded narratives and dialogues have been disseminated largely through the organization’s YouTube channel, as it accommodates time-aligned transcripts and is the most popular means of reaching a wide audience, with the ultimate goal of having all material properly archived in addition to being available on popular platforms.9 At the same time, survey activi ties brought volunteers and students in touch with many neglected and marginalized populations of the city and thus heightened their awareness not only of the linguistic di versity that we sought to document but also of the exceedingly difficult social conditions in which this diversity exists. One of the more interesting impacts of the revolution in digital and social media is the elevation of primary sources, which allows for the curation of original data but emphasizes maintaining transparent access to the original voices. Language attitudes surveys, for instance, were not only conducted by linguists but interpreted by them as

8

One inspiration for the model was the student-led Language Documentation Training Center at the University of Hawaiʻi at Mānoa, where students belonging to various departments were trained to make short descriptions of their own languages together with sample recordings. 9 Holton (2011) and Moriarty (2011) discuss some of the new domains for endangered languages introduced by recent technology. However, the role of video sharing, while lying at the heart of the Endangered Language Project (www.endangeredlanguages.com), has yet to be subject to systematic investigation, as far as we are aware.

406 Daniel Kaufman and Ross Perlin well, and there was typically little access to the original interviews. One of our new tasks is to create platforms on which communities affected by language loss can speak out and be heard. A large portion of ELA’s work since its inception has been to facilitate the making of these videos, including their transcription, translation, publication, and circulation. At present, the organization has become a hub for any type of activity around endan gered languages and language documentation in New York City. The various elements, activities, and relations of the organization are detailed below with the hope that similar initiatives can benefit from the model in other cities around the world.

4.2. An ecosystem for urban language organizations An urban organization focused on language documentation can bring together a va riety of constituencies and actors invested in and concerned about linguistic diversity. Figure 17.1 below illustrates the various constituencies that make up this ecosystem and the kinds of collaboration that have taken place. We discuss each in turn.

Figure 17.1. An ecosystem for urban language organizations.

Language Documentation in Diaspora Communities 407

4.2.1. Threatened linguistic communities The constituency which the organization seeks to serve first and foremost is made up of the relevant language communities. In our experience, most community organiza tions with a threatened heritage language are interested in documentation and revital ization initiatives but do not have the resources to take action alone on this front. Few of our collaborators who have made powerful statements about language preservation would have done so without facilitation from a third party. These are people who do not consider themselves activists but whose experiences and opinions regarding language endangerment and conservation are compelling. In some cases, they feel (or come to feel) a strong ideological motivation to work with linguists on further documenting their language; in other cases, it is simply something they enjoy doing from time to time. Working with such organizations is ideal, as it widens the scope of both the input and the impact, even for a short-term project. In most cases, however, languages are represented by scattered individuals without a community organization. Making con tact with such individuals can be facilitated by organizations that help settle refugees. In the case of New York City, sizable refugee populations have arrived from the Middle East, Sudan, and Myanmar, among other areas over the last decade. While working with newly arrived refugees may prove particularly challenging, it can provide them with a small source of income (provided there is funding) and a valuable cultural exchange. Among refugee groups, Sudanese minorities and their languages are in a particularly precarious situation due to the protracted conflict in South Sudan, the Nuba hills, and Darfur. Sudanese languages are also especially diverse, endangered, and lacking in doc umentation. Other linguistic initiatives with Sudanese refugees that we are aware of in clude one initiated by researchers at the University of Melbourne (Musgrave and Hajek 2015) and the Moro Language Project based at the University of California, San Diego (UCSD).10 Smaller cities now taking on disproportionately large numbers of refugees— such as Boise, Idaho; Charlottesville, Virginia; or Utica, New York—may be in just as good a position as large cities when it comes to direct work with refugees. There is tremendous variation in terms of how formally or cohesively diaspora communities do or do not organize themselves. Religious institutions often form on an at least partially ethnolinguistic basis. Italian social clubs and Chinese benevolent associations based on specific localities are widespread, and Himalayan groups tend to have one organization per ethnolinguistic grouping, but indigenous Mexicans tend not to be organized by ethnolinguistic group, at least in New York City, although there may be loose village associations. On the other hand, smaller cities in the United States have attracted disproportionately large numbers of immigrants from particular Mexican lan guage groups, based on a chain migration pattern. One such example can be found in Albany, New York, roughly a three-hour drive from New York City, which hosts a large population of Triqui speakers from the Mexican state of Oaxaca.11 10

http://moro.ucsd.edu/. Working with this community, linguist George Aaron Broadwell has led the production of a dictionary (Albany Triqui Working Group, 2014) as well as other publications (Broadwell et al. 2009; 11

408 Daniel Kaufman and Ross Perlin In some cases, a community-wide language documentation project can also be ef fectively led by a sufficiently motivated individual. An ELA project entitled Voices of the Himalayas focuses on documenting Tibeto-Burman (especially Tibetic) language varieties as spoken in New York City. On the initiative of Nawang Tsering Gurung, orig inally from Mustang, Nepal but now living in the Himalayan community in Queens, the project members have been recording oral histories in the style of short, popular on line documentaries, including contextual footage taken in neighborhoods, homes, and community centers. Though Nawang is the founder of a community-focused non-profit and has worked for a Tibetan social service organization, it is really his personal role as a connector that has enabled interviews and ensured the popularity of the resulting videos. As in traditional fieldwork situations, network effects and community entry points are crucial, with one contact leading to another. In cities it also seems more likely for such connections to happen across languages because of the formation of “super- communities” like Himalayan Queens (or post-Soviet, “Russian-speaking” Brooklyn).12 Another collaborative activity with which ELA has experimented is internet radio. A space inside the office has been converted into a small studio for broadcasting in in digenous languages of the Americas, including Garifuna, Totonac, K’iche’, and others as well as discussion of indigenous issues through the medium of Spanish. These broadcasts connect the homeland and the diaspora, representing marginalized lan guage groups with low-cost, high-quality media. The benefits of recording this type of material for documentation purposes is evident. Through facilitating internet radio, an urban language organization can help strengthen bonds between both the diaspora and homeland communities while collecting valuable conversational recordings. The urban language organization model may work best with languages that have sizable communities of speakers in diaspora. If a critical mass of speakers is required for keeping a language vital in its homeland, this is even more the case in diaspora. Generally, however, the language communities ELA is involved with contain tens of thousands of speakers but are losing the battle of intergenerational transmission. In terms of our own prioritization, a linguistic minority voicing collective alarm regarding language shift holds just as much weight as vitality statistics reported by the standard sources. In many cases, statistics that could appear authoritative are in fact outdated estimates. In other cases, the standards of evaluation are applied unevenly.

4.2.2. Academic departments An independent language organization can fruitfully complement the work of local lin guistics departments. Though there are a few academic centers with a specialization in language documentation, in some cases researchers are interested individuals who find their “community of practice” at periodic conferences and workshops, rather than in the Vidal-Lopez 2012). See the papers in Fox and Rivera-Salgado (2004) for more examples of indigenous Mexican communities that have been transplanted to other parts of the United States. 12

The downside of this, from the point of view of language loss, is that cities like New York are sites of assimilation not just to English but to languages like Nepali, Tibetan, Russian, Spanish, and others.

Language Documentation in Diaspora Communities 409 city where they live. ELA has been able to serve as an open-door hub for language doc umentation and description work that can continue outside school terms and specific classes. The organizing of field methods classes has been a particularly successful example of complementarity. Larger linguistics departments have a perennial need for native speakers of lesser-known languages to serve as consultants, so that students can get an initial sense of the “field” in the controlled environment of the classroom. Over the last seven years, ELA has attempted to bridge the gap between field methods classes and local language communities in two ways: (i) by connecting speakers of endangered and under-documented languages with field methods classes in surrounding universities (CUNY, NYU, and Columbia), and (ii), by providing space, knowledge, and funding for the documentation work started in field methods classes to continue and for the results to be disseminated publicly. In best-case scenarios, a field methods class, instead of being a self-contained, solely student-focused experience, can jump-start a longer- term project and set of relationships. As is well known, the attempt to combine the goals of education and documentation is not without potential pitfalls. Student goals (e.g., to work out basic aspects of the language that are already well known) may conflict with documentation goals, making unreasonable demands on speakers and communities eager for more professional help. Ensuring continuity and coordination between docu mentation work done in the class, at the organization, and in the traditional “field” can also be a significant challenge. More broadly, a language organization can provide a kind of “second home” for un dergraduate and graduate linguistics students (among others) with a particularly strong interest in documentation, revitalization, and community work. At ELA, such students have formed the bedrock of our volunteer corps for the last seven years; for some who have not been able to enroll in a field methods class at their home institution for what ever reason, ELA has provided a kind of equivalent. For areal and language family specialists, too, ELA has served as a space for collaborating with other researchers, finding speakers and volunteers, and sharing work.

4.2.3. Municipal departments As a result of the limitations of the census, municipal departments are largely in the dark when it comes to populations that do not speak the official languages of their country. In Manchester, England, the Multilingual Manchester project led by Yaron Matras has worked to map out which languages are spoken in the city and has conducted various types of surveys on the diversity and vitality of these languages within the city.13 With the most accurate information on the linguistic needs of Manchester’s residents, the project has become instrumental to the city’s efforts at providing multilingual services.

13

The varied activities of Multilingual Manchester are documented in excellent detail on: http://mlm. humanities.manchester.ac.uk/.

410 Daniel Kaufman and Ross Perlin The project has also taken up a positive role as an institutional advocate for promoting multilingualism, heritage language maintenance, and expanded language services. Likewise in New York, ELA has become a de facto provider of interpreters for Indigenous languages of Mexico and Guatemala to the local courts. One civil servant, in charge of finding interpreters for the Queens courts, stated that demand for interpreters of less common languages is rising sharply and encouraged ELA to for malize its role as a language translation agency in making referrals and connections. Municipal departments are either required, or prefer, to deal with institutions rather than individuals, both for references and sometimes for procedures like payment. The difficult translation and orthography issues that open up beyond standardized, written, amply documented languages are largely invisible to city agencies. For example, a Department of Education specialist looking to reach Mixteco parents in East Harlem was astonished to learn that there are dozens of distinctive, mutually unintelligible Mixteco varieties, most of whose local speakers have never seen their language written, and thus one cannot straightforwardly plan to translate a document “into Mixteco.” In another instance, Department of Health specialists, themselves Mexican-American, requested that ELA personnel come to the department to present a briefing on indige nous Mexican languages, they had realized that Spanish materials were inadequate for communicating with one of the city’s most marginalized populations.14 The “long tail” of less common languages is invisible to citizens, policymakers, and those involved in delivering services. As city officials become increasingly aware of and hopefully sensitized to the depth of the new linguistic diversity, they are likely to turn to urban language organizations, where they exist, for answers.15

4.2.4. Filmmakers and film students The rise of the field of language documentation in the last two decades has increasingly privileged the use of video. Several factors are at play in the increasing salience of video, including but not limited to community desires for richer, engaging media with visuals; the greater shareability of video online and via mobile phone; and the increasing impor tance accorded to the study of gesture and context. The demands in creating high-quality language documentation are more than a single individual can live up to. Linguistic recordings aimed at the public or at a language com munity, as opposed to just specialists, are in competition with a glut of free, highly en gaging video content on popular platforms. While linguists and communities cannot compete with Hollywood in terms of production value, we must recognize and adapt to higher production standards where possible, or risk being drowned out. In addition to linguistic analysis, we must also be proficient in the technical aspects of audio recording, video recording, editing, not to mention interviewing, database creation, and other 14 See http://www.nytimes.com/2014/07/11/nyregion/immigrants-who-speak-indigenous-mexican- languages-encounter-isolation.html. 15 Besides the courts, the Department of Education, and the Department of Health, ELA has collaborated with the Queens Public Library system and the Queens Museum.

Language Documentation in Diaspora Communities 411 skills. Regardless of how much time a linguist puts into video production, specialists will be able to produce better film. ELA has thus made a point of collaborating with filmmakers, videographers, and film students wherever possible. These collaborations have yielded short, simple videos of higher quality as well as a full feature documentary (Language Matters, which aired on public television). The basic arrangement, which should be formalized, is usually that ELA can use and archive the raw footage while the filmmaker or film student, as desired, creates his or her own project. The approach is not without its challenges. In one repre sentative case, a team of filmmakers, understandably focused on the visual and tech nical aspects of a shoot, repeatedly interrupted speakers. In other cases, the filmmakers and film students have not ultimately felt comfortable sharing all their footage, nor have they been professional about doing it; their involvement was contingent and superficial. As for working with experienced professionals, the top-down, visual-focused, multi- take method of filmmaking, centered on paid professional actors, is problematic for the purposes of language documentation. Likewise, linguists have to find a happy medium between working with sophisticated, quality equipment and not intimidating speakers with lights, cameras, and microphones.

4.2.5. Educational outreach Where a university-based linguistics program might be assumed (rightly or wrongly) to be a self-contained academic unit built for interfacing with other such units, a language non-profit—located in a city, with an online presence—is almost by default assumed to be public-facing and practice-oriented. ELA is thus regularly contacted by educators, curators, and others interested in having an educational program around endangered languages or the languages of New York City. Despite the almost complete lack of lin guistics education at the primary or secondary level, frequently lamented, ELA’s expe rience suggests that there is real interest from educators if they see a local organization doing language work that is at least partially intended for a non-specialist audience. Responding to these requests, ELA has made presentations at local middle schools, high schools, and colleges; created language record-a-thons at fairs; and launched a few ex perimental “language tours” of city neighborhoods. More traditional public events are also a mainstay: readings, performances, and lectures aimed at a general audience. As mentioned earlier, ELA also regularly hosts community language classes in less com monly taught languages (most recently Quechua and Hawaiian), typically attracting a mix of semi-speakers, heritage speakers, and members of the general public. Another form of education, particularly for those with a deeper interest in lan guage or a plan to study linguistics, can come through volunteering, another preroga tive of a non-profit that a university is not typically in a position to support. The degree of volunteer interest in ELA has been consistently strong and occasionally volunteers are themselves younger members of endangered-language speech communities. This presents the opportunity to include younger speakers or semi-speakers in the docu mentation process. One such case was that of a speaker of Juhuri, an Iranic language of Azerbaijan now spoken mostly in New York and Israel by a small Jewish minority

412 Daniel Kaufman and Ross Perlin (Authier 2012; Borjian and Kaufman 2015). While the volunteer was beginning her un dergraduate degree in linguistics at a local university, she had no means to connect her studies with her heritage language. At ELA, she was able to offer invaluable assistance in the translation and analysis of an older speaker’s recordings that we had made previ ously in her own community in Brooklyn. The translation process furthermore required her to engage her parents in Juhuri to fill the gaps and thus facilitated an intergenera tional connection around meaningful language work. It is worth emphasizing the great utility of any such collaboration in which new documentation can be created while si multaneously improving a young person’s control of the language. This can be seen, in embryo, as a digital, asynchronous version of the master-apprentice approach to revi talization (Hinton 1997). Unlike in the actual master-apprentice approach, the “digital apprentice” cannot start from scratch; the method is best suited for younger speakers or semi-speakers who wants to improve their language skills via a thorough analysis of the speech of a more fluent speaker. Though difficult to implement on a wider scale, this kind of participation can have an impact on individuals and how they see their language. Effectively harnessing (without exploiting) the skills of volunteers, while also teaching them new skills, is not an easy task. Regardless, volunteers form an important part of the loose “extended family” at ELA, an organizational shape shared by other small, grassroots-oriented non-profits. Performances have constituted another type of educational outreach with positive side effects. Between 2013 and 2015, ELA produced an eight-part series of performances and presentations with each installment focused on the endangered languages of a different region or language family. The speakers, depending on their interests and talents, read stories, sang songs, performed poems, or told riddles. Besides offering a unique introduction to languages and cultures that receive no public attention, the performances in many cases also had a considerable impact on the speakers themselves. A similar initiative entitled Treasure Language Storytelling is currently being developed by Steven Bird and colleagues, who are creating a manual for such public programs.

4.2.6. Artists, photographers, illustrators A city is a dense concentrate of creative talent, allowing for a variety of approaches to an issue like language endangerment, which has attracted artists, composers, and writers to an unusual degree.16 Like journalists and filmmakers, they are seeking compelling mate rial for their own creative and professional purposes, but done right this can enable new creative approaches to revitalization and publicizing language endangerment. Diaspora communities are points of intersection between “traditional” cultures of artistry and craftsmanship and the relatively more mass-market, globalized, and professionalized “culture industry.” The goal should be to make such intersections mutually beneficial and symbiotic, and not simply extractive and appropriative. In one good example of

16

For example, http://www.nytimes.com/2016/04/03/arts/music/vanishing-languages-reincarnated- as-music.html.

Language Documentation in Diaspora Communities 413 artistic collaboration, ELA came to work with photographer Yuri Marder, who created portraits of our collaborators supplemented by our descriptions and recordings. With the support of the National Endowment for the Arts (NEA), the work was exhibited in different forms at multiple locations and was able to introduce some of New York City’s endangered languages to a wider audience through a personal medium—individual speakers and their stories. An exhibit at the Queens Museum featured the results of an ELA collaboration with several artists and authors. The initial impetus was the preparation of a language map of Queens, informed by ELA’s research, for an unusual atlas of New York City (Solnit and Jelly-Shapiro 2016). The map was in turn adapted by the artist Mariam Ghani for the cre ation of a large-scale mural in the main hall of the museum. In another case, ELA helped facilitate a musical collaboration between Breton and Garifuna musicians, including a concert and finally a CD. A willingness to collaborate on projects far outside the regular domain of linguistics has not only broadened our approach to language revitalization but has also created new and interesting material in the languages we aim to support.

4.2.7. Journalists Cities, hyperdiverse “global cities” in particular, are media-saturated environments with dedicated corps of specialists devoted to crafting narratives, seeking out experts and cir culating information. An urban language organization can become a “go-to” resource for journalists researching stories on language endangerment generally, on specific lan guage communities and on the city itself and its linguistic landscapes. ELA receives an extraordinarily high volume of such requests from print, radio, and broadcast outlets of all kinds and responds as workloads allow to give more exposure to issues of language endangerment, linguistic diversity, and multilingualism. Perhaps the biggest and most unexpected boon is that speakers of other endangered and little-documented languages often get in touch after seeing media coverage, leading in turn to new partnerships. Journalistic attention is not without its downsides, however. ELA’s seven-year experi ence suggests that reporters may repeatedly ask the same questions and employ frames that linguists and speech communities may find objectionable. Furthermore, journalists are often looking for a single, named central character for their story—an imagined Indiana Jones-style linguist, seen “saving languages”—and are unlikely to be willing to give adequate attention either to linguistic material or to the story of a whole commu nity. While members of a big-city diaspora community are more likely to be familiar with journalistic practice, sensitive issues frequently come up and many journalists are unable or unwilling, following journalism ethics, to share pre-publication drafts for fact checking. ELA’s experience suggests that the existence of a visible, responsive organization, especially in a high-profile media environment like New York City, can increase cov erage of linguistic issues and of endangered-language issues in particular. Journalists are looking not just for expert testimony but for entry points into stories (introductions, tips, events, facilitated situations, etc.) which their reporting depends on. Given how labor-intensive this process can be, researchers must manage these requests judiciously.

414 Daniel Kaufman and Ross Perlin Making an extra effort around high-impact stories in major outlets is common sense, but special consideration should also be given to “ethnic media” that may have relevant speech community members among their readers.17 Advertising in such media may also be an effective way to reach endangered-language communities in the city, although we have not invested in this approach.

5. Greenhouse or graveyard? Given the pace of language loss, there is reason to believe that peak urban linguistic di versity is occurring now. As intergenerational transmission weakens in small language communities around the world, newer immigrants will be less likely to speak an indig enous or minority language. Moreover, the rising cost of living in larger cities and an apparently growing prejudice against immigration seem likely to dim the prospects for urban linguistic diversity in places like New York and London, to take just two examples. In New York, the proportion of foreign-born residents (within the fast-growing total population) has remained stable at around 40% for the last fifteen years, after rising from around 20% in 1970, but the overall share of immigrants to America residing in New York City fell from 9% in 2000 to 7% in 2014. There is evidence that new arrivals are not staying as long in the city, but moving more quickly to cheaper areas, including suburbs.18 Peak urban diversity in New York City may be occurring now, but it may just be getting going in the emerging megacities of the developing world, which are closer to “hotspots” of language endangerment (Anderson 2011), and in the smaller, cheaper cities where large-scale immigration is just beginning. It is also possible that growing awareness of and interest in linguistic diversity will enable a more multilingual future for the long term, reversing a history of urban areas being “graveyards” for languages and turning them into “greenhouses.” While cities may be highly effective places for the kinds of language initiatives outlined above, language maintenance in diaspora beyond two generations is rare without a continual influx of new speakers from the homeland. Likewise, the signal cases of language revitalization, such as Hebrew, Hawaiian, Welsh, Wampanoag, have all to varying extents taken place on the terrain of a “homeland,” bound up with questions of sovereignty and autonomy. Such is the power of English and the pull of assimilation that cases of multi-generation language maintenance in the United States have generally

17 See the useful website www.voicesofny.org for an English-language overview of New York’s massive and vitally important “ethnic media” world. Media is a language domain that in some cases flourishes more in diaspora—the first-ever Irish-language periodical was published in Brooklyn (An Gaodhal), while the center of Yiddish-language journalism has been in New York for over a century. 18 Adam Forman, Center for an Urban Future, personal communication, 2016. See also https://www. osc.state.ny.us/reports/immigration/NYC_Immigration_Rpt_8-2014.pdf.

Language Documentation in Diaspora Communities 415 been restricted to highly insular religious communities—whether the Amish in rural Pennsylvania or, more recently, Hasidic Jews in Brooklyn. Other “settler societies” composed primarily of immigrants (such as Canada or Australia) show a broadly sim ilar pattern, and increasingly so do all contemporary nation-states, given their focus on national identity and language uniformity. Many of the ethnolinguistic groups now experiencing diaspora and global-scale migration are doing so for the first time. Arguably, self-conscious diaspora was a condition previously confined to a relatively small number of groups, which is now becoming nearly universal.19 Complicating the situation is what appears to be the shrinking gap between homelands and diasporas, primarily due to faster, cheaper travel and better commu nications technologies.20 For some groups, there is now the viable option to send chil dren back to the homeland every summer; for others, even one trip remains formidably difficult. In a number of cases, ELA has been able to connect projects in New York City with fieldwork in the homeland by equipping a collaborator to make recordings during a trip home. The resulting recordings have been among the most important ELA has collected, reflecting the access and perspective of an insider (a member of the language community) paired with the technology and awareness of a diaspora situation. With resources to pursue this strategy more rigorously, an urban language organi zation could help counteract “language drain” and anchor the “network of researchers and language speakers actively collaborating online” which Henderson (2015) aptly notes “should be a first step in any documentation endeavor for which the technology is available.” Community members who may have not been particularly motivated or empowered to document their own languages before emigrating can return, even for short trips, with the tools and skills to record valuable material. In addition to being archived, this material can optimally make a round trip through popular digital platforms so that high-quality recordings with translations can be accessed both back home and in the diaspora.

6. Conclusion ELA’s seven-year experience suggests important advantages to having a non-profit lan guage documentation center in an urban diaspora setting. While New York and Toronto are two outstanding examples of cities with extreme linguistic diversity, similar work has also been taking place in different forms in London, Manchester, Barcelona, Jakarta,

19 Note the discussion, reported in Rodger Kamenetz’s The Jew in the Lotus, of the Tibetan leadership’s interest in studying Jewish cultural and religious survival over the course of a 2,000-year diaspora— today’s massive Tibetan diaspora is less than sixty years old. 20 See the New York Talk Exchange study, an intringuing portrait of diaspora communication (among other things) that shows the city’s different neighborhoods communicate with the rest of the world by phone: http://senseable.mit.edu/nyte/.

416 Daniel Kaufman and Ross Perlin and elsewhere. We recommend and encourage the establishment of a network of urban endangered-language organizations in the world’s most hyperdiverse cities, as these cities will only become more important for documenting and maintaining linguistic di versity as the pace of urbanization increases. We have not touched here on many theoretical aspects of urban fieldwork which deserve attention, for instance, whether or not we can speak of diasporic “language communities” in any type of traditional sense (Patrick 2003; Blommaert and Rampton 2012). In some cases, diaspora, dispersion, and intense language contact can reconfigure and atomize lan guage communities beyond recognition, as is the case with speakers of indigenous Meso- American languages in New York City; in others, a rather traditional-looking “language community” can be substantially reconstituted, as with Hasidic Yiddish. Future work should also continue to explore the linguistic impact of diaspora communities on their places of origin (e.g., Perez-Baez 2009), as well as urban indigenous migrants in areas of high linguistic diversity (e.g., Shulist 2013). Despite recent progress, we still know relatively little about endangered languages in global cities and smaller urban centers. The more we can improve our understanding, the more we will be able to facilitate urban collaborations for the sake of language documentation, maintenance, and revitalization.

References Albany Triqui Working Group (G. A. Broadwell director; Román Vidal López principal lan guage consultant). 2014. A Copala Triqui—Spanish—English dictionary. http://copalatriqui. webonary.org. Anderson, Gregory D. S. 2011. “Language Hotspots: What (Applied) Linguistics and Education Should Do About Language Endangerment in the Twenty-First Century.” Language and Education 25: 273–289. Authier, Gilles. 2012. Grammaire juhuri, ou judéo-tat, langue iranienne des Juifs du Caucase de l’est (Beiträge zur Iranistik 36—Bibliothèque iranienne 76). Wiesbaden: Reichert. Blommaert, Ben Rampton. 2012. “Language and Superdiversity.” MMG Working Paper 12-05. Göttingen: MPI for the Study of Religious and Ethnic Diversity. Bloomfield, Leonard. 1917. Tagalog Texts with Grammatical Analysis (University of Illinois Studies in Language and Literature, 3). Urbana: University of Illinois. Borjian, Habib and Daniel Kaufman. 2015. “Juhuri: From the Caucasus to New York City.” International Journal of the Sociology of Language 237: 59–74. Bowern, Claire and Natasha Warner. 2015. “‘Lone Wolves’ and Collaboration: A Reply to Crippen & Robinson (2013).” Linguistic Documentation & Conservation 9: 59–85. Broadwell, George A., Kosuke Matsukawa, Edgar Martín del Campo, Ruth Scipione, and Susan Perdomo, eds. 2009. The Origin of the Sun and Moon: A Copala Triqui Legend (Román Vidal López, narrator). Munich: Lincom Europa. Canby, Peter. 2010. Retreat to subsistence. The Nation, July 5, pp. 30–36 Chelliah, Shobhana L. and Willem J. De Reuse. 2011. Handbook of Descriptive Fieldwork. Dordrecht, The Netherlands: Springer. Crowley, Terry. 2007. Field Linguistics: A Beginner’s Guide. Oxford: Oxford University Press.

Language Documentation in Diaspora Communities 417 Czaykowska-Higgins, Ewa. 2009. “Research Models, Community Engagement, and Linguistic Fieldwork: Reflections on Working Within Canadian Indigenous Communities.” Linguistic Documentation & Conservation 3: 15–50. Dwyer, Arienne M. 2006. “Ethics and Practicalities of Cooperative Fieldwork and Analysis.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel, 31–66. Berlin: Mouton de Gruyter. Errington, Joseph. 2003. “Getting Language Rights: The Rhetorics of Language Endangerment and Loss.” American Anthropologist 105: 723–732. Fox, Jonathan and Gaspar Rivera-Salgado. 2004. Indigenous Mexican Migrants in the United States. Boulder, CO: Lynne Rienner Publishers. Garcia, O. and Fishman, J. (eds). 2002. The Multilingual Apple: Languages in New York City. Berlin and New York: Mouton de Gruyter. Ghoso, Dawa B. 2007. “Language Maintenance: A Sociolinguistic Study of Tibetan Immigrant Youths in Toronto, Canada.” MA thesis, University of British Columbia, Canada. Grenoble, Lenore A. 2010. “Language Documentation and Field Linguistics: The State of the Field.” in Language Documentation: Practices and Values, edited by Leonore A. Grenoble and N. Louanna Furbee, 289–309. Amsterdam and Philadelphia: John Benjamins. Henderson, Brent. 2015. “Out of Context: Documenting Languages in Immigrant and Refugee Communities.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona Mc Laughlin, 239–251. Amsterdam and Philadelphia: John Benjamins. Himmelmann, Nikolaus. 1998. “Documentary and Descriptive Linguistics.” Linguistics 36: 161–196. Hinton, Leanne. 1997. “Survival of Endangered Languages: The California Master-Apprentice Program.” International Journal of the Sociology of Language 123: 177–191. Holton, Gary. 2011. “The Role of Information Technology in Supporting Minority and Endangered Languages.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 371–400. Cambridge: Cambridge University Press. Jukes, Anthony. 2011. “Researcher Training and Capacity Development in Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 423–445. Cambridge: Cambridge University Press. Kapur, Devesh. 2010. Diaspora, Democracy and Development. Princeton, NJ: Princeton University Press. Kaufman, Daniel. 2009. “Ex-situ Language Documentation and the Urban Fieldstation for Linguistic Research.” Paper presented at International Conference on Language Documentation and Conservation (ICLDC). University of Hawai‘i at Mānoa Lahe- Deklin, Francesca and Aung Si. 2014. “Ex- situ Documentation of Ethnobiology.” Language Documentation & Conservation 8: 788–809. Maffi, Luisa. 1999. “Language Maintenance and Revitalization.” In Cultural and Spiritual Values of Biodiversity, edited by Darryl Posey, 37–44. Nairobi: United Nations Environment Programme. Moriarty, Máiréad. 2011. “New Roles for Endangered Languages.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 446– 458. Cambridge: Cambridge University Press. Musgrave, Simon and John Hajek. 2015. “Linguistic Diversity and Early Language Efforts in a Recent Migrant Community in Australia: Sudanese Languages, Their Speakers and the Challenge of Engagement.” In Challenging the Monolingual Mindset, edited by John Hajek and Yvette Slaughter, 113–130. Bristol, Buffalo, and Toronto: Multilingual Matters.

418 Daniel Kaufman and Ross Perlin Patrick, Peter K. 2003. “Speech Community.” In Handbook of Language Variation and Change, edited by J. K. Chambers, Peter Trudgill, and Natalie Schilling-Estes. Oxford: Blackwell Publishing. Perez-Baez, Gabriela. 2009. “Endangerment of a Transnational Language.” PhD diss. State University of New York at Buffalo. Rice, Keren. 2011. “Documentary Linguistics and Community Relations.” Language Documentation & Conservation 5: 187–207. Rice, Keren. 2012. “Ethical Issues in Linguistic Fieldwork.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 407–429. Oxford: Oxford University Press. Roberts, Sam. 2010. “The Languages of New York, Lost and Found”, The New York Times, April 29. Shulist, Sarah A. 2013. “In the House of Transformation: Language Revitalization, State Regulation, and Indigenous Identity in Urban Amazonia.” PhD diss., University of Western Ontario, Canada. Rebecca Solnit and Joshua Jelly-Schapiro. 2016. Nonstop Metropolis: A New York City Atlas. Oakland: University of California Press. Sonnenfeld, David A. 1992. “Mexico’s ‘Green Revolution,’ 1940- 1980: Towards an Environmental History.” Environmental History Review 16: 28–52. Thomason, Sarah G. 2015. Endangered Languages: An Introduction. Cambridge: Cambridge University Press. Vidal-Lopez, Román. 2012. Nana naguan’ rihaan nij síí chihaan’ | Consejos para la gente Triqui | [Words of counsel for the Triqui people], edited by George A Broadwell, Ashley LaBoda, Sharone Horowit-Hendler, and Gabriela Aquino Dehesa (IMS Occasional Publication No. 16). Albany, NY: University at Albany. Wilkins, David. 1992. “Linguistic Research Under Aboriginal Control: A Personal Account of Fieldwork in Central Australia.” Australian Journal of Linguistics 12: 171–200.

Chapter 18

Ethics in L a ng uag e D o cum entat i on and Revitali z at i on Jeff Good

1. Ethical encounters in language documentation The increasing emphasis that linguistics has placed on the documentation and revi talization of the world’s endangered languages has brought more and more scholars of language into contact with communities whose cultures, needs, and interests diverge greatly from their own. Moreover, in many countries, research involving so-called human subjects has become the target of heightened scrutiny, and the rise of cheap, dig ital means of information collection and exchange that has made contemporary docu mentary linguistics possible has foregrounded issues of rights and access to language resources. These contexts of “culture clash” have prompted serious considerations of ethical practices in documentation and revitalization. In this chapter, I take as a given that essentially all scholars choosing to work on en dangered languages seek to “do no harm” as well as to “do some good” (Dwyer 2006, 38–39) in the course of their research. The challenge, then, is not to change attitudes but to increase awareness of strategies of interaction that minimize the potential for harm and maximize the chances of positive benefit. However, there is an immediate obstacle to addressing issues of ethics in this domain in any general way. The primary concern

I am grateful to Lyle Campbell and Lise Dobrin for their comments on an earlier version of this chapter. Some of the work discussed here was supported by research funded by the U.S. National Science Foundation under Award BCS-1360763.

420 Jeff Good that gave rise to the emergence of contemporary documentary linguistic approaches— language endangerment at a global scale—emphasizes linguistic particularity, and scholars have further sought to engage speaker communities in diverse ways. The re sult is that there can be no general ethical “formula.” Instead, one must try to separate out different strands of ethical consideration in documentary and revitalization work so that the parameters of ethical engagement can be made clearer, even if the specifics must be worked out on a case-by-case basis. This chapter has the advantage of building on significant previous work on ethics in documentation and revitalization, such as Dwyer (2006), Grinevald (2006), Rice (2006, 2012), Thieberger and Musgrave (2007), and Austin (2010). General fieldwork guides also discuss this topic (Crowley 2007, 23–56, Bowern 2015, 167–89, Chelliah and de Reuse 2011, 139–159, and Sakel and Everett 2012, 67–78). While not explicitly focused on ethics, Grenoble and Whaley (2005) contains relevant consideration of key issues. There is significant agreement in the content of this chapter and earlier works, though the ad vice given across them is not always the same, and, given the complexity of this topic, consulting multiple sources is ideal. This survey most notably differs from earlier work in its emphasis on the broader ideological and cultural context in which documentary and revitalization activities take place, as opposed to the enumeration of specific steps one might take in order to conduct more ethical fieldwork, since these points are cov ered well elsewhere. Section 2 of this chapter begins by presenting a series of case studies that illustrate a range of issues in the ethics of documentation and revitalization. These are intended to provide context for a more general consideration of the importance of understanding the ideologies that underpin documentary and revitalization work in section 3. Section 4 then considers the complexities involved in maintaining the different relationships required for success in this area, section 5 covers ethical issues in the creation and hand ling of endangered language resources, and section 6 relates broader ethical concerns to systems of legal and ethical compliance that govern data collection and research. Section 7 offers brief concluding remarks.

2. Case studies in ethical interaction In this section, five case studies of ethical issues raised over the course of work on lan guage documentation and revitalization are reviewed. They are chosen both to illustrate a diverse range of concerns and to exemplify the kinds of ethical dilemmas that can arise in different parts of the world. Additional relevant case studies can be found in Warner, Luna, and Butler (2007), Holton (2009), Robinson (2010), and Brooks (2015), among others. The discussion below is mostly oriented toward summarizing the facts of these case studies. Their connection to broader issues is considered more directly in subse quent parts of the chapter.

Ethics in Language Documentation and Revitalization 421

2.1. Wilkins (1992): Research under community control Wilkins (1992) is an early publication emphasizing the social contexts in which docu mentary research takes place. It is especially noteworthy for its description of the con duct of research which was, to a much greater extent than is usually the case, under the control of a specific community. The research took place in collaboration with the Yipirinya School Council, where a majority of the community members identified as Western Arrernte. However, Mparntwe Arrernte was chosen by the group to be the focus of the work as the language traditionally associated with Alice Springs, where the school is located. Wilkins’s choice to conduct research under community control was in part driven by high-level ethical considerations embedded within an ideology placing importance on conducting linguistics in a more “responsible” manner (Wilkins 1992, 173). Having made that choice, many of the issues he dealt with were more logistical in nature than ethical. For instance, there was the question of how one could simultaneously satisfy the demands associated with earning an academic degree while also fulfilling obligations to the community (Wilkins 1992, 181–182). However, two other aspects of his work also intersected directly with ethical concerns. The first relates to Australia’s status as a country dominated by a settler society, of which Wilkins is a member. While his work at the Yipirinya School eventually led to mutual appreciation between him and the community, the earliest stages of his interaction were governed by a lack of trust (Wilkins 1992, 176). This kind of tension is not unusual in lan guage documentation projects, and linguists working in parts of the world dominated by settler societies (including the Americas) are generally very sensitive to this dynamic (see, e.g., Czaykowska-Higgins (2009) for discussion in a Canadian context). More gen erally, Wilkins’s experiences are indicative of the extent to which the relationships that researchers must build in order to take part in documentation and revitalization will, at least in part, be governed not only by who they are as individuals but also more general sociopolitical and sociocultural factors. This will be discussed further in section 4. A second ethical issue raised by Wilkins (1992, 182–183) relates to the presence of a community outside Alice Springs where a number of other Arrernte speakers lived. Representatives of this community contacted him for assistance doing language re search, and, from his perspective, offering such assistance would have been completely natural since linguistic work on a “language” need not be bound to any one group of individuals. However, the Yipirinya School Council saw things differently and barred him from working with any organization not based in Alice Springs. The central ethical question raised by this is what the social unit is—i.e., “community”—that documentary linguists should be engaging with in their work. Wilkins’s assumption was that the com munity of interaction and responsibility should be defined by linguistic identity, but the actual individuals whose heritages were partly defined by language had a different view. Wilkins’s experience here is both relevant to how different ideologies held by actors in a documentation or revitalization project must be taken into account as well as how

422 Jeff Good problematic the notion of “community” is for ethical practice, topics to be considered in section 3 and section 4.

2.2. Debenport (2010a): Language ideologies and restricted access Debenport (2010a; see also Debenport 2015) describes her research on the Tiwa lan guage in New Mexico in ways that are broadly similar to that of Wilkins (1992). In par ticular, the community that she works with maintains tight control over her research, and even being able to access the community as a researcher was an unusual privilege (Debenport 2010a, 229–230). This desire for control is part of a larger set of patterns regulating “circulation of cultural knowledge” among the Tiwa (Debenport 2010b, 205). These restrictions extend to the publication of written representations of the language, requiring her to obscure key aspects of linguistic data, for instance, by blacking out any line representing Tiwa in an interlinear glossed text while allowing the glossing and English translation to remain visible (Debenport 2010a, 236). These restrictions are not due to specific negative experiences working with linguists but, rather, emanate from a language ideology that is strongly conservative and purist in nature (see, e.g., Kroskrity 1992). They, thus, are part of a larger cultural complex in which the “ancestral code” (Woodbury 2011, 177) of these groups is embedded, and, pre sumably, this Tiwa language ideology should be understood as a significant target of documentation, even if documentary linguistics tends to emphasize the collection of lexical and textual data over data relevant to understanding ideological aspects of lan guage use (Debenport 2010b, 236). Most documentary or revitalization projects do not find themselves operating under restrictions of the sort that Debenport describes. However, her work would have been simply disallowed had she not been willing to respect the community’s wishes. So, there was no ethical “choice” in the matter, beyond, perhaps, the ever-present choice to simply not do research with a community that imposes such restrictions.1 Moreover, the fact that the community had the primary agency over the ways that she could present her research relieved her of some of the burden that is faced by linguists working in parts of the world where it is problematic to assume that community members will have a clear understanding of different modes of data presentation, in particular web-based dissem ination (see, e.g., Robinson 2010). However, three other ethical issues arose during her fieldwork, which merit consid eration here. The first two of these relate to a change in the political situation within

1 There does not appear to be extensive consideration in the literature of the ethics involved in the choice of working with one community over another. For instance, Chelliah and De Reuse’s (2011, 79–92) chapter on “choosing a language” does not include discussion of ethical concerns. See, however, Grinevald (2003, 60–62).

Ethics in Language Documentation and Revitalization 423 the community. From 2003 through 2009, work was done under the guidance of the director of the local language program. However, in 2009 a tribal leader decided that the language should no longer be written, and language materials were confiscated (Debenport 2010a, 237–238). Debenport ultimately saw this as an opportunity to ex plore local language politics and ideologies, though examining the issue for research purposes would mean “mining the painful experiences of dear friends” (Debenport 2010a, 238). This raises the fraught question of what it means to “do harm.” Is it harmful to ask individuals to discuss something painful to them for the purposes of research? There can be no general answer to such a question beyond the fact that it depends on the wishes of all of those involved. The change in attitude of one of the tribal leaders raises another set of concerns around the question of who gets to speak for a “community.” In the case Debenport discusses, it appears that those working on the language program accepted the fact that their own preferences could be overridden by tribal leadership (Debenport 2010a, 238). However, lines of control will often not be well-established among the group of individuals who have a heritage association with a language, making it difficult to know what the right response is in cases of disagreement. This issue was also considered above in section 2.1, and general concerns surrounding the notion of “community” will be discussed in section 3 and section 4. A final ethical concern raised by Debenport (2010a) relates to how the work of scholars may impact the perception of “language” within a community. The work of the language program resulted in a kind of new institutionalization of the Tiwa language, and those closely involved came to be seen as the “experts” on the language when, pre viously, any fluent speaker was considered an expert. The knowledge of Debenport herself was even called upon when it came to writing the language (Debenport 2010a, 232–233). While the fact that the act of language documentation may “change” a lan guage should not necessarily stop the work from being done—particularly if speakers are supportive—this example speaks to the importance of being aware of how docu mentary activities, as well as the ways that those activities are framed, may have un intended consequences for the linguistic culture of a community, a topic that will be returned to in section 3.

2.3. Dobrin (2008): Local notions of exchange in language documentation Dobrin (2008) is a critical examination of commonly held attitudes about language doc umentation and development. Much of the discussion centers around her experiences working in a village in Papua New Guinea associated with the Arapesh language. These serve as a concrete example of how a broader Melanesian system of exchange relations can interact with well-intentioned approaches of Western scholars in ways that would be unanticipated without an understanding “the culturally particular local systems of meaning in which language loss itself is taking place” (Dobrin 2008, 317).

424 Jeff Good

Figure 18.1. Schematization of meaning of exchange relations in Melanesia (Dobrin 2008, 309).

Dobrin’s central observation rests on recognition of the importance of gift exchange in Melanesian cultures as the primary means through which relationships are devel oped. By contrast, talk is not a significant means of strengthening relationships since it “comes so cheaply” (Dobrin 2008, 308). Moreover, exchange relationships operating over a long distance are considered especially valuable as representations of one’s influ ence. Patterns of exchange with outsiders can, therefore, motivate collective action via a cycle through which anticipated exchange motivates community-level cooperation as depicted in Figure 18.1. Since language documentation and revitalization always involve patterns of ex change of some kind, understanding how these are perceived in a given cultural con text is clearly a prerequisite to ethical behavior. Dobrin (2008, 301) is especially concerned with how Western notions of “empowerment,” which emphasize autonomy and self-determination, interact with a Melanesian cultural system that values ongoing exchange. As an example of the way in which these two value systems can interact in un productive ways, she describes the difficulties involved in maintaining activity around a local vernacular language preschool (Dobrin 2008, 310–315). Its establishment was seen by outsiders as a sign of the village’s “virtue” (Dobrin 2008, 311), but, because it was associated with political developments in Papua New Guinea emphasizing local autonomy, the community never received the anticipated exchange with outsiders that would have provided the social impetus for continued communal investment.2 More broadly, Dobrin’s observations have clear implications for how any documentary or

2

See Mc Laughlin and Sall (2001, 196–200) for another instance where noteworthy differences in the social meaning of exchange are considered in a documentary context.

Ethics in Language Documentation and Revitalization 425 revitalization project should construct the ways it chooses to “give something back” (Dwyer 2006, 57) in this part of the world. Dobrin’s study illustrates the ethical difficulties that arise due to cultural particularities of the groups that are brought together over the course of documentary work, as will be discussed further in section 3 and section 4.

2.4. Di Carlo and Good (2014): What should we be “preserving”? The case study described in this section is based on Di Carlo and Good (2014), which examines the language dynamics of a rural region of Cameroon known as Lower Fungom. At issue is the interplay of local language ideologies and the “preservationist” ideology associated with most linguistic work on endangered languages. Popular depictions of the loss associated with the extinction of a language often treat this as comparable to “dropping a bomb on the Louvre.”3 Such characterizations are clearly linked to essentialist notions of language that assume a close link between a “lan guage” and a “culture” (see, e.g., Hymes 1968 and Foley 2005). However, while this as sociation may hold in certain contexts, it does not appear to in Lower Fungom. There, languages are used to index affiliation with political units (e.g., villages) which are lo cally conceptualized as dynamic entities that come into and out of existence as sociopo litical circumstances warrant. Speaking a language is the primary overt signal of one’s membership in a given political unit in this part of Cameroon. This ideology is clearly expressed in this quotation from Buo Makpa Amos, who is from the village of Missong in Lower Fungom, which is associated a variety of the Mungbam language (Di Carlo and Good 2014, 245). When you people are cooperating you speak one language. If you speak one lan guage, you cooperate. As a group of relatives moves, the brothers may decide to split, each choosing a different place to stay. This is what happened to us. We left the early place in Fang side as a whole and arrived in Abar. From here we scattered. Now, we Bambiam from Missong have relatives in Abar, in Buu, in Ngun. Each family attached itself to a village and therefore had to speak the general language used there. For example, we Bambiam attached ourselves to Bikwom and hence had to adopt their language; Bikwom people are attached to Bidjumbi and Biandzəәm to form the village of Missong, and this is why they all had to use the same language, that is, Missong.

This speaker is very clear in not portraying the Missong variety as an integral part of his identity. Rather, it has a more utilitarian role as a tool to facilitate cooperation. By 3

This quotation is widely attributed to Ken Hale in publications dating back at least to the late 1990s, for instance in an article by Wade Davis in the August 1999 issue of National Geographic.

426 Jeff Good implication, if the families that together form the village of Missong were to choose to disband, the Missong “language” should also fall into disuse. This would not be due to the global sociopolitical dynamics leading to language loss generally but, rather, forces endogenous to the local linguistic culture. On the one hand, this suggests that it would be a foreign cultural imposition for an outside linguist to come to a region like Lower Fungom and promote the maintenance of each of its languages under the assumption that this is necessary to “save” a local culture. On the other hand, it points to the idea that the target of “preservation” in the region should not be any one spe cific lexicogrammatical code but, rather, the ecology that allows different languages to develop or dissipate as the political units that they are associated with rise and fall (see Di Carlo and Good 2014, 254–256). This case study underscores the importance of understanding how ideologies of lan guage and linguistics shape documentary and revitalization projects if one wants to achieve ethical outcomes (section 3).

2.5. Innes (2010): Context for archival materials The final case study that will be considered here is focused on issues arising from the use of archival materials and derives from the work of Innes (2010) on the North American language Mvskoke, spoken by the Muskogee and Seminole Nations of Oklahoma and the Seminole Tribe of Florida. The Americanist linguist Mary Haas collected a significant number of texts in the language, and Innes began working with archived versions of these texts to prepare them for wider publication. She chose to examine an initial set of narratives on the basis of the amount of analysis that Haas had already done on them. However, her consultants identified some of them as inappropriate for certain audiences, though there had been no indication of this in the materials themselves. Two of them were not merely unsuitable for men who lacked the right background and all women but were also seen as being dangerous for those individuals to either read or hear. Work on them ended as soon as this concern had been identified.4 In contemporary terms, we might describe the ethical issue that arose from the work on these Mvskoke texts as a failure of metadata. That is, the materials did not contain sufficient information about the restrictions on their content for a later user to know how to work with them responsibly. Given the very different time in which they were collected, we should expect lapses in the collection of important contextual information in legacy materials of the sort that Innes was working with (see also O’Meara and Good 2010). However, as Innes (2010, 200) points out, even contemporary discussions of metadata do not place a strong emphasis on collecting

4

For a striking account of the dangers involved when not taking appropriate care when working with sensitive texts, see Toelken (1996).

Ethics in Language Documentation and Revitalization 427 ethnographic detail of the sort that is needed to avoid the situation that she found herself in.5 Existing work emphasizes the importance of archiving materials so that they can be accessed by appropriate audiences (see, e.g., Dwyer 2006, 40 and Thieberger and Musgrave 2007, 30). What Innes (2010) shows us is that archiving ethically requires us to not simply enter metadata in “checklist” form but to also consider sensitivities that may be culturally quite specific. This will be further discussed in section 5.

2.6. From case studies to generalizations The most important general theme of these case studies is the extent to which, even if we agree on broad ethical principles like “do no harm,” the particularities of each situ ation make it difficult to know what steps are required to apply them. The next section tries to build on this observation by considering the ideologies that inform the actions of participants in documentary and revitalization projects. The logic behind this is that a better understanding of these ideologies is a crucial step in knowing how to concretely detect potential ethical problems and respond to them effectively.

3. Ideologies in language documentation Most work on language ideologies has been done within the context of linguistic an thropology, though there has been increasing discussion of the topic in the linguistic lit erature on documentation and revitalization (see Austin and Sallabank 2014). Here, two topics will be considered: (i) ideologies held by linguists as they engage in documentary and revitalization work and (ii) ideologies held by speaker communities with respect to their languages. These are both large topics in their own right, and the discussion here is necessarily selective. For those drawn to the study of language because of an interest in grammatical anal ysis, work on ideologies may seem somewhat foreign. Nevertheless, this body of lit erature can be enormously helpful in achieving more ethical outcomes because of the tools it offers to reveal the hidden assumptions that guide the actions and responses of the different stakeholders in documentation and revitalization projects. Becquelin, de Vienne, and Guirardello-Damian (2008) stands out in this context for its careful

5 Innes (2010) cites various papers discussing metadata in her rightful critique of existing sources of advice. To them, I might add a work of my own (Good 2011, 229–232), published after Innes (2010), but which suffers from the same issue.

428 Jeff Good elucidation of how complex the interaction can be between the ideologies held by researchers and those held by speaker communities.

3.1. Linguists’ ideologies The topic of linguists’ ideologies was raised above in the case studies discussed in section 2.1 and section 2.4, specifically regarding the linkage between language and identity. As discussed in section 2.1, Wilkins (1992) discovered a dissociation between linguistic identity and other kinds of identity when one Arrernte community prevented him from collaborating with another. In section 2.4, Di Carlo and Good (2014) encountered a situ ation where languages were not considered by speakers to reflect “deep” aspects of iden tity, meaning that language loss is locally construed as a “natural” political event. Each of these cases is, in some sense, surprising due to at least two deeply embedded aspects of the ideology of contemporary linguistics. The first is the conceptualization of languages as discrete and countable entities. While linguists are well aware of the difficulties surrounding any definition of “language” (see, e.g., Cysouw and Good 2013), the idea that we can even speak of a class of “endangered languages” presupposes that we can identify its members (see also Whaley 2011, 342–343). In a broader discussion of the rhetoric of endangered language linguistics, Hill (2002, 127–128) characterizes this under the heading “enumeration,” and Dobrin, Austin, and Nathan (2009) further elab orate on how documentary linguistics has inadvertently taken this process of enumer ation one step further by reducing languages to the form of archivable objects, such as recordings and annotated texts. The urge to “standardize” languages and language data is perfectly understandable in an academic and political environment where it is necessary to justify the allocation of scarce resources to documentary and revitalization activities. Ethical issues arise, how ever, when we accidentally carry over the assumptions embedded within these models of endangerment to work with speaker communities. For instance, we may privilege an idealized native speaker as the only “true” holder of knowledge of a “language” and, in so doing, fail to appreciate the diverse ways that people can be speakers in their local context (see, e.g., Evans 2001; Grinevald 2007; and Dobrin and Berson 2011, 191), or we may fail to engage the complexities involved with determining what the “speaker com munity” is of a language in regions characterized by pervasive multilingualism (see, e.g., Lüpke and Storch 2013, 13–47). A second important feature of the ideology of contemporary linguistics related to notions of language and identity is the romantic association of each “endangered” lan guage to the “unique local knowledge of the cultures and natural systems in the region in which it is spoken.”6 Once this association is made, an emphasis on the “preservation” 6 The quotation in this sentence is taken from the program solicitation for the National Science Foundation’s Documenting Endangered Languages program (http://www.nsf.gov/pubs/2016/nsf16576/ nsf16576.htm).

Ethics in Language Documentation and Revitalization 429 of these languages naturally arises, both for the sake of universal human knowledge (see Hill 2002, 121–123) and as a means of ensuring that the groups associated with these lan guages can maintain their rights to a collective identity (see Errington 2003, 727–729).7 This ideological linkage is hardly unique to linguists—it is, after all, deeply connected to the “Herderian equation” (Foley 2005, 158) of language, culture, and nation that is a fundamental part of the language ideology of the European nation-state. However, it is by no means universal. In Lower Fungom, as discussed in section 2.4, languages are linked to units of “cooperation” which are not necessarily stable. Other kinds of linkages are also described in the literature, such as one between language and land described for Papua New Guinea (see, e.g., Slotta 2012, 5 and Dobrin 2014, 129–130). A lack of awareness of the ways that researcher ideologies can shape encounters with speaker communities has the potential to lead to unethical outcomes.8 For instance, it could result in the imposition of a Western ideology on a community lacking a strong language– culture linkage in a way that decreases the potential for cultural self-determination. Reiman (2010, 127) writes of his work on the endangered Kasanga language of Guinea-Bissau that speakers, “agree and appreciate it when told that their language and culture are valuable and worthwhile.” However, rather than telling speakers about the value of their language, it would be more in line with the current documentary ethos to try to discover what they value about their language and find ways to incorporate those values into the structure of documentary or revitalization projects. Similarly, as discussed in detail by Dobrin (2011, 191–194), preservationist ideologies lead to an emphasis on documenting a “pure” ancestral code—i.e., uses of language where instances of borrowing and code-switching are minimized. This will often run counter to the way a language is actually used and result in the creation of documentary records which do not clearly reflect the speech practices of a community. This may not be as obviously problematic as, say, recording someone without their consent (see section 6), but, it clearly runs counter to a core value of documentary linguistics that the records that we help pro duce of a language should record the actual practices of speakers as faithfully as possible (see also Childs, Good, and Mitchell 2014). A final ideological view increasingly articulated by linguists is the idea that work on endangered languages should involve significant collaboration with speaker communities. This topic is considered briefly in section 4.

3.2. Local language ideologies Within the vast range of work that has been done on language ideologies (see Woolard 1998), the topic of most interest here is the way in which they intersect with efforts at

7

These themes can already be found in Hale et al. (1992, 8–9). Though not couched in terms of ethics or ideologies, Ladefoged’s (1992) response to Hale et al. (1992) partly anticipates the points made here. 8

430 Jeff Good language documentation and revitalization.9 The case studies considered in section 2 al ready illustrate the importance of understanding this in a number of ways. For instance, in section 2.2, Debenport (2010a) found that her work was inadvertently transforming local language ideologies, in particular creating a new class of language “experts” that had not previously existed. In section 2.5, Innes (2010) encountered a case where com munity conceptualizations of the power associated with certain texts made them inap propriate for her to work on. There will be as much diversity of language ideologies as there is of languages and speaker communities, making this a difficult topic to consider in a general way here. However, some further examples can be given for the purposes of illustration. Terrill (2002), for instance, examines the desire of the Lavukaleve community of the Solomon Islands for a dictionary that would be of little practical value. What turned out to be of most importance for them was not for an outside linguist to produce resources which would directly support language maintenance, as might normally be expected, but, rather, to have an object which would add to the language’s prestige. In this case, local ideas about how to “support” a language differed from those typically adopted by linguists. More generally, the reception of written materials in a community will necessarily be linked to local language ideologies. This means that, before embarking on what may seem to be a relatively neutral activity involving writing, such as the development of an orthography—an activity which is commonly proposed as a way to “give back” to a community—it is important to understand the potential ethical implications of such work.10 Rehg’s (2004) examination of language development projects in Micronesia is especially instructive in this regard. Among other things, he evaluates the problems that arise due to the standardizing “impulse” typically associated with orthography devel opment, which, by implication, creates “non-standard” varieties when, previously, var iation was conceptualized in different terms (Rehg 2004, 509). More strikingly, Rehg (2004, 512–515) suggests that attempts to develop Micronesian languages by making them more like “English” (i.e., by being associated with a written standard, dictionaries, etc.) may hasten their demise by not focusing on the actual concerns of speaker communities, an interesting contrast to what was found by Terrill (2002) who was also working in the Pacific.11 This discussion can only scratch the surface of the complexities that can arise when different ideologies of language come into contact. However, a general point emerging 9

Meek (2010) and Nevins (2013) are recent detailed studies of this topic in a Native American context. Seifart’s (2006) discussion of orthography development is of note here. While he covers important issues relevant to the ethics of orthography development with respect to, for instance, aligning orthographic choices with a community’s goals for the representation of their language in visual form (see, e.g., Seifart 2006, 284–285), he explicitly avoids the broader ethical issues involved in developing a writing system for a language in the first place (Seifart 2006, 275–276). 11 Lüpke’s (2011) discussion of orthography development is helpful in clarifying many of the complex social factors involved in working on the development of writing systems for languages lacking a significant written tradition. 10

Ethics in Language Documentation and Revitalization 431 from it should hopefully be clear: greater understanding of the ideologies at play in documentation and revitalization projects can help researchers achieve more ethical outcomes.12

4. The maintenance relationships in documentary work Standard presentations of language documentation emphasize a dichotomy between “outside researchers” and members of a “speech community,” as effectively presented in the various collaboration diagrams given in Leonard and Haynes (2010).13 A re sult of this is that extensive attention has been paid to the ethics of collaboration among these idealized groups in documentary and revitalization work (see, e.g., Czaykowska-Higgins (2009) for detailed development and Dobrin and Schwartz (2016) for a recent synthesis).14 Rather than further develop that topic here, the focus will be on the other sorts of relationships that one builds up in the field that have re ceived less attention. Presentations such as Dwyer (2006, 37) and Holton (2009, 169) are helpful in this regard by clarifying how many different groups may be involved in a successful proj ect. Dwyer (2006) enumerates, for instance, different levels of government, funding organizations, research institutions, and archives, as well as the ultimate users of documentary products.15 Holton (2009) presents a more particularized set of groupings from a research context in Indonesia which makes clear the complexities lying behind the idea of understanding who can represent a “community” in a doc umentary context. The case studies in section 2.1 and section 2.2 also revealed the problems that can arise when groups that look like coherent “communities” to the researcher may, in fact, be characterized by significant internal divisions (see also Whaley 2011, 340–342). In a somewhat different vein, Grinevald’s (2007) discussion of the variety of different kinds of “speakers” one may encounter within a community also clarifies the variety of relationships required for successful documentary and revitalization work. A strength of the existing literature is that it orients scholars away from a researcher- oriented perspective toward one that incorporates the needs and 12

The nature of the topic under consideration here results in an orientation towards potential problems that might arise. See, however, Austin (2014) for a case where language ideologies and existing documentation aligned in a way that resulted in a positive outcome. 13 See Good (2012) for consideration of other kinds of collaboration. 14 See also Crippen and Robinson (2013) and Bowern and Warner (2015) for debate around ethical issues emanating from the field’s current emphasis on collaborative models. 15 See the collection introduced by Dobrin (2009) for consideration of the role of one prominent institution, SIL International, in documentary work.

432 Jeff Good aspirations of the different community members and other stakeholders who play a central role in the work. At the same time, one area that has not been well developed, but which is central to ethical practice, is how the relationships that a researcher takes part in while in the “field” are perceived in the context of the norms of the communities that they work with. Dobrin (2008), discussed in section 2.3, is an ex ception here in its treatment of the outsider’s role in a local system of exchange (see Figure 18.1). Other work on this topic can be found in the cultural anthropological literature. A useful notion is that of enclicage, as developed by Oliver de Sardan (1995, 81), which refers to a “double-edged” problem arising from fieldwork due to the necessity of devel oping personal relationships with specific members of a community. On the one hand, the researcher’s association with some individuals, but not others, will embed them within local social networks in ways that an outsider will have difficulty understanding, and, inevitably, result in other individuals feeling, and being, excluded. On the other hand, researchers will naturally tend to adopt the perspective of their closest associates, again to the exclusion of other points of view. While ethnographically, rather than linguistically, oriented Beereman (1962) presents a relevant cautionary tale based on research in India. At one point, he had to switch his interpreter from a member of the high-status Hindu Brahmin caste to one who was a Muslim (Beereman 1962, 15). This had an unexpected impact on his relationship with members of the community which was the focus of his research, largely due to the spe cial social constraints governing interaction with and among Brahmins in this part of the world. Not only had Beereman become associated with this group when working with his first interpreter, the interpreter’s view of the local world also significantly struc tured that of the researcher. It is also important to consider how one’s identity may cause them to quickly be placed into pre-existing “templates” of interaction. In some contexts, such as ab original Australia and native North America, outside researchers know from the outset that they would enter as individuals who need to dedicate substantial efforts to building up trust with community members (see, e.g., Leonard and Haynes 2010, 275–276, as well as the case studies in section 2.1 and section 2.2). In others, they may instead find themselves immediately treated as having a very high status which entails special obligations, as indicated in section 2.3 (see also Mc Laughlin and Sall 2001, 196–200). Aspects of identity beyond being a “researcher” will matter as well, of course, with gender, age, family status, and nationality standing out in partic ular. In addition, the insider status of community members who are also researchers makes their own role in a project quite distinctive from outside researchers (see Cruz and Woodbury 2014 for a work written from both insider and outsider perspectives).

Ethics in Language Documentation and Revitalization 433

5. Responsible handling of language resources Much of the discussion here has focused on relationships in documentary and revitali zation work since this is where the greatest ethical pitfalls emerge. However, the goals of these efforts typically involve the creation of language resources of one kind or another. Dwyer (2006, 40–50), Thieberger and Musgrave (2007), and Chelliah and de Reuse (2011, 151) discuss the obligation of the researcher to pay careful attention to issues of rights and access of language resources, as well as to help community members fully un derstand the different kinds of access to materials that digital technologies provide. This includes ensuring community members are able to access collected materials in a way that is appropriate to their needs and interests. Existing guides to data management and archiving such as Thieberger and Berez (2012) effectively present key steps in the responsible management of documentary data, and work done on documentation and archiving, such as Nathan (2010), considers ways that access restrictions required by stakeholders in documentary resources can be implemented (see also Christen 2008). On the whole, the field has a decent under standing of how to ethically process resources assuming there is a clear awareness of any sensitivities associated with them. More difficult is ensuring that all parties contributing to the creation of a resource have adequate knowledge of the cataloging and dissemi nation practices of contemporary documentation. Robinson (2010) considers the issue of “informed consent” (see section 6) for material dissemination when working with speakers with little awareness of modern technology. It is also hard to anticipate cases where there may be very different cultural understandings of linguistic products between researchers and community members, such as what was seen in section 2.5, where Innes (2010) inadvertently exposed her consultants to texts that they viewed as dangerous. One proposed response to problems like these has been to expand the norms for collecting metadata for doc umentary resources to the level of a “meta-documentation” which would, among other things, “document the goals, processes, methods, and structures of language documentation projects” (Austin 2013, 14–15). Meta-documentation itself would not guarantee more ethical outcomes, but, by providing more context to documentary resources, it would facilitate the kind of evaluation required for their more ethical use. Henderson (2013) provides a case study of developing community-driven meta- documentation for legacy materials of the Noongar language of Australia which resulted in a more ethical protocol for use of the materials than could have been devel oped by researchers alone.

434 Jeff Good

6. Ethics versus compliance in endangered language work A final set of topics that are generally discussed under the broad heading of ethics are issues connected to the systems of institutional and legal compliance that researchers are subject to.16 Of these, regulations on research involving so-called human subjects as implemented by university ethics review boards—e.g., Institutional Review Boards (IRBs) in the United States—have received the most attention (see, e.g., Bowern 2010). Intellectual property laws and the application of copyright to documentary resources are also frequently addressed (Dwyer 2006, 45–48; Austin 2010, 41–46; Newman 2012). Discussion of these topics has been backgrounded here because of their some times tenuous connection to ethics itself. At heart, the problem involves a tension be tween “ethics” and “compliance” (see Dobrin and Lederman 2012). The latter of these notions emphasizes the adoption of practices which adhere to standards set by outside authorities, rather than emanating from the dynamics of any specific project. The nature of ethics review as conducted by IRBs in the United States exemplifies this tension. While the presence of a systematic means to ensure that research is conducted ethically has clear benefits (see, e.g., Bowern 2010, 901), the fact that this system of re view is based on a medical model of research limits its efficacy at evaluating the ethics of much of what is involved in language documentation and revitalization. The result is that the review process can, on the one hand, accidentally guide the researcher to ward unethical outcomes and, on the other hand, fail to consider where the real ethical tensions of certain research activities lie. A well-known problem relates to the handling of data which, in medical research is rightfully treated as requiring strong protections to maintain the confidentiality of research participants. Anonymity is, therefore, adopted as a kind of default stance for data storage, while documentary work favors explicit recognition of everyone involved in the research (Bowern 2010, 903). A focus on harm to “human subjects” can further make these boards ineffective at evaluating what harm a researcher’s activities might pose to a community at large, a frequent concern in documentary and revitalization work. Finally, it should be emphasized that this system of review is simply not designed to monitor the sorts of “culture clashes” discussed above since the “controlled” nature of idealized medical research does not foreground these kinds of ethical problems. What this means is that, while it is certainly too strong to accuse ethics protocols generally of “moral depravity” (van Driem 2016), one cannot assume that having

16

Included among these are codes adopted by relevant scholarly societies, such as the Ethics Statement of the Linguistic Society of America (http://www.linguisticsociety.org/sites/default/files/ Ethics_Statement.pdf). The Australian Linguistic Society also has adopted a set of policies on ethical conduct (http://www.als.asn.au/activities.html). Of particular note in the present context is their explicit articulation of the linguistic rights of Aboriginal and Torres Strait Islander communities.

Ethics in Language Documentation and Revitalization 435 institutional ethics approval means that one is necessarily conducting “ethical” lin guistic research. The goal of documentary linguists should be to work with review boards and educate them about the nature of their research so that research protocols can be devised that are both compliant and ethical. Despite horror stories, such as one presented by van Driem (2016, 44) where a researcher was ostracized by a community because of culturally inappropriate requirements imposed by a review board, at least in the United States, these boards can generally be persuaded to deviate from their estab lished norms when provided with reasonable justification. This could involve appealing to disciplinary ethics statements or by providing examples of how similar research was approved at other institutions. Students, in particular, may benefit from reaching out to scholars with more experience in this area when trying to navigate the review process. One key concept that work on ethics in documentary linguistics has adopted from the medical domain is the notion of informed consent (see, e.g., Dwyer 2006, 43–45; Grinevald 2006, 353–363; Thieberger and Musgrave 2007, 30–32; and Austin 2010, 39– 40). If “informed consent” is understood broadly to mean that researchers engage in ongoing processes of discussion with other stakeholders to ensure that there is a mutual understanding of the nature and goals of a project, it is clear that this aligns well with the general ethical stance of contemporary documentary and revitalization work. However, informed consent is an area where ethics review boards often recommend the use of culturally inappropriate written forms whose content is organized to minimize legal lia bility rather than to be genuinely informative (see, e.g., Bowern 2010, 900). This is, then, an area where the distinction between ethics and compliance in research is especially clear, and where special attention is needed in order to ensure ethical outcomes (see also Robinson 2010). Similar issues can be raised with respect to intellectual property rights, though they are generally only especially contentious when documentary materials have commer cial value. Problems can be found, in particular, if a speaker community’s notions of intellectual property do not align with those of existing legal regimes in a way which could result in a loss to the community. This could happen, for instance, if documentary materials revealed Traditional Ecological Knowledge, which may be quite valuable, but is not subject to copyright (see Austin 2010, 46; Newman 2012, 448). In such a case, the information would have to be protected by access restrictions (see section 5) rather than by copyright law.

7. Achieving ethical relationships in diverse contexts If there is one overriding theme that ties together the points made in this chapter, it is that the foundation for ethical language documentation and revitalization is eth ical relationships, and these, in turn, are best developed in a context of mutual

436 Jeff Good understanding. This can be difficult to achieve, however, because this kind of work so often involves a researcher working in an unfamiliar cultural context and because of the complicated ways that the culture of research may interact with local moral values. Realizing ethical outcomes, therefore, requires not only expertise in general aspects of documentary practice but also knowledge of the cultural assumptions that inform both one’s own approach to a given project and those of its other stakeholders. In practical terms, there are various concrete ways of arriving at an improved under standing of the complexities involved. First, it is important that those engaging in doc umentation and revitalization appreciate the ideological assumptions about language, speakers, and communities that underpin the work. Second, especially when working in parts of the world with which one has had little previous cultural contact, much can be gained by turning to the existing ethnographic literature on the region, in particular work on systems of social organization, exchange, and, if available, language ideologies. Finally, one can reach out to people with experience working in a given region and with the communities where the work will take place, whether these are people from the re gion or those who have learned about it in other ways. As with anything as complex as “culture,” the more viewpoints one can gather, the better. Finally, it should be said that the nature of the subject matter of this chapter has caused it to accentuate ethical problems over ethical successes. This is largely be cause following the core ethical precept of “do no harm” requires us to consider just what harms can come about in the first place. However, it would be wrong to assume that ethical dilemmas are a defining feature of documentation and revitalization. Most scholars will gladly talk about the many positive relationships and beneficial outcomes that developed over the course of their work. Learning how to orient one’s behavior in more ethical directions is not only about “doing no harm” but also about making good relationships even stronger.

References Austin, Peter K. 2010. “Communities, Ethics and Rights in Language Documentation.” In Language Documentation and Description, vol. 7, edited by Peter K. Austin, 34–54. London: Hans Rausing Endangered Languages Project. Austin, Peter K. 2013. “Language Documentation and Meta-Documentation.” In Keeping Languages Alive: Documentation, Pedagogy, and Revitalization, edited by Mari C. Jones and Sarah Ogilvie, 3–15. Cambridge: Cambridge University Press. Austin, Peter K. 2014. “Going, Going, Gone? The Ideologies and Politics of Gamilaraay- Yuwaalaray Endangerment and Revitalization.” In Endangered Languages: Beliefs and Ideologies in Language Documentation and Revitalization, edited by Peter K. Austin and Julia Sallabank, 109–124. Oxford: Oxford University Press. Austin, Peter K. and Julia Sallabank, eds. 2014. Endangered Languages: Beliefs and Ideologies in Language Documentation and Revitalization. Oxford: Oxford University Press. Becquelin, Aurore Monod, Emmanuel de Vienne, and Raquel Guirardello-Damian. 2008. “Working Together: The Interface Between Research and Native People: The Trumai Case.”

Ethics in Language Documentation and Revitalization 437 In Lessons from Documented Endangered Languages, edited by K. David Harrison, David S. Rood, and Arienne M. Dwyer, 43–66. Amsterdam: John Benjamins. Beereman, Gerald D. 1962. Behind Many Masks: Ethnography and Impression Management in a Himalayan Village. Ithaca, NY: Society for Applied Anthropology. Bowern, Claire. 2010. “Fieldwork and the IRB: A Snapshot.” Language 86: 897–905. Bowern, Claire. 2015. Linguistic Fieldwork: A Practical Guide. 2nd ed. Houndmills, Basingstoke, Hampshire, UK: Palgrave Macmillan. Bowern, Claire and Natasha Warner. 2015. “‘Lone Wolves’ and Collaboration: A Reply to Crippen and Robinson (2013).” Language Documentation & Conservation 9: 59–85. Brooks, Joseph D. 2015. “On Training in Language Documentation and Capacity Building in Papua New Guinea: A Response to Bird et al.” Language Documentation & Conservation 9: 1–9. Chelliah, Shobhana L. and Willem J. de Reuse. 2011. Handbook of Descriptive Linguistic Fieldwork. Dordrecht, The Netherlands: Springer. Childs, G. Tucker, Jeff Good, and Alice Mitchell. 2014. “Beyond the Ancestral Code: Towards a Model for Sociolinguistic Language Documentation.” Language Documentation & Conservation 8: 168–191. Christen, Kimberly. 2008. “Archival Challenges and Digital Solutions in Aboriginal Australia.” SAA Archeological Recorder 8: 21–24. Crippen, James A. and Laura C. Robinson. 2013. “In Defense of the Lone Wolf: Collaboration in Language Documentation.” Language Documentation & Conservation 7: 123–135. Crowley, Terry. 2007. Field Linguistics: A Beginner’s Guide. Oxford: Oxford University Press. Cruz, Emiliana and Anthony C. Woodbury. 2014. “Collaboration in the Context of Teaching, Scholarship, and Language Revitalization: Experience from the Chatino Language Documentation Project.” Language Documentation & Conservation 8: 262–286. Cysouw, Michael and Jeff Good. 2013. “Languoid, doculect and glossonym: Formalizing the Notion ‘Language’.” Language Documentation & Conservation 7: 331–359. Czaykowska-Higgins, Ewa. 2009. “Research Models, Community Engagement, and Linguistic Fieldwork: Reflections on Working Within Canadian Indigenous Communities.” Language Documentation & Conservation 3: 15–50. Debenport, Erin. 2010a. “Comparative accounts of linguistic fieldwork as ethical exercises.” International Journal of the Sociology of Language 206: 227–244. Debenport, Erin. 2010b. “The Potential Complexity of ‘Universal Ownership’: Cultural Property, Textual Circulation, and Linguistic Fieldwork.” Language and Communication 30: 204–210. Debenport, Erin. 2015. Fixing the Books: Secrecy, Literacy, and Perfectibility in Indigenous New Mexico. Santa Fe, NM: School for Advanced Research Press. Di Carlo, Pierpaolo and Jeff Good. 2014. “What Are We Trying to Preserve? Diversity, Change, and Ideology at the Edge of the Cameroonian Grassfields.” In Endangered Languages: Beliefs and Ideologies in Language Documentation and Revitalization, edited by Peter K. Austin and Julia Sallabank, 229–262. Oxford: Oxford University Press. Dobrin, Lise M. 2008. “From Linguistic Elicitation to Eliciting the Linguist: Lessons in Community Empowerment from Melanesia.” Language 84: 300–324. Dobrin, Lise M. 2009. “SIL International and the Disciplinary Culture of Linguistics.” Language 85: 618–619. Dobrin, Lise M. 2014. “Language Shift in an ‘Importing Culture’: The Cultural Logic of Arapesh Roads.” In Endangered Languages: Beliefs and Ideologies in Language Documentation and

438 Jeff Good Revitalization, edited by Peter K. Austin and Julia Sallabank, 125–148. Oxford: Oxford University Press. Dobrin, Lise M., Peter K. Austin, and David Nathan. 2009. “Dying to Be Counted: The Commodification of Endangered Languages in Documentary Linguistics.” In Language Documentation and Description, vol. 6, edited by Peter K. Austin, 37–52. London: Hans Rausing Endangered Languages Project. Dobrin, Lise M. and Josh Berson. 2011. “Speakers and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 188–211. Cambridge: Cambridge University Press. Dobrin, Lise M. and Rena Lederman. 2012. “Imagine Ethics Without IRBs.” Anthropology News 53: 20. Dobrin, Lise M. and Saul Schwartz. 2016. “Collaboration or Participant Observation? Rethinking Models of ‘Linguistic Social Work’.” Language Documentation & Conservation 10: 253–277. Dwyer, Arienne M. 2006. “Ethics and Practicalities of Cooperative Fieldwork and Analysis.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 31–66. Berlin: Mouton de Gruyter. Errington, Joseph. 2003. “Getting Language Rights: The Rhetorics of Language Endangerment and Loss.” American Anthropologist 105: 723–732. Evans, Nicholas. 2001. “The Last Speaker Is Dead—Long Live the Last Speaker!” In Linguistic Fieldwork, edited by Paul Newman and Martha Ratliff, 250–281. Cambridge: Cambridge University Press. Foley, William A. 2005. “Personhood and Linguistic Identity, Purism and Variation.” In Language Documentation and Description, vol. 3, edited by Peter K. Austin, 157–180. London: Hans Rausing Endangered Languages Project. Good, Jeff. 2011. “Data and Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 212– 234. Cambridge: Cambridge University Press. Good, Jeff. 2012. “‘Community’ Collaboration in Africa: Experiences from Northwest Cameroon.” In Language Documentation and Description, vol. 11, edited by Peter K. Austin and Stuart McGill, 28–58. London: School of Oriental and African Studies. Grenoble, Lenore A. and Lindsay J. Whaley. 2005. “Review of Language Endangerment and Language Maintenance, ed. by David Bradley and Maya Bradley, and Language Death and Language Maintenance, ed. by Mark Janse and Sijmen Tol.” Language 81: 965–974. Grinevald, Colette. 2003. “Speakers and Documentation of Endangered Languages.” In Language Documentation and Description, vol. 1, edited by Peter K. Austin, 52–72. London: Hans Rausing Endangered Languages Project. Grinevald, Colette. 2006. “Worrying About Ethics and Wondering About ‘Informed Consent’: Fieldwork from an Americanist Perspective.” In Lesser-Known Languages of South Asia: Status and Policies, Case Studies and Applications of Information Technology, edited by Anju Saxena and Lars Borin, 339–370. Berlin: Mouton de Gruyter. Grinevald, Colette. 2007. “Encounters at the Brink: Linguistic Fieldwork Among Speakers of Endangered Languages.” In The Vanishing Languages of the Pacific Rim, edited by Osamu Sakiyama, Osahito Miyaoka, and Michael E. Krauss, 35–76. Oxford: Oxford University Press. Hale, Ken, Michael Krauss, Lucille J. Watahomigie, Akira Y. Yamamoto, Colette Craig, LaVerne Masayesva Jeanne, and Nora C. England. 1992. “Endangered Languages.” Language 68: 1–42.

Ethics in Language Documentation and Revitalization 439 Henderson, John. 2013. “Language Documentation and Community Interests.” In Keeping Languages Alive: Documentation, Pedagogy, and Revitalization, edited by Mari C. Jones and Sarah Ogilvie, 56–68. Cambridge: Cambridge University Press. Hill, Jane H. 2002. “‘Expert Rhetorics’ in Advocacy for Endangered Languages: Who Is Listening, and What Do They Hear?” Journal of Linguistic Anthropology 12: 119–133. Holton, Gary. 2009. “Relatively Ethical: A Comparison of Linguistic Research Paradigms in Alaska and Indonesia.” Language Documentation & Conservation 3: 161–175. Hymes, Dell H. 1968. “Linguistic Problems in Defining the Concept of ‘Tribe’.” In Essays on the Problem of Tribe: Proceedings of the 1967 Annual Spring Meeting of the American Ethnological Society, edited by June Helm, 65–90. Seattle: University of Washington Press. Innes, Pamela. 2010. “Ethical Problems in Archival Research: Beyond Accessibility.” Language and Communication 30: 198–203. Kroskrity, Paul V. 1992. “Arizona Tewa kiva Speech as a Manifestation of Linguistic Ideology.” Pragmatics 2: 297–309. Ladefoged, Peter. 1992. “Another View of Endangered Languages.” Language 68: 809–811. Leonard, Wesley Y. and Erin Haynes. 2010. “Making ‘Collaboration’ Collaborative: An Examination of Perspectives That Frame Linguistic Field Research.” Language Documentation & Conservation 4: 268–293. Lüpke, Friederike. 2011. “Orthography Development.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 312– 336. Cambridge: Cambridge University Press. Lüpke, Friederike and Anne Storch. 2013. Repertoires and Choices in African Languages. Berlin: De Gruyter Mouton. McLaughlin, Fiona and Thierno Seydou Sall. 2001. “The Give and Take of Fieldwork: Noun Classes and Other Concerns in Fatick, Senegal.” In Linguistic Fieldwork, edited by Paul Newman and Martha Ratliff, 189–210. Cambridge: Cambridge University Press. Meek, Barbra. 2010. We Are Our Language: An Ethnography of Language Revitalization in a Northern Athabascan Community. Tucson: University of Arizona Press. Nathan, David. 2010. “Archives 2.0 for Endangered Languages: From Disk Space to MySpace.” International Journal of Humanities and Arts Computing 4: 111–124. Nevins, M. Eleanor. 2013. Lessons from Fort Apache: Beyond Language Endangerment and Maintenance. Chichester, West Sussex, UK: Wiley. Newman, Paul. 2012. “Copyright and Other Legal Concerns.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 430– 456. Oxford: Oxford University Press. Olivier de Sardan, Jean-Pierre. 1995. “La politique du terrain: sur la production des données en anthropologie.” Enquête: anthropologie, histoire, sociologie 1: 71–109. O’Meara, Carolyn and Jeff Good. 2010. “Ethical Issues in Legacy Language Resources.” Language and Communication 30: 162–170. Rehg, Kenneth L. 2004. “Linguists, Literacy, and the Law of Unintended Consequences.” Oceanic Linguistics 43: 498–518. Reiman, D. Will. 2010. “Basic Oral Language Documentation.” Language Documentation & Conservation 4: 254–268. Rice, Keren. 2006. “Ethical Issues in Linguistic Fieldwork: An Overview.” Journal of Academic Ethics 4: 123–155.

440 Jeff Good Rice, Keren. 2012. “Ethical Issues in Linguistics Fieldwork.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 407– 429. Oxford: Oxford University Press. Robinson, Laura C. 2010. “Informed Consent Among Analog People in a Digital World.” Language and Communication 30: 186–191. Sakel, Jeanette and Daniel L. Everett. 2012. Linguistic Fieldwork. Cambridge: Cambridge University Press. Seifart, Frank. 2006. “Orthography Development.” In Essentials of Language Documentation, edited by Jost Gippert, Nikolaus Himmelmann, and Ulrike Mosel, 275–299. Berlin: Mouton de Gruyter. Slotta, James. 2012. “Dialect, Trope, and Enregisterment in a Melanesian Speech Community.” Language and Communication 32: 1–13. Terrill, Angela. 2002. “Why Make Books for People Who Don’t Read? A Perspective on Documentation of an Endangered Language from Solomon Islands.” International Journal of the Sociology of Language 155/156: 205–219. Thieberger, Nicholas and Andrea Berez. 2012. “Linguistic Data Management.” In The Oxford Handbook of Linguistic Fieldwork, edited by Nicholas Thieberger, 90–118. Oxford: Oxford University Press. Thieberger, Nicholas and Simon Musgrave. 2007. “Documentary Linguistics and Ethical Issues.” In Language Documentation and Description, vol. 4, edited by Peter K. Austin, 26–37. London: Hans Rausing Endangered Languages Project. Toelken, Barre. 1996. “From Entertainment to Realization in Navajo Fieldwork.” In The World Observed: Reflections on the Fieldwork Process, edited by Bruce Jackson and Edward D. Ives, 1–17. Urbana and Chicago: University of Illinois Press. van Driem, George. 2016. “Endangered Language Research and the Moral Depravity of Ethics Protocols.” Language Documentation & Conservation 10: 243–252. Warner, Natasha, Quirina Luna, and Lynnika Butler. 2007. “Ethics and Revitalization of Dormant Languages: The Mutsun Language.” Language Documentation & Conservation 1: 58–76. Whaley, Lindsay J. 2011. “Some Ways to Endanger an Endangered Language Project.” Language and Education 25: 339–348. Wilkins, David P. 1992. “Linguistic Research Under Aboriginal Control: A Personal Account of Fieldwork in Central Australia.” Australian Journal of Linguistics 12: 171–200. Woodbury, Anthony C. 2011. “Language Documentation.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 159– 186. Cambridge: Cambridge University Press. Woolard, Kathryn. 1998. “Language Ideology as a Field of Inquiry.” In Language Ideologies: Practice and Theory, edited by Bambi Schieffelin, Kathryn Woolard, and Paul Kroskrity, 3–49. Oxford: Oxford University Press.

Pa rt I I I

L A N G UAG E R E V I TA L I Z AT ION

Chapter 19

Approach e s to and Strate g i e s for L an guage Revi ta l i z at i on Leanne Hinton

1. Introduction Language revitalization is not an automatic response to language endangerment. In fact, endangerment itself is to a large extent driven by community-internal attitudes that their language is inferior to the encroaching socially dominant language, and no longer useful. These internal feelings are in turn driven in part by external evaluations of the language as useless and should be given up for the sake of assimilation to the larger society. This sense of inferiority of the language and culture is constantly reinforced through education and policies of the larger society. Furthermore, even those who love their language can easily feel a paralyzing sense of despair and hopelessness as they observe people ceasing to use it and children not learning it. A change of attitude needs to happen before language revitalization can occur (Bradley 2003; A. King, Chapter 23, this volume, and J. King, Chapter 26, this volume). How this change can occur tends to relate to several factors from both outside the community (such as a change in language policy) and inside (such as economic improvement within the community). But in my experience with endangered languages, I have most often seen language revitalization develop steam primarily through the actions and inspiration of individuals within the community—and usually these are individuals of the generation that did not learn their language in the home, and feel the loss. There are many stories of people who have begun a language revolution through their own personal acts, either with the encouragement of their community or simply on their own (e.g., Fellman 1973; Baldwin et al. 2013; Grounds and Grounds 2013; little doe baird1 1

little doe baird purposely spells her name with lower case.

444 Leanne Hinton 2013; see also Baldwin and Costa, Chapter 24, this volume). People outside the speech community—funders, researchers, consultants, teachers—can be helpful and even inspiring in the process, but it takes inspiration and commitment from within for language revitalization to begin and to make progress. I will focus on four main aspects to the revitalization of endangered and sleeping languages: child learning, adult learning, modernization, and language use. Child learning would include school and home as the main venues; adult learning can occur through university classes, community classes, Master- Apprentice approaches, or learning from documentation (all of which can shade into each other). Modernization includes new vocabulary development and other kinds of language engineering, and use of new writing systems. Language use is the ultimate goal of the other aspects, but for endangered languages, using the language has to begin as a consciously planned endeavor with its own approaches and strategies. Each of these four aspects of language revitalization function as strategies for success of the other three aspects as well, as we shall see. In the best of all worlds, all four of these aspects synchronize for a strong program.

2. Child learning A community begins to be aware that their language is endangered when it becomes obvious that many of their children are not learning their heritage language at home, or are ceasing its use at school age. But as mentioned above, the call to action is often later than that, when the parents themselves have grown up without their language and feel the significance of this loss. The home has faltered as a venue for learning the heritage language, and the school both historically and currently focuses on the mainstream language of the larger society, overtly or unconsciously discouraging the use of the heritage tongue. Yet children are the master language learners of humanity, so if they can be exposed to the local language early enough, thoroughly enough, and long enough, it can be hoped that the community’s shift away from the language can be reversed.

2.1. Language at home It seems like the obvious place to begin language revitalization would be in the home itself. And it is often the case that individuals take the reins of language revitalization in just that way. The Miami language is one case in point (Baldwin and Costa, Chapter 24, this volume), where what has become a strong community-wide revitalization program began with a single family using their language at home. The inspiration for Hebrew language revitalization is often said to be Ben Yehuda’s decision to speak only Hebrew to his son (Fellman 1973). There are many examples that can be found of people choosing to use their endangered language at home with their children (Hinton 2013). This approach

Approaches to and Strategies for Language Revitalization 445 to language revitalization allows children to learn their language early and naturally, the way everyone learns their first language. While there are many families that have committed to using their endangered language at home with their children, they are generally taking the plunge on their own. Community-based language revitalization rarely starts with family language support programs. Community-based efforts at language revitalization have tended to ignore family-based possibilities or treat them as an afterthought. In fact, I have sometimes seen that even the teachers teaching the endangered language in schools fail to use it in their own home environment. But in recent years, there have been a number of family-based programs developed, among them notably the Thousand Homes Māori language program in New Zealand (Kotahi Mano Kāika -1000 Ngāi Tahu Homes, http://www.kmk.maori.nz/home/). An excellent resource for Māori parents is Kei Roto I te Whare: Māori language in the home (Te Kāwanatanga o Aotearoa 2008). It is downloadable from the web (https://www. tpk.govt.nz/en/a-matou-mohiotanga/language/kei-roto-i-te-whare-reprinted). In Scotland, Finlay Macleoid used to run the Taic/CNSA Family Language Plan for young parents (MacLeoid 2013). In 2016, the Tolowa nation in California started a program for five families, led by Pyuwa and Ruby Bommelyn, the first family to use the Tolowa language at home with their children in this era of revitalization (see an interview with Ruby, in Medina 2015). In these programs, in almost all these families the parents are second-language learners, sometimes learning along with their children. Also, as new generations of children grow up who have learned their language from second-language parents or from immersion schools, many of them are choosing to use the language at home with their children. Thus language revitalization in the family is a growing phenomenon and a very hopeful result of the hard work people have been doing over decades of other approaches to language revitalization. Family language programs first must make sure that parents have the opportunity to learn the kind of language they need at home with their children. Even parents who have become proficient through school or university programs find that once they are trying to use the language with their babies they lack the vocabulary they need for the intimate details of their interaction. So family language programs teach the language that parents would use with their children. The Māori (O’Regan 2013) and Scots Gaelic programs teach domains of language such as getting children dressed or changing an infant’s diapers, waking up in the morning, feeding children, getting into car seats, praise and endearments, and getting children to bed at night (McLeoid 2013). Families learning on their own may seek elders who know the language to ask for instruction in these domains (Grounds 2013) and, more, may seek to learn some traditional aspects of child-rearing. Parents who are learning the language may learn alongside their children in a family program, or may bring home what they learn in an adult program and teach it to their children. Parents learning their language as a second language may try to be consistent in using what they know even though they might have to use the mainstream language the rest of the time (Baldwin 2013).

446 Leanne Hinton Inserting the language into a household where English has been used is often tricky. It may feel unnatural to start interacting with one’s family in that language, and the tendency to slip back into the first language is hard to overcome. Reminding each other to stay in the language is important; parents can remind a child who has slipped into English that they know how to say it in their other language, or may even refuse to respond unless the child says it in the language. It can be fun for the children to turn tables and remind the adult to speak the language. Children may feel uncomfortable with the shift into a different language at first. One strategy to get their cooperation is to start with games and other fun activities that use the language (Hinton 2013b). It is also critical to talk to your children about why using this different language is important and what the benefits of it are. Both the Māori and Scots Gaelic programs suggest making a family language plan where all members of the family are involved in the planning, including children. Families trying to make their heritage tongue a language of home also benefit by forming relations with other families with the same goals, and get together socially, showing the usefulness of the language beyond the walls of the household. Re-establishing the heritage language as a home language is language revitalization at its best. It recreates natural language transmission across generations and makes it a normal part of life. But language at home by itself does not guarantee its survival in the lives of the children. Without additional support from sources outside the nuclear family—the extended family, other families, community programs, and/or school programs—the family may not be able to raise children who are strong enough and dedicated enough speakers able to pass the language on to their own children. While the home may have been the last bastion of endangered languages while they were in decline, the language faltered there too in the end, because of lack of support outside the home. This could easily happen again to families using their language at home if there were nothing else to support it.

2.2. Language nests With language loss in indigenous communities around the world at a critical point in the 1980s, it was clear that if people wanted to save their languages something drastic would have to be done. Various programs involving the schools were and are being developed (see section 2.3), but some people realized that if families were not using the language at home, it would be best to start bringing the language to the children at the earliest possible age by other means. In 1982, New Zealand, followed in 1984 by Hawai‘i, developed the system of “language nests” (Kōhanga Reo in Māori, Pūnana Leo in Hawaiian), and this has been replicated in many locations throughout the world. Ideally, language nests are locations where pre-school-age children can spend a large part of their day, and the grandparent generation, who (again ideally) know the language, will communicate with the children entirely in that language. Thus even if the home is using the mainstream

Approaches to and Strategies for Language Revitalization 447 language, the children can be exposed sufficiently to the local language to become near- native speakers. There are now language nests all over the world—B esides New Zealand and Hawai‘i, there are language nests (and immersion preschools for endangered languages by other labels) in mainland United States (for example, the Esther Martinez fund of the Administration for Native Americans funded proposals for nine ongoing or planned language nests and other pre-school immersion programs in 2015 (http://www.acf.hhs.gov/ana/resource/native-languages-immersion-esther- martinez-initiative). In British Columbia, the First Peoples’ Cultural Council was funding ten language nests in 2016 (Aliana Parker and Suzanne Gessner, personal communication, July 4, 2016). The Northwest Territories of Canada alone has over twenty language nests (https://www.ece.gov.nt.ca/early-childhood-and- school-services/early-childhood/language-nests). There are also language nests in Mexico (Meyer and Soberanes Bojórquez 2009, 2010). In Europe, both the Skolt Saami and Inari Saami have language nests (Latomaa and Sirkku 2005, 171, Olthuis, Kivela, and Skutnabb-Kangas2013), along with Lower Sorbian (http://www.witaj- sprachzentrum.de/index.php/de). In the ideal language nest, the mainstream language of the nation is never heard, and children are engaged at all times in activities where the heritage language is being used. But this ideal is sometimes not reached, since even the grandparents are often so used to speaking the mainstream societal language to their younger relatives that they may have a hard time maintaining their heritage tongue with the children. And as the decades pass by, the number of elders who speak the language is declining, so that it is harder to staff the language nests. Strategies for making language nests function as they are supposed to are always evolving, including developing strong curriculum and training speakers in how to remain in the language, make themselves understood, and engage the children. As the number of adult native speakers declines, various means to produce new fluent adult second-language speakers are also developed to fill the teacher gap (see section 3). Many language revitalization programs begin with language nests. While the very best time to introduce the language to children is at infancy and even in the womb, the years before school age are still a very fine time for language learning. Since many countries do not require formal pre-school education, laws and policies are more lenient toward minority language immersion in language nests than they are in the primary and secondary schools, making it easier for communities to implement a language nest than a school program. It is a delight to everyone involved to hear a tiny child singing or chattering easily in the heritage language. However, language nests have only temporary success in developing new speakers if that is the only input a child has of the heritage language. Without further programs or use at home, once a child has left the language nest there will be no way to continue learning and using the language. Thus language nests are ideally followed by bilingual education or immersion schooling in the language (see section 2.3.2).

448 Leanne Hinton

2.3. The role of schools in language revitalization Formal education is a requirement in virtually all countries, and although school has been and is even now one of the most important reasons why local languages are endangered, there are many reasons why people turn to the schools for language revitalization. First, it is where a community’s children are together for a large portion of their waking hours; if some or all of those hours can be in the heritage language, a whole generation of young speakers might be created. Second, by changing a school’s orientation to the language and culture of the local community it can hopefully stop being an agent for language loss. Beyond that, it can become a location for teaching traditional culture and values as well—knowledge that has faded in the wake of colonization and mainstream education. In Hawai‘i, for example, the schoolday at an immersion school begins with Hawaiian song and teachings that inculcate Hawaiian values. Highschool electives may include such options as Hawaiian chant, hula, sailing and canoe navigation, and traditional gardening (Wilson and Kamanā 2001). School programs are often developed and mandated across a broad swath of languages by national governments. In Australia, for example, part of government policy since 2009 has been to support “language and culture nests” (using the term “language nest” as a program in public primary schools, a different definition from the Māoris and Hawaiians). Aboriginal Language and Culture Nests are an initiative of OCHRE, the NSW Government’s community-focused plan for Aboriginal affairs. They support local communities with realising their visions and aspirations to revitalise, reclaim, and maintain their traditional languages through the teaching of Aboriginal languages in schools. https://www.aboriginalaffairs.nsw.gov.au/policy-reform/language-and-culture/nests. Similarly, Mexico has a nation-wide bilingual education program for indigenous languages, with centralized training of bilingual teachers. Government-run programs are often criticized and often ineffectual for language retention (e.g., see Hamel 2008). But on a more local scale, there are also ways for communities to be in control of their own language programs and their own schools, easier done perhaps in some locations than others. A high level of local control tends to produce better results (Hamel 2008, 328).

2.3.1. Bilingual education In the 1970s and 1980s, bilingual programs were set up in many countries. Governments support bilingual education as a way to improve the educational prospects for students whose first language was not the mainstream language of the school, while at the same time making sure that the children are learning the mainstream language. There are two main models of bilingual education—the transitional model and the maintenance model. The transitional model suggests that once the child is proficient in the school language, education in their home language can be dropped. In the maintenance model, education in both languages is continued throughout, so that the first language continues to be supported. (A comparison of the transitional and maintenance models

Approaches to and Strategies for Language Revitalization 449 can be found at http://www.idra.org/IDRA_Newsletter/April_2001_Self_Renewing_ Schools_Early_Childhood/Boosting_Our_Understanding_of_Bilingual_Education/. But for endangered languages, many communities see bilingual education as something else—an opportunity to strengthen their languages, and to bring the local language into the school even if it is not the home language. Thus bilingual education became one approach to language revitalization. Children were coming to school knowing English or some other mainstream language and not knowing their local heritage language. Bilingual education became an avenue for learning the local language as well as using it for a good deal of the [instructional content] (Watahomigie and Yamamoto 1987).

2.3.2. Immersion schools Immersion schooling is an intense response to the loss of the heritage language in home and community. Immersion schools go beyond bilingual education in that most or all education takes place in the heritage language. In the most intense form, the mainstream language will be introduced only as a “foreign” language, although the students generally know the mainstream language anyway, from exposure at home and in their daily lives outside of school. Children may not know the heritage language at all when they arrive for their first days in the immersion school. But immersion learning at any age takes place through the medium of actions and activities which allow “comprehensible input” for language learning (Krashen 1983). A very first communication, for example, might be when the children come into the room and the teacher greets them and shows them where to put their jackets and other belongings, talking about it all at the same time in the target language. She may lead them to the rug in the center of the room and tell them to sit down, using understandable gestures as nonverbal cues. She will use pictures to tell stories or to explain various topics, or talk about items that are passed around the room. Weather may be a daily topic, with pictures that the teacher can point to. She can ask questions such as “Is it raining?” in the target language, and by the second day, if not the first, the students are likely to be able to say “yes” or “no” at least—and soon when she asks what the weather is like students will be able to say “It’s raining,” or other appropriate response. When it’s snack time, the foods will be named in normal conversation as they are given to the children. These strategies for bringing students into the language come out of applied linguistics, where they were developed not with endangered languages in mind, but rather for foreign language teaching and teaching English as a second language (ESL). For example, a resource that has been used frequently in teacher-training for immersion schools is Asher (2000), the developer of the popular TPR (Total Physical Response) method. Applied linguistics has much to offer to language learning of all kinds, including endangered languages (Cope and Penfield 2011). Since an endangered language almost by definition exists within a society where a mainstream language surrounds it, it is usually deemed necessary to learn to use the mainstream language. Most activists in language revitalization desire that their children will grow up to be bilingual. If their children go on to higher education, it is likely

450 Leanne Hinton to be at universities that teach in the mainstream language; and they will most likely get jobs that require it as well. Full-scale K-12 immersion schools must find a balance between the two languages in order to send their children forth in life as strong bilingual speakers. Immersion schools have found that it is important not to introduce the mainstream language too soon, since the endangered language is so vulnerable. Ke Kula Nawahiokalaniopu’u in Hilo, Hawai‘i, for example, will introduce English as a “foreign language” in high school, and occasionally uses English-language textbooks for some of their classes in the upper grades, even though even in those classes, classroom interaction is entirely in Hawaiian (Hinton, personal observation, 2014). There are immersion schools all over the world, for both endangered and non- endangered languages. The most effective outcomes of the language nests described above are of programs that also provide primary and secondary education in the language.

2.3.3. Minority language classes in Majority-language schools Many speech communities do not have the resources for a bilingual or immersion school. Even so, they may be able to develop language classes in the local schools. While a language class is rarely sufficient to create fluent speakers (as most of us know from our foreign-language classes in primary and secondary school), a language class can help students gain some level of knowledge of and appreciation for their heritage language that can serve as a foundation for further development. In the diverse Native American communities of California, for example, there are a number of small communities that teach language classes in the local schools. A few out of many possible examples: • The Yocha Dehe Wintun Nation has founded their own school, the Yocha Dehe Wintun Academy, where language and culture is a strong component of their children’s education (http://www.yochadehe.org/tribal-government/yocha-dehe- wintun-academy). Through an imaginative language curriculum (pre-K through eighth grade) and a core of talented teachers, they have produced students with strong enough proficiency in their Patwin (Wintuan) language that at least one student has used it to pass her high school “foreign” language requirement (Leland Kinter, personal communication, May 21, 2015). • The Yurok tribe has been teaching their language to tribal children and other interested students for many years now in five public schools in Humboldt County (Los Angeles Times, 2013). • On the Round Valley Reservation, the talented teacher Cheryl Tuttle has been teaching the previously “extinct” (dormant) Wailaki language to fourteen excited high school students, partnering with University of California linguists and working with the extensive documentation on the language to develop her language lessons (North Coast Journal News Blog, June 9, 2015: http:// www.northcoastjournal.com/ N ewsBlog/ a rchives/ 2 015/ 0 6/ 0 9/ w elcome- back-wailaki-an-extinct-native-language-rebounds.)

Approaches to and Strategies for Language Revitalization 451 In the public schools, teachers need a teaching certificate in order to run a classroom. This creates difficulties for teachers of endangered languages—there are very few people, especially in small speech communities such as those of Native California, who have both a teaching degree and strong speaking knowledge of their heritage language. To address this problem, pressure and planning by California Indians led to the passage of AB544 by California State Legislature, which states that any federally recognized California Indian Tribe may develop and administer an assessment of fluency of a person and recommend that person to receive a language teaching credential, which authorizes the holder to teach that language in California public schools and adult education courses.

3. Adult language learning Even though it seems commonsense to focus on little children for language revitalization, who are such great language learners, adult speakers are critically necessary for language revitalization. Home-based language revitalization can’t happen without parents who can use the language. Language nests and immersion schools can’t function without teachers who speak the language. But it is typically the case that language revitalization is taking place during a time when adult speakers are diminishing in number, especially those of parental and professional age. A language revitalization program without a strong adult language- learning program will have great difficulty moving forward successfully. Thus adult language teaching and learning is an extremely important part of language revitalization. Approaches to adult language learning are varied. I will examine a few of the dominant approaches: university and other institutional classes, the Master-Apprentice approach, and learning through linguistic documentation. Almost any program for adult language learning will have different results for different people, with some people gaining much higher proficiency than others. Much of the success in language learning depends on the learner’s degree of commitment and dedication, and his own ingenuity. Other factors are also important, such as amount of available resources, previous exposure, and degree of difference between the target language and the first language (http:// www.languagetesting.com/how-long-does-it-take).

3.1. University and other institutional classes All language-learning courses involve the teaching of vocabulary and grammar by some means. The most successful language classes at colleges and universities utilize immersion strategies to teach these, and go further to teach conversation and cultural aspects of language use as well. There are a growing number of successful language programs for endangered languages around the world. Māori, Hawaiian, Irish, and other endangered languages, especially those where a country or other large political unit has a single endangered language that is emblematic to that area, have strong language

452 Leanne Hinton education at the university level. Universities in regions of great linguistic diversity have a harder time finding ways to assist in the learning of endangered languages, but there are some great models, including the University of Victoria in British Columbia. There intensive course sequences on particular languages is rare, but they have developed a major program in language revitalization itself, where a student can work toward a certificate or an MA in language revitalization, which also includes self-study of one’s heritage tongue through the Master-Apprentice approach (see section 3.2). Excellent language learning programs have also been developed in college and university settings. For example, the aboriginal-owned and operated Six Nations Polytechnic on the Six Nations Reserve, in Ontario, Canada, teaches Mohawk using a special method of their own (the “Root-Word Method”) which also includes teaching in total immersion—always the most important factor in any method that puts out proficient speakers (Hinton, personal observation, March 29, 2016). Another home-grown method is the Accelerated Second-Language Acquisition method (ASLA), developed by Arapaho speaker Stephen Neyooxet Greymorning, and taught at the University of Montana. This is a image-based method, also taught entirely through immersion. Greymorning has done trainings on ASLA in many places in North America, as well as Australia, and the method has become an important language-teaching tool by an increasing number of endangered languages.

3.2. The Master-Apprentice approach The Master-Apprentice Language Learning Program was developed in California by the Advocates for Indigenous California Language Survival, a native-run organization which was founded in 1992 as a response to the critical endangerment of all California Indian languages. California has the situation of extreme linguistic diversity and very small populations, which makes it difficult or impossible for universities to develop effective language-learning programs. The Master-Apprentice approach is a bootstrap language-learning strategy based on the strategies that individuals have used to learn language on their own. It was developed in response to the critical endangerment of the diverse indigenous languages of California. The Advocates run workshops for one-on-one teams of a speaker and a learner (usually self-selected), and train them to immerse themselves together so that the learner can develop proficiency in the language. The “10 points for language learning” are the basis of the approach. The “10 points” are informal and have been restated in various ways; the original instantiation of the rules are below.

1. 2. 3. 4. 5.

Leave English behind Make yourself understood with non-verbal communication Teach in full sentences Aim for real communication in your language of heritage Language is also culture

Approaches to and Strategies for Language Revitalization 453

6. 7. 8. 9. 10.

Focus on listening and speaking (not reading and writing) Learn and teach the language through activities Use audio and video recording Be an active learner Be sensitive to each other’s needs; be patient and proud of each other and yourselves!

Explanations and details of each of these points can be found in Hinton, Vera, and Steele (2002, 10–19). The explanations of these points for language learning and exercises to practice them take place over a two-to three-day workshop. From then on the teams are essentially on their own, although in Master-Apprentice programs there is usually regular mentoring by phone, occasional visits, and follow-up workshops at least once a year. The Master-Apprentice approach (or Mentor-Apprentice approach as it is alternatively named) has spread to many places, including Canada, Brazil, Australia, Scandinavia, and elsewhere in the United States. A recent list of Master-Apprentice programs has been developed by Ryan Henke, a graduate student in Linguistics studying language documentation and revitalization at the University of Hawai‘i at Mānoa. At the point of this writing, his list shows 123 languages around the world that are using or attempting or planning to use the Master-Apprentice program (Ryan Henke, personal communications, June 18–August 11, 2016). The Master-Apprentice approach is popular because it is relatively simple in concept, commonsense, and based on approaches that individuals have used throughout history to learn a language through immersion in a speech community. All one needs is a speaker and a person with a passion to learn the language. The difficulties come, though, from not actually having a location where there are lots of people using the language all day, every day. The team has to create their own speech community and their own immersion.

3.3. Learning from documentation In this era of rapid decline of linguistic diversity, there are hundreds of languages that can no longer be labeled “endangered,” and are instead “extinct.” But even in these cases, so long as there is documentation, it may be possible for people to reconstruct, learn, and use the language. Because of this and other reasons, language activists prefer not to use the term “extinct” at all but instead use the terms “dormant” or “sleeping” (Hinton 2001).2 Linguist Wesley Leonard (2008, 23) writes that there is no such thing as an extinct language unless it has never been documented. Miami was reclaimed from extinction within a proposed category of “sleeping languages,” which I define as those that are not currently known but that are documented, claimed as part of one’s heritage, and thus may be used again. 2 People sometimes react to the term “extinct” as somewhat of an insult. L. Frank, a Tongva artist and activist for her dormant language, often refers to this with humorous sarcasm when she introduces herself: “Hello, I’m L. Frank, and I’m extinct.”

454 Leanne Hinton A documented language always has the possibility of revival among its people, and Leonard’s own tribe’s language, Myaamia (Miami) is a strong example (Leonard 2008; Baldwin 2013, ). Thus a documented language is not “extinct,” but rather it is dormant, or “sleeping.” The Ethnologue uses both the terms “extinct” and “dormant” now, defining the latter on the basis of community rather than documentation: Although a dormant language is not used for daily life, there is an ethnic community that associates itself with a dormant language and views the language as a symbol of that community’s identity. Though a dormant language has no proficient users, it retains some social uses. In contrast, an extinct language is no longer claimed by any extant community as the language of their heritage identity. Extinct languages are lacking in both users and societal uses. (https://www.ethnologue.com/enterprise- faq/what-difference-between-dormant-language-and-extinct-language-0.)3

The term “language revival” is sometimes used instead of “revitalization” for the re- introduction of a dormant language into modern use. Hebrew is the most famous (and perhaps the only) example of a fully revived language that was once dormant. The Cornish language (the Brittanic Celtic language indigenous to Cornwall) is another example of a dormant language that is in advanced stages of revival (Ferdinand 2013; Welsh Center for Language Planning 2015). In the United States, Miami (Baldwin 2013), Wampanoag (Makepeace 2011; littledoe 2013), and Chochenyo Ohlone are three examples of languages being revived; and in Australia, Kaurna is one of the best-known examples of language revival (Amery 2016). A community without speakers needs to know how to access and utilize documentation for language revival purposes. Nicholas Thieberger has written a fine guide on this for Australian languages, based on a workshop held at the Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) in 1993 (Thieberger 1995). Starting with the issue of how to identify one’s language, the guide gives instruction on how to find publications and materials on one’s language, how to understand the documentors’ writing systems and read the words, how to develop a consistent spelling system for one’s community, how to understand and use the grammar of one’s language, and how to use a computer to organize the data and create good learning resources. In 1995, the Advocates for Indigenous California Language Survival developed the Breath of Life Language Restoration Workshop for California Languages, which has now been going on biennially ever since. Berkeley has a set of archives of field notes and recordings covering a century and a half of documentation on California Indian languages, and also even older documents from early scholars, and from the Mission era when California was part of Mexico. Participants in the program come to learn how to find archival materials, how to read and analyze them, and how to use them for language learning and teaching, materials development, and actual use in their daily lives. Here 3 The Ethnologue also uses the term “awakening language” for a language in the process of

revitalization.

Approaches to and Strategies for Language Revitalization 455 are just a few recent examples of how Breath of Life has been useful in helping people do language revitalization from documentation: • The comprehensive Mutsun dictionary co-authored by long-time BOL participant Quirina Geary and linguist Natasha Warner (now a professor at the University of Arizona), who Geary first met at Breath of Life. This dictionary took over fifteen years of collaboration and includes every word ever recorded by the various linguists that worked with the last speakers. The dictionary is published by the online Journal of Language Documentation and Conservation (Warner, Butler and Geary 2016). • The Wailaki language class being taught to high school students on the Round Valley reservation by the talented teacher Cheryl Tuttle, who attended Breath of Life in 2014, and has since worked with her linguistic partners Justin Spence and Kayla Begay regularly to develop the curriculum and language lessons that she delivers to her students. An article about Tuttle’s course can be seen at http://www. northcoastjournal.com/NewsBlog/archives/2015/06/09/welcome-back-wailaki- an-extinct-native-language-rebounds. A new participant from Round Valley attended Breath of Life in 2016, to study Yuki, the second of six languages in Round Valley, in order to start preparing to teach that language as well. • The escalating career of Vincent Medina, Chochenyo, who first came to Breath of Life in 2012, now a very proficient speaker. He is an invited speaker at many events, giving welcomes in Chochenyo, telling stories, and advocating for language revival. He was one of the main editors at the office of the News from Native California magazine for some time, and while there conceived and implemented the regular column “In our words,” where in each issue there is a poem, a story, or other contribution in a California Indian language. He started the column off with an essay in Chochenyo that he composed himself (Medina 2014). In July 2016, he put together an exhibit on California Indian languages at the Maidu Museum in Roseville, California. Now deeply involved in a project of bringing California Indian foods to the public (see https://www.makamham.com/makamham-means-our-food/), Vince brings the vocabulary of food into the forefront, and still continues to teach Chochenyo to his family and use it with his friends. • Louis Trevino, Rumsen, who was a talented undergraduate in Political Science at Berkeley, came to Breath of Life in 2014 and immediately declared a minor in Linguistics. He now runs a Facebook blog for Rumsen researchers and learners called “Learning and Using the Rumsen Language,” doing a sentence a day with context and analysis and adding a recording of the sentence; he generally sticks to a theme for several weeks, such as “Greetings,” “Emotions,” etc. • A fairly large number of parents who have attended Breath of Life make it a priority to use their language at home with their children, as well as sharing songs and traditional activities learned or enhanced by research at the workshop. Even when they are still learning the language themselves, this is perhaps the ultimate goal of language revitalization—transmitting the language and culture naturally again across generations.

456 Leanne Hinton Breath of Life by name and model has spread to other locations, including Oklahoma and Washington, DC (which also runs a biennial event on the odd years between Berkeley). The Washington, DC Breath of Life is for all languages of the United States and Canada, and works with the vast archives at the National Anthropological Archives, the Library of Congress, and the Museum of the American Indians. It has for the last several events been organized by Daryl Baldwin and the staff of his Myaamia Center at Miami University in Ohio. In 2015 a partnership was formed between the Myaamia Center and the Recovering Voices Project of the Smithsonian Institution, so that the two groups would together work on both funding and organizing of Breath of Life DC.

4. Language modernization The problem with reviving languages with no speakers is that the documentation of a language is never complete. There may be many things missing in grammar and vocabulary, and even whole functions of conversation—such as conversational patterns, or how to talk to children. A large part of language revival is actually language creation. Daryl Kipp describes it this way: Our languages are adaptive, incorporating all we know since the beginning of our time. Think of how they describe our worlds; when our tribes first saw the horse, automobile and airplane. Think how our language stays with us no matter what inventions we encounter. It is only when we stop using them do they become inflexible and static. If we keep our language alive in our children, it will stay with them well past I-Pod, bio-fuel, MTV and the million other innovations coming towards them. Our languages can serve us to the end of time. . . . (Kipp 2009, 6–7).

4.1. New words In an active language, new words are created and spread to others in all sorts of ways. Commercial companies hire people to do “product naming”; biologists seeking and finding new species of plants and animals have specific rules for developing scientific nomenclature. New items and concepts coming from other cultures may be named through borrowing from the language that created it. Or it can be an informal, organic process, such as the creative development and spread of slang. A language that has not been used on a daily basis for awhile is behind the times in vocabulary development, and new topics and venues need new words in order for the language to be used again in daily life. Depending on the size and intensity of a language revitalization program, approaches to vocabulary development may range from informal, impromptu use of descriptive phrases or borrowings in conversations between language learners trying to use their language, to full-on official language committees and new-word dictionaries. Some examples of dictionaries that focus on including

Approaches to and Strategies for Language Revitalization 457 new words are Māmaka Kaiao: A Modern Hawaiian Vocabulary (Hua’ōlelo2003) and A Student’s Dictionary of Modern Cornish (Gendall 1991). Here are some of the many possible strategies for developing new vocabulary: • borrow a word from the language that the concept comes from (probably with phonological and (if relevant) spelling changes: for example, Koyukon kelaandas, “pencil,” from Russian karandásh (Denser-King 2008); Cornish bytt, from English byte, in computer jargon (Glendall 1991). • adopt a word from a related language that is still in use (for example, Hawaiian pounamu “jade,” borrowed from Māori (Hua’ōlelo 2003). (While this would also count as an example of the strategy of borrowing a word (see above), the point here is that it is often considered more authentic to borrow from a closely related language.) • make a loan translation. A term in another language might consist of more than one word, such as a compound, whose components can be translated into one’s own language (for example: Kaurna wirltu yarlu “sea eagle” is a loan translation from English “sea eagle: wirltu “ eagle”, yarlu “sea” (Amery 2016). • expand or shift the meaning of a word that already exists in the language to a new meaning (for example, the Havasupai word tñudga “to write” is expanded from its original meaning “to make a design (e.g., on a basket)”) (Hinton et al. 1984). • modify a word that already exists, using affixation or other grammatical process to signal a new meaning. For example, Umatilla Sahaptin pluuswit’awas, “computer,” for pluus, “brain” + wit, abstractive suffix + awas, instrumentative suffix (Denser-King, 2008). • create a phrase that describes the object or concept (example: Havasupai “Bible”— tñud ñaa glab (literally “flat black book”, tñud “book”, ñaa “black”, glab “flat”) (Hinton et al. 1984). People at work reviving their languages can study the documentation to find the strategies that the former speakers used to coin new words. A small group of people trying to revive their sleeping language through informally using the language together can just use these various strategies together informally as they struggle to converse. But some kind of authoritative decision-making is essential in large groups that are doing immersion schooling, simply to teach the subject matter of the classroom. The Hawaiians and the Cherokees are among the many groups that have official language committees, to decide on systematic principles, choose between suggested alternatives, and keep their language developing in a single direction between multiple communities and schools.

4.2. Writing systems For indigenous languages that do not have a long history of written literature, writing systems must be devised and/or chosen from available choices. This is an involved

458 Leanne Hinton process that may take years or even generations to settle, as intellectual and social changes affect people’s choices (Grenoble and Whaley 2006; Hinton 2014). There is often a strong history of documentation of otherwise unwritten languages by linguists and other people who have taken on the task of documentation. Some groups have decided to use these linguistic orthographies. Some First Nations programs in coastal British Columbia have done that; famed for their large numbers of consonants with unusual points and manners of articulation, the special symbols of the International Phonetic Alphabet seem most appropriate to represent them. Using the same writing system in which their language may already have a large amount of documentation also gives people access to that documentation, which may be an important resource for language revitalization. In other cases, it is deemed beneficial to design a new writing system, which is often modelled after existing writing systems for the mainstream language that the community people already know. Thus we see that most revitalizing languages in colonized and post-colonial countries of Europe and the New World will have alphabetic writing systems, using the symbols that are already familiar to the local community and easily available on typewriters and computers. Sounds not in the mainstream language can be represented by digraphs, diacritics, or redefinitions of how a given letter is pronounced. In some cases, writing systems were developed within a community while the language was still strong. These orthographies may be very different from the mainstream language, and may even not be alphabetic. In North America, two of these systems are in use today in speech communities—the Cherokee syllabary and the Canadian Aboriginal Syllabics, used by many languages, and taught in immersion schools. Writing systems such as these, that have a long history within communities, are seen as part of the cultural traditions to be revitalized. It is not necessary for a community to settle on a writing system before language teaching and learning take place. But certain important venues must use a writing system, such as immersion schools. Once established, a writing system has many important benefits for language revitalization, such as access to or community development of dictionaries, pedagogical grammars, reference grammars, children’s books and other kinds of creative writing in the language, and ability to communicate on social media through writing. Writing also gives the opportunity for more public display of the language, on street signs and maps, newspapers, and flyers. Many communities make pocket-sized phrasebooks that can be distributed to community members, not just for possible use but also to increase awareness and interest in the language.

4.3. Reconstructing grammar Documentation may also be incomplete with regard to the wide range of grammatical constructions that would have existed when the language was being used in daily communication. Further research on documentation, possibly from deep in the history of the language, and borrowing grammatical features from related living languages, are

Approaches to and Strategies for Language Revitalization 459 two of the main ways that the grammar of a language in the process of reclamation can be expanded. As Ferdinand writes for Cornish: An additional obstacle faced by Revived Cornish was the incompleteness of its syntax, semantics and lexicon. Since Cornish had been silent for about a century, there was no possibility of consulting with traditional speakers in order to fill in gaps or resolve inconsistencies. The issue of grammar and syntax was basically resolved by Nance and A. S. D. Smith between 1920 and 1940 using the works of Lhuyd (1707), Stokes (1872) and Breton grammar, the closest language to Cornish, as a comparative model. Although there were some mistakes in the reconstruction, these were rectified as soon as they came to light. (Ferdinand 2013, 213)

In some cases, there is simply no further documentation, and no related languages. This was the case for the California language Esselen, the first California language to go dormant (Golla 2011, 112). Linguist David Shaul has retrieved as much of the grammar as possible through analysis of the small corpus of words and phrases that exists (Shaul 1995) and has worked with members of the Esselen community on language revitalization at the Breath of Life workshops in Berkeley. Esselen may seem like a close to hopeless case. But despite the minimum of information on morphology and syntax, what has come out of the Esselen efforts of language revival is some lovely verbal art by people who identify with the Esselen languages: storytelling audio recordings by Louise Ramirez, powerful poetry by Deborah Miranda (published in Miranda 2012), and Esselen raps by Melissa Leal (Indian Country Today 2013. http://indiancountrytodaymedianetwork. com/2013/02/28/california-educator-bridges-generation-gap-hip-hop-147707). This brings up the question of “authenticity” in revitalizing languages. Linguists and language activists alike may wish that the language they are bringing back could replicate closely the language of their ancestors. However, the new versions of languages in revitalization are very likely to be quite different, and not just because of the necessity of adding new vocabulary. In the case of Hebrew, for example, it has been argued that modern Hebrew is a “hybrid” language, bringing in many elements of grammar from European languages (Zuckermann 2009). Some people have suggested more than just to be tolerant of the kinds of language change that occur in revitalization but even to consciously teach a new version. An interesting article by Gary Holton discusses the kinds of phonological and grammatical changes that occur in the speech of second-language learners of endangered languages, and suggests accepting this and just teaching them that way (Holton 2009).

5. Language use The biggest hurdle for both native speakers and language learners is to actually start using the language on a daily basis. For endangered languages, this is a major challenge.

460 Leanne Hinton Just as elders in a community that has undergone language shift cease to use the language they grew up with because most of the community doesn’t know it, so do second- language learners find themselves without interlocutors. Furthermore, there are various social and psychological factors that make people silent, even if they know the language well. The main strategy for developing the habits of language use is to join or create groups, times, and physical spaces where the language can happen. Of course an immersion school is one such space; and the home is another possible space. Beyond those important venues, sometimes people have set up “language houses” where the rule is that the language must be used most or all of the time. The Yurok tribe of California set up a program of “language pods,” a regular gathering where people are using the language together—with a facilitator to make sure it happens (http://www. yuroktribe.org/departments/education/Yurok_Tribe_Language_Program/documents/ PodParticipants2011.pdf). The Karuks have developed their own language pod program as well. Lack of fluency should not preclude the use of the language. Language use can and should take place as a part of language learning. One strategy for a learner is to replace English words and phrases with the target language as he learns them. Learning through conversation also helps beginners start using the language. The Master-Apprentice Language Learning Program described above is focused strongly on language learning through conversational practice. The team practices speech related to different domains of activities, usually related to daily routines. Lushootseed language teacher Zalmai Zahir has given added structure to learning domain by domain, asking his students to consider their daily activities within a certain room of their house, such as the kitchen or bathroom, and to ask for utterances related to those activities, which Zahir then translates into Lushootseed. The students must utter these sentences every day as they practice those activities, and build on them each week. Note that this practice is not exactly conversational in nature—it is commonly sequences of phrases such as “I am taking the knife,” “I am getting an onion,” “I am cutting the onion,” what Zahir calls “self-narration”(originally a literary term about first-person genres of writing). The goal is that over a period of months the student will master enough domains within that particular room—for example, in the kitchen it would consist of various domains around cooking, eating, cleaning, etc.—that the kitchen can become a “language nest” for the language being learned, and English will no longer be spoken in that space (Zahir 2015). There are many ways to use language besides conversation. Robert Amery has pointed out that for the Kaurna language of Australia the primary use of the language is in the public sphere, where memorized ceremonial speech and prepared talks may be used (Amery 2016). In many programs, prepared self-introductions are a very common genre that learners develop early. People might compose songs in their language, or poems. They might learn traditional tales or translate English versions of tales back into their language, and tell them at gatherings. Even the tiniest gesture toward language use can have symbolic importance for revitalization, such as a tribal council deciding to vote

Approaches to and Strategies for Language Revitalization 461 “yes” or “no” in their language instead of English. A family may decide to give a traditional name to their children or their pets. Other public uses of the language take place in writing—street names and other public signs can be a way to bring the language back into a community. The internet gives people other ways to practice using their language. Language learners and second-language speakers can and do email each other or post on Facebook or other social media.

6. The role of linguistics Woven throughout this chapter and this volume as a whole are examples of how analytical and applied linguistics is useful to language revitalization. An understanding of the grammatical rules of a language is essential to using it. Children learn these rules naturally through long-term exposure and practice, but adults learning their endangered language for the first time may need to become conscious of the rules in order override a tendency to use the grammar of their first language (e.g., English) in place of their heritage tongue. Analytical linguistics can help with figuring out the grammar of a language; applied linguistics can help with ways people can learn their language effectively. Linguistics is also necessary to understand linguistic documentation—how linguists write down languages that do not already have a writing system (and how and why they might use linguistic transcription even for those languages that do have a writing system); how to pronounce the words written in linguistic orthography; and how the grammatical rules are figured out through elicitation or the analysis of texts. Through applied linguistics, the community members can learn how to effectively teach others the language. Linguists coming into a community to document the language may be utilized for revitalization projects the community is interested in. Communities may hire linguists to help create dictionaries, pedagogical grammars, materials of various sorts, and curriculum for the schools. Much of the content of the field of Linguistics is changing in response to community needs and demands for language maintenance and revitalization. Analytical and theoretical linguists are learning applied linguistics in order to be of use to the communities. Documentary linguists have a new understanding of domains that should be documented to be helpful to present or future language revitalization efforts. A new ethic in linguistics has taken root, where members of speech communities are seen as partners in mutually beneficial endeavors, rather than the old colonialist view of speakers as subjects to benefit science (see Good, Chapter 11, this volume). But increasingly, it is the community members themselves who have decided to get the education they need to become linguists. Many of the linguists mentioned in this chapter are indigenous people who have gotten degrees in order to learn their language and benefit their communities.

462 Leanne Hinton

7. Other important factors I have said little about language planning, which is dealt with elsewhere in this volume (see Cahill, Chapter 14, this volume; Wright, Chapter 28, this volume). Almost any speech community that is doing language revitalization will (either from the beginning or at some later point) develop councils or committees that plan strategies for advancing revitalization. Frequent review of the strategies for improving language learning, increasing language use, and increasing the public profile of the language is itself an important strategy for revitalization. Language revitalization within a community is generally part of a constellation of efforts involving the revitalization of other aspects of its culture—reclaiming traditional lifeways in a broader sense. Communities may be reviving forms of ceremony and spiritual activities. They may be reclaiming aspects of knowledge of care and use of wild or domestic plants, hunting, butchering and food preparation, or traditional arts such as dancing, song, material arts such as basketry, weaving, and traditional clothing, and traditional child-rearing practices and social values. Language revitalization can also be preceded or accompanied by efforts to regain control of a land base, or claim increased autonomy with regard to important social and political aspects of a community, such as education and lawmaking. Language is part of all these different aspects of life, and can inspire and be inspired by efforts to strengthen any of them.

References Amery, Rob. 2016. Warraparna Kaurna!: Reclaiming an Australian Language. Adelaide: University of Adelaide Press. Asher, James J. 2000. Learning Another Language Through Actions. 6th ed. Los Gatos, CA: Sky Oaks Productions. Baldwin, Daryl, Karen Baldwin, Jessie Baldwin, and Jarrid Baldwin. 2013. “Myaamiaataweenki oowaaha: ‘Miami Spoken Here.’” In Bringing our Languages Home: Language Revitalization for Families, edited by Leanne Hinton, 3–18. Berkeley, CA: Heyday Books. Bradley, David. 2003. “Language Attitudes: The Key Factor in Language Maintenance.” In Language Endangerment and Language Maintenance, edited by David Bradley and Maya Bradley, 1–10. London: Routledge Curzon. Cope, Lida and Susan D. Penfield. 2011. “Applied Linguist Needed’: Cross-Disciplinary Networking for Revitalization and Education in Endangered Language Contexts.” Language and Education 25: 267–271. Denser-King, Ryan. 2008. “Neologisms in Indigenous Languages of North America.” In Proceedings from the Eleventh Workshop on American Indigenous Languages, edited by Joye Kiester, and Verónica Muñoz-Leo, 25–39. Santa Barbara Papers in Linguistics 19: 25–39. http://www.linguistics.ucsb.edu/sites/secure.lsit.ucsb.edu.ling.d7/files/sitefiles/research/ papers/19/Denzer-King_vol19.pdf. Accessed August 8, 2016.

Approaches to and Strategies for Language Revitalization 463 Fellman, Jack. 1973. The Revival of a Classical Tongue: Eliezer Ben Yehuda and the Modern Hebrew Language. The Hague, The Netherlands: Mouton. Ferdinand, Siari. 2013. “A Brief History of the Cornish Language, Its Revival and Its current Status.” e-Keltoi:Journal of Interdisciplinary Celtic Studies 2: 199–227. Gendall, Richard. 1991. A Students’ Dictionary of Modern Cornish. Menheniot: Teer ha Tavas. Golla, Victor. 2011. California Indian Languages. Berkeley: University of California Press. Grenoble, Lenore A. and Lindsay J. Whaley. 2006. Saving Languages: An Introduction to Language Revitalization. New York: Cambridge University Press. Grounds, Richard A. and Renée T. Grounds. 2013. “Family Language Without a Language Family.” In Bringing our Languages Home: Language Revitalization for Families, edited by Leanne Hinton, 41–58. Berkeley, CA: Heyday Books. Hamel, Rainar E. 2008. “Bilingual Education for Indigenous Communities in Mexico.” In Encyclopedia of Language and Education, 2nd ed., Vol. 5: Bilingual Education, edited by Jim Cummins, J. and Nancy H. Hornberger, 311–322. New York: Springer Science + Business Media LLC. Hinton, Leanne. 2001. “Sleeping Languages: Can They Be Awakened?” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Ken Hale, 413–417. San Diego, CA: Academic Press. Hinton, Leanne, ed. 2013a. Bringing Our Languages Home: Language Revitalization for Families. Berkeley, CA: Heyday Books. Hinton, Leanne. 2013b. “Bringing Your Language into Your Own Home.” In Bringing Our Languages Home: Language Revitalization for Families, edited by Leanne Hinton, 225–255. Berkeley, CA: Heyday Books. Hinton, Leanne. 2014. “Orthography Wars.” In Developing Orthographies for Unwritten Languages, edited by Michael Cahill and Keren Rice, 139–168. Dallas, TX: SIL International. Hinton, Leanne, Matt Vera, and Nancy Steele. 2002. How to Keep Your Language Alive: A Commonsense Approach to One-on-One Language Learning. Berkeley, CA: Heyday Books. Hinton, Leanne and Past and present staff members of the Bilingual Education Program and the Havasupai Community. 1984. A Dictionary of the Havasupai Language. Printed by the Havasupai Bilingual Education Program (ms). Supai, AZ. Holton, Gary. 2009. “Relearning Athabascan Languages in Alaska: Creating Sustainable Language Communities Through Creolization.” In Speaking of Endangered Languages: Issues in Revitalization, edited by Anne Marie Goodfellow, 238–265. Newcastle upon Tyne: Cambridge Scholars Publishing. Hua’ōlelo, Kōmike. 2003. Māmaka Kaiao: A Modern Hawaiian Vocabulary. Hilo: University of Hawai‘i Press. Kipp, Daryl. 2009. “Encouragement, Guidance and Lessons Learned: 21 Years in the Trenches of Indigenous Language Revitalization.” In Indigenous Language Revitalization: Encouragement, Guidance & Lessons Learned, edited by Jon Reyhner and Louise Lockard, 1–9. Flagstaff: Northern Arizona University, College of Education. Krashen, Stephen D. and Tracy D. Terrell. 1983. The Natural Approach: Language Acquisition in the Classroom. San Francisco: The Alemany Press. Latomaa, Sirkku and Pirkko Nuolijärvi. 2005. “The Language Situation in Finland.” In Language Planning and Policy. Vol. 1: Europe: Hungary, Finland and Sweden, edited by Robert B. Kaplan and Richard B. Baldauf, 125–232. Clevedon, UK: Multilingual Matters. Leonard, Wesley. 2008. “When Is an ‘Extinct Language’ Not Extinct? Miami, a Formerly Sleeping Language.” In Sustaining Linguistics Diversity: Endangered and Minority Languages

464 Leanne Hinton and Language Varieties, edited by Kendall A. King, Natalie Schilling-Estes, Jia Jackie Lou, Lyn Fogle, and Barbara Soukup, 23–33. Washington, DC: Georgetown University Press. little doe baird, jessie. 2013. “How Did This Happen to My Language?” In Bringing Our Languages Home: Language Revitalization for Families, edited by Leanne Hinton, 19–30. Berkeley, CA: Heyday Books. Lhuyd, Edward 1707. Archæologia Britannica. Oxford: Theater. Macleoid, Finlay M. 2013. “Taic/CNSA and Scottish Gaelic.” Bringing Our Languages Home: Language Revitalization for Families, edited by Leanne Hinton, 209–221. Berkeley, CA: Heyday Books. Makepeace, A. 2011. We Still Live Here—As Nutayunean: A documentary on Native American Language Revival (film). Independent Lens, PBS. Medina, Vincent. 2014. “In Our Language. Chochenyo.” News from Native California. Spring 2014. Medina, Vincent. 2015. “Heartbeats of the Language: Home- schooling in Tolowa.” Interview with Ruby Tuttle. News from Native California Blog, Spring 2014, 6–7. http:// newsfromnativecalifornia.com/blog/heartbeats-of-t he-language-home-s chooling-in- tolowa/. Accessed August 11, 2016. Meyer, Lois and Fernando Soberanes Bojórquez. 2009, 2010. El Nido de Lengua: Orientación para sus guías. Oaxaca, MX: CMPIO, CNEII, CSEIIO. http://jaf.lenguasindigenas.mx/docs/ el-nido-de-lengua.pdf. Accessed August 11, 2016. Miranda, Deborah. 2012. Bad Indians: A Tribal Memoir. Berkeley, CA: Heyday Books. O’Regan, Hana. 2013. “Māori: My Language Story.” In Bringing Our Languages Home: Language Revitalization for Families. edited by Leanne Hinton, 80–100. Berkeley: Heyday Books. Olthuis, Marja-Liisa, Suvi Kivela, and Tove Skutnabb-Kangas. 2013. Revitalizing Indigenous Languages: How to Recreate a Lost Generation. Clevedon, UK: Multilingual Matters. Shaul, David Leedom. 1995. The Huelel (Esselen) language. International Journal of American Linguistics, 51(2):191–239. Stokes, Whitley. 1872. Beunans Meriasek: The Life of Saint Meriasek, Bishop and Confessor: A Cornish Drama. London: Trübner and Co. Te Kāwanatanga o Aotearoa. 2008. Kei Roto i te Whare: Māori Language in the Home (Booklet.). Auckland: Te Puni Kōkiri. https://www.tpk.govt.nz/en/a-matou-mohiotanga/language/ kei-roto-i-te-whare-reprinted). Thieberger, Nicholas, ed. 1995. Paper and Talk: A Manual for Reconstituting Materials in Australian Indigenous Languages from Historical Sources. Canberra: Aboriginal Studies Press. Warner, Natasha, Lynnika Butler, and Quirina Geary. Mutsun-English English-Mutsun Dictionary, mutsun-inkiS inkiS-mutsun riica pappel. Mānoa, HI (Language Documentation & Conservation Special Publication 11). Honolulu: University of Hawai‘i Press. http://nflrc. hawaii.edu/ldc/?p=988. Accessed August 11, 2016. Watahomagie, Lucille J., and Akira Y. Yamamoto. 1987. “Linguistics in Action: the Hualapai Bicultural Bilingual Education Program.” In Collaborative Research and Social Change: Applied Anthropology in Action, edited by D. D. Stull and J. J. Schensul, 77–98. Boulder, CO: Westview. Welsh Centre for Language Planning. 2015. Cornish Language Strategy 2015-25. Evaluation and Development Report. Newcastle Emlyn, Carmarthenshire: Welsh Centre for Language Planning.

Approaches to and Strategies for Language Revitalization 465 Wilson, William H. and Kauanoe Kamanā. 2001. “‘Mai Loko Mai O Ka ‘I’ni: Proceeding from a Dream’—The ‘Aha Pūnana Leo Connection in Hawaiian Language Revitalization.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Ken Hale, 147–176. San Diego, CA: Academic Press. Zahir, Zalmai. 2015. “Language Revitalization Lecture” (Lecture notes). Lecture given at the 2015 National Breath of Life Archival Institute for Indigenous Languages, June 8. http:// nationalbreathoflife.org/wp-content/uploads/2015/06/Zalmai-Zahir_-Lang_revitalization- June8-15-BOL.pdf. Zuckerman, Ghil’ad. 2009. “Hybridity versus Revivability: Multiple Causation, Forms and Patterns.” Journal of Language Contact 2: 40–67.

Chapter 20

C omparative A na lysi s i n L anguage Revi ta l i z at i on Pract i c e s Addressing the Challenge Gabriela Pérez Báez, Rachel Vogel, and Eve Koller

1. Introduction This chapter addresses language revitalization broadly and is intended to contribute to our understanding of approaches to prevention or reversal of language shift.1 We refer to these approaches as language revitalization efforts whether they are intended for language maintenance, development, revitalization, or reclamation of a language. There are relatively few case studies reported in the relevant literature to date, and in many instances, the same cases are cited repeatedly. Hence, the intent of this chapter is to expand knowledge of ongoing revitalization efforts beyond the already commonly cited This research has been made possible by funding from the Recovering Voices initiative of the Smithsonian’s National Museum of Natural History. We are grateful to our partners at the University of Hawaiʻi at Mānoa: Lyle Campbell, William O’Grady, Ken Rehg, and Andrea Berez and their students and alumni: Bradley Rentz, Raina Heaton, Carolina Aragon, and Kaori Ueki. We thank the assistants who contributed to this research: Kate Murray, Jessica Nesbitt, Chris Pérez, Skyler Rachlin, and Carlos Cisneros and colleagues Daryl Baldwin, Haley de Korne, and Rozenn Milin who provided us with valuable feedback on the pilot design. Above all, we are grateful to the respondents of the surveys analyzed in the pilot for taking the time to share their experiences with us. 1

“Language shift” is defined as the process of one language replacing another in an increasing number of domains of use. “Reversing language shift” (RLS), then, is a transparent term referring to the efforts to reverse this process. Fishman (1991) discusses in detail how those undertaking language planning for RLS are faced with the need to address questions related to the prestige of dialects, competing views on standardization in RLS, and the role of developing corpora to assist in RLS.

Comparative Analysis in Language Revitalization Practices 467 cases. We endeavor to document the diversity of revitalization efforts worldwide and to analyze them comparatively based on the Global Survey of Language Revitalization Efforts (henceforth the Survey) carried out by the Recovering Voices initiative at the Smithsonian Institution’s National Museum of Natural History and the Linguistics Department at the University of Hawaiʽi at Mānoa. This Survey is, to our knowledge, the first attempt to analyze revitalization activities comparatively across a broad variety of cultural and geographic contexts, to develop methods for quantitative analysis, and to shed light on correlations that may inform future revitalization practices. In this chapter, we report on the results of a pilot of the Survey. In section 2 we define basic notions critical to the chapter. We also provide an overview of the relevant literature, contextualizing the state of knowledge of current revitalization practices and the dearth of case studies and comparative analysis of them. Section 3 explains the research rational for the Survey. Section 4 describes the design of the Survey, followed by a report on the pilot data in section 5. Plans and expectations for the full deployment of the Survey are discussed in section 6, followed by conclusions in section 7.

2. The state of knowledge about revitalization In order to understand what remains unknown about revitalization, it is important to review what has been reported to date. The literature on language revitalization has expanded in the last two decades, reflecting the increased interest in developing effective responses to language endangerment. The literature is diverse, with several subgenres, including but not limited to: general overviews of language revitalization (e.g., Hinton and Hale 2013), expositions on specific methods of revitalization (e.g., Hinton 2013f), pedagogical issues (e.g., Hinton 2013e; Reyhner 1997), areal/language-specific case studies (e.g., Balcazar 2009), and acquisition assessments (e.g., Peter, Hirata-Edds, and Montgomery-Anderson 2008; Housman et al. 2011). There are also publications on sociological issues related to language revitalization, in particular, identity and more recently the health and social benefits of language vitality in indigenous communities (e.g., Chandler and Lalonde 2008; Whalen, Moss, and Baldwin 2016).

2.1. Terminology The diverse nature of the terms used in the literature reflects the diversity of efforts to sustain individual languages. In this section, we provide definitions of some of the relevant terms, beginning with “language revitalization” as defined in King (2001, 23): Language revitalization, as I define it, is the attempt to add new linguistic forms or social functions to an embattled minority language with the aim of increasing its uses or users.

468 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller More specifically, language revitalization, as conceptualized here, encompasses efforts which might target the language structure, the uses of the language, as well as the users of the language.

King (2001, 23) states further that “Language revitalization is thus the process of moving towards renewed vitality of the threatened language.” Regarding “language maintenance,” Fishman (1964) established that it must involve intergenerational transmission. Pauwels (2008, 719) echoes this view, defining it as follows: The term language maintenance is used to describe a situation in which a speaker, a group of speakers, or a speech community continue to use their language in some or all spheres of life despite competition with the dominant or majority language to become the main/sole language in these spheres.

Thus, maintenance differs from revitalization in that the goal is to continue the current level of use of a language as to ensure its continuity. Generally, the language in question requires support to sustain its speaker base, rather than the more energy-intensive efforts to recreate speakers and domains of language use. The terms defined thus far can overlap to a degree with other terminology such as “Reversing Language Shift” (RLS), “language development,” “planning,” “revival,” and “awakening.” Language development is defined in Ferguson (1968) as consisting of three major components: (1) an orthography, (2) standardization, and (3) modernization. Modernization refers to enabling the practical use of a language in modern scientific, business, and other contexts. Rubin and Jernudd (1971) defined language planning as a part of language modernization, stating that language planning was “deliberate language change” (1971, xvi). Conversely, language modernization is used to refer to a part of language planning that involves making necessary additions to a lexicon as language policymakers work to use the language in new domains, by using borrowings or creating neologisms (Christian 1988). Fishman (1987, 49) defines language planning as: The authoritative allocation of resources to the attainment of language status and corpus goals, whether in connection with new functions that are aspired to, or in connection with old functions that need to be discharged more adequately.” (see also Danesi 2015, 101)

Lewis, Simons, and Fennig (2016) define language development as “the result of the series of ongoing planned actions that language communities take to ensure that they can effectively use their languages to achieve their social, cultural, political, economic, and spiritual goals.”2 Dorian (1994, 481) characterizes the distinction between language revitalization and language revival. In the former, “the language survives, but precariously”; in the latter, 2

https://www.ethnologue.com/language-development.

Comparative Analysis in Language Revitalization Practices 469 “the language is no longer spoken as a vernacular; it may have ceased to be spoken rather recently, or it may have been out of use as a vernacular for a long time.”3 In other words, according to Dorian, language revival primarily refers to the awakening of a sleeping language—one lacking first-language speakers—whereas revitalization refers to reversing language shift. Akin to language revival is the term “language reclamation”, defined as “the revival of a language that has no native speakers as in Hebrew” (Zuckermann and Walsh 2011, 119). For Hebrew, the reclamation effort resulted in the use of the language as a primary language with official status within a nation-state. The terms “dormant language,” “awakening language,” and “re-awakening language” are used in these contexts to avoid the implications of terms that refer to extinction (cf. Leonard 2011). Such is the case of Myaamia (Leonard 2007; Baldwin, Costa, and Troy 2016), Wôpanâak, and numerous languages researched in the context of Breath of Life (Baldwin, Hinton, and Pérez Báez 2018). There is diversity in the types of approaches developed to sustain the use of languages, which corresponds to the diversity of sociolinguistic contexts. Correspondingly, a variety of terms are used to refer to them, often interchangeably and even in contradicting ways. In designing the Survey, we simplified the terminology and used the term “revitalization” to refer to all efforts intended to support the use of a language independently of its vitality. This allowed us to streamline wording in the Survey and to avoid imposing categories on the respondents. We follow the same practice in this chapter and unless otherwise noted, we use the term “revitalization” in its broadest meaning.

2.2. What we know about language revitalization Some of the earliest known efforts to revitalize languages include those for Hebrew, Cornish, Breton, and Frisian, among others. Hebrew revitalization began formally in 1889 through the establishment of the Language Council (Zuckermann and Walsh 2011, 116). Revitalization of Cornish in Britain began in the early 1900s (Ferdinand 2013, 214) and continues to this day. Local athletes and celebrities have participated in activities to promote the language; meetings for informal Cornish conversations started being held at pubs; and an online language class, Kernewek dre Lyther (http://www.kesva.org/ KDL), was created (Ferdinand 2013, 218). As a result of these efforts, between 1980 and 2000, the number of Cornish speakers increased 600%, to approximately 300 fluent speakers, “including some native speakers,” as well as another 3,000 individuals who could hold conversations in the language (Ferdinand 2013, 215–216). Efforts in support of Breton began when its speaker base started to decline in the early 1900s (Milin, personal communication September 2016). West Frisian (Frysk) revitalization in the Netherlands began in the nineteenth century (Vellinga, personal

3

Dorian acknowledges a “fossilized use of the language” as a possibility even during periods in which it is no longer spoken as a vernacular.

470 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller communication June 2016). Elsewhere, movements such as the revitalization of Māori began in New Zealand in the 1970s (Reyhner 1997; Hale 2013c; King 2013), serving as a model for other efforts, including those for Hawaiian (Hinton 2013a; Warner 2013; Wilson and Kamanā 2013) and Ojibway (Grunewald 2015). Much of the literature has centered on specific geographic regions, namely, the Pacific (particularly Polynesia), North America, and Europe. This is evident in the gen eral overviews on language revitalization often cited, such as Hinton and Hale (2013), Grenoble and Whaley (2006), and Tsunoda (2006). These often feature case studies, for example: Navajo (Hale 2013b) and Karuk (Hinton 2013c) in North America, Hawaiian (Hinton 2013a; Warner 2013; Wilson and Kamanā 2013) and Māori (Hale 2013c; King 2013) in the Pacific, and Irish (Cotter 2013; Hale 2013a) and Welsh (Hinton 2013b; Morgan 2013) in Europe. Recently, the literature has expanded to cover a greater number of case studies for an improved coverage of the global landscape. Some examples include Asia (Tang 2011; Cardoso 2014), Australia (Amery 2004, 2016; Hale 2013d; Lowe and Walsh 2004; Bowern and James 2010; Hobson et al. 2010), Europe (Morris and Jones 2008; Costa 2014), and Latin America America (Benjamin, Pecos, and Romero 1996; King 2004; Balcazar 2009). A few works go a step beyond this. Hornberger (2008) analyzes the role of schools in revitalization in Māori, Saami, Hñähñö and Quechua; Rohloff and Henderson (2015) contrast language development and revitalization of Guatemalan Mayan languages with that of African languages; and Pérez Báez, Rogers, and Rosés Labrada (2016) analyze language documentation for revitalization in a variety of Latin American contexts. Nevertheless, systematic, comparative analysis across sets of globally representative case studies remains an area needing attention. The extensive literature on methods includes Master-Apprentice initiatives (Hinton 2013d), language nests (Johnston and Johnson 2002; Reyhner and Lockard 2009; First Peoples’ Cultural Council 2014; Borgia 2014), immersion schools, technology, and family/home-based language revitalization (Leonard 2007; Hinton 2013f). Reyhner (1997, 2000, 2015), Reyhner et al. (1999, 2003, 2013, 2015); Reyhner and Burnaby (2002), Reyhner, Gilbert, and Lockhard (2011), and address many of the pedagogical issues in language revitalization. Hinton and Hale (2013), Grenoble and Whaley (2006), Tsunoda (2006), and Hinton (2013f) also focus significantly on methods. Rice (2009) addresses the benefits of collaboration between linguists and language activists. A more recent theme is that of language acquisition for revitalization. Peter, Hirata- Edds, and Montgomery-Anderson (2008) discuss the acquisition of verb forms in a Cherokee language initiative. Housman et al. (2011) tested for vocabulary and selected syntactic features in Hawaiian. Just as works have analyzed the structural changes that occur in languages undergoing shift (Campbell and Muntzel 1989; Palosaari and Campbell 2010), other works report on structural changes in languages undergoing revitalization (NeSmith 2009; Zuckermann and Walsh 2011). More recent yet is the topic of the social impact of revitalization. Hallett, Chandler, and Lalonde (2007) and Chandler and Lalonde (2008) published groundbreaking work reporting that in First People’s bands in Canada where at least 50% of members spoke the band’s language, teen suicide rates were significantly lower. The correlation

Comparative Analysis in Language Revitalization Practices 471 was statistically significant, even when accounting for other factors. Whalen, Moss, and Baldwin (2016) identify additional studies that report on other social, spiritual, and health benefits associated with indigenous communities in which the language is spoken, e.g., lower tobacco and drug use, reduced violence, higher graduation rates, etc. To summarize, the literature on revitalization is vast and growing, and includes gen eral overviews as well as detailed reports on methods, structural linguistic changes during revitalization, acquisition, assessment, and social benefits. However, the literature remains grounded in language-specific case studies, whose number is limited in comparison with the number of revitalization efforts that likely exist worldwide. Our Survey is motivated precisely by the recognition that there is much to learn about revitalization through more comparative analysis. Section 3 focuses on some “known unknowns” about revitalization, which were taken into consideration during the development of the Survey.

3. Research rationale As introduced in section 2, the case studies available in the literature are limited, concentrated in certain regions of the world and cited frequently, which hampers the ability to engage in synthetic analysis. This is partly because language revitalization is multifactorial and closely dependent on the cultural contexts of individual language communities, i.e., effective strategies for revitalization in one context are not necessarily suitable for others. Nevertheless, without a method of comparative analysis, revitalization practitioners, advocates, and scholars are left with limited opportunities to learn from each other or from an aggregate of practices and outcomes to identify key factors that might be found to facilitate—or hamper—revitalization. To begin, we do not know how many revitalization efforts exist worldwide. To our knowledge, no effort has been carried out to inventory the revitalization efforts reported in the relevant literature systematically, much less those that are not described in the literature. We therefore cannot answer basic questions such as whether certain regions of the world have higher or lower numbers of efforts, or to what extent the number of efforts in any given region correlates to the number of languages in need of revitalization. The literature has devoted attention to details of various methods to recreate the transmission of a language; however, we cannot ascertain how broadly these methods are used around the world. We also do not know whether there are correlations between region or context and certain types of revitalization methods. This is not trivial. For example, while language nests are lauded for the benefits they have yielded in numerous cases (e.g., Hawaiian and Māori), we do not know how viable they are in other contexts and why. Take for instance the case of Mexican indigenous languages. The Catalogue of Endangered Languages (ELCat) classifies seventy Mexican

472 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller languages at various levels of endangerment. These languages therefore require recreating their speaker bases, and the language nest method seems to represent a reasonable approach to do so. However, the only language nests in Mexico are likely those in the Mixteca region, and at the time of writing of this chapter, the sustainability of these initiatives was in question (Meyer and Soberanes Bojórquez 2009). Information about any other efforts is thus crucial for understanding whether language nests are as applicable in Mexico as in other parts of the world or what the conditions might be to make them possible. We know that revitalization is extremely demanding, requiring extraordinary commitment and dedication, often over a lifetime. The learning curve is steep, and practitioners are often overwhelmed. It would therefore be of tremendous value if future practitioners could have access to information about how certain variables might correlate with outcomes. For instance, could we develop a way to predict to what extent language nests might be feasible for a particular endangered indigenous language of Mexico? We can only attempt to answer questions of this type if we carry out comparative analyses, both quantitative and qualitative. There is much to learn if we ask basic questions such as what revitalization practitioners around the world identify as the top five challenges to revitalization. Are these the same when analyzed by region, or country, or any other relevant classifications? Without information on how linguistic diversity, endangerment, and revitalization play out in a broad sample of contexts, our understanding of these topics and our consistent use of associated terminology, are also hampered. An example is the lack of clarity in naming methods such as “language nests,” “mother-tongue education,” “bilingual education,” and “immersion schooling.” These terms are often used loosely and in context-dependent ways. For instance, should the term “immersion schooling” be used in African contexts where reference is more generally made to mother-tongue education? In the relevant African contexts, is mother-tongue education a method of language revitalization or an education policy? Do language nests constitute mother- tongue education or immersion schooling? Beyond developing better terminology for practices of revitalization, it is critical that practitioners have access to enough details across the broadest spectrum possible to develop an understanding of the diversity of practices and their outcomes. Finally, while forums exist for revitalization practitioners to gather, these tend to be regional. One reason for this is that the lingua franca of an event might limit the gathering to practitioners from particular regions where the language is spoken. Consequently, the experiences shared at such gatherings are not as varied or extensive as what could be shared on a global scale. Further, groups sharing a lingua franca might come from comparable contexts and take similar approaches to revitalization. As such, challenges that could seem insurmountable to practitioners in one region may have been addressed effectively elsewhere. Yet, without exposure to a network of practitioners and to different approaches and methods, practitioners will not benefit from the knowledge gained by others. The Survey seeks to foster wider exchange of experiences for the benefit of revitalization practitioners.

Comparative Analysis in Language Revitalization Practices 473

4. A pilot of the Survey of Global Language Revitalization Efforts The pilot project is built upon two efforts: the development of a Global Directory of Revitalization Initiatives, and the design of the Survey itself.

4.1. Global Directory of Revitalization Initiatives In preparation for the design of the Survey, we carried out an extensive review of the relevant literature in order to understand which variables in language revitalization may need to be distinguished for comparative purposes and to develop a typology of the efforts. The revitalization efforts reported in the works we reviewed were entered into an MS Access Database, intended for the creation of a Global Directory of Revitalization Initiatives (henceforth the Directory).4 Next, we searched for additional initiatives in conference proceedings from The Foundation for Endangered Languages (FEL) and in grant abstracts from the National Science Foundation (NSF), The Endangered Language Fund (ELF), Language Conservation and Documentation (LC&D), and The Endangered Languages Documentation Programme (ELDP). We also directly contacted the regional directors of ELCat and other individuals involved in revitalization. In particular, our partners at the Linguistics Department of the University of Hawaiʻi at Mānoa facilitated an outreach effort at the 2015 International Conference on Language Documentation and Conservation (ICLDC). Finally, we conducted searches online. For each database entry, we recorded the primary source from which the initiative was documented, secondary source(s) such as websites, and basic contact information including the initiative’s website, physical address, phone number, email address, and a contact person if available. A second task involved the collection of more detailed data on each initiative. This included the year the initiative was established, language or languages the initiative works on, the ISO code of each language, the vitality of each language according to both ELCat and the Ethnologue, the initiative’s specific activities, and a mission statement or description of the initiative. For consistency, every description included the initiative’s name, type of programming, purpose, and the population it serves. These last two fields allowed us to include more detailed information beyond that documented in other fields. During the coding process, we developed a set of twenty-one activity types to cover the range of activities and methods described in the literature as well as individual

4

We are grateful to Bradley Rentz at the University of Hawaiʻi at Mānoa who was instrumental in this process as the designer of the database.

474 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller initiatives’ descriptions of their own work. The set of activities included bilingual education, capacity-building, classes (not in immersion school), community programming, community training, cultural programming, curriculum development, documentation, family programming, immersion school, language camps, language nests, Master- Apprentice, mother-tongue education, online education, online resources, preschool, presence in media, resource creation, teacher training, and education. These categories were devised only after we had a broad enough sample of existing activity types. We then assigned each initiative in our database to as many activity types as were applicable to its particular approach. Finally, we coded the initiatives based on a standard set of four categories referring to broader goals: (1) language revitalization, (2) language maintenance, (3) language reclamation, and (4) language support. The first three categories are defined in section 2. The fourth, language support, is a broad category, which we used specifically to refer to the work of organizations that support revitalization but do not work directly with particular language communities. An example from our database is the Muurrbay Aboriginal Language and Culture Co-operative, which conducts research on six languages and creates resources for teachers and learners within the language communities. The Co-operative provides tools and support for leaders in the communities rather than carrying out hands-on revitalization activities in the communities themselves. Each documented initiative was coded in the database for one of these four categories.

4.2. Pilot Survey design The Directory research informed the design of the Survey created in SurveyMonkey. This platform was selected to make the Survey as broadly accessible as possible through the internet and to yield data that could be easily entered into the MS Access Database. The Survey had twenty-five questions displayed over five screens and was intended to take no more than thirty minutes. The Survey was designed in English and translated into Spanish.5 A combination of questions designed to elicit qualitative and quantitative data were included. Qualitatively oriented questions preceded quantitatively oriented ones, to give participants the opportunity to comment freely on a particular issue prior to synthesizing their thoughts into responses for the multiple-choice questions. Even in questions formatted as multiple-choice we offered blank textboxes in addition to the pre-set options for respondents to add categories that might be more suitable to their particular circumstances. The categories and labels used in the multiple-choice questions were based on further analysis of Directory data. All Survey questions were optional and participants were not bound to answering every question. 5 The full-scale Survey was produced in seven language versions to make it fully accessible online and broadcasted as a crowdsourcing project. However, the pilot efforts could only include two languages. In selecting pilot survey participants we considered whether English and Spanish were languages accessible to them.

Comparative Analysis in Language Revitalization Practices 475 The pilot Survey began with three questions intended to obtain basic information about the language. Question 1 asked for the name or names of the language, the ISO code if known, and an indication of where the language is spoken. The field for the third element was an open textbox to allow respondents to provide as much explanation as they might consider relevant or necessary. For instance, in diaspora situations, the textbox allowed for an explanation about the home and sister communities where the language is spoken. Questions 2 and 3 were intended to get a sense of the language’s vitality. These questions were based on the ELCat scale shown in Table 20.1 (Lee and Van Way 2016). As such, question 2 asked respondents to select from a list of options about the makeup of the population of speakers, and question 3 asked for an estimated number of speakers. Questions 4 through 15 focused on details about the particular effort that the respondent is engaged in. Specifically, we sought to understand how each initiative articulates its objectives, how it evaluates them, whether the objectives are being met, and whether there are correlations between the initiative’s outcomes and the availability or lack of resources. Questions 4 and 5 asked about the origins of the efforts. Question 6 asked for a brief description of the efforts, and question 7 was a multiple- choice question that asked about leadership in the efforts. These questions were intended to provide insights into whether there are commonalities across initiatives in terms of how their efforts begin and evolve, whether there might be a critical mass necessary to start an effective effort, and what the nature of the leadership of revitalization efforts is. Question 8 provided five numbered textboxes for the respondents to articulate their main objectives in their own words. The open-endedness of this question was critical, as the objectives of the efforts may range from raising awareness to recreating a speech community, with myriad objectives in between. Questions 9 and 10 followed up to ask for an assessment of how each of the numbered objectives is met and the criteria for the assessment. Questions 11 to 14 were a mix of qualitatively and quantitatively oriented questions that ask what has worked well and what has represented a challenge. With these questions we sought to shed light on the assets an initiative needs to meet its objectives at each phase of its evolution. Question 15 provided a textbox for the respondent to add any additional relevant information. Questions 16–22 consisted primarily of multiple-choice questions about the activities that the initiative carries out. Question 16 provided a textbox to allow respondents to articulate the activities in their own words. Question 17 provided a list of fourteen types of activities from which respondents could select as many as they needed to characterize the work carried out in their revitalization efforts. As previous multiple-choice questions did, question 17 also provided the option for respondents to describe any activities they engage in that were not represented in the choices offered. Question 18 requested the grades in which school-based initiatives are carried out. Questions 19–21 were quantitatively oriented questions intended to elicit more details about the activities and their scope, i.e., their intended audience, and the number of people who benefit from them, as well as frequency of the activities the extent that the target language is used in each of them, and resources that the activities require. Question 22 was another multiple-choice question requesting

Table 20.1 ELCat Endangered Language Scale Level of Endangerment 5. Critically Endangered

4. Severely Endangered

3. Endangered

2. Threatened

1. Vulnerable

0. Safe

Inter-generational Transmission

There are only a few elderly speakers.

Many of the grandparent generation speak the language, but younger people generally do not.

Some adults in the community are speakers, but the language is not spoken by children.

Most adults in the community are speakers, but children generally are not.

Most adults and some All members of children are speakers. the community, including children, speak the language.

Absolute Number of Speakers

1–9 speakers

10–99 speakers

100–999 speakers

1,000–99,999 speakers

10,000–99,000 speakers

>100,000 speakers

Speaker Number Trends

A small percentage of the community speaks the language, and speaker numbers are decreasing very rapidly.

Less than half of the community speaks the language, and speaker numbers are decreasing at an accelerated pace.

Only about half of community members speak the language. Speaker numbers are decreasing steadily, but not at an accelerated pace.

A majority of community members speak the language. Speaker numbers are gradually decreasing.

Most members of the community or ethnic group speak the language. Speaker numbers may be decreasing, but very slowly.

Almost all community members or members of the ethnic group speak the language, and speaker numbers are stable or increasing.

Domains of Use of the Language

Used only in a few very specific domains such as in ceremonies, songs, prayer, proverbs, or certain limited domestic activities.

Used mainly just in the home and/or with family and may not be the primary language of these domains for many community members.

Used mainly just in the home and/or with family, but remains the primary language of these domains for many community members.

Used in some non- official domains along with other languages, and remains the primary language used in the home for many community members.

Used in most domains except for official ones such as government, mass media, education, etc.

Used in most domains, including official ones such as government, mass media, education, etc.

Comparative Analysis in Language Revitalization Practices 477 an assessment of each activity. Question 23 provided space to elaborate on the assessment openly, and question 24 provided a textbox for respondents to add any additional information that was not covered elsewhere. Finally, question 25 asked for contact information since the full Survey intends to do follow-up interviews on a selection of cases. Before we move onto the analysis of the data in section 5, it is important to include a note about our choice of terminology for the Survey and for discussion of the revitalization efforts and their outcomes. We use the term “revitalization efforts” to make reference to a broader set of activities in support of a language beyond what the term “revitalization program” might allow for. We seek to include emerging efforts in addition to more established programs. We also include initiatives structured within institutions in addition to efforts led by individuals. Additionally, we do not use the word “success” to refer to the outcomes of revitalization efforts, since it creates a dichotomy of success and failure without capturing the complexity of the evolution of revitalization efforts. Furthermore, the term can be damaging to the morale of those involved in the efforts. Next, having outlined the basics of the Survey design and word choice, we move on to the analysis of some of the pilot results.

5. Analysis of pilot Survey data In this section we analyze some of the results of the pilot Survey, focusing on observable trends and on refinement of research hypotheses to test during the eventual deployment of a full-scale Survey. The pilot Survey was sent to some forty revitalization initiatives selected from the Directory (described above) and from among practitioners with whom we have direct communication. These initiatives were selected to provide broad geographic coverage, and to include a range of types of initiatives, from emerging ones to those with a longer history, and a range of vitality situations. We received a total of thirty complete responses to the pilot Survey for two African languages, four European languages, four Asian languages, two Australian languages, and thirteen languages of the Americas. Some languages were discussed by more than one respondent; in one case, two people involved in one revitalization effort responded separately, and in the other, there were two different documented efforts for a single language.

5.1. Language community Table 20.2 provides a breakdown of the languages and the information supplied by respondents about the estimated size of the language community. To recall, the ranges in numbers of speakers that were used in the Survey are those used by ELCat. The languages surveyed range from two awakening languages with emerging language communities, Miami-Illinois and Šmuwič, to Balinese with over 1 million speakers but still considered, by those involved in its revitalization, to exhibit early signs of language shift under the pressure from Indonesian. Responses articulated by respondents are in quotation marks. One valuable reminder that emerges from the pilot Survey relates to the difficulty of assessing the vitality of a language and the nuances that are lost in attempts to categorize

Table 20.2 Speaker basea Language

How many people speak the language?

Kalinago

“Zero. This is a dormant language”

Šmuwič

“We are starting to basically speak but I would say at an elementary conversation level, about 5 people.”

Shmuwich, Chumash

“ ‘speak’ as in fluent, then 1–5, but ‘speak’ as in know words and are actively learning then 1-50”

Coeur d'Alene

1–9

Iquito

1–9

Ngunawal

1–9

Kaurna

1–9

Tunica

10-99

Kumeyaay

10-99

Tembé

10-99

Miami-Illinois

“there are 100 + language users”

Cornish

100–999

Babanki

10,000–99,999

Diidxazá

10,000–99,999

Zapoteca (Diidxazá)

10,000–99,999

Huichol

10,000–99,999

Zapoteco de Macuiltianguis

1,000–9999

Jejueo

1,000–9999

Manx Gaelic

1,000–9999

Kari'nja

1,000–9999

Udi

1,000–9999

Kaqchikel

100,000 +

Diidxazá

100,000 +

TjiKalanga

100,000 +

Balinese

100,000 +

Frysk

100,000 +

Breton

100,000 +

Truku

“Not sure the exact number of the speakers because the villagers live in a different regions.”

a Quotes are presented exactly as provided by respondents independently of typographic errors and

stylistic preferences.

Comparative Analysis in Language Revitalization Practices 479 the situations of the languages of the world. The development of the Directory described in section 4.1 made it clear that the Survey ought to be designed in such a way as to avoid presenting categories in a mutually exclusive format. As such, question 2 was formatted to allow for multiple vitality descriptors to be selected, thereby providing a more comprehensive understanding of the vitality situation. Indeed seven respondents selected more than one vitality descriptor. Further, we received supplementary information for five languages and a customized response for Miami-Illinois. The data from question 2 is presented in Table 20.3, with the language community shown on the left and the

Table 20.3 Reported language vitalitya Language

What is the situation of the language?

Babanki

Many of the grandparent generation speak the language, but the younger people generally do not. Most adults in the community are speakers, but children generally are not.

Balinese

Most adults and some children are speakers.

Breton

Many of the grandparent generation speak the language, but the younger people generally do not.

Coeur d'Alene

There are a few elderly speakers.

Cornish

There are no more first-language speakers. Some adults in the community are speakers, but the language is not spoken by children.

Diidxazá

Most adults in the community are speakers, but children generally are not.

Diidxazá

Most adults in the community are speakers, but children generally are not.

Frysk

Most adults and some children are speakers. All members of the community, including children, speak the language, but we want to make sure this doesn’t change

Huichol

Many of the grandparent generation speak the language, but the younger people generally do not. Some adults in the community are speakers, but the language is not spoken by children. Most adults in the community are speakers, but children generally are not.

Iquito

There are a few elderly speakers. There are 30 or so partial or passive speakers in their 40s and 50s but they do not use the language for purposes of communication. There is a nominal “bilingual” component to education but it is political not pedagogical. More generally, heritage community members have a conflicted relationship with the language and with indigenous identity more generally, so lg revitalization efforts are fraught with competing agendas.

Jejueo

There are a few elderly speakers. Many of the grandparent generation speak the language, but the younger people generally do not. Some adults in the community are speakers, but the language is not spoken by children.

Kalinago

There are no more first-language speakers. (continued)

480 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller Table 20.3 Continued Language

What is the situation of the language?

Kaqchikel

Many of the grandparent generation speak the language, but the younger people generally do not. Most adults in the community are speakers, but children generally are not.

Kari’nja

Many of the grandparent generation speak the language, but the younger people generally do not. Some adults in the community are speakers, but the language is not spoken by children.

Kaurna

There are no more first-language speakers. There are a number of emerging speakers of Kaurna, who have learned Kaurna as adults. There are also one or two children who are acquiring Kaurna as a semi-first language.

Kumeyaay

There are a few elderly speakers. Some adults in the community are speakers, but the language is not spoken by children. It depends on the community.

Manx Gaelic

There are no more first-language speakers. There are now a number of new-native speakers whom are the children of people who learnt the language as adults

Miami-Illinois

“There is a growing number of novice language users, some of whom have had language since birth and consider Myaamia as one of their first languages.”

Ngunawal

There are no more first-language speakers. There is a little language knowledge in the Aboriginal community but the language has not been in active use for many years.

Šmuwič

There are no more first-language speakers.

Shmuwich, Chumash

There are no more first-language speakers. Some adults in the community are speakers, but the language is not spoken by children. “Most learners are English first language and not many children are learning.”

Tembé

Mostly only a few older people speak the language. There is one lady, however, who founded one village (basically her extended family) where Tembé is spoken by all, including the children. There are a few scattered families where the children also speak the language (though it is the language of daily communication only in that lady’s village). So: Basically older people, almost no children, but not quite—a few children are learning it.

TjiKalanga

Most adults and some children are speakers.

Truku

Many of the grandparent generation speak the language, but the younger people generally do not.

Tunica

There are no more first-language speakers.

Udi

All members of the community, including children, speak the language, but we want to make sure this doesn't change.

Zapoteca (Diidxazá)

Most adults in the community are speakers, but children generally are not.

Zapoteco de Macuiltianguis

There are a few elderly speakers. Many of the grandparent generation speak the language, but the younger people generally do not. Some adults in the community are speakers, but the language is not spoken by children. Most adults in the community are speakers, but children generally are not.

a Quotes are presented exactly as provided by respondents independently of typographic errors and

stylistic preferences.

Comparative Analysis in Language Revitalization Practices 481 vitality situation shown in the right. Responses articulated by respondents are in quotation marks (All others come from the multiple choice options provided.).

5.2. Origins of the revitalization efforts One notable trend that emerges from the data obtained from the basic information questions is that the bulk of the documented efforts began in this century. Efforts in support of seventeen of the thirty languages reported in the pilot began after 2000. A number of observations can be made based on this. First, given the geographic spread of the cases in the pilot Survey that began within this century, it is reasonable to hypothesize for the full-scale Survey that revitalization has received a boost in the last fifteen years or so. It would be interesting to attempt to identify the drivers behind this, such as the talk by Johannes Bechert at the 1987 International Congress of Linguists in Berlin (Bechert 1990), the 1991 symposium entitled “Endangered Languages and their Preservation” held at the 65th Annual Meeting of the Linguistic Society of America and the resulting publication calling for the linguistics discipline to attend to the problem of language endangerment (Hale et al. 1992), legislation at the national and international level, social and/ or historic events affecting the language communities, and improvements in technology. Another trend that we expect will emerge in the full-scale Survey is that the 1970s may have been another period during which revitalization was propelled. Among the pilot cases, efforts for Coeur d’Alene are reported to have originated in the 1970s, as are the contemporary efforts in support of Breton. As known from the literature, the same also applies to the efforts of Hawaiian and Māori, to name but a couple. An unexpected but perhaps not surprising trend seen from the pilot Survey data is that in fifteen of the thirty cases, individuals are identified by name as being at the origin of the efforts. This trend, if borne out in the full-scale Survey, is not trivial: language revitalization can be overwhelming for those who want to develop an effort or are in the early stages of one. But if a significant number of existing efforts have indeed begun with a group of people small enough that its members can be identified by name, this could show that it is indeed viable for one or a few individuals to start an effective long-term revitalization effort. In eleven of the fifteen cases, the people in question are members of the community. In six cases, a community-external actor is named. We expect that the full-scale Survey will show a large number of efforts originating from within the community, and will also elucidate the role of community-external actors.

5.3. Objectives, activities, assets, and needs The data show that the most commonly stated objective—mentioned in fifteen different instances—relates to an interest in or need to raise awareness about the importance of keeping a language in use and to improve the attitudes toward the endangered language. The next most frequent objective, stated sixteen times, is to create new or better

482 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller speakers of the target language. Just as frequent was the objective to develop materials to support the teaching of the language, such as textbooks and dictionaries. Other types of objectives mentioned include documentation, creation of new domains of use of the language, transmission of knowledge, development of a support network, teacher training, and literacy. Not surprisingly, goals focus much more on children than on adults: three respondents reported objectives focused on adults, compared to fifteen instances in which the objectives were reported as focusing on children and the youth. An interesting asset, not overtly stated but that emerges from analysis of the qualitative data, is the involvement of retirees in the revitalization efforts. This was especially noticeable in the Spanish version of the pilot. The contribution of retirees could prove to be a critical asset if explored in a more concerted way in the full-scale Survey. Consistent with the top objective to raise awareness about the importance of the target language is the fact that twenty respondents stated that they engage in the organization of cultural events. Similarly, the common need for more language materials is reflected in the fact that seventeen respondents report involvement in the development of pedagogical materials and also seventeen respondents report involvement in language documentation. One striking trend emerged: the intergenerational transmission of the language was all but absent as an objective from the pilot responses. Only the respondent for Balinese, the largest language in the sample, stated their top objective to be to “Encourage young parents to speak Balinese to their children.” This suggests that when intergenerational transmission of a language breaks down, opening a domain for the teaching of languages is the strategy of choice to ensure the acquisition of a language by children. Indeed, the pilot reveals a reliance on schools for the transmission or teaching of the target language. Seventeen respondents indicated that their revitalization efforts are carried out in schools, especially in the preschool—reported by nine respondents—and primary (elementary) grades—reported by fourteen respondents. It seems, then, that revitalization is heavily focused on the teaching of languages rather than on recreating a process of intergenerational transmission of the language. This brings up the critical question of how much linguistic input the schools actually provide to students, since the reliance on the school as the domain of language teaching entails a reliance on the amount and quality of the language input that the school-based programs can provide. The pilot Survey shows promising trends. Eight of the reported school-based programs meet for four-to six-hour sessions. Seven programs report using the language 50% of the session time; three programs report using it 75% of the time; and two programs report using the language 100% of the time. The assets most frequently identified as critical to the revitalization efforts are the dedication and passion of those involved in the effort and community support. Funding was identified as an asset by some and as a need by others, showing that it is considered critical either way for revitalization efforts. Similarly, government and institutional support were cited as assets for some and as standing needs by others. There are also a number of different needs that seem to constitute a larger category, that of capacity-building. These include teacher training and staffing more broadly, technology and communications,

Comparative Analysis in Language Revitalization Practices 483 and equipment of various types. Finally, but importantly, the opening of new domains for language use, especially outside the school environment, was identified as a standing need. A goal of the full-scale Survey is to take the responses to questions 11 to 14 beyond a basic isolated tally to identify correlations between the assets and needs identified by the respondents and their ability to meet the objectives of their efforts in a satisfactory way. The results of such an analysis may inform not only practitioners but also policy makers and funding entities.

6. Discussion: toward a Survey of Global Language Revitalization Efforts At the time of the writing of this chapter, the pilot has been completed and the data- gathering phase of the full-scale Survey has begun. The Survey will allow revitalization practitioners with first-hand knowledge about a language to be consulted on how its vitality should be characterized. With the flexibility to select a variety of vitality descriptors and to articulate a customized description, the Survey will provide the opportunity to develop vitality characterizations that are closer to the views of those doing hands-on work to revitalize a language, thereby better reflecting the variation across cases. The inclusion of Balinese in the pilot Survey sample underscores the value of diversity in our data set for understanding what language endangerment and revitalization mean in a variety of situations. As such, the Survey will need to capture any efforts for languages even if their communities go well beyond the 100,000 speakers mark. We are especially interested in investigating how revitalization efforts are born and how they evolve, as we see great potential in sharing any clear trends with practitioners, especially those of emerging initiatives. We thus seek to understand how common it is for programs to start with only a small group of people, whether in fact they necessarily start with only a small core group, and what the diversity might be in revitalization efforts’ origins. It will also be important to investigate carefully the role of both community-internal and community-external actors. Given the intensity of debates over the last twenty years about the role of external actors, a large sample of experiences elucidating their role and impact could be particularly relevant. Because the Survey offers the potential to record practitioners’ views on their objectives, it will be possible to establish correlations between certain objectives and the activities that practitioners engage in. These two variables combined with data on assets available to revitalization efforts and satisfaction with outcomes of the activities may provide insights into which approaches are more conducive to achieving the desired outcomes. The same two variables combined with data on needs can yield valuable information about what might need to be prioritized when developing revitalization initiatives.

484 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller The relative absence in the pilot Survey results of the recreation of the intergenerational transmission of a language as a primary objective raises a number of important issues. As mentioned earlier, the pilot data suggests that efforts are focusing greatly on language teaching as opposed to language reproduction, and it will be important to understand the ramifications of such a trend if it is borne out in the full-scale Survey. As stated in section 2, an emerging subfield in the literature is that of assessment of language acquisition in revitalization contexts. A strong focus on language teaching within revitalization, then, calls for increased attention to assessment of acquisition to inform the teaching strategies and determine whether objectives are being achieved. It will be critical to establish correlations between the vitality of a language, the objectives of a revitalization effort, and the amount of input in the target language offered. We stated some initial promising trends of high levels of language input in school-based initiatives. The pilot also showed, however, that none of the four school-based efforts in Mexico reports being able to provide target-language input for more than 50% of the schoolday. Thus, regional or topical trends could greatly help to identify shortcomings requiring concerted attention, thereby informing the strategic priorities of revitalization efforts both within the initiative itself as well as at the policy and funding level. If the Survey confirms a widespread reliance on schooling for the purposes of revitalizing a language, this would point to a need for an increased and concerted effort to bring the fields of pedagogy and language acquisition to the service of language revitalization efforts.

7. Conclusions In this chapter, we have argued for the need to carry out a comparative analysis of revitalization efforts worldwide in order to increase our understanding of how revitalization is practiced in various regions and what factors may lead to the most desirable outcomes. We have put forth the Global Survey of Language Revitalization Efforts as a first attempt of a global comparative analysis and have reported on the results of a pilot study to show that the potential value of a full-scale Survey is considerable. In so doing, we have created two important resources: (1) The Global Directory of Revitalization Initiatives and (2) an extensive bibliography. Based on the pilot Survey results, we are enthusiastic about the research and learning prospects of a full-scale Survey. We remain cognizant of the diversity of language endangerment situations and of revitalization efforts, and of the risks inherent in attempting to make generalizations. Nevertheless, we are confident that what stands to be learned from a comparative analysis can be of great value to researchers interested in language revitalization. More importantly still, we expect that the full- scale Survey will provide knowledge grounded in data for revitalization practitioners worldwide to use as a resource when making decisions to improve the outcomes of their efforts.

Comparative Analysis in Language Revitalization Practices 485

References Amery, Rob. 2004. “Kaurna Language Reclamation and the Formulaic Method.” In Language Is Life: Proceedings of the 11th Annual Stabilizing Indigenous Languages Conference, University of California at Berkeley, June 10–15, 2004, 81–99. Andrew Garrett and Leanne Hinton (Series Editors). Balcazar, Ivonne Heinze. 2009. “Bilingual Acquisition in Kaqchikel Maya Children and Its Implications for the Teaching of Indigenous Languages.” In Language Is Life: Proceedings of the 11th Annual Stabilizing Indigenous Languages Conference, University of California at Berkeley, June 10–15, 13–25. Andrew Garrett and Leanne Hinton (Series Editors). Baldwin, Daryl, David J. Costa, and Douglas Troy. 2016. Myaamiaataweenki eekincikoonihkiinki eeyoonki aapisaataweenki: A Miami Language Digital Tool for Language Reclamation. Language Documentation & Conservation 10: 394–410. Baldwin, Daryl, Leanne Hinton, and Gabriela Pérez Báez. 2018. The Breath of Life Workshops & Institutes. In The Routledge Handbook of Language Revitalization, edited by Leanne Hinton. Bechert, Johannes. 1990. “Universalienforschung und Ethnozentrismus.” In Proceedings of the Interenational Congress of Linguists (1987, Berlin), edited by W. Bahner, J. Schildt, and D. Viehweger, 2350–2352. Berlin: Akademie Verlag. Benjamin, Rebecca, Regis Pecos, and Mary Eunice Romero. 1996. “Language Revitalization Efforts in the Pueblo de Cochiti: Becoming ‘Literate in an Oral Society’.” In Indigenous Literacies in the Americas: Language Planning from the Bottom Up, edited by Nancy H. Hornberger, 115–136. Berlin: Mouton de Gruyter. Borgia, Melissa Elayne. 2014. “Using Gesture to Teach Seneca in a Language Nest School.” Language Documentation & Conservation 8: 92–99. http://scholarspace.manoa.hawaii.edu/ handle/10125/4617. Bowern, Claire and Bentley James. 2010. “Yan- nhaŋu Documentation: Aims and Accomplishments.” In Re-Awakening Languages: Theory and Practice in the Revitalization of Australia’s Indigenous Languages, edited by John Hobson, Kevin Lowe, Susan Poetsch, and Michael Walsh, 361–371. Sydney: University of Sydney Press. Campbell, Lyle and Martha Muntzel. 1989. “The Structural Consequences of Language Death.” In Investigating Obsolescence: Studies in Language Death, edited by Nancy Dorian, 181–196. Cambridge: Cambridge University Press. Cardoso, Hugo. 2014. Language Endangerment and Preservation in South Asia. Honolulu: University of Hawaiʻi Press. Chandler, Michael J. and Christopher E. Lalonde. 2008. “Cultural Continuity as a Protective Factor against Suicide in First Nations Youth.” Horizons—A Special Issue on Aboriginal Youth, Hope or Heartbreak: Aboriginal Youth and Canada’s Future 10: 68–72. Christian, Donna 1988. “Language Planning: The View from Linguistics.” In Language: The Socio-Cultural Context, edited by Frederick J. Newmeyer, 193–211. Cambridge: Cambridge University Press. Costa, James. 2014. “Must We Save the Language? Children’s Discourse on Language and Community in Provençal and Scottish Language Revitalization Movements.” In Endangered Languages: Beliefs and Ideologies in Language Documentation and Revitalisation, edited by Peter K. Austin and Julia Sallabank, 195–214. Oxford: Oxford University Press. Cotter, Colleen. 2013. “Continuity and Vitality: Expanding Domains Through Irish-Language Radio.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 301–316. Leiden and Boston: Brill.

486 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller Danesi, Marcel. 2015. Language, Society, and New Media: Sociolinguistics Today. London: Routledge. Dorian, Nancy. 1994. “Purism vs Compromise in Language Revitalization and Language Revival.” Language in Society 23: 479–494. Ferdinand, Siarl. 2013. “A Brief History of the Cornish Languages, Its Revival and Current Status.” Journal of Interdisciplinary Celtic Studies. Cultural Survival, 2. https://www4.uwm. edu/celtic/ekeltoi/volumes/vol2/2_6/ferdinand_2_6.pdf. Accessed November 3, 2016. Ferguson, Charles A. 1968. “Language Development.” In Language Problems of Developing Nations, edited by Joshua A. Fishman, Charles A. Ferguson, and Jyotirindra Das Gupta, 25– 35. New York: John Wiley & Sons, Inc. “First Peoples’ Cultural Council with the Support of Chief Atahm School.” 2014. Canadian Language Nest Handbook for B.C. First Nations Communities. Fishman, Joshua A. 1964. “Language Maintenance and Language Shift as a Field of Inquiry: A Definition of the Field and Suggestions for Its Further Development.” Linguistics 2: 32–70. Fishman, Joshua A. 1987. Ideology, Society, and Language: The Odyssey of Nathan Birnbaum. Ann Arbor, MI: Karoma Publishers. Fishman, Joshua A. 1991. Reversing Language Shift: Theoretical and Empirical Foundations of Assistance to Threatened Languages. Clevedon, UK: Multilingual Matters, Ltd. Grenoble, Lenore A. and Lindsay J. Whaley. 2006. Saving Languages: An Introduction to Language Revitalization. Cambridge: Cambridge University Press. Grunewald, Rob. 2015. “A Promising Approach to School Preparation, Economic Opportunity and Language Preservation.” Paper presented at the National Congress of American Indians 10th Annual Tribal Leader/Scholar Forum, St. Paul, Minnesota, June 30. https://www. minneapolisfed.org/~/media/files/publications/studies/earlychild/earlychildhoodnativela nguageimmersionprograms.pdf?la=en. Hale, Kenneth. 2013a. “Irish.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 299–300. Leiden and Boston: Brill. Hale, Kenneth. 2013b. “The Navajo Language: I.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 83–85. Leiden and Boston: Brill. Hale, Kenneth. 2013c. “Introduction to the Māori Language.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 115–116. Leiden and Boston: Brill. Hale, Kenneth. 2013d. “Australian languages.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 273–275. Leiden and Boston: Brill. Hale, Ken, Michael Krauss, Lucille J. Watahomigie, Akira Y. Yamamoto, Colette Craig, LaVerne Masayesva Jeanne & Nora C. England. 1992. Endangered languages. Language 68 (1). 1–42. Hallett, Darcy, Michael J. Chandler, and Christopher E. Lalonde. 2007. “Aboriginal Language Knowledge and Youth Suicide.” Cognitive Development 22: 392–399. Hinton, Leanne. 2013a. “An Introduction to the Hawaiian Language.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 129–131. Leiden and Boston: Brill. Hinton, Leanne. 2013b. “Introduction to the Welsh Language.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 103–104. Leiden and Boston: Brill. Hinton, Leanne. 2013c. “The Karuk Language.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 191–193. Leiden and Boston: Brill.

Comparative Analysis in Language Revitalization Practices 487 Hinton, Leanne. 2013d. “The Master-Apprentice Language Learning Program.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 217–226. Leiden and Boston: Brill. Hinton, Leanne. 2013e. “Teaching Methods.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 179–189. Leiden and Boston: Brill. Hinton, Leanne. 2013f. Bringing Our Languages Home. Berkeley, CA: Heyday Books. Hinton, Leanne and Kenneth Hale, eds. 2013. The Green Book of Language Revitalization in Practice. Leiden and Boston: Brill. Hobson, John, Kevin Lowe, Susan Poetsch, and Michael Walsh, 2010. Reawakening Languages: Theory and Practice in the Revitalisation of Australia’s Indigenous Languages. Sydney: Sydney University Press. http://ses.library.usyd.edu.au/handle/2123/6647. Hornberger, Nancy H., ed. 2008. Can Schools Save Indigenous Languages?: Policy and Practice on Four Continents. Basingstoke and New York: Palgrave Macmillan. Housman, Alohalani, Kaulana Dameg, Māhealani Kobashigawa, and James Dean Brown. 2011. “Report on the Hawaiian Oral Language Assessment (H-OLA) Development Project.” Second Language Studies 29(2): 1–59. Johnston, Bill and Kimberly Johnson. 2002. “Preschool Immersion Education for Indigenous Languages: A Survey of Resources.” Canadian Journal of Native Education 26: 107–123. King, Kendall A. 2001. Language Revitalization Processes and Prospects: Quichua in the Ecuadorian Andes. Clevendon: Multilingual Matters. King, Kendall. 2004. Language Policy and Local Planning in South America: New Directions for Enrichment Bilingual Education in the Andes. Bilingual Education and Bilingualism 7: 334–347. King, Jeanette. 2013. “Te Kōhanga Reo: Maori Language Revitalization.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 119–128. Leiden and Boston: Brill. Lee, Nala H. and John R. Van Way. 2016. “Assessing Levels of Endangerment in the Catalogue of Endangered Languages (ELCat) Using the Language Endangerment Index (LEI).” Language in Society 45: 271–292. Leonard, Wesley. 2011. “Challenging ‘Extinction” Through Modern Miami Language Practices.” American Indian Culture and Research Journal 35: 135–160. Leonard, Wesley. 2007. “Miami Language Reclamation in the Home: A Case Study.” PhD diss., University of California at Berkeley. Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig, eds. 2016. Ethnologue: Languages of the World. 19th ed. Dallas, TX: SIL International. Online: http://www.ethnologue.com. http:// www.ethnologue.com/language-development. Lowe, Kevin and Michael Walsh. 2004. “California Down Under: Indigenous Language Revitalization in New South Wales, Australia.” In Language Is Life, Proceedings of the 11th Annual Stabilizing Indigenous Languages Conference, University of California at Berkeley, June 10–13, edited by Wesley Y. Leonard and Stelómethet Ethel B. Gardner, 100–115. Meyer, Lois M. and Fernando Soberanes Bojórquez. 2009. El Nido de Lengua. Orientación para sus Guías. Oaxaca: Movimiento Pedagógico. Morgan, Gerald. 2013. “Welsh: A European Case of Language Maintenance.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 107–113. Leiden and Boston: Brill.

488 Gabriela Pérez Báez, Rachel Vogel, and Eve Koller Morris, Delyth and Kathryn Jones. 2008. “Language Socialization in the Home and Minority Language Revitalization in Europe.” In Encyclopedia of Language and Education, Vol. 8: Language Socialization, edited by P. A. Duff and N. H. Hornberger, 127–143. New York: Springer. NeSmith, R. Keao. 2009. “Tūtū’s Hawaiian and the Emergence of a Neo Hawaiian Language.” In ‘Ōiwi Journal 3: A Native Hawaiian Journal Kuleana 3. Honolulu: ‘Ōiwi Press. Palosaari, Naomi and Lyle Campbell. 2010. “Structural Aspects of Language Endangerment.” In The Cambridge Handbook of Endangered Languages, edited by Peter K. Austin and Julia Sallabank, 100–119. Cambridge: Cambridge University Press. Pauwels, Anne. 2008. “Language Maintenance.” In The Handbook of Applied Linguistics, edited by Alan Davies and Catherine Elder. Malden, MA: Blackwell Publishing. Pérez Báez, Gabriela, Chris Rogers, and Jorge Emilio Rosés Labrada, eds. 2016. Latin American Contexts for Language Documentation and Revitalization. Amsterdam: Mouton de Gruyter. Peter, Lizette, Tracy Hirata- Edds, and Bradley Montgomery- Anderson. 2008. “Verb Development by Children in the Cherokee Language Immersion Program, with Implications for Teaching.” International Journal of Applied Linguistics 18: 166–187. Reyhner, Jon, ed. 1997. Teaching Indigenous Languages. Flagstaff: Northern Arizona University. http://jan.ucc.nau.edu/~jar/TIL_Contents.html. Reyhner, Jon. 2000. Learn in Beauty: Indigenous Education for a New Century. Flagstaff: Northern Arizona University. Reyhner, Jon. 2015. Teaching Indigenous Students: Honoring Place, Community, and Culture. Norman: University of Oklahoma Press. Reyhner, Jon and Barbara Burnaby, eds. 2002. Indigenous Languages Across the Community. Flagstaff: Northern Arizona University. http://jan.ucc.nau.edu/~jar/ILAC/. Reyhner, Jon, Gina Cantoni, Robert St. Clair, and Evangeline Parsons Yazzie. 1999. Revitalizing Indigenous Languages. Flagstaff: Northern Arizona University. Reyhner, Jon, Willard Sakiestewa Gilbert, and Louise Lockard. 2011. Honoring Our Heritage: Culturally Appropriate Approaches for Teaching Indigenous Students. Flagstaff: Northern Arizona University. Reyhner, Jon and Louise Lockard. 2009. Indigenous Language Revitalization: Encouragement, Guidance and Lessons Learned. Flagstaff: Northern Arizona University. Reyhner, Jon, Joseph Martin, Louise Lockard, and Willard Sakiestewa Gilbert, eds. 2013. Honoring Our Children: Culturally Appropriate Approaches for Teaching Indigenous Students. Flagstaff: Northern Arizona University. Reyhner, Jon, Joseph Martin, Louise Lockard, and Willard Sakiestewa Gilbert, eds. 2015. Honoring Our Elders: Culturally Appropriate Approaches for Teaching Indigenous Students. Flagstaff: Northern Arizona University. Reyhner, Jon, Octaviana V. Trujillo, Roberto Luis Carrasco, and Louise Lockard. 2003. Nurturing Native Languages. Flagstaff: Northern Arizona University. Rice, Keren. 2009. “Must There Be Two Solitudes? Language Activists and Linguists Working Together.” In Indigenous Language Revitalization: Encouragement, Guidance, and Lessons Learned, edited by Jon Reyhner and Louise Lockard, 37–59. Flagstaff: Northern Arizona University. Rohloff, Peter and Brent Henderson. 2015. “Development, Language Revitalization, and Culture: The Case of the Mayan Languages of Guatemala, and Their Relevance for African Languages.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 177–194. Amsterdam: John Benjamins.

Comparative Analysis in Language Revitalization Practices 489 Rubin, Joan, and Björn H. Jernudd. 1971. “Introduction: Language Planning as an Element in Modernization.” Can Language Be Planned? Sociolinguistic Theory and Practice for Developing Nations, edited by Joan Rubin and Björn H. Jernudd, xiii–xxiv. Honolulu: University of Hawaiʻi Press. Tang, Apay (Ai-yu). 2011. “From Diagnosis to Remedial Plan: A Psycholinguistic Assessment of Language Shift, L1 Proficiency, and Language Planning in Truku Seediq.” PhD diss., University of Hawaiʻi at Mānoa. Tsunoda, Tasaku. 2006. Language Endangerment and Language Revitalization: An Introduction. New York: Mouton de Gruyter. Warner, Sam L. No’eau. 2013. “The Movement to Revitalize Hawaiian Language and Culture.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 133–144. Leiden and Boston: Brill. Whalen, Douglas H., Margaret Moss, and Daryl Baldwin. 2016. Healing Through Language: Positive Physical Health Effects of Indigenous Language Use. F1000Research 2016, 5: 852. doi: 10.12688/f1000research.8656.1. Wilson, William H. and Kauanoe Kamanā. 2013. “Mai Loko Mai O Ka ‘I’ini: Proceeding From A Dream—The ‘Aha Pūnana Leo Connection In Hawaiian Language Revitalization.” In The Green Book of Language Revitalization in Practice, edited by Leanne Hinton and Kenneth Hale, 147–176. Leiden and Boston: Brill. Zuckermann, Ghil’ad and Michael Walsh. 2011. “Stop, Revive, Survive: Lessons from the Hebrew Revival Applicable to the Reclamation, Maintenance and Empowerment of Aboriginal Languages and Cultures.” Australian Journal of Linguistics 31: 111–127.

Chapter 21

The Lingui st i c s of L anguage Revi ta l i z at i on Problems of Acquisition and Attrition William O’Grady

1. Introduction Complex phenomena often invite more than one type of explanation. During the 1830s, a smallpox contagion killed tens of thousands in England, in part because the Industrial Revolution had brought large numbers of people into crowded cities. In a sense, the epidemic can therefore be explained in social and economic terms. At another level, however, it has a medical explanation: a virus infected a large number of individuals who would otherwise have been healthy. There is nothing inherently right or wrong about either perspective; in fact, both are factually correct. Nonetheless, it is worth noting that, in the end, the medical perspective offered a more practical solution: smallpox was ultimately eradicated by a program of individual inoculation, not by abandoning industrialization or reversing migratory trends. There may be a helpful lesson here for how we think about the study of endangered languages. On the one hand, language loss is most likely to occur under certain socioeconomic conditions—poverty, urbanization, industrialization, a lack of political autonomy, and so on (e.g., Nettle and Romaine 2000, 138ff; Fishman 2001, 2; Austin 2011; Grenoble 2011; Harbert 2011, among many others). On the other hand, at a linguistic level, language loss can be thought of as a cognitive phenomenon. Put simply, a language dies when the conditions necessary for its acquisition and maintenance are no longer met: caregivers do not speak it enough in front of their children, and children do not have sufficient opportunity to use it in the first decade of their lives.

For Kamele and Mālie. I thank the editors and Ryan Henke for their helpful comments.

The Linguistics of Language Revitalization 491 There is at least one good reason for approaching language loss and revitalization primarily as a linguistic problem: there is no practical alternative. The social forces that contribute to language shift are generally difficult, if not impossible, to reverse; the only hope is to mitigate their consequences by focusing on the conditions that favor language transmission within families, schools and neighborhoods. But that can be done only if we can define the precise conditions under which language acquisition occurs. As Fishman aptly states (2001, 13), “It is of no help to tell a patient that he should attain health by getting better, or that he should get better by being healthier.” And perhaps more to the point, it is of little use to recommend a course of treatment that includes no information about how it should be implemented. “Take some medication” is not a particularly effective piece of advice unless it is accompanied by an actual prescription: take this much of this drug on this schedule. We cannot yet offer a prescription to communities that want to undertake a program of language revitalization. True, we can say that it is desirable for infants to be exposed to their community’s language at home, that immersion programs are an effective option for young children, that it is good idea to have language programs for adults, that there should be opportunities to use the language beyond home and school, and so on (Fishman 2001, 12ff.). However, these conclusions amount to little more than common sense. The hard questions, the ones whose answers will ultimately decide whether a language lives or dies, require more precise responses. How much exposure to a language is required for children to acquire it? How frequently does a language have to be used in order to be maintained? Do adults have a realistic prospect of success in language learning? And so on. Much of what we know about these matters comes from research on language acquisition and maintenance in situations that are substantially different from those confronted by endangered-language communities. Nonetheless, there are several key points of consensus that could very well serve as a foundation for a more directed program of research focusing specifically on issues of language revitalization. I will concentrate in this chapter on two macro-issues: the question of how children acquire—and lose—language, and the question of the practicality of bilingualism as a strategy for language revitalization.

2. Children and language As any parent knows, children are especially gifted for language learning. Herein, for many people, lies the great hope for language revitalization. Surely, they think, all will be well if only we have children spend a few hours a week interacting with their grandparents or other speakers of the community’s endangered language. Alas, things are not so simple. At least two factors, largely ignored in the revitalization literature, hinder and disrupt language acquisition even in childhood. I will consider each in turn.

492 William O’Grady

2.1. The importance of ample input The enterprise of language revitalization is fraught with disappointments. A language is used at home, but children fail to learn it. A community devotes its energy to a language immersion program, but the results fall below expectations. Enthusiastic elders agree to teach the language to young learners, but their efforts are for naught. The most likely explanation for these outcomes is easy to state: the children do not hear the language often enough over a sufficiently long period of time. Children’s need for extensive exposure to whatever language they are learning was first documented in a landmark study conducted by two psychologists, Betty Hart and Todd Risley (Hart and Risley 1995, 1999). Their research team made monthly one- hour recordings of forty-two children growing up in monolingual English-speaking families in the United States. The recording sessions began when the children were 7 to 9 months old and continued for two and a half years. (Sampling techniques of this sort are common in research on child development, and are considered reliable, especially when they involve a large number of children and extend over a longer period of time, as the Hart–Risley study did.) Hart and Risley’s investigation revealed vast differences in the amount of language to which individual children were exposed. At one extreme were children from more talkative families, who heard more then 7,000 utterances in a typical day—which amounts to about 2.5 million utterances in the course of a year; see Table 21.1. This may seem like a very large number, but other work has produced comparable estimates (e.g., Wells 1985; Van de Weijer 2002; Roy 2009). In contrast, as summarized in Table 21.2, children from the least talkative families heard only about a third as much speech. These differences matter. At the age of 30 months, children from the most talkative families in Hart and Risley’s study had vocabularies more than twice the size of the vocabulary of children from the least talkative families. Moreover, in the subsequent six months, the children from the highly talkative families went on to learn more than twice as many new words as their peers did. Vocabulary growth has long been recognized as a major marker of linguistic development, both in its own right (one cannot communicate effectively without an extensive vocabulary) and as a predictor of subsequent academic success. This point was recently further confirmed by Morgan et al. (2015), whose longitudinal study of 8,650 children

Table 21.1 Mean number of utterances per day and per year for children in more talkative families Sentences/day

Sentences/year

7,250

2.5 million+

The Linguistics of Language Revitalization 493 Table 21.2 Mean number of utterances per day and per year for children in the least talkative families Sentences/day

Sentences/year

2,170

800,000

Table 21.3 Children linguistic attainment as it relates to language exposure Vocabulary size

No. of words learned

at 30 months

in next 6 months

Talkative families

766

350

Non-talkative families

357

168

living in the United States revealed that larger vocabularies at age 2 correlated with better achievement in reading and mathematics upon entry into kindergarten three years later. Vocabulary growth is also an indicator of how well other aspects of language have been acquired, including morphology and syntax (e.g., Bates, Bretherton, and Snyder 1988). Fernald, Marchman, and Weisleder (2013) report an additional effect involving how quickly children are able to recognize words that they hear. (Speed was measured by means of a “looking-while-listening” task: as the children looked at a picture of two familiar objects, they heard the name for one or the other. The amount of time, in milliseconds, that it took for their eyes to fixate on the right object was then measured.) In Fernald et al.’s study of forty-eight English-speaking infants, input-related differences in vocabulary size were evident at age 18 months and were positively correlated six months later with word-recognition speed—a factor that has been linked in other studies to superior language and cognitive skills later in life. A further interesting feature of Fernald et al.’s study was a correlation between vocabulary size and processing speed on the one hand and socioeconomic status (SES) on the other. Overall, children from lower-SES families did less well—but not because of their parents’ income. As Hart and Risley noted in their study, the decisive factor is how talkative children’s families are. Addressing this point in an interview,1 Todd Risley noted: “some poor people talked a lot to their kids and their kids did really well 1

The interview can be accessed at http://www.childrenofthecode.org/interviews/risley.htm.

494 William O’Grady [linguistically]. Some affluent business people talked very little to their kids and their kids did very poorly.” Risley went on to observe, “When you look at the amount of talking the parents are doing, nothing is left over relating to socioeconomic status. [The amount of talk] accounts for all the variance” in children’s linguistic development. This finding has been taken very seriously, both by scholars and by educators, and it has spawned a number of major projects devoted to filling the “word gap” that hinders linguistic development in children from less talkative families. One such project, dubbed “Providence Talks” (http://www.providencetalks.org/), uses biweekly visits and special technology to help parents keep track of how much they speak to their children. Early results suggest a significant increase in familial speech to children, with an average increment of over 4,000 words per day for children whose input was at or below the fiftieth percentile at the start of the project.2 A program of periodic assessments is under way to document long-term results, including the possibility of improved performance in school. The importance of input for linguistic development is not limited to English. Weisleder and Fernald (2013) investigated language development in a group of twenty- nine Spanish-speaking Latino children in the United States, all from low-SES families. Their results revealed “striking variability” in the amount of adult speech addressed to the children in samples collected when they were 19 months old. Some children heard as many 29,000 words in the course of a day, and some fewer than 2,000. Crucially, the children to whom more speech had been directed had substantially larger vocabularies six months later and were quicker at recognizing words. Do these findings extend to indigenous languages? Schneidman and Goldin-Meadow (2012) examined the issue of input and development in a Yucatec Maya community in Mexico. Based on a study of fifteen families, they reported that the amount of speech directed to children at the age of 24 months was strongly correlated to the size of their vocabulary eleven months later. This is essentially the same finding that has been reported for English-speaking and Spanish-speaking children in the United States. It highlights the importance of input to language development in all linguistic communities, including those whose language is endangered. Of course, quantity is not the only important factor; the quality of the input also matters, as various studies have shown (Huttenlocher et al. 2002, 2010; Hoff 2003; Rowe 2012; Ramírez-Esparza 2014). Children benefit from speech that is carefully articulated, from sentences that increase over time in complexity and sophistication, and from stories and conversations that capture their interest. Most important of all may well be the opportunity for one-to-one interactions. A number of recent studies have documented the value of speech that is directed specifically to the child, identifying it as a major predictor and facilitator of linguistic development (Schneidman et al. 2013 for English, Weisleder and Fernald 2013 for Spanish, and Schneidman and Goldin-Meadow

2

http://www.providencetalks.org/wp-content/uploads/2015/10/Providence-Talks-Pilot-Findings- Next-Steps.pdf.

The Linguistics of Language Revitalization 495 2012 for Yucatec Maya). In Weisleder and Fernald’s study, for example, vocabulary size at 24 months was linked to the amount of speech addressed to the child, not the amount of speech that she or he simply overheard.

2.2. The danger of attrition It is often suggested that children’s early success in language learning is due to a high degree of “cerebral plasticity,” which allows them to quickly acquire new words and morphosyntactic patterns. The downside of cerebral plasticity is that until those words and patterns are fully entrenched, they are highly susceptible to loss. Language attrition is far less studied than language acquisition, but what is known points toward a major peril for children whose exposure to their first language is dramatically reduced (or even ceases) after a few years, often at the point at which they start attending school. One well-documented example of rapid language attrition in young learners comes from the study of children who have been adopted by families living in a different country and speaking a different language (Genesee and Delcenserie 2016). Based on a survey of 130 infants and toddlers who had been adopted prior to the age of thirty months, Glennen and Masters (2002, 427) report a quick loss of “existing ability in their birth language”— a conclusion that has also been confirmed by case studies of individual children (e.g., Nicoladis and Grabois 2002). Schmid (2012, 184–185) reports a similar result in older children who were adopted after they had begun school, and Isurin (2000) documents precipitous language loss in a nine-year-old adoptee who she studied over a two-year period.3 The speed of attrition in adoptees is alarmingly fast, and one cannot help but wonder whether it might be attributable to the wrenching experience of being sent to live with a new family in a foreign land. However, the rate of linguistic decline observed in adoptees is not out of line with what has been reported in studies of children who move to a new country with their birth families. In one such study, Berman (1979) reports on a 3 1/2-year-old Hebrew-speaking child who lost her ability to speak and understand Hebrew after just a few months in the United States with her family. Sunyoung Lee and I have documented a somewhat similar case involving a young Korean girl, who spent several months in the United States with her bilingual mother. The child, who was 6;10 at the time of her displacement, quickly became immersed in a monolingual English environment—she attended English-language school and her mother spoke to her almost exclusively in English. During the course of her stay in the United States, the child participated in a regimen of testing that included a 120-item picture-naming task that was administered monthly. The results are summarized in Figure 21.1. 3

By far the largest investigation of language loss in adoptees was carried out by Gindis (2009), who studied 800 adoptees, mostly from Eastern Europe, whose age at adoption ranged from 3;6 to 9;00. Gindis reports that the children lost the ability to understand their birth language within a matter of months, with an even faster decline in the ability to speak it. Unfortunately, he provides no information about his methodology, and his study apparently did not undergo peer review.

496 William O’Grady 100 90

Mean Percentages

80 70 60 50 40 30 20 10 0 #0

#1

#2 Valid KOR

#3

#4 #5 Sessions Valid ENG

#6 No response

#7

#8

#9

Invalid

Figure 21.1. Vocabulary loss in a 6-year-old Korean child

As can be seen here, the child’s ability to access Korean vocabulary began to decline within the first month of her departure from Korea and her success rate fell to less than 50% after just two months in the United States. A dramatic decline was also observed in her ability to produce narratives and to carry on conversations in Korean; in fact, she quickly reached the point where she could no longer speak in Korean to her father, who had remained in Korea but spoke with her regularly by phone. It is not yet known to what extent lost linguistic skills can be recovered, but it is evident that age and the amount of time that elapses before re-exposure to the language are crucial (Köpke and Schmid 2004; Bylund 2009; Hyltenstam et al. 2009). Various case studies have documented recovery of a lost or weakened childhood language by pre- adolescent children. For example, the Korean child who Sunyoung Lee and I studied began speaking her first language again shortly after her return to Korea. Berman (1979) reports similar success for the young Hebrew-speaking child who she studied (see above), and Hubbell-Weinhold (2005) recounts the recovery of English by three sisters who were immersed in Swiss German for three years, beginning when they were between the ages of 8 and 11.4 In contrast, the prognosis for recovery of a lost childhood language by adults is poor. Pallier et al. (2003) report that adults in their 20s and early 30s who had been adopted between the ages of 3 and 8 were unable to distinguish sentences of their native Korean from sentences of Polish and Japanese. Along similar lines, Hyltenstam et al. (2009) found that even after two years or more of study, a group of twenty-one ethnic Korean

4

It is worth noting, however, that although the children in these studies ceased to speak their first language for a period of time, they continued to receive some exposure to it from family members.

The Linguistics of Language Revitalization 497 adults who had been adopted as children performed no better on Korean grammar tasks than did native speakers of Swedish who were studying Korean as a second language. There are perhaps two bright spots in this otherwise quite grim picture. The first involves auditory perception. Research on Korean adoptees suggests that adults may retain a sensitivity to at least some subtle phonetic contrasts in their lost first language and that this sensitivity can be enhanced through practice and exposure (Bowers, Mattys, and Gage 2009; Hyltenstam et al. 2009; Oh, Au, and Jun 2010; Park 2015). In Oh et al.’s study, for instance, twelve adults (age 18–33 yrs) who had been adopted from Korea prior to age 1 and who had minimal subsequent exposure to Korean, were tested on their ability to distinguish among tense, lax, and aspirated consonants—a staple of Korean phonology. While there was no overall advantage compared to a control group, the adoptees were better at distinguishing between lax and aspirated consonants. A second bright spot involves the finding that adults who have spoken their first language into adolescence are able to maintain a relatively high level of proficiency, even after many years without an opportunity to use it. When tested, they often show relatively minor deficits and can still use the language effectively for communicative purposes, although not of course at the same level as monolingual native speakers (e.g., Köpke 2004; Köpke and Schmid 2004; Tsimpli et al. 2004). A particularly striking example comes from Schmid’s (2012) case study of eleven Jewish children, who were rescued from Germany during World War II. Then 11 to 15 years old, they were placed with English-speaking families in the United Kingdom, and had no subsequent opportunity to speak German. Interviewed fifty years later, the by-then elderly subjects showed a remarkable ability to express themselves in German as they recounted their experiences before and after adoption. Köpke and Schmid (2004) present other case studies, along with a far-ranging discussion of language attrition and retention in adults.

2.3. Implications What can we make of findings such as these on language acquisition and attrition as we think about the revitalization of endangered languages? In my opinion, two points are especially worth highlighting. First, we see the importance of monitoring the quantity and quality of the language to which learners are exposed in revitalization programs. Hearing just a few dozen (or even a few hundred) utterances per week is unlikely to result in the acquisition of more than a few vocabulary items and fixed expressions. At the same time, exposure to an overly narrow range of speech, even in large quantities, can also create problems. An instructive example of this comes from Peter, Hirata-Edds, and Montgomery’s (2008) study of a first-grade Cherokee immersion classroom. Although the thirteen children in the class had already completed one to two years of preschool immersion, they all performed very poorly on verbal inflection, with success rates of less than 20%. (Cherokee has two basic verb classes, each of which agrees with its subject in person and number.) Upon closer inspection, Peter et al. found the likely reason: the teachers tended to interact

498 William O’Grady with the students through the use of commands (Sit down, Write on your paper, Read page 5), which provide no variation in the choice of subject and therefore little opportunity to observe the workings of subject-verb agreement. Second, signs of language breakdown, such as difficulty accessing vocabulary and a decrease in fluency, occur soon after children’s exposure to the first language ceases, often in a matter of months, if not weeks. A language cannot be considered secure unless it is used at least into adolescence. Children require continuous long-term exposure to the community’s language if they are to maintain what they learned so easily as infants.

3. Bilingualism No one has ever proposed that a community should become monolingual in order to save its language. Indeed, calls for monolingualism typically have the opposite goal in mind: the sacrifice of the indigenous language under the pretext that it is an inferior mode of communication or that its use undermines national unity. Bilingualism is the foundation of any reasonable plan for language revitalization. This leads us to two important questions. Is bilingualism a good thing? Is bilingualism a practical option? Let us consider each question in turn.

3.1. The effects of bilingualism The most obvious advantage of bilingualism is that it confers fluency in a second language, offering children and adults the benefits of maintaining their community’s traditional language, without sacrificing the opportunities that might come from also being proficient in a national or international language. This in turn creates a cascade of developmental, cognitive, and psychological effects. On the developmental side, it is clear that acquiring two languages cannot take less time or effort than acquiring one. It is widely recognized that children who simultaneously acquire two languages learn each somewhat more slowly than do monolingual learners (e.g., Hoff et al. 2012, 20–22). The exact extent of the lag reflects the relative amount of exposure to each language (see below). Eventually, though, bilingual children catch up to their monolingual peers and are able to enjoy the advantages of fluency in two languages rather than just one. It has also been established that even fluent bilinguals are slightly slower than monolinguals at retrieving words from their mental lexicon—a reflection of the “embarrassment of riches” (a dual vocabulary) conferred by bilingualism. However, the slowdown is too minute to have any effect on communication and therefore does not create a practical disadvantage of any sort. A wide range of cognitive benefits have been associated with bilingualism, ranging from protection against dementia (Schweizer et al. 2012) to being better able to take

The Linguistics of Language Revitalization 499 the perspective of others (Liberman et al. 2017). Overall, the single most studied cognitive advantage of bilingualism involves apparent improvements in the “executive processing” needed to sustain attention, adjust to contextual demands, and avoid distracting information (Bialystok, Fergus, and Luk 2012)—all valuable skills. This claim is worth exploring in more detail, as it has recently become the subject of controversy. A typical piece of evidence for a bilingualism-related advantage in executive processing comes from performance on “Stroop tasks,” in which participants have to deal with distracting stimuli. In one experiment of this type, Hernández et al. (2010) presented young adult bilinguals and monolinguals with a series of numerals and asked them to indicate how many digits each contained. As illustrated below, there were two types of test items—congruent, in which the numerals themselves signaled the number of digits, and non-congruent, in which there was a mismatch. Congruent (numerals = no. of digits) 22 333

Non-congruent (numerals ≠ no. of digits) 11 222

Bilinguals responded faster than monolinguals on both conditions, demonstrating a superior ability to use the extra clue on the congruent items and to suppress the distracting information on the incongruent items. In another type of task, participants are asked to indicate the direction of an arrow that is flanked on each side by two other arrows. In the congruent pattern, all arrows point in the same direction; in the non-congruent pattern, the middle arrow points in a different direction. Congruent →→→→→

Non-congruent ←←→←←

Costa, Hernández, and Sebastián-Gallés (2008) report that adult Spanish-Catalan bilinguals responded faster than monolinguals on both conditions and were less susceptible to interference from flanking arrows in the non-congruent test items. Many similar findings have been reported (see, e.g., Bialystok et al. 2012 for a review), but the jury is still out on whether the observed effects are consistent, whether they are restricted to certain conditions, and whether they are manifested in only some types of bilinguals. For various perspectives on these issues, see Costa et al. 2009, Paap and Greenberg (2013), Baum and Titone (2014a, 2014b), Duñabeitia et al (2014), Bialystok et al. (2015), and Sekerina and Spradlin (2016).5 5 This issue has generated significant public interest in the media, including articles by Maria Konnikova (“Is Bilingualism Really an Advantage?”) in the January 22, 2015 issue of the New Yorker and by Ed Yong (“The Bitter Fight over the Benefits of Bilingualism”) in the February 10, 2016 issue of The Atlantic.

500 William O’Grady This controversy matters little for work on language endangerment and revitalization. If there are cognitive advantages of bilingualism, they should of course be reported to educators and caregivers. But it is important not to lose sight of the greater prize, which is the preservation of the community’s traditional language. Indeed, there is good reason to think that this accomplishment confers psychological benefits independent of those associated with cognitive function. For instance, it is by now a well-established fact that proficiency in one’s heritage language contributes to a sense of self-esteem and well-being (Fishman 1991, 7–8; McIvor 2005). A further correlate is better academic achievement. Based on a review of Navajo, Yup’ik, and Hawaiian language immersion schools, McCarty (2011) reports that indigenous language programs contribute not only to language maintenance but also to improved scholastic performance. For example, in addition to developing their Navajo language skills, students in the Rock Point Navajo immersion program consistently outperformed their peers in English-only programs on state tests, even in mathematics and English. This is not an isolated finding: drawing on data from a multi-year study of 700,000 students representing fifteen minority languages, Thomas and Collier (1997) report that instruction in the child’s native or heritage language is the single most powerful predictor of academic success in many cases. More dramatically, there is even evidence that retention of a heritage language can sometimes be a matter of life and death. Based on a study of more than 150 Aboriginal communities in British Columbia, Canada, Hallett, Chandler, and Lalonde (2007) report that youth suicide rates were correlated with the degree to which communities had maintained their traditional language. Communities with low language retention rates had six times as many youth suicides as communities in which at least half the population had retained an ability to converse in their traditional language. Of course, it is difficult to determine the exact contribution of language retention itself to this result. Chandler and Lalonde (2008) suggest that the key overall factor is “cultural continuity,” of which language is a major component. Other relevant factors include self-government, legal title to traditional lands, control over education and community services, and promotion of traditional cultural practices—as well as the participation of women in local government and the provision of child care services.

3.2. The practicality of bilingualism Bilingualism is a natural cognitive state, commonplace throughout the world. No reliable worldwide statistics are available, but a 2012 survey coordinated by the European Commission Directorate-General for Communication reports that more than half of all Europeans speak at least one language in addition to their mother tongue, with rates of bilingualism or multilingualism at well over 90% in several countries, including Luxembourg, the Netherlands, Latvia, Lithuania, and Slovenia.6 According to 2007 US 6

http://ec.europa.eu/public_opinion/archives/ebs/ebs_386_en.pdf.

The Linguistics of Language Revitalization 501 census estimates, approximately 20% of Americans age 5 and older speak a language other than English at home.7 Census figures for Canada, compiled in 2011, indicated that 17.5% of the population speaks two languages at home.8 Balanced bilingualism is rare: equal fluency in two languages is “the exception, not the norm” (Grosjean 1982, 235) and therefore not a practical goal for most individuals or revitalization programs. A more realistic goal is to seek a level of fluency in each language that will support effortless communication in whatever situations it is used. A key predictor of this sort of developmental success is the amount of exposure that children receive to each of their two languages. A study by Hoff et al. (2012) offers an illustrative example. Hoff et al. examined lexical and grammatical development in forty-seven bilingual Spanish-English children, ages 22 to 30 months of age. Their principal finding was straightforward: development is strongly correlated with the amount of home language input. Put simply, children who hear more Spanish than English have a better grasp of Spanish, in terms both of vocabulary and of the ability to produce complex sentences. Similarly, children who are exposed to more English than Spanish manifest an advantage in English. Thordardottir (2015) reports comparable contrasts in a study of fifty-six 3-year-olds and eighty-three 5-year-olds who were growing up in Montreal, speaking both French and English. Because the children were matched for SES and for non-verbal cognitive skills and because both languages have high status in Montreal, Thordardottir’s study was able to control for many of the external factors that often obscure the role of input in language development. Her results were clear-cut: for vocabulary, utterance length, and the use of inflection, children who received unequal exposure to the two languages performed better in the more commonly heard language. This in turn leads to the question of what the minimum input for bilingual development might be. It is not currently possible to identify a precise cut-off point for bilingual development; however, a suggestive finding by Pearson et al. (1997, 56) has been influential. In a study of twenty-five Spanish-English bilingual children who received varying amounts of exposure to the two languages, Pearson and her colleagues noted that six of the seven children who had received less than 20% of their exposure to one of the languages were “very reluctant” to use that language and appeared to be “tuning it out” when it was used around them in laboratory play sessions; see also Hoff et al. (2012, 22). Genesee (2007) and Baker (2014, 38) recommend that neither language make up less than 30% of the speech to which children are exposed.9 7

http://www.census.gov/prod/2010pubs/acs-12.pdf. http://www12.statcan.gc.ca/census-recensement/2011/as-sa/98-314-x/98-314-x2011001-eng.cfm. 9 Genesee’s recommendation is stated differently in two versions of the same article. In one version, available at Genesee’s website (http://www.psych.mcgill.ca/perpg/fac/genesee/A%20Short%20Guide%20 to%20Raising%20Children%20Bilingually.pdf), he says “Our best guess at this time is that bilingual children must be exposed to a language during at least 30% of their total language exposure if their acquisition of that language is to proceed normally. Less exposure than this could result in incomplete acquisition of that language.” However, in the other version (available at http://www.multilingualliving. com/wordpress/wp-content/uploads/mag/sept07/multilinguallivingmagazine.pdf), Genesee explains that even “this 30% criterion may not be sufficient to ensure that a child acquires the language 8

502 William O’Grady The easiest path to bilingualism is no doubt built on exposure to two languages in childhood. Indeed, only a small percentage of bilinguals in the United States (around 16%) learned a language other than English in school, according to a study of census results by the Commission on Language Learning of the American Academy of Arts and Sciences.10 The vast majority (over 75%) became bilingual thanks to early opportunities to hear and use their family’s traditional language at home. Sadly, many parents are reluctant to use their own language to their children, in the belief that it is better to use only the language of school. However, based on a study of twenty-nine 5-year-olds and their parents, Place and Hoff (2011, 1847) suggest that children do not benefit from this practice and that it has the negative effect of denying them access to their heritage language and potentially impeding parent-child communication. Paradis (2011, 231) draws a similar conclusion based on a study of 169 children ages 4 to 7; see also Hammer, Davison et al. (2009).

3.3. The prospects for acquired bilingualism in adults Language revitalization programs typically target all age groups. Children may be the best language learners, but their ability to assume leadership of the revitalization effort in their communities may be decades away. In the meantime, they need role models and the community needs teachers (e.g., NeSmith 2012 for Hawaiian, and Te Paepae Motuhake 2011 for Maori). Often, these responsibilities fall to adult second-language learners. Two important facts calls for acknowledgment. First, non-immersion school-based programs for language teaching often fail to provide adults (or children) with enough input to develop fluency. The best estimates that we have (based on data from the US Foreign Service Institute) suggest that a daunting amount of classroom time is required to develop mid-intermediate or low-advanced proficiency: many hundreds of hours for a young adult of average aptitude (Rifkin 2003; Johnson 2016) and even more for languages with a complex morphology and syntax. Second, the ability to acquire language naturalistically declines with age. Indeed, with very rare exceptions (most notably Snow and Hoefnagel-Hohle 1985), comparative studies of age-related success in home settings and in immersion programs point toward a strong advantage in favor of young children.11 The earliest signs of a decline in the ability to learn a second language are manifested in the first year of life, perhaps as early as 6 months, as children begin to lose the ability to distinguish among new speech completely.” Grüter et al. (2014) show that “density” of exposure (how many utterances the child actually hears) is a better predictor of bilingual fluency than simple time of exposure. 10

The report is available at www.amacad.org/content/publications/publication.aspx?d=22429. There is reason to think that older children may be superior to their younger counterparts when it comes to instructed classroom learning (e.g., Jaekel et al. 2017, Pfenninger & Singleton 2016), an approach that is far less effective overall than immersion. 11

The Linguistics of Language Revitalization 503 sounds (Werker et al. 1996; Yoshida et al. 2010; Kuhl 2011). By the age of 11 or 12 months, monolinguals are sensitive just to the phonetic contrasts of their one language, while bilinguals manifest a sensitivity to the sounds of both their languages (Ramírez, García- Sierra, and Kuhl 2016). The early loss of phonetic sensitivity can result in non-native pronunciation in a later- learned language. In Granena and Long’s (2012) study of sixty-five Chinese-speaking immigrants to Spain, for example, no one whose first exposure to Spanish took place after age 5 developed native-like pronunciation (as judged by a panel of twelve native speakers), no matter how long they had been in their new country (more than twenty years in some cases). The ability to acquire grammatical contrasts declines more slowly, and some studies (e.g., Schwartz 2004) suggest that children can acquire the grammar of a second language in much the same way as native speakers if they are exposed to it by age 4. Older learners, including adolescents and adults, typically fare less well, as documented in detailed studies by Granena and Long and by Abrahamsson and Hyltenstam (2009), among others. However, it is well established that differences in aptitude after childhood make language learning easier for some individuals than for others (e.g., DeKeyser 2000; DeKeyser, Alfi-Shabtay, and Ravid 2010), and that high motivation can facilitate progress toward proficiency (Ushioda and Dörnyei 2012; Dörnyei 2014).

3.4. Implications Not every community can hope to restore full intergenerational transmission of its language. Nonetheless, regardless of the type of revitalization program that it chooses to pursue, the prospects for a positive outcome can be improved by an understanding of the conditions under which second languages are learned and maintained. I have focused here on two findings of broad relevance. First and most obviously, the prospects for success in acquiring a second language are best with very young learners (ideally infants or toddlers), whose skills as language learners have not yet been compromised by age-related decline. In short, all other things being equal, the earlier that children are given the opportunity to become bilingual, the better. Second, because language revitalization programs have bilingualism as their goal, care must be taken to ensure the quantity and quality of children’s exposure to both languages. Like their monolingual counterparts, bilingual children need to hear ample amounts of speech addressed to them and to have frequent opportunities to engage in conversation. In order to ensure genuine proficiency in both languages, a rough balance in the child’s linguistic experience is therefore important, with (it seems) neither language making up more than 75% of the total input. Moreover, the danger of attrition is ever-present. Children who learn two languages require ongoing opportunities to hear and use both, at least into the adolescent years.

504 William O’Grady

4. Concluding remarks We all want languages to be saved, and we all know that this requires intergenerational transmission—a fact that is acknowledged in every measure of language vitality that has ever been proposed (see Lee and Van Way 2016 for a review). But we cannot allow ourselves to engage in magical thinking. The laws of cognition do not vary from place to place, anymore than the laws of nature do. It doesn’t matter what the language is, where it is spoken, or how endangered it is. Languages can be saved only if they are acquired by each generation of children, and that can happen only if particular linguistic conditions are met. The study of those conditions is an ongoing matter, and there is a particular need for research that is directed specifically toward the acquisition and use of endangered languages. Nonetheless, we do have in hand certain fundamental findings that can and should guide ongoing efforts to preserve and revitalize threatened languages. Writing a quarter century ago, Fishman (1991, 1) made the sobering observation that “most efforts to reverse language shift are only indifferently successful, at best, and outright failures or even contra-indicated and harmful undertakings, at worst.” Fifteen years later, despite a flurry of activity in the name of language conservation, Grenoble and Whaley (2006, ix) express a similarly pessimistic view: “an honest evaluation of most language revitalization efforts to date will show that they have failed.” No revitalization program can be perfect, of course. Compromises have to be made, and some disappointments are inevitable. Nonetheless, we are now at least in a position to better see how our energies should be invested, to know what is realistic and reasonable, and to understand the benefits and perils both of action and of inaction.

References Abrahamsson, Niclas and Kenneth Hyltenstam. 2009. “Age of Onset and Nativelikeness in a Second Language: Listener Perception Versus Linguistic Scrutiny.” Language Learning 59: 249–306. Austin, Peter and Julia Sallabank 2011. Introduction. In The Cambridge Handbook of Endangered Languages, edited by P. Austin and J. Sallabank, 1–24. Cambridge: Cambridge University Press. Baker, Colin. 2014. A Parents’ and Teachers’ Guide to Bilingualism. 4th ed. Clarendon, UK: Multilingual Matters. Bates, Elizabeth, Inge Bretherton, and Lynn Synder. 1988. From First Words to Grammar: Individual Differences and Dissociable Mechanisms. New York: Cambridge University Press. Baum, Shari and Debra Titone 2014a. “Moving Toward a Neuroplasticity View of Bilingualism, Executive Control and Aging.” Applied Psycholinguistics 35: 857–894. Baum, Shari and Debra Titone 2014b. “The Future of Bilingualism Research: Insufferably Optimistic and Replete with New Questions.” Applied Psycholinguistics 35: 933–942. Berman, Ruth. 1979. “The Re-emergence of a Bilingual: A Case Study of a Hebrew-English Speaking Child.” Working Papers on Bilingualism 19: 158–179.

The Linguistics of Language Revitalization 505 Bialystok, Ellen, Fergus Craik, and Gigi Luk. 2012. “Bilingualism: Consequences for Mind and Brain.” Trends in Cognitive Science 16: 240–250. Bialystok, Ellen, Judith Kroll, Davin Green, Brian MacWhinney, and Fergus Craik. 2015. “Publication Bias and the Validity of the Evidence: What’s the Connection?” Psychological Science 26: 944–946. Bowers, Jeffrey, Sven Mattys, and Suzanne Gage. 2009. “Preserved Implicit Knowledge of a Forgotten Childhood Language. Psychological Science 20: 1064–1069. Bylund, Emanuel. 2009. “Maturational Constraints and First Language Attrition.” Language Learning 59: 687–7 15. Chandler, Michael and Christopher Lalonde. 2008. “Cultural Continuity as a Moderator of Suicide Risk Among Canada’s First Nations.” In Healing Traditions: The Mental Health of Aboriginal Peoples in Canada, edited by L. Kirmayer and G. Valaskakis, 221–248. Vancouver: University of British Columbia Press. Costa, Albert, Mireia Hernández, Jordi Costa-Faidella, and Núria Sebastián-Gallés. 2009. “On the Bilingual Advantage: Now You See It, Now You Don’t.” Cognition 113: 135–149. Costa, Albert, Mireia Hernández, and Núria Sebastián-Gallés. 2008. “Bilingualism Aids Conflict Resolution: Evidence from the ANT Task.” Cognition 106: 59–86. DeKeyser, Robert. 2000. “The Robustness of Critical Period Effects in Second Language Acquisition.” Studies in Second Language Acquisition 22: 499–533. DeKeyser, Robert, Iris Alfi-Shabtay, and Dorit Ravid. 2010. “Cross-linguistic Evidence for the Nature of Age Effects in Second Language Acquisition.” Applied Psycholinguistics 31: 413–438. Dörnyei, Zoltan. 2014. Motivation in Second Language Learning. In Teaching English as a Second or Foreign Language, 4th ed., edited by M. Celce-Murcia, D. M. Brinton, and M. A. Snow, 518–531. Boston: National Geographic Learning/Cengage Learning. Duñabeitia, Jon, Juan Hernández, Eneko Antón, Pedro Macizo, Adelina Estévez, Luis Fuentes, and Manuel Carreira. 2014. “The Inhibitory Advantage in Bilingual Children Revisited.” Experimental Psychology 61: 234–251 Fernald, Ann, Virginia Marchman, and Adriana Weisleder. 2013. “SES Differences in Language Processing Skill and Vocabulary Are Evident at 18 Months.” Developmental Science 16: 234–248. Fishman, Joshua. 1991. Reversing Language Shift. Clevendon, UK: Multilingual Matters. Fishman, Joshua. 2001. “Why Is It So Hard To Save a Threatened Language?” In Can Threatened Languages Be Saved? Reversing Language Shift, Revisited: A 21st Century Perspective, edited by J. Fishman, 1–22. Toronto: Multilingual Matters. Genesee, Fred. 2007. “A Short Guide to Raising Children Bilingually.” Multilingual Living Magazine, September-October, 2: 18–21. Genesee, Fred and Audrey Delcenserie, eds. 2016. Starting Over—The Language Development of Internationally-Adopted Children. Amsterdam: John Benjamins. Gindis, Boris. 2009. “Abrupt Native Language Loss in International Adoptees.” ADVANCE— Emagazine for Speech-Language Pathologists and Audiologists, 18(51): 5. [available at: http:// www.adoptionarticlesdirectory.com/ArticlesUser/articles95_list.php] Glennen, Sharon and M. Gay Masters. 2002. “Typical and Atypical Development in Infants and Toddlers from Eastern Europe.” International Journal of Speech-Language Pathology 11: 417–433. Granena, Gisela and Michael Long. 2012. “Age of Onset, Length of Residence, Language Aptitude, and Ultimate L2 Attainment in Three Linguistic Domains.” Second Language Research 29: 311–343,

506 William O’Grady Grenoble, Lenore. 2011. “Language Ecology and Endangerment.” The Cambridge Handbook of Endangered Languages, edited by P. Austin and J. Sallabank, 27–44. Cambridge: Cambridge University Press. Grenoble, Lenore and Lindsay Whaley 2006. “Preface.” In Saving Languages: An Introduction to Language Revitalization, edited by L. Grenoble and L. Whaley (eds.), ix–xi. Cambridge: Cambridge University Press. Grosjean, François. 1982. Life with Two Languages. Cambridge: Harvard University Press. Grüter, Theres, Nereyda Hurtado, Virginia Marchman, and Ann Fernald. 2014. “Language Exposure and Online Processing in Bilingual Development: Relative Versus Absolute Measures.” In Input and Experience in Bilingual Development, edited by T. Grüter and J. Paradis, 15–36. Amsterdam: John Benjamins. Hallett, Darcy, Michael Chandler, and Christopher Lalonde. 2007. “Aboriginal Language Knowledge and Youth Suicide.” Cognitive Development 22: 392–399. Hammer, Carol, Megan Davison, Frank Lawrence, and Adele Miccio. 2009. “The Effect of Maternal Language on Bilingual Children’s Vocabulary and Emergent Literacy Development During Head Start and Kindergarten.” Scientific Studies of Reading 13: 99–121. Harbart, Wayne. 2011. “Endangered Languages and Economic Development.” In The Cambridge Handbook of Endangered Languages, edited by P. Austin and J. Sallabank, 403– 422. Cambridge: Cambridge University Press. Hart, Betty and Todd Risley. 1995. Meaningful Differences in the Everyday Experience of Young American Children. Baltimore, MD: Paul H. Brookes. Hart, Betty and Todd Risley. 1999. The Social World of Children Learning to Talk. Baltimore, MD: P.H. Brookes. Hernández, Mireia, Albert Costa, Luis Fuentes, Ana Vivas, and Núria Sebastián-Gallés. 2010. “The Impact of Bilingualism on the Executive Control and Orienting Networks of Attention.” Bilingualism: Language and Cognition 13: 315–325. Hoff, Erika. 2003. “The Specificity of Environmental Influence: Socioeconomic Status Affects Early Development Via Maternal Speech.” Child Development 74: 1368–1378. Hoff, Erika, Cynthia Core, Silvia Place, Rosario Rumiche, Melissa Señor, and Marisol Parra. 2012. “Dual Language Exposure and Early Bilingual Development.” Journal of Child Language 39: 1–27. Hubbell-Weinhold, Juliet. 2005. “L1 Attrition and Recovery–A Case Study.” Proceedings of the 4th International Symposium on Bilingualism, edited by J. Cohen, K. McAlister, K. Rolstad, and J. MacSwan, 1045–1052. Somerville, MA: Cascadilla Press. Huttenlocher, Janellen, Marina Vasilyeva, Elina Cymerman, and Susan Levin. 2002. “Language Input and Child Syntax.” Cognitive Psychology 45: 337–374. Huttenlocher, Janellen, Heidi Waterfall, Marina Vasilyeva, Jack Vevea, and Larry Hedges. 2010. “Sources of Variability in Children’s Language Growth.” Cognitive Psychology 61: 343–365. Hyltenstam, Kenneth, Emanuel Bylund, Niclas Abrahamsson, and Hyeon- Sook Park. 2009. “Dominant- Language Replacement: The Case of International Adoptees.” Bilingualism: Language and Cognition 12: 121–140. Isurin, Ludmilla. 2000. “Deserted Island or a Child’s First Language Forgetting.” Bilingualism: Language and Cognition 3: 151–166. Jaekel, Nils, Michael Schurig, Merle Florian, and Markus Ritter. 2017. “From early starters to late finishers: A longitudinal study of early foreign language learning in school.” Language Learning 67: 631–664.

The Linguistics of Language Revitalization 507 Johnson, Michele. 2016. “Ax toowú át wudikeen, my spirit soars: Tlingit direct acquisition and co-learning pilot project.” Language Documentation and Conservation 10: 306–336. Köpke, Barbara and Monika Schmid. 2004. “First Language Attrition: The Next Phase.” In First Language Attrition: Interdisciplinary Perspectives on Methodological Issues, edited by M. S. Schmid, B. Köpke, M. Keijzer, M. Weilemar, and L. Weilemar, 1–43. Amsterdam: John Benjamins. Köpke, Barbara. 2004. “Neurolinguistic Aspects of Attrition.” Journal of Neurolinguistics 17: 3–30. Kuhl, Patricia. 2011. “The Linguistic Genius of Babies.” TED talk. Online: https://www.youtube. com/watch?v=G2XBIkHW954. Lee, Nala and John Van Way. 2016. “Assessing Levels of Endangerment in the Catalogue of Endangered Languages (ELCat) Using the Language Endangerment Index (LEI).” Language in Society 45: 271–292. Liberman, Zoe, Amanda Woodward, Boaz Keysar, and Katherine Kinzler. 2017. “Exposure to Multiple Languages Enhances Communication Skills in Infancy.” Developmental Science 20: 1–11. doi:10.1111/desc.12420 McCarty, Teresa. 2011. “The Role of Native Languages and Cultures in American Indian, Alaska Native, and Native Hawaiian Student Achievement.” http://center-for-indian-education. asu.edu/sites/center-for-indian-e ducation.asu.edu/f iles/McCarty,%20Role%20of%20 Native%20Lgs%20and%20Cults%20in%20AI-AN-NH%20Student%20Achievement%20 [2]%20(071511).pdf. McIvor, Onowa. 2005. “The Contribution of Indigenous Heritage Language Immersion Programs to Healthy Early Childhood Development.” Research Connections Canada 12: 5–20. Morgan, Paul, George Farkas, Marianne Hillemeier, Carol Hammer, and Steve Maczuga. 2015. “24-Month-Old Children with Larger Oral Vocabularies Display Greater Academic and Behavioral Functioning at Kindergarten Entry.” Child Development 86: 1351–1370. NeSmith, Richard Keao. 2012. “The Teaching and Learning of Hawaiian in Mainstream Educational Context in Hawai’i: Time for a Change?” PhD diss., University of Waikato, Hawai’i. Nettle, Daniel and Suzanne Romaine. 2000. Vanishing Voices: The Extinction of the World’s Languages. Oxford: Oxford University Press. Nicoladis, Elana and Howard Grabois. 2002. “Learning English and Losing Chinese: A Case Study of a Child Adopted from China.” International Journal of Bilingualism 6: 441–454. Oh, Janet, Terry Au, and Sun-Ah Jun. 2010. “Early Childhood Language Memory in the Speech Perception of International Adoptees.” Journal of Child Language 37: 1123–1132. Paap, Kenneth and Zachary Greenberg. 2013. “There Is No Coherent Evidence of a Bilingual Advantage in Executive Processing.” Cognitive Psychology 66: 232–258. Pallier, Christophe, Stanislas Dehaene, Jean-Baptiste Poline, Denis Le Bihan, Anne-Marie Argenti, Emmanuel Dupoux, and Jacques Mehler. 2003. “Brain Imaging of Language Plasticity in Adopted Adults: Can a Second Language Replace the First?” Cerebral Cortex 13: 155–161. Paradis, Johanne. 2011. “Individual Differences in Child English Second Language Acquisition.” Linguistic Approaches to Bilingualism 1: 213–237. Park, Hyeon-Sook. 2015. “Korean Adoptees in Sweden: Have They Lost Their First Language Completely?” Applied Psycholinguistics 36: 773–797. Pearson, Barbara, Sylvia Fernández, Vanessa Lewedeg, and D. Kimbrough Oller. 1997. “The Relation of Input Factors to Lexical Learning by Bilingual Infants.” Applied Psycholinguistics 18: 41–58,

508 William O’Grady Peter, Lizette, Tracy Hirata- Edds, and Bradley Montgomery- Anderson. 2008. “Verb Development in the Cherokee Language Immersion Program, with Implications for Teaching.” International Journal of Applied Linguistics 18: 166–187. Pfenninger, Simone and David Singleton. 2016. “Affect Trumps Age: A Person-in-context Relational View of Age and Motivation in SLA.” Second Language Research 32: 311–345. Place, Silvia and Erika Hoff. 2011. “Properties of Dual Language Exposure That Influence 2- Year-Olds’ Bilingual Proficiency.” Child Development 82: 1834–1849. Ramírez, Naja, Rey Ramírez, Maggie Clarke, Samu Taulu, and Patricia Kuhl. 2016. “Speech Discrimination in 11- Month- Old Bilingual and Monolingual Infants: A Magnetoencephalography Study.” Developmental Science 19: 1–16. Ramírez- Esparza, Nairán, Adrián García- Sierra and Patricia Kuhl. 2014. “Look Who’s Talking: Speech Style and Social Context in Language Input to Infants Is Linked to Concurrent and Future Speech Development.” Developmental Science 17: 880–891. Rifkin, Benjamin. 2003. “Oral Proficiency Outcomes and Curricular Design.” Foreign Language Annals 36: 582–588. Rowe, Meredith. 2012. “A Longitudinal Investigation of the Role of Quantity and Quality of Child-Directed Speech in Vocabulary Development.” Child Development 83: 1762–1774. Roy, Deb. 2009. “New Horizons in the Study of Child Language Acquisition.” Proceedings of Interspeech 2009. Brighton, UK. Online: http://dkroy.media.mit.edu/publications/ Schmid, Monica. 2012. “The Impact of Age and Exposure on Bilingual Development in International Adoptees and Family Migrants: A Perspective from Holocaust Survivors.” Linguistic Approaches to Bilingualism 2: 177–208. Schneidman, Laura and Susan Goldin-Meadow. 2012. “Input and Acquisition in a Mayan Village: How Important Is Directed Speech?” Developmental Science 15: 659–673. Schneidman, Laura, Michelle Arroyo, Susan Levine, and Susan Goldin-Meadow. 2013. “What Counts as Effective Input for Word Learning?” Journal of Child Language 40: 672–686. Schwartz, Bonnie. 2004. “Why Child L2 Acquisition?” In Proceedings of GALA 2003, Vol. 1, edited by J. van Kampen and S. Baauw, 47–66. Utrecht: The Netherlands. Graduate School of Linguistics (LOT). Schweizer, Tom, Jenna Ware, Corrine Fischer, Fergus Craik, and Ellen Bialystok. 2012. “Bilingualism as a Contributor to Cognitive Reserve: Evidence from Brain Atrophy in Alzheimer’s Disease.” Cortex 48: 991–996. Sekerina, Irina and Lauren Spradlin. 2016. “Bilingualism and Executive Function: An Interdisciplinary Approach.” Linguistic Approaches to Bilingualism 6: 505–516. Snow, Catherine and Marian Hoefnagel-Höhle. 1985. “The Critical Period for Language Acquisition: Evidence from Second Language Learning.” Child Development 49: 1114–1128. Te Paepae Motuhake. 2011. Te Reo Mauriora: Report on the Review of the Maori Language Strategy and Sector. Online: http://www.tpk.govt.nz/en/consultation/reviewmlss/report/. Thomas, W. P. and V. Collier. 1997. School Effectiveness for Language Minority Students. Washington, DC: National Clearinghouse for Bilingual Education. Thordardottir, Elin. 2015. “The Relationship Between Bilingual Exposure and Morphosyntactic Development.” International Journal of Speech Language Pathology 17: 97–114. Tsimpli, Ianthi, Antonella Sorace, Caroline Heycock, and Francesca Filiaci. 2004. “First Language Attrition and Syntactic Subjects: A Study of Greek and Italian Near-Native Speakers of English.” International Journal of Bilingualism 8: 257–277. Ushioda, Ema and Zoltan Dörnyei. 2012. Motivation. In The Routledge Handbook of Second Language Acquisition, edited by S. Gass and A. Mackey, 396–409. New York: Routledge.

The Linguistics of Language Revitalization 509 Van de Weijer, Joost. 2002. How Much Does an Infant Hear in a Day? Online: http://person2. sol.lu.se/JoostVanDeWeijer/Texts/gala01.pdf Weisleder, Adriana and Anne Fernald. 2013. “Talking to Children Matters: Early Language Experience Strengthens Processing and Builds Vocabulary.” Psychological Science 24: 2143–2152. Wells, C. Gordon. 1985. Language Development in the Pre-School Years. Cambridge: Cambridge University Press. Werker, Janet, Valerie Lloyd, Judith Pegg, and Linda Polka. 1996. “Putting the Baby in the Bootstraps: Toward a More Complete Understanding of the Role of the Input in Infant Speech Processing.” In Signal to Syntax, edited by J. Morgan and K. Demuth, 427–447. Mahwah, NJ: Erlbaum. Yoshida, Katherine, Ferran Pons, Jessica Maye, and Janet Werker. 2010. “Distributional Phonetic Learning at 10 Months of Age.” Infancy 15: 420–433.

Chapter 22

New Media for E ndangered L a ng uag e s Laura Buszard-W elcher

1. Introduction Electronically mediated communication (EMC) using computers, the internet, and mobile devices is an increasingly important domain of language use in the modern world. Certainly oral, face-to-face language transmission remains the primary mode of language use for the vast majority of people, and the main one that they are born into, equipped to use. For endangered languages, this primary domain is the one that is eroded away during the loss of intergenerational language transmission. After oral transmission has ceased, language may continue to be used in written form, but its functionality as a human language has been significantly reduced, and would require major effort to become a spoken language again. For this reason, face- to-face communication and the restoration of healthy intergenerational oral language transmission is rightly the focus in language maintenance and revitalization efforts, especially in the context of critical language endangerment, when only a few elderly speakers remain. Nevertheless, endangered languages can benefit in significant ways from use with EMC. For example, it can create new environments for language use, where the use of these languages may be severely curtailed in face-to-face and public contexts. It can bridge the divide of physical distance for diasporic speech communities, which endangered languages often are. It can bridge social divides where a single speaker or teacher can interact with many language learners, and where language learners can interact with each other in spaces that are more forgiving of learner errors and beginner speech. It can open up new domains of language use, where language could be used for example in electronic commerce, or news dissemination, or entertainment. It can provide a context for speech community where there would otherwise be none.

New Media for Endangered Languages 511 The domain of EMC is also becoming increasingly accessible to people around the world. Internet access and connectivity via mobile devices have steadily increased in global penetration. However, this global spread has not come with equal linguistic access. Rather, for reasons I discuss below, EMC development has historically focused on just a few major world languages, with only more recent penetration into a few hundred languages (still, this is out of several thousand extant human languages). This doesn’t mean that speakers of endangered languages around the world cannot access and use EMC; rather it means they must use another more widely spoken language to do so. And then EMC becomes another (potentially powerful) domain driving language shift. Unfortunately, enabling EMC for endangered languages requires special effort and advocacy, often on the part of the speech community itself, which may lack the resources to bring it about. The good news is, the technologies that exist for multilingual support in EMC can also support endangered languages, and bootstrapping, where speech communities devote their own time, people, energy, and resources to achieving EMC support, is possible. With each of the language-enabling technologies I describe below, I also provide examples where small and endangered-language speech communities have successfully acquired them. I also describe resources (individuals, projects, and organizations) that exist to help speech communities with limited resources get started.1

2. Language support in EMC Within the last decade, EMC has spread dramatically around the world with the broad deployment of cellular telephone, internet, and wireless networks and the proliferation of inexpensive computer devices to access and communicate using them. As of this writing, World Internet Statistics reports that internet penetration has now reached 50% of the global population (3.70 billion of a total population of 7.52 billion) (World Internet Usage and Population Statistics 2017a). Smartphones, which combine telephony with internet access, have also spread dramatically and now have an estimated 2 billion users worldwide (Statista). Access in developing nations lags behind major world economies but is nonetheless on the rise, and those with access are “voracious users,” especially of social media (Poushter 2016). One might expect that with this global spread of EMC there would also be broad representation of the many different languages used around the world. Yet, this is not the case. Instead, just a few major world languages dominate this domain, with ten languages accounting for nearly 80% of all content (World Internet Usage and Population 1 The author is very grateful to Melinda Lyons, current Registrar at SIL for ISO 639-3, Deborah Anderson of the Script Encoding Initiative, and Daniel Linder of The Long Now Foundation for helpful and informative discussions and for pointing out illustrative examples for the use of various technologies discussed here. All representations (and any errors) here are the author’s own.

512 Laura Buszard-Welcher Statistics 2017b). Which languages comprise the remaining 20% of languages used online? This is a trickier question to answer, but we have some idea based on work by computer scientist and linguist Kevin Scannell. Scannell has built a set of tools that crawl the Web, blogs, and social media services like Twitter.2 Matching short strings of three characters (“3-grams”) that serve as “fingerprints” for the use of unique languages online, he has discovered resources for over 1,000 different languages (Scannell 2011). In many cases these may only be a single Web page or Tweet, but it shows that speakers of hundreds of different languages are engaged in EMC and attempting to make use of it in their own languages. In EMC, language representation very much parallels what is seen in the “offline” world: a “long tail” distribution where just a few large, powerful languages are dominant. The main difference is that EMC represents a much smaller subset of languages, currently only around 1,000–1,500 languages in total. This means that little to no online content exists for most of the world’s languages that have fewer than tens of thousands of speakers, and many of these are the most critically endangered. Again given the broad spread of EMC worldwide, why should this be the case? From the vantage point of the possible commercial markets for EMC devices and tools, any technology company that aims for coverage in about 100 of the most widely spoken languages has the capacity to capture about 80% of humanity as its users and/or audience, in at least one of their primary languages (Simons and Fennig 2017c). This Pareto Principle (“80/20 rule”) at play in the distribution of populations of speech communities has been a powerful incentive for development for big languages, but a disincentive for development of small languages, and is arguably what has driven the multilingual support we see in devices and tools today. Looking to the near future and extrapolating a bit, we might guess that development of EMC technology will continue to push ahead to coverage for an additional few hundred of the world’s most widely spoken languages. These would include most languages with a million or more speakers, and expand the audience and user base for EMC technology to 95% of the human population (Simons and Fennig 2017c). These same forces are unlikely to drive development and support of the long tail of languages, simply because the economic motivation is not there. However, there is good news for those who want to take advantage of EMC for endangered languages: the same tools and infrastructure developed to support the larger, more widely spoken languages can also support the smaller, less widespread ones. The difference is that some extra work will be required, probably driven by interested speech communities themselves, to make sure their languages are supported. In the remainder of this chapter, I describe some of these key technologies, and how they can be adapted to accommodate any language.

2 For the web crawler An Crúbadán” (Scanell 2017a), see http://crubadan.org/. For blogs (Scanell 2017b), see http://indigenoustweets.com/blogs/ and for Tweets (Scanell 2012), see http:// indigenoustweets.com.

New Media for Endangered Languages 513

3. Language identifiers A language identifier is a human and computer recognizable code for identifying a human language or set of languages. Language identifiers are widely used by EMC devices and software applications when interacting with text, such as with search, display, translation, and localization. They are used to set the default language interface for mobile devices, computers, and applications including Web browsers. Without a standard identifier, a language cannot be reliably supported in the growing domain of EMC. For a demonstration that may be readily at hand, the reader may try the following: use a Web browser to examine the source code for any Web page in English, there should be a tag at the top of the page of code that looks something like this: . The “lang” attribute with the value “en” specifies to the browser that the default language of the page is English. This attribute can be omitted, and the HTML still remains well- formed, but then the browser would have to guess at the language of the page, and might end up displaying it incorrectly. The W3C Internationalization Group recommends that all Web pages declare the content language of the page using language identifiers in this way (W3C 2014). Language identifiers are part of the fabric of the World Wide Web. There are a number of proprietary and ad hoc sets of language identifiers, but the highest value set for EMC development (and therefore our purposes) is a set of language identifiers maintained by the International Standards Organization, known as ISO 639. ISO 639 was originally developed in 1967 as a terminology set (see ISO 2016). It was replaced in 2002 by ISO 639-1, a set of 136 two-letter language identifiers, as with the “en” value in the example above. The set mostly consists of national languages, as well as a few “macrolanguages” that make reference to a set of individual languages like Arabic and Chinese, as well as a few constructed languages, like Esperanto and Interlingua.3 In 1998, an additional set of identifiers, ISO 639-2 was added to the standard. This set contains 464 three-letter codes, and expands the basic ISO two-letter codes to include many more needed by librarians and information scientists to identify the language of cataloged resources. From the linguistic perspective, this set is a surprising hodge- podge of individual languages, as well as language families like “iro” for Iroquoian languages, and regional groupings like “paa” for Papuan languages, consisting of some 800 languages (!), and many language families. For librarians, this level of differentiation was sufficient for most needs. ISO 639-3, added in 2007, is the first set of language identifiers in the standard that attempts to represent actual global linguistic diversity. SIL International is the Registration Authority that maintains the code set. They describe ISO 639-3 as follows: “Whereas ISO 639-1 and ISO 639-2 are intended to focus on the major languages of the world that are most frequently represented in the total body of the world’s 3

For a particularly accessible and informative introduction to language identifiers, see Wright (2015).

514 Laura Buszard-Welcher literature, ISO 639-3 attempts to provide as complete an enumeration of languages as possible, including living, extinct, ancient and constructed languages, whether major or minor. As a result, ISO 639-3 lists a very large number of lesser-known languages” (see SIL 2017b). Given the needs and economic motivations of large technology companies as outlined above, the adoption of such a large, comprehensive set of language codes as an international standard is perhaps surprising. But the confluence of a few different factors likely helped bring it about: a growing public awareness of global linguistic diversity and language endangerment, the clear economic need of large technology companies to move into the domain of less common languages, and a set of codes ready-at-hand that could serve such a purpose: the ISO 639-3 code set is based on three-letter codes developed and maintained by the Summer Institute of Linguistics (SIL), a missionary organization that developed and used the codes for many years in its publication The Ethnologue, a catalog of the world’s languages (see Simons and Fennig 2017a). The adoption of the set of SIL Ethnologue codes as ISO 639-3 was further aided by the fact that codes were a fixed three-letter length, valued for brevity and predictability by the IT industry. An alternative comprehensive scheme in concurrent development, the Linguasphere Register,4 was at a disadvantage in this respect, using codes of variable length to represent taxonomies of languages from the level of regional or language- family grouping, all the way down to individual languages and dialects. Likewise, the set of Glottolog codes,5 developed as an alternative to ISO 639-3 for scientific use, had fixed-length but longer codes consisting of four letters and numbers. These codes designate “languoids” which are agnostic as to taxonomic level, and may identify language families, individual language varieties, or groupings in between these (for example, Indo-European, West Germanic, Anglian, Standard English, and Sri Lankan English all have codes). This tension between brevity and comprehensiveness is still playing out in the world of ISO 639, so it is worth paying attention to alternative schemes, as their role with respect to the standard may change in the future. Fortunately, all modern schemes have the goal of comprehensiveness at least far as individual language representation is concerned, which is light years ahead of ISO 639-2 in representing actual global linguistic diversity. Whatever way the standard may develop in the future, it is likely to continue becoming more inclusive in its representation, rather than less. The set of ISO 639-3 codes is becoming more settled and static over time; however, changes are often still needed. This is an important point for our purposes, and ISO 639-3 as a standard includes a formal mechanism for making changes (see SIL 2017c). Changes may involve adding or removing supplemental information about a code, such as changing a language name associated with the code (language names derived from historical use may be inaccurate, or occasionally disparaging exonyms, and therefore

4 5

See Dalby (2000) also available online at http://www.linguasphere.info/. See Glottolog, available online at http://glottolog.org/.

New Media for Endangered Languages 515 in need of amendment once better information is available). Changes to codes themselves are occasionally needed to split a code into more than one language, or to merge codes for languages that turn out to not be distinct (see SIL 2017a). If two codes are merged, then at least one will be retired. (Retired in this context means deprecated from future use. Codes are maintained into the future, so once a code is in the standard set, it may be retired but not deleted. If there is any evidence that a language exists or existed it will be maintained. For example, there is an ISO 639-3 code, omc, for Mochica (a.k.a. Yunga, Chimu, Mochika), a language isolate that went extinct in the 1920s.) (See MultiTree 2017.) A code change would also be needed to request a new code, if a language hasn’t been represented by one yet. New codes are sometimes needed due to political changes, and the creation of new national boundaries, as happened with Serbian, Croatian, and Bosnian after the break-up of the former Yugoslavia (Library of Congress 2017). (A similar case could also be made for Montenegrin, which doesn’t yet have its own code.) Also, sometimes new language varieties emerge, as with the case of Sheng in Kenya (which also doesn’t have an ISO 639-3 code yet, but may, especially if the speech community’s need for EMC and localization grows) (Dean 2013). In other cases, new language codes are needed because our documentation and knowledge of the languages of the world has improved and new languages are identified that weren’t well-known before, which is more likely to be the case with endangered languages. The formal change-making process of the standard allows anyone to submit a change request. This means that interested individuals and groups from within endangered language speech communities can organize, gather resources, and submit a proposal to get the process started. The proposal process is not intended to be onerous, but making a good case is essential to success, and getting help from knowledgeable language experts and people familiar with proposal writing is advised. Also recommended is contacting the ISO 639-3 Registrar who can offer advice and serve as a valuable guide. Speakers of Jejueo, an endangered language of Jeju Island, South Korea, provide a good example of a recently completed successful change request for the creation of a new ISO 639-3 code (“jje”) (see SIL 2016). Jejueo is related to Korean but is mutually unintelligible with Standard Korean and Korean dialects. The language has between 5,000 and 10,000 speakers. Schools on Jeju Island have historically used and taught standard Korean, which has also been the language used for writing. This use of standard Korean in schools, along with a long-standing deprecation of Jejueo as inferior or “slang” by speakers of standard Korean, led to a significant decline in the number of speakers over time such that most speakers today are elderly. The younger generation can often understand Jejueo but don’t speak it themselves.6 Jejueo was officially recognized as a critically endangered language by UNESCO in 2010, and since then there has been a strong and growing interest in its revitalization and

6

Ju (2014). See also the Catalogue of Endangered Languages (http://www.endangeredlanguages.com/ lang/8409).

516 Laura Buszard-Welcher a surge of new resources dedicated to its promotion. These include a Research Center for Jeju Studies, the development of a standard orthography (in 2013), the use of Jejueo in school curricula, and the creation of digital resources such as websites, online textbooks and talking dictionaries. With the growing use of Jejueo in EMC, the speech community desired an ISO 639-3 code for the purpose of developing fonts, keyboards, and software localization.7 The Jejueo change request was originally submitted as a minimal application by a young person from Jeju Island, which seems indicative of interest (and attention to such things) among the younger generation of speakers who want to see their language represented by EMC. The final, successful proposal was modified significantly based on feedback from the Registrar, and included a Professor of Linguistics and a Jejueo native speaker graduate student from the University of Hawai’i at Mānoa, and an Associate Professor of Education from Jeju National University. As suggested above, this illustrates that the ISO 639-3 may be best navigated in partnership with the Registrar, and other advocates who are likely to be familiar with proposal writing and language documentation. The successful proposal is archived and available to the public on the ISO 639-3 website (see SIL 2014). Proposals for change requests need to follow a particular format, documented on the Registrar’s website, and the format is somewhat different depending on whether a change to a code is requested, or whether a new code is needed (see SIL 2017c). Once a proposal is received and has all the needed information in it (which might require some back-and-forth communication between the person or body making the request and the Registrar), it becomes a “Proposed Change” which then goes through a process of public review and discussion. Based on the outcome of this discussion, if the request is still up for consideration, it will become a “Candidate Change,” and given at least three months for final public discussion and review. At the end of the review period, the change may be adopted in whole or in part, or rejected. All change requests that make it to the status of “Proposed Change” are then archived and remain publicly accessible online, whether or not they are adopted.

4. Unicode, fonts, and keyboards Thanks to modern connectivity speeds and bandwidth, spoken language (represented in audio and video) is widely used in EMC, even outside telephony. Nevertheless, the use of EMC is still heavily dependent on the written representation of spoken language. And no matter what type of writing system is used for a language—abjad, alphabet, syllabary, logographs, or a combination of these—the set of written symbols must be

7

There is now a page for Jejueo on ScriptSource that describes these needs, see http://scriptsource.org.

New Media for Endangered Languages 517 mapped to a unique numeric sequence that a computer device can understand and implement. Computers fundamentally use two-state binary operations (1s and 0s) so ultimately whatever numeric sequence is chosen for a given character, it will be stored by the computer as a binary sequence. These binary “bits” are organized into sequences of eight, called “bytes” and conventionally the byte is the smallest unit that computer architectures and systems work with. But this is actually a lot, since in binary, one byte gives you 256 possible combinations of bits. One of the first EMC encodings of a writing system is ASCII, which allowed for the representation of the English upper-and lower-case alphabets, numbers 0–9, and punctuation. It was developed for use with teletype, which used a seven-bit binary sequence allowing for a total of 128 possible character-binary sequence mappings. With the advent of eight-bit computer architectures, it became possible to extend the set of possible character mappings to 256. ASCII could still be represented (the extra bit was sometimes used as a “parity bit” for error checking) and many of the world’s major languages could be accommodated by the extra mapping space. To deal with logographic systems like Chinese characters where thousands of mappings are needed, variable-length encodings were devised that made use of combinations of single and multiple bytes. So far, so good. However, over time, hundreds of varied implementations and practices came to be used across systems and devices (see Unicode 2016). As EMC technology spread, so did the desire to share text files across devices and platforms. Yet there was an ever- diminishing chance that an elegantly rendered script on one EMC device wouldn’t show up as a gibberish of symbols on another. (A notorious case of this was Windows Code Pages, implemented in the 1980s and 1990s, where programs would need to reference a particular set of encodings for a file in order to correctly display characters—leading to Mojibake 文字化け, Japanese for “character transformation,” and headaches if the reference was missing or incorrect.) (See Wikipedia 2017a.) Enter Unicode, which embraces a set of fundamental design principles to address these problems. Unicode assigns a unique, unambiguous value (or “code point”) to each symbol (“character”). The values, or code points, are sequences of hexadecimal values: numbers from 0–9 and letters A–F. Sequences of four hexadecimal values provide 65,536 (216) possible code points. The majority of characters found in the world’s scripts can be represented with a single set of 65,536 code points, known as the “Basic Multilingual Plane; An additional sixteen sets of 65,536 code points (the sets are called “planes”) provides space for a possible 1,114,112 code points, of which 128,172 are currently assigned by The Unicode Standard Version 9.0. These code points are translated into several possible encodings using one or more bytes. The set of unique code points and their translation into encodings comprises an industry-wide standard. UTF-8 and UTF-16 are both variable-length encodings using one or more sequences of 8 or 16 bits, respectively (see Unicode 2017b). UTF-32 is a fixed-length encoding where each code point is translated into 32 bits. Each encoding has its own set of advantages and disadvantages and efficiencies and deficiencies depending on the set characters to be used in a given file or application. Taken together,

518 Laura Buszard-Welcher however, they provide a flexibility of implementation across a wide variety of character sets and hardware and software platforms. Another fundamental design principle of Unicode is to design for universality from the outset—that is, Unicode aims to be inclusive of all symbols used in all writing systems worldwide, modern and historic, as well as non-linguistic symbols in common use (see Unicode 2016c). Over the years, scripts have been added to the Unicode Standard, so that the current Version 9.0 includes a total of 135 different scripts. As work progresses, the standard pushes into the territory of less common languages and scripts. This is often the territory occupied by minority and endangered languages. For example, newly included in Version 9.0 are characters and scripts for the following: “Osage, a Native American language, Nepal Bhasa, a language of Nepal, Fulani and other African languages, The Bravanese dialect of Swahili, used in Somalia, The Warsh orthography for Arabic, used in North and West Africa, [and] Tangut, a major historic script of China.”8 Why might a language not be represented by Unicode? One possibility is a language may not have a written form. This is very common, as up to half of the world’s extant languages are not written (Simons and Fennig 2017b). Without a written form, EMC can still be used for telephony and spoken audio and video, but another written language would need to be employed for anything text-based, such as the ubiquitous Short Message Service (SMS). This use of a different, written language may be the most straightforward and efficient path to a community’s use of EMC if the community is multilingual, but the strategy does assert a dominant, more widespread language as the language of EMC over the minority or endangered one. Another more elaborate option for an unwritten language could be to develop and deploy a writing system that uses characters already represented by Unicode, circumnavigating the need for the adoption by Unicode of new characters or scripts. Sometimes, writing systems in established and even widespread use may simply not be represented in Unicode yet. This is more likely to be the case with endangered-language scripts. A good example of this is Cherokee, which was added to Unicode 3.0 in 1999. The Cherokee writing system, a set of eighty-five syllabic characters, has been in vital use among Cherokee speakers since the early nineteenth century and continues with strong use today including in adult and grade-school language immersion programs. This modern usage was made possible through the long-term, dedicated efforts of the Cherokee Language Technology Program, whose impressive goal is to create “innovative solutions for the Cherokee language on all digital platforms including smartphones, laptops, desktops, tablets, and social networks” (Cherokee Nation 2017). For languages whose scripts are not represented by Unicode, there is a way to propose additions. The proposal process for additional characters and scripts is well- documented, as to be expected from a standard, and can be found on the Unicode website. However, this process isn’t simple or straightforward—proposals require significant research, and additions must meet tests of adequate documentation and regularity

8

List quoted from Unicode (2017a).

New Media for Endangered Languages 519 of use. The latter, in particular, may require a significant amount of community compromise and consensus around a final form of characters to be encoded. Minority-and endangered-languages communities are also at a disadvantage here in having little political or economic power. Although the Unicode Consortium is a 501(c)(3) non-profit, the full voting membership (members of which currently each pay US$18,000 a year) generally consists of big technology companies with a set of interests one might expect of large for-profit stakeholders. Fortunately, the Script Encoding Initiative (SEI) (2016), run out of the University of California, Berkeley maintains an institutional voting membership in Unicode, and works to fund expert proposals for the encoding of minority and historical scripts. The close working relationship of SEI with the Unicode Consortium, and particularly the Technical Committee, assures that proposals they fund will meet Unicode requirements. To date, the SEI has helped over 70 scripts become encoded, and they estimate 100 scripts remain. For use in EMC, scripts also require fonts and keyboards for graphical display and text entry. A computerized font is required as part of the publication of an approved script (Unicode 2016a). The creation of keyboards may take place later, but this can be a lengthy process. An Apple keyboard for Cherokee was created in 2003, four years after the inclusion of the syllabary in Unicode, and a Windows keyboard wasn’t created until several years later (Waddell 2016). The SEI expects that much of the work that lies ahead in the future will be in helping support the development of both fonts and keyboards. There are a couple of new initiatives in the development of free, open fonts to support Scripts in Unicode. One is Unifont, a single bitmap font for all Unicode scripts, released with a GNU General Public License in 2013 (see Unifoundry 2017a). A primary advantage of Unifont is its comprehensiveness, with coverage for the Unicode Basic Multilingual Plane. Another initiative is the suite of Noto Fonts developed by Google (2017). Unlike Unifont which is a single font, Noto Fonts are a suite of over 100 different fonts, all licensed under the SIL Open Font License.9 An advantage of Noto Fonts is their elegance of design, with “a harmonious look and feel” which extends across all fonts in the suite. Noto Fonts prioritize the scripts they cover; however, their goal is comprehensiveness. So the most straightforward path for inclusion of a script in the Noto suite is to make sure it is included in Unicode.

5. Corpora and natural language processing Every day, speakers of major world languages online are creating volumes of new textual content in their languages: news articles, blog posts, discussion threads, electronic mail, social media posts, and search queries. Technology companies like the big search 9

This is the same SIL that is also the ISO 639-3 Registration Authority.

520 Laura Buszard-Welcher engines use these queries and text documents to (among other things) figure out ways to display ads relevant to the user, because much of their revenue is ad-driven. In the process though, they are amassing huge corpora for these languages and a set of sophisticated tools that can analyze them. For the benefit of the services they provide, users add to these corpora every day and as a result, their services and the EMC environment for the use of major world languages gets better, and better. It is debatable whether the cost of being continually subjected to online advertisement is worth the benefits these tools provide. However the value of a large and varied corpus is of undeniable value for natural language processing (NLP) which can be used to create powerful tools for the use of a language with EMC. Below is a sample of the tools that can be created for a language with a sizable and varied corpus (primarily text, but also audio and video): • Spell-checking: producing normative or standard spellings for words given approximate matches. • Grammar-checking: analyzing the grammatical structure of text in order to suggest normative or standard alternative usage. • Predictive Text: automatically completing text entry for words and phrases so that less input from the user is needed. This is especially helpful for mobile devices which typically have minimal physical interfaces for text entry. • Search: broadly, this enables locating particular information within a larger corpus. This could mean relatively simple tasks like locating a match for a string in a text document or a dictionary look-up, to the sophisticated searches enabled by modern search engines like Google. • Information Extraction: finding information in text that is relevant for another use, for example, extracting information about a planned meeting in email in order to automatically populate a calendar. • Machine Translation: translating words, strings of words, or documents from one language to another, for example, automatic translation of government web pages. • Optical Character Recognition (OCR): the ability to automatically convert images of text characters to encoded computer characters. • Speech Recognition: automatically identifying human-produced speech, and using it as input to computer systems. For example, voice commands, automatic transcription, and speech-to-text input. As technology evolves and human and societal dependence on computer devices increases, NLP-enabled language tools like those above become ever more critical in everyday life. Languages (and their speakers) enabled for sophisticated use in the digital domain will stay at the forefront of development and innovation. Those that are not will likely lag behind, or be left behind completely. A reasonable choice for economic opportunity might then be to simply use a different, better-enabled language in the digital domain, and many endangered-language communities choose to go this route. But again, such a choice creates another powerful domain of language use that excludes

New Media for Endangered Languages 521 endangered languages and may drive their further obsolescence, especially as younger speakers increasingly encounter and interact with EMC. So how might an endangered- language speech community enable the development of sophisticated NLP tools for their language? A good place to start is developing a corpus. Other chapters in this volume provide a good overview of corpus-building activities for endangered-language speech communities, so we address the topic only briefly here as it intersects with EMC development for a language. To begin with, it is useful to distinguish two broadly different types of language resources, annotated and unannotated. Examples of unannotated resources might be a written text, or audio or video recording in a single language (like a document, or recorded speech). A set of written emails might then constitute a corpus. So might a collection of YouTube videos. Each type of media has its own advantage in a corpus. A large unannotated text corpus would be useful in developing NLP tools for text, similarly collections of unannotated audio files would be useful in developing NLP tools for audible speech including speech recognition. Video has an advantage over audio in that it includes audio but can also show context for the speech event, such as the environment the speech takes place in, what in the physical environment the speech may be about (for example, a demonstration), whether there are other speakers or auditors, and important and meaningful speech accompaniments such as gesture, stance, eye gaze, and facial expression. In terms of content, anything and everything is potentially valuable, so long as it is (reasonably) well-formed, connected speech.10 Variety is also valuable, as it can provide examples of different styles of speech, registers, levels of formality, genres, gender-or age-based differences, how speakers engage in conversation versus monologue, the poetic use of language, and even singing and the incorporation of song into speech. Words, grammatical usages, or styles that are rare might not show up in a small corpus, so the bigger the corpus, the better the chances of capturing the full range of expressiveness in a language. Annotated language resources have additional linguistic information associated with the primary connected speech. There are a variety of types of annotation, some of which are more easily produced by speakers themselves, and some of which require linguistic training and greater amounts of linguistic or computational expertise to produce. For the former, a common and useful type of annotation would be to translate a language resource or set of resources into one or more different languages. A slightly more complex (but still broadly accessible) version of this would be a set of translations aligned at the sentence level. An even more complex version would be word-or morpheme-level translations that may also include grammatical information. This type of translation is usually best accomplished by those with some linguistic training, and is the type most commonly produced by linguists engaged in language documentation projects. Then, there are a variety of annotations that are particularly useful in NLP and more commonly produced by computational linguists, such as texts formally marked for word

10

Errors in natural speech are common, and perfectly fine to include in a corpus.

522 Laura Buszard-Welcher boundaries, parts of speech, syntactic structures, and semantic information.11 In terms of their representation in a corpus, unannotated resources typically far outnumber annotated ones. But the more elaborate annotation types provide the key to developing NLP tools; the remainder of the unannotated corpus can provide useful data to test and fine-tune them. Unlike the language-enabling EMC technologies of ISO 639 and Unicode, natural language processing is a substantial field with many and varied researchers and practitioners, from academics to technologists to NGO workers. There are also many reasons for those working in the field of NLP to develop capacity for languages that have few if any available language resources or tools. These include enabling “low-density” languages (as they are sometimes termed in the literature) for EMC for commercial reasons, but also for military purposes, for interventions in war zones or humanitarian crises, or for rapid emergency response in natural disasters. Endangered languages are hardly alone in having few language resources to deploy for the development of NLP. In fact, most of the languages of the world, even languages with millions of speakers, are in this situation.12 Haitian Creole provides an interesting case study as a language with millions of speakers that was quickly mobilized for broader use with EMC in a crisis. On January 10, 2010, a 7.0 magnitude earthquake struck near Port- au-Prince, Haiti. Damage from the quake and many strong aftershocks was widespread and severe, and there were well over 100,000 reported deaths. Although public communication systems were disrupted, many people living in Haiti, including Port-au-Prince, were able to send out SMS messages and social media updates about local conditions and emergency needs. Soon, friends outside Haiti started geolocating Tweets and SMS messages on the Ushahidi platform, an online service specifically developed for crisis mapping (Meier 2012). As the humanitarian crisis mounted, a toll-free SMS number 4636 was set up, and publicized by radio stations in Haiti (Cutler 2010). Soon thousands of text messages were pouring in, in Haitian Creole. Haitian Creole was a low-density language at the time, without widely available machine translation. So to mobilize international aid, human volunteers (including diaspora Haitian Creole speakers located via Facebook) were organized to provide translation and geolocation of the SMS messages. The effort was impressive for its rapid mobilization of resources, but highlighted the severe lack of automated translation tools for Haitian Creole, especially when such resources existed and were being broadly deployed for more widely spoken languages (for example, Google Translate, which launched in 2006). Shortly afterward, Google (Google Operating System 2010) and Microsoft (Microsoft Translator 2010) each announced alpha-version machine translation for Haitian Creole in order to provide support for relief efforts. It is clear that existing corpora developed 11 Maxwell and Hughes (2006) provide a good overview of the types of annotated resources that are useful for NLP. 12 According to Maxwell and Hughes (2006) there are only 20 to 30 languages of the world that have enough machine-readable data to be reasonably well-enabled for NLP.

New Media for Endangered Languages 523 for research were brought to bear on the problem, for example, the Microsoft Research translator was made possible by translation corpora provided by Carnegie Mellon University, as well as others (see Language Technologies Institute of Carnegie Mellon University’s School of Computer Science 2016). The rapid deployment of Haitian Creole machine translation services by technology companies to help in a humanitarian crisis arguably “primed the pump” for other exploratory efforts aimed at mobilizing resources for low-resource languages. In 2012, in honor of International Mother Tongue Day (February 21) Microsoft Research announced the availability of its Bing translator for the Hmong Daw language (Wozniacka 2012). In the 1970s, a large group of Hmong entered the United States as refugees. There are around 260,000 Hmong residing in the United States today, and over 30,000 in Fresno alone. As with most immigrant communities in the United States, the younger generation was showing signs of language shift to English with some children not learning to speak Hmong. The project served to bridge a gap between the older and younger generations of Hmong Daw, the older generation benefiting from the automated translation of English web pages, and the younger generation having access to a translator for Hmong, for which there are few other available resources. The Hmong Daw translator was developed through a close collaboration between Microsoft Research and the Hmong community in Fresno, California. In particular, the project came together through the efforts of Will Lewis, a computational linguist at Microsoft Research, and Phong Yang, a teacher of Hmong at Fresno State University who served to organize and mobilize the Fresno-based Hmong community. Collectively, the project trained the Bing translator using hundreds of pages of parallel English-Hmong translations, as well as entries from a Hmong-English dictionary. Speakers also provided feedback on the intelligibility of translations so that the translation engine could be modified, and retrained with additional material. The result was the rapid deployment of a machine translation tool that performed well-enough to enable Hmong-dominant speakers to glean important information from English-language web pages. As of the date of this writing, the Hmong Daw Bing translator is still available for public use.13 In the absence of such dramatic events and interventions though, how might a low- density language community go about the creation of a corpus that might be suitable for NLP? A good place to start might be an existing language documentation project. Although such projects aren’t typically created to develop a corpus for NLP or to enable a language in the EMC domain, they certainly produce language resources that are amenable to these outcomes. Another reason to start with an existing language documentation project, is that they typically produce the more complex type of annotated language resources, such as word-and morpheme-level translations, especially if the project involves people with linguistic training or expertise. Then expanding on such a project by crowdsourcing the development of unannotated language resources using EMC devices and tools would be a good

13

Available online at https://www.bing.com/translator.

524 Laura Buszard-Welcher strategy, if devices are readily available and in widespread use, or could be deployed for this purpose. For such efforts it is worth noting that the development of certain types of corpora may depend on the other language-enabling technologies discussed above. For example, in order for many speakers to contribute to the building of an online text corpus, a language needs a writing system, and to reliably use that writing system in EMC, the characters and script need to be represented in Unicode, and there need to be fonts and a means of entering text into a device through a keyboard or other interface. A desirable text corpus for many minority languages is the development of a language- specific Wikipedia, and to date, there are 295 of these. But to have a language-specific Wikipedia, a language needs not only to be written and represented by Unicode but also to have an ISO 639-3 code. While NLP still revolves heavily around text, audio and video corpora can be used to produce NLP tools, and have an advantage over text in that they work for both written and unwritten languages, and are not encumbered by the need to enter or display text (although this may be desirable for captions or translations). They are also easy to produce with common recording software on computers and mobile devices if adequate file storage is available. Inexpensive or free online services such as SoundCloud and YouTube make it very easy to upload and share audio and video online. Some types of basic annotation such as audio transcriptions and translations can also be produced and shared this way. Crowdsourcing the more complex types of annotation, such as word-and sentence-level translations, are difficult to produce with the more common tools and services. However, innovative applications for EMC devices are increasingly making this possible.14 In the building of a corpus, it is important to consider where it will be housed and served. By paying attention to this at the outset, it is possible to build a valuable, lasting corpus of language resources that can serve many different purposes: community use and reference, language learning and teaching, and NLP development. Three considerations in particular are important in deciding where to house and serve language resources, these are preservation, discovery, and access.

5.1. Preservation Increasingly, the “long tail” of language resources online exists in hundreds of individual web pages, social media posts, audio, and video posted to popular sharing sites.15 These sites may have tremendous value for online speech communities, and may even be where endangered languages are finding a foothold with language learners in EMC. However, the average life of a web page is not long,16 and while sites like Facebook and YouTube are widely popular, they make no long-term commitment to archiving users’ 14

For examples, see the applications described in Taff et al., Chapter 39, this volume. See Kevin Scannell’s work at http://crubadan.org/. 16 The average lifespan of a web page is ninety-two days according to Brewster Kahle of the Internet Archive. See Brown (2017). 15

New Media for Endangered Languages 525 contributions. For this reason, it is a good idea to consider placing digital endangered- language resources in an archive committed to their long-term preservation as language resources. Any of the organizations and archives that participate in the DELAMAN network would be suitable and can advise on this subject (see DELAMAN 2017).

5.2. Discovery How will online language resources be discovered? A valuable principle in long-term archiving is “use it or lose it”—that is, digital resources that can be discovered and used have a much greater chance of long-term preservation than those that are obscure. But besides this, a goal should be discovery if a community wishes to engage the interests and help of NLP practitioners. For resources housed in language archives, archivists will make sure language resources have appropriate metadata associated with them (including language identifiers) that allow for their discovery as being in a particular language, and/or part of a particular corpus. Archives in DELAMAN use language-appropriate metadata schemes like OLAC (Open Language Archives Community), which also enable them to be discovered in searches on the OLAC site that operate across the community of participating archives (see OLAC 2017). Sites like SoundCloud and YouTube have the advantage of high-volume traffic, which means that a public SoundCloud audio file, or YouTube video has a much better chance of discovery than one served from the average blog or home page. Unlike language archives, however, mainstream services like SoundCloud and YouTube provide no special means of discovery for language resources. So for resources housed and served this way, consider adding an ISO 639-3 code as a tag to make them more discoverable and identifiable.

5.3. Access The third consideration is the access that users have to online language resources. The default on many sharing services is public access, but many also have some privacy options that limit the ability of resources to be accessed and shared. Many archives are equipped to take in material with access restrictions, so in the case where a resource may be sensitive for one reason or another, it can still be part of the archived corpus, and be preserved for the future. While some speech communities want strict control over access to their online language resources, and reasonably so, there is a good argument for making some set of language resources publicly available. The reason is computational linguists and NLP practitioners can use open archived corpora to provide free resources to a speech community. If a corpus is locked down, they may never discover it, and if they do, it is likely they will bypass it to work on other languages and corpora that are more readily

526 Laura Buszard-Welcher accessible. Think of open language resources as a jar of honey to attract computational linguists to a corpus. They will produce valuable tools and resources.

6. Summary In this chapter, we’ve presented three technologies that are essential to enabling any language in the digital domain: language identifiers (ISO 639-3), Unicode (including fonts and keyboards), and the building of corpora to enable natural language processing. Just a few major languages of the world are well-enabled for use with electronically mediated communication. Another few hundred languages are arguably on their way to being well-enabled, if for market reasons alone. For all of the remaining languages of the world, inclusion in the digital domain remains a distant possibility, and one that likely requires sustained interest, attention, and resources on the part of the language community itself. The good news is that the same technologies that enable the more widespread languages can also enable the less widespread, and even endangered ones, and bootstrapping is possible for all of them. The examples and resources described in this chapter will hopefully serve as inspiration and provide some guidance in getting started. The ultimate goal is full participation in EMC, which can open up an exciting new domain of very modern language use for endangered language speech communities.

7. Coda: For linguists While this chapter has focused on key language technologies that enable languages online, and has largely addressed non-linguists, there is an important message here for those providing linguistic support to endangered-language communities. Linguists can be valuable allies in the process of acquiring ISO codes and helping communities get their scripts represented in Unicode, if only by knowing the people, organizations, and resources that can be brought to bear. They can also be helpful in preparing proposals, since these require close familiarity with the language, community, and existing language documentation. And, finally, they can help by knowing that the digital resources they may be creating as part of a language documentation project might also enable the development of NLP tools, and being sure to archive them in a way that allows for their preservation, as well as discovery and access. And while computational linguistics is not required training for fieldwork, it is nonetheless valuable as a skill, and at the very least documentary linguists should consider a working partnership with a computational linguist who can help guide the development of resources and tools for NLP. In this way, linguists can work with endangered-language communities to help pave the way for their inclusion in EMC.

New Media for Endangered Languages 527

References Brown, Jeffrey. 2017. Internet History Is Fragile. This Archive Is Making Sure It Doesn’t Disappear. PBS Newshour, Jan. 2, 2017. Archived at: https://web.archive.org/web/20170326011025/ http://www.pbs.org/newshour/bb/internet-history-fragile-archive-making-sure-doesnt- disappear/. Accessed March 26, 2017. Cherokee Nation. 2017. Cherokee Language Technology. Archived at: https://web.archive.org/ web/20170319034825/http://www.cherokee.org/languagetech Accessed March 19, 2017. Cutler, Kim-Mai. 2010. “How a Tweet Brought Makeshift 911 Services to Life in Haiti.” Venture Beat, January 10, 2010. Archived at: https://web.archive.org/web/20160829063504/http:// venturebeat.com/2010/01/28/team-4636/. Accessed August 29, 2016. Dalby, David. 2000. The Linguasphere Register. Hebron, Wales, UK: Linguasphere Press. Dean, Laura. 2013. “Street Talk: How the Urban Slang of Nairobi Slums Is Becoming the Language of the People.” Slate, November 1, 2013. Archived at: https://web.archive.org/ web/20161024101333/http://w ww.slate.com/articles/news_and_p olitics/roads/2013/11/ sheng_is_becoming_a_kenyan_language_how_the_urban_slang_of_nairobi_slums.htmll. Accessed October 24, 2016. DELAMAN. 2017. Digital Endangered Languages and Musics Archives Network. Online: http://www.delaman.org/ Archived at: https://web.archive.org/web/20161209081519/http:// www.delaman.org/. Accessed March 14, 2017. Google. 2017. Google Noto Fonts. Archived at: https://web.archive.org/web/20170320141409/ https://www.google.com/get/noto/. Accessed March 20, 2017. Google Operating System. 2010. Google Translate Adds Haitian Creole. Archived at: https:// web.archive.org/web/20160623071701/http://googlesystem.blogspot.com/2010/02/google- translate-adds-haitian-creole.html. Accessed June 23, 2016. ISO. 2016. ISO/ TC 37. “Terminology and Other Language and Content Resources.” Archived at: https://web.archive.org/web/20160322223817/http://www.iso.org/iso/home/ standards_development/list_of_iso_technical_committees/iso_technical_committee. htm?commid=48104. Accessed March 22, 2017. Ju, Suyon. 2014. Jeju Island Dialect (Student Advocates for Language Preservation). Archived at: https://web.archive.org/web/20151012003945/http://www.studentlanguagepreservation. org/jeju-island-dialect.html. Accessed October 12, 2015. Language Technologies Institute of Carnegie Mellon University’s School of Computer Science. 2016. “Public Release of Haitian Creole Language Data by Carnegie Mellon.”. Archived at: https://web.archive.org/web/20160913191612/http://w ww.speech.cs.cmu.edu/haitian/ . Accessed September 13, 2016. Library of Congress. 2017. “Codes for the Representation of Names of Languages.” Archived at: https://web.archive.org/web/20170118031324/https://www.loc.gov/standards/iso639-2/ php/code_changes.php. Accessed January 18, 2017. Maxwell, Michael and Baden Hughes. 2006. “Frontiers in Linguistic Annotation for Lower- Density Languages.” In Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora, 29–37. Association for Computational Linguistics. Meier, Patrick. 2012. “How Crisis Mapping Saved Lives in Haiti.” National Geographic Voices, July 2, 2012. Archived at: https://web.archive.org/web/20170211043039/http://voices. nationalgeographic.com/2012/07/02/crisis-mapping-haiti/. Accessed February 11, 2017.

528 Laura Buszard-Welcher Microsoft Translator. 2010. Announcement: Haitian Creole Support in Bing Translator and Other Microsoft Translator Powered Services. Archived at: https://web.archive.org/web/ 20170327015354/https://blogs.msdn.microsoft.com/translation/2010/01/24/announcement- haitian-creole-support-in-bing-translator-and-other-microsoft-translator-powered-services/. Accessed March 27, 2017. Microsoft Translator. 2017. (Includes Hmong Daw.) Available at https://www.bing.com/translator. Accessed March 26, 2017. MultiTree. 2017. Mochica. Available at: http://multitree.org/codes/omc.html. (No archive.org URL available due to robots.txt.) Accessed March 26, 2017. OLAC. 2017. OLAC: Open Language Archives Community. Available online at http://www.language-archives.org/. Archived at: https://web.archive.org/web/20170317110256/http://www. language-archives.org/ Accessed March 17, 2017. Poushter, Jacob. 2016. Smartphone Ownership and Internet Usage Continues to Rise in Emerging Economies. Pew Research Center. Archived at: http://web.archive.org/web/20170309230121/ http:// w ww.pewglobal.org/ 2 016/ 02/ 2 2/ s martphone- ownership- and- i nternet- u sage- continues-to-climb-in-emerging-economies/. Accessed March 10, 2017. Scannell, Kevin. 2011. “1,000 Languages on the Web.” Indigenous Tweets. Archived at: http:// web.archive.org/web/20170310230515/http://indigenoustweets.blogspot.com/2011/12/1000- languages-on-web.html. Accessed March 10, 2017. Scannell, Kevin. 2012. Indigenous Tweets (indigenoustweets.com). Archived at: http://web.archive.org/web/20120127231247/http://web.archive.org/screenshot/http://indigenoustweets. com/. Scannell, Kevin. 2017a. An Crúbadán (www.crubadan.org). Archived at: http://web.archive. org/web/20170324061958/http://crubadan.org/. Accessed March 24, 2017. Scannell, Kevin. 2017b. Indigeneous Blogs (http://indigenoustweets.com/blogs/). Archived at: http://web.archive.org/web/20170214231947/http://indigenoustweets.com/blogs/ Accessed February 14, 2017. Script Encoding Initiative. 2016. Archived at: https://web.archive.org/web/20160530172021/ http://linguistics.berkeley.edu/sei/index.html Accessed May 30, 2016. ScriptSource. Language Jejueo jje. Archived at: https://web.archive.org/save/_embed/http:// scriptsource.org/ c ms/ s cripts/ p age.php?item_ i d=language_ d etail&key=jje&_ s c=1. Accessed March 26, 2017. Simons, Gary F. and Charles D. Fennig, eds. 2017a. Ethnologue: Languages of the World. 20th ed. Dallas, TX: SIL International. Online: http://www.ethnologue.com. Simons, Gary F. and Charles D. Fennig. 2017b. “How Many Languages in the World Are Unwritten?” In Ethnologue: Languages of the World, 20th ed, edited by Simons, Gary F. and Charles D. Fennig. Dallas, TX: SIL International. Archived at: https://web.archive.org/ web/20170314064001/https://www.ethnologue.com/enterprise-faq/how-many-languages- world-are-unwritten-0. Accessed March 14, 2017. Simons, Gary F. and Charles D. Fennig. 2017c. “Summary by Language Size.” In Ethnologue: Languages of the World, 20th ed., edited by Gary F. Simons and Charles D. Fennig. Dallas, TX: SIL International. Archived at: https://web.archive.org/web/ 20170223192832/http://glottolog.org/. Accessed February 23, 2017. Statista. 2018. Number of Smartphone Users Worldwide from 2014 to 2020 (in Billions). Archived at: http://web.archive.org/web/20170207031628/https://www.statista.com/statistics/330695/ number-of-smartphone-users-worldwide/ Accessed February 7, 2017.

New Media for Endangered Languages 529 SIL. 2014. Request for New Language Code Element in ISO 639-3. Archived at: https://web.archive.org/web/20160305073318/http://www-01.sil.org/iso639-3/cr_files/2014-004_jje.pdf. Accessed March 5, 2016. SIL. 2016. Documentation for ISO identifier: jje. Archived at: https://web.archive.org/web/ 20160609223355/http://www-01.sil.org/iso639-3/documentation.asp?id=jje. Accessed June 9, 2016. SIL. 2017a. 639-3 Index of All Past Change Requests. Archived at: https://web.archive.org/ web/ 2 0170327001612/ http:// w ww- 0 1.sil.org/ i so639- 3 / c hg_ requests.asp?order=CR_ Number&chg_status=past. Accessed March 26, 2017. SIL. 2017b. Relationship Between ISO 639-3 and the Other Parts of ISO 639. Archived at: https:// web.archive.org/web/20170312052808/http://w ww-01.sil.org/iso639-3/relationship.asp. Accessed March 12, 2017. SIL. 2017c. Submitting ISO 639-3 Change Requests. Archived at: https://web.archive.org/web/ 20170303105737/http://www-01.sil.org/iso639-3/submit_changes.asp. Accessed March 3, 2017. Unicode. 2016a. Submitting Character Proposals. Archived at: https://web.archive.org/ web/20161106230238/http://unicode.org/pending/proposals.html. Accessed November 6, 2016. Unicode. 2016b. Unicode: A Sea Change. Archived at: https://web.archive.org/web/ 20161205055823/http://www.unicode.org/press/seachange.html. Accessed December 5, 2016. Unicode. 2016c. The Unicode Standard, Version 9.0 Chapter 2 General Structure. Archived at: https://w eb.archive.org/w eb/2 0161011190302/h ttp://w ww.unicode.org/v ersions/ Unicode9.0.0/ch02.pdf. Accessed October 11, 2016. Unicode. 2017a. Unicode 9.0.0. Archived at: https://web.archive.org/web/20170225004819/ http://unicode.org/versions/Unicode9.0.0/. Accessed February 25, 2017. Unicode. 2017b. UTF-8, UTF-16, UTF-32 & BOM. Archived at: https://web.archive.org/web/ 20170312085749/http://www.unicode.org/faq/utf_bom.html Accessed March 12, 2017. Unifoundry. 2017. Unicode Resources. Archived at: https://web.archive.org/web/ 20170320042623/http://unifoundry.com/. Accessed March 20, 2017. Waddell, Kaveh. 2016. The Alphabet That Will Save a People from Disappearing. The Atlantic, November 26, 2016. Archived at: https://web.archive.org/web/20170128044724/https:// www.theatlantic.com/technology/archive/2016/11/the-alphabet-that-will-save-a-people- from-disappearing/506987/. Accessed January 28, 2017. W3C. 2014. Declaring a Language in HTML. 29 May 2014. Archived at: https://web.archive. org/web/20170126015527/http://www.w3.org/International/questions/qa-html-language- declarations. Accessed January 26, 2017. Wikipedia. 2017. Windows Code Page, Problems Arising from the Use of Code Pages. Archived at: https://web.archive.org/web/20170307051402/https://en.wikipedia.org/wiki/Windows_ code_page. Accessed March 7, 2017. World Internet Usage and Population Statistics. 2017a. The Internet Big Picture. Archived at: http://web.archive.org/web/20170319013935/http://www.internetworldstats.com/stats.htm. Accessed March 19, 2017. World Internet Usage and Population Statistics. 2017b. Top Ten Languages in the Internet. Archived at: http://web.archive.org/web/20170320070520/http://www.internetworldstats. com/stats7.htm. Accessed March 20, 2017.

530 Laura Buszard-Welcher Wozniacka, Gosia. 2012. “Calif. Hmong community launches online translator.” The San Diego Union Tribune, March 25, 2012. http://www.sandiegouniontribune.com/sdut-calif-hmong- community-launches-online-translator-2012mar25-story.html (no Archive.org URL available). Accessed March 26, 2017. Wright, Sue Ellen. 2015. “Language Codes and Language Tags.” The Routledge Encyclopedia of Translation Technology, edited by Chan Sin-wai, 536–549. London: Routledge.

Chapter 23

L anguage Re c ov e ry Paradi g ms Alan R. King

In memory of Txomin Aizagirre and Paula López

1. Introduction This chapter will review two experiences of revitalization of endangered languages which exemplify some common principles that I will claim are shared by successful processes of language recovery (LR).1 I will maintain the view that, for an LR process to achieve success, it must periodically question some of its assumptions and undergo conceptual transitions to reach the next “level” (i.e., stage). Identifiable major stages are 1

This chapter is based on a seminar talk at the University of Hawai’i at Manoa in October 2014. An expanded transcription of the original talk was posted in my blog Kia Weli! (http://kiaweli.blogspot.com/ 2014/11/pushing-paradigm-of-language-recovery.html) and a revised version was posted on the Tushik website (http://tushik.org/pushing-the-paradigm/) in 2016, under the title “Pushing the paradigm of language recovery: the cases of Basque and Nawat.” The theoretical dimension in particular has been considerably reformulated in the present chapter. I wish to express my gratitude to all who responded to my earlier versions, engaged with my ideas on the subject, and gave me much needed feedback, including Lyle Campbell. Jan Morrow is to be thanked for his constant support of my work on Nawat language recovery and that of other indigenous languages, and for his practical assistance over the years. I am also grateful for the opportunities to work with them that I have received over the years from many members of Basque, Nawat, and other language communities, in representation of whom I shall single out my first Basque-speaking friend Txomin Aizagirre and my Nawat-speaking friend Paula López, to whose memories this study is respectfully dedicated. I will talk about Paula later. I met Txomin as a young man who spoke Basque at home to his aging parents but rarely outside the home, who was influenced by me to learn Basque language literacy; his progress was rapid, and led to a new career choice, becoming a teacher of his native language and a lifelong activist in the language movement. Txomin was not a special case but one of countless other language militants who collectively, through strong determination and focused action, brought Basque back from the brink of endangerment; and it is precisely in this capacity of one among many that I honor Txomin’s priceless contribution and that of many like him. This chapter is built around a reflection upon what has taken place to bring about this remarkable achievement.

532 Alan R. King arranged on a five-stage, directional Language Recovery Sequence (LRS)2 ranging from complete absence of any LR awareness to the completion of recovery. Evidence will be presented for transitions in both case studies between two stages called I and II. This transition in Basque LR commenced after the middle of the twentieth century; in Nawat LR, a comparable transition is now taking place. The paradigms3 associated with each successive stage consist of distinctive clusters of doctrines (beliefs and value judgments), strategies (actions considered necessary), or focuses (emphasis on certain dimensions of the issue). At any given time, however, a LR process may be in a state of flux, where the main tension is normally between paradigms pertaining to two adjacent stages on the LRS. The following stages, each associated with a characteristic paradigm, are assumed: 0 (Pre-LR), I (Ineffective LR), II (Effective LR), III (Mainstream LR), IV (Post-LR).4 Therefore the transitions expected are 0/I, I/II, and so on. A short description of each stage now follows. At Stage 0 (Pre-LR) there is no effective social awareness of the need for language recovery. This is followed by an incipient stage of LR proper, Stage I, when a society 2 Called the “Language Recovery Scale” in earler versions of this chapter. An anonymous reviewer has pointed out to me that the word “scale” could give rise to confusion given the existence of tools and constructs in the language revitalization discipline that are also referred to as scales, e.g., GIDS (Fishman 1991) and its expansion, EGIDS (Lewis and Simons 2009). These were proposed for diagnosing the state of health (degree of vitality) in which a language may find itself at any given time between the two extremes of international and extinct (0 and 10, respectively, on EGIDS). LRS, on the contrary, is an ordered series of stages through which language recovery processes have been observed to progress on the journey from the Pre-Language-Recovery stage (0) to the Post-Language-Recovery stage (IV). These stages are called “paradigms” since each is primarily characterized by beliefs and attitudes both in society at large and among LR practitioners about both the language itself and the concept of language recovery. The central purpose of this chapter is to discuss one paradigm change that constitutes a critical turning point for LR movements, that represented as the juncture between stages I and II in the LRS. 3 As noted (previous note), “paradigm” is used in the sense of a characteristic cluster of beliefs and attitudes which determine approaches to language recovery. It will be suggested that, historically, different LR paradigms may be prevalent in different language movements at certain times, that these characterize sequentially ordered stages of language recovery, and that in the course of development of a given language movement, at any one period it is likely that a transition between two principal paradigms is being played out in the form of ideological confrontations and differences of opinion about the language and the objectives of the language movement, and hence also about appropriate measures, methods, and expectations. Thus “paradigm” is primarily used, within the proposed theoretical framework, as a descriptor of a particular stage in the evolution of a language recovery process which can be observed from the outside, so to speak, by LR practitioners analyzing or diagnosing the state of a specific language movement. Nevertheless, it is also usually the case that these practitioners are themselves immersed within a particular paradigm which they bring to bear in their role in a language’s recovery efforts by adopting certain beliefs and attitudes themselves, and in this respect the issue of paradigms discussed in this chapter is intended to raise questions about assumptions often made in the LR field, some of which might need to be reexamined. 4 It is to be understood that this sequence is found in successful LR processes (or ones that may be considered successful up to the point they have reached). For empirical purposes it would be of the greatest interest to attempt to compare these with unsuccessful processes, in order to discover how their development differs.

Language Recovery Paradigms 533 becomes aware that its language is at risk, recognizes its importance, and becomes concerned about the survival of the language in the future. Stage I (Ineffective LR) begins with a social debate in which some members of the community will adopt a Stage 0 (pre-or anti-LR) position opposed to LR activity, perhaps denying that the language is at risk or arguing the language has no value and is not worth saving, while others adopt a Stage I (pro-LR) position. At this phase, LR success requires a “win” for arguments favoring steps toward language revitalization, so that a significant part of society adopts the Stage I paradigm (see section 4 below), recognizing that something should be done to save the language. Although this step is necessary it does not imply that effective ways to achieve LR have yet been found or implemented. For successful LR, Stage I must be followed by a second paradigmatic transition, in which part of the LR movement abandons some of the earlier premises and moves on to a Stage II (Effective LR) paradigm. This is the I/II transition, the phase with which this chapter will be chiefly concerned. Stage III (Mainstream LR) is assumed to be a stage at which the society at large takes on board the LR goal, declaring its support for the effort needed to recover the endangered language. This change may be manifested in a notable growth and expansion of institutions dedicated to supporting LR or in which LR goals are incorporated, effective legislation and assignment of significant resources in support of LR goals, official recognition of the language, official status and explicit declarations of the rights of members of the language community, as well as the flourishing of mass media in the target language, and so forth. The last stage, Stage IV (Post-LR), is the attainment of a situation where full recovery will have been achieved and consequently the LR process is at an end. In this chapter we will look at the recovery movements of two endangered languages with which I have had prolonged and intensive involvement: Basque and Nawat. Although neither process is in any sense complete, both have achieved a certain degree of partial success within their own contexts. Section 2 of the chapter will venture a working definition of LR, after which section 3 sketches the progress of Basque LR in the twentieth century. In the light of the Basque experience, section 4 examines a set of conventional assumptions about endangered languages and language recovery which will then be challenged through counterarguments, drawing attention to the main themes of the I/II transition. This section echoes a major debate which took place within Basque LR during the second half of the twentieth century. Section 5 offers a sketch of the situation of the Nawat language in the period prior to the recently begun language recovery process. Section 6 outlines early steps in Nawat LR in the first years of the present century. Section 7 explains new developments in the second decade which challenge some conventional ideas. Section 8 re-examines this narrative, interpreting it in terms of the I/II transition on the LRS. The chapter ends with a brief summary in section 9.

534 Alan R. King

2. Retreat, renewal, or normalization? Language recovery is a reversal of language loss, undertaken by a group at risk of losing its language. At Stage I, “reversal” here is sometimes understood as a return to a real or supposed former stage of language health. At Stage II, however, it is understood that the object of language recovery is not to turn back the clock (an impossibility) but to change the vector of change in the language’s fortunes. Recovery is not a retreat to the past but a renewal and a forward movement toward a new stage for the language, different from both the present stage of language attrition and that of any earlier historical period. Stage I strategies tend to emphasize efforts to slow down the language’s decline, whereas Stage II is powered by a growing awareness that the only way to save the old is by making it new. The change of perspective means switching mind-sets from one which sees the language as a fragile link to the past, to a new mentality which dares to visualize the old language, revived, as a powerful new key to a different future (though rooted in the past). The language, while still endangered, becomes empowered, imbued with new social and cultural meaning, as a tool of renewed cultural and perhaps political identity. Stage II involves reconstructing the language as a vigorous, evolving medium of genuine communication and the enabler of new realities; a common possession of the whole language community (including those members who may have previously lost it—one of the meanings of recovery), and a valid instrument for doing all the things any of its members wish to do; a malleable and versatile instrument which may be adapted to diverse and changing media, channels, genres, styles, uses, domains, settings, fashions, technologies, registers, functions, and discourses. For the meaning of language recovery at Stage III, we may turn to current Basque LR discourse where a much used concept is normalization, expressing the idea that for language recovery to be complete one needs to go beyond mere precarious survival. The place of the language in society, both de jure and de facto, should resemble that assumed to apply to “normal” languages. The rationale for this demand is that any language that lacks a normalized situation will always remain vulnerable and potentially endangered. That, however, is a far cry from the situation in which Basque found itself only 100 years ago! How did it get here?

3. Basque in the twentieth century Concern about the decline of Basque5 had already been expressed before the twentieth century; so had the opinion that it was a useless language not worth keeping 5

Basque is an isolate of great antiquity spoken in a small region of western Europe now forming part of Spain and France. The history of Basque language recovery is relatively well-known and is the subject,

Language Recovery Paradigms 535 alive.6 Although still spoken in the countryside, where traditional Basque culture had been preserved best, the Basque language was already being replaced by Spanish or French in urban centers. Language shift was particularly drastic in districts of southern Euskal Herria where industrialization led to influxes of workers from poor regions of Spain at various times in the nineteenth and twentieth centuries. Early initiatives to turn the tide on language loss date back to the start of the century, but these were thwarted by the Spanish Civil War (1936–1939), won by Francisco Franco’s side and followed by a lengthy dictatorship (1939–1975). Founded on the premise of a politically, culturally, and linguistically unified and uniform state, the Franco regime systematically persecuted expressions of Basque nationalist sentiment in the part of the Basque Country it ruled.7 Already in decline, the Basque language was virtually outlawed throughout most of this period, and Basque people’s reactions alternated between shunning the language as a defense mechanism or discreet defiance of prohibition. The Spanish regime’s attempt to stamp out the Basque language backfired in the long run, however. The Basques, inheriting a strong and deep-rooted sense of ethnic and cultural identity, first reacted with despair and outrage. Then Basque society, pulling itself together, came to the collective realization that their ancestral language was fast losing ground and their identity as a distinct people was in danger, and perceived that a decisive moment had now been reached when either something would have to be done or they would disappear as a people. They saw their language as a key to survival: as long as they still spoke Basque they would not be assimilated. But how could they keep Basque alive? This realization was felt first, and most intensely, not in rural areas where the language continued to be spoken but in the places where urbanization had advanced, contact with non-Basque-speakers was frequent, and Spanish had already become or was fast wholly or in part, of an ever-increasing literature which I shall make no attempt to review here. For a general introduction in a cross-language perspective, see, for example, the documents presented by the Garabide NGO on its website: http://www.garabide.eus/. For resources of the government of the Basque Autonomous Community concerning the Basque language see its site http://www.euskara.euskadi. eus/. The critical viewpoint of a non-governmental entity that works in the area of Basque language rights may be found on Kontseilua’s site: http://kontseilua.eus/. The Academy of the Basque Language (Euskaltzaindia)’s website is at http://www.euskaltzaindia.eus/. 6 The Basque-born Spanish-language author Miguel de Unamuno famously said in 1901: “Pero en el caso co//n//creto del vascuence estoy profundamente convencido de que se pierde, y que se pierde de pronto y sin remedio, y por su índole misma, por ser un idioma inapto para la cultura moderna” [But in the specific case of Basque I am deeply convinced that its days are numbered, and it will die out soon and irremediably, because it is a language not apt for modern culture] (quoted in Torrealdai 1998). 7 Conditions were and still are different in the northeastern region of the Basque Country ruled by France. Interestingly, persecution of Basque was less direct, yet the decline of the language has been more precipitous in recent years here than anywhere else. However, it would be too simplistic to posit a direct link between overt persecution and language recovery; other variables that need to be factored in include the socioeconomical: northern Euskal Herria is poor and predominantly rural except for the coast where French tourism and services dominate. In any case, it would be a mistake to think that French-language policies have been benign; on the contrary, minority languages still struggle under Parisian centralism.

536 Alan R. King becoming dominant. Thanks to the urban environment and the social effects of industrialization, these also happened to be the areas where the enthusiasts of the language movement were best equipped materially, ideologically, and practically to take effective action in support of the language. The change that occurred at this point, part way though the era of the Franco dictatorship, also played out as a generational conflict. Many young people were critical of their elders for their meek silence (as they saw it) and lack of action in response to authoritarian, anti-Basque oppression. Although still at risk under a belligerent police state, a radical underground political and cultural movement crystallized, attracting many adherents in different cities and regions. The new movement was not limited to the young and reckless, however; families and people of all ages and walks of life, sharing the frustration and concern about the future, joined in and offered their support. Thus a vigorous new drive for language recovery got under way in the midst of harsh times, bent on action rather than talk. Many Basques devoted their energy to establishing secret schools where children could study in Basque8; Basque language classes for non-Basque-speaking adults and Basque literacy classes for Basque-speaking adults who had received their education in Spanish9; a clandestine press,10 lively literary, musical, and cultural movements; and a redefined ideology of national liberation. It was a time of mobilization and perseverance, creativity, and imagination. The Basque language was one of the main emblems of this multifaceted movement. The Spanish monarchy was restored after Franco’s death in 1975, and following constitutional changes the 1980s saw the creation of a new system of autonomous regions with self-governing powers. Two “autonomous communities” were set up in distinct regions of the southern Basque Country, whose parliaments eventually established and implemented language policies. Owing to different political alignments, those of the Basque Autonomous Community (BAC) were highly supportive of Basque LR, while 8

I.e., the ikastola movement, http://www.ikastola.eus/, a large country-wide network of non- governmental parent-owned Basque-medium schools. After the creation of the autonomous communities, Basque-medium education was also introduced in varying degrees into the public school system. The ikastolak, however, were the precursors and still lead the way in some respects. 9 First known as gaueskolak [evening schools] because most adults only had time after work to attend such classes, at a later time they came to be known as euskaltegiak. Having started out as local grassroots initiatives, gaueskolak around the country were restructured as a coordinated group called AEK (Alfabetatzeko eta Euskalduntzeko Koordinakundea [coordinating body for literacy and Basque language teaching]). Originally under the auspices of Euskaltzaindia (the Basque language academy), AEK later broke away and became an independent organisation, http://www.aek.eus/. 10 At first mainly through Basque-language magazines, but after the Spanish regime change reinstated publishing freedoms, Basque nationalist newspapers appeared, bilingual at first (with Spanish predominating), and later also a completely Basque-language daily newspaper, Euskaldunon Egunkaria. These initiatives, especially Egunkaria, still had to grapple with challenges from the Spanish courts which eventually closed the newspaper down on trumped-up charges which Spain’s own supreme court eventually overturned after many years of litigation, by which time a new Basque-language newspaper, Berria, had taken its place. Today there are also several Basque-language television channels and radio stations.

Language Recovery Paradigms 537 those of the Community of Navarre were less so. France has yet to recognize constitutionally the specific identity of the part of the Basque Country in its territory as a political entity. In its recent history, Basque LR has undergone three important transitions: a 0/I phase up to around 1960 (although there had been earlier initiatives defending Basque before the civil war), a I/II phase preceding the creation of the BAC, and a II/III phase that is now under way. In the 0/I phase, the language movement defended the importance of keeping the Basque language alive in the face of an opposing discourse (supported by the Spanish establishment along with certain conservative factions in the Basque Country and part of the Spanish immigrant community) which made every effort to deny the heritage language’s value, importance, and future. From the 1960s onward the language movement moved on to the I/II phase focusing on a different debate between the Stage I paradigm now supported by the pro-Basque old guard, which saw Basque as the language of rural Basques tied to traditional customs and values which should be kept alive if possible, and a (at the time, radical) vision of Basque as a modern language, the use of which could be equally valid in the city or in the country, by young people as well as by old, among new speakers as well as traditional ones, in writing as well as in speech, in all domains, settings, and functions. Adult language and literacy schools mushroomed, Basque language militancy spread, and language standardization was also adopted as a flag, linked to new roles for the old language.11 Following this period came another in which the democratic transition in Spain, which permitted self-governing communities in the Basque Country and elsewhere, led to the removal of some (though not all) legal and political obstacles to Basque, the creation of new institutions and opportunities to pursue language recovery, and a sharp increase in available financial resources. These developments gave the language movement a needed boost, consolidated the legitimacy of the drive for LR, and resulted in a further considerable increase in the numbers of people learning and using Basque. They led to the current II/III phase where the main focus is on normalization.

11 Most people in Basque towns who could not speak Basque, which was by now a minority language in many places, self-identified as Basques (and still do). Therefore, except for a minority who wished to deny their ethnic heritage, it was seen as natural for them personally to wish to recover their ancestral language, often expressing the sentiment that until they did so they would be “incomplete Basques.” In this sense, a contrast may be noted with the traditional view in a country like El Salvador where, on the contrary, partially assimilated people of indigenous origin, who may have a minimal admixture of European blood in their veins but who do not maintain the language and traditional rural lifestyle, are regarded as Ladino (whose original meaning was “foreign”); the Nawat word for this is ejkuni literally meaning “one who arrives [from abroad].” These terms stand in contraposition to “native” or “autochthonous.” Such people would not expect or be expected to speak Nawat. However, it would seem that this traditional schema is no longer being accepted by the enlightened youth of today of the kind who become interested in studying Nawat, which they usually view as their ancestral language.

538 Alan R. King

4. Two paradigms of language recovery Here I will list some contrasting ideas about language recovery, current at one time or another in the history of the Basque LR movement (and also others), which characteristically pertain to two distinct ideological frameworks or paradigms which we can associate with Stages I and II respectively. The replacement of one set of premises by the other in different periods as the dominant ideas guiding LR efforts is an example of what I refer to as paradigm shift. The following ideas form a standard doctrine that was rarely challenged in Stage I. In the Basque case, this paradigm was prevalent up to around 1960 (i.e., up to the first part of Franco’s dictatorship). Each premise is labeled by a letter and a mnemonic key word: a) SPOKEN: “Our language has always been a spoken language and should remain so.” b) PURE: “Our language, which is starting to be corrupted by foreign influence, should be kept pure.” c) NATIVE: “Our language belongs to the native speakers, especially the elders, and they are its sole guardians and authorities on what is allowed or correct.” d) DESCRIBED: “Use of the language by native speakers should be described without prescribing what is correct or suggesting innovations.”12 e) CHILDREN: “Language recovery depends on learning from the elderly native speakers and teaching children their language.” f) RURAL: “The stronghold of our language is in the rural areas, so that is where language recovery should be focused.” g) VALUES: “Our language is the vehicle of an old, traditional culture whose values and way of thinking are to be tied to the survival of the language.”

12

In historical perspective this is a simplification. In fact, early in the twentieth century a concerted attempt was made in circles close to the newborn political movement of modern Basque nationalism to reform the language, particularly by replacing all words of foreign (e.g., Spanish) origin with neologisms. This resulted in an artificial kind of literary Basque which had too little in common with the ordinary language spontaneously spoken by native speakers, was difficult to understand, and met criticism in the second half of the century in particular from a post-war generation of writers who favored a new attempt at standardization of the written language based on due consideration for the full range of Basque dialects, historical literary precedent, and the common linguistic system underlying all its modern manifestations, while distancing themselves from the premises of excessive purism (particularly the dogmatic rejection of all lexical input from surrounding languages) and attempting to eliminate the consequences of capricious neologizing that had cluttered the literary language with “words” lacking any basis in traditional Basque. Ironically, defenders of the older artificial literary medium who opposed this new trend (which culminated in Euskara Batua, the universally accepted present-day standard language) accused the latter’s proponents of creating an artificial language. See Zuazo 1988 on the history of Basque language standardization. Similar phenomena, where there is an evident misfit between controversial claims and linguistic realities, users of highly unnatural, obscure, and incorrect forms accusing the most active pursuers of language recovery of promoting something other than the “real language,” can be observed in the histories of other LR processes including that of Nawat. Although this is frustrating and disheartening at times, common sense has a way of winning out at the end of the day.

Language Recovery Paradigms 539 The next stage, however, saw growing criticism of this paradigm, especially among the more advanced sectors in the third quarter of the century, on account of what was increasingly perceived as its narrowness and conservative limitations. Following the same list of premises, here is a summary of some of the counterarguments adduced in support of their replacement by a new (Stage II) paradigm.

4.1. a) SPOKEN: “Our language has always been a spoken language and should remain so.” Counterargument: LR strategies may involve the creation of a written language (unless it already exists), for two important reasons: (1) In LR a language’s use needs to expand into new functions and domains, in some of which writing is appropriate and necessary. (2) A written form of the language is necessary or useful for the implementation of new channels of language transmission such as language schools, immersion schools, teaching materials, language documentation, etc. Where a writing tradition already exists, the controversy may now focus on written language standardization. In the Basque case, agreement on a new written standard went hand in hand with the development of a multitude of unprecedented written uses, both developments being essential ingredients of the LR roadmap and a requisite for subsequent normalization in Stage III.

4.2. b) PURE: “Our language, which is starting to be corrupted by foreign influence, should be kept pure.” Counterargument: All living languages must grow, and growth is a form of change; if all change is viewed as “corruption” and “impurity,” the language’s development will be hampered. Excessively purist views on Basque in the first half of the twentieth century eventually gave way to a contrary reaction. The outcome was a redefinition of the balance between tradition and innovation (see Zuazo 1988).

4.3. c) NATIVE: “Our language belongs to the native speakers, especially the elders, and they are its sole guardians and authorities on what is allowed or correct.” Counterargument: This idea of native speakers is correct up to a point and in a certain kind of context, and also resonates with the descriptivist axioms of modern structural linguistics about the priority of the spoken language as the spontaneous oral production of native speakers. But in an LR context the dictum that native speakers are the

540 Alan R. King language’s guardians and sole authority is ingenuous and should not be adopted as a simplistic dogma. In a healthy language, to be sure, native speakers are the language’s main transmitters and users, but in a dying language, quite frankly, many native speakers may have failed to perform these functions adequately, leaving their real authority open to question. On the other hand, countless Basque language activists who were new speakers filled the ranks of the language movement in a crucial period, some playing prominent roles, providing linguists, language teachers, writers, producers of textbooks and learning materials, school staff, university professors and teacher trainers, students and transmitters of cultural traditions, publishers, producers and contributors of the Basque-language press, media, and so on, musicians and artists, organizers, and activists of support groups of the language movement. As new speakers, their command of the language was sometimes imperfect and this caused concern in some quarters, but when all is said and done their impact was mostly positive. Glorification of the native speaker as the be-all and end-all of the language’s universe can degenerate into false arguments and may have a negative effect which a successful LR movement must learn to sidestep. Moreover, the ideological and social outlook of a progressive LR movement should aim at inclusion.

4.4. d) DESCRIBED: “Use of the language by native speakers should be described without prescribing what is correct or suggesting innovations.” Counterargument: This position is again supported by the tenets of twentieth-century structural linguistics. For LR, however, documentation is not an end in itself but a means to an end: the recovery of the language as a living, growing, vigorous medium of cohesion, expression, and progress. The purpose of documentation is to collect and generate the information needed for the language to carry on as an expression of a growing culture and serve the needs of a new generation living in its own world. Without losing sight of the distinction between description and prescription, responsible linguists committed to LR should not be afraid to propose new norms in appropriate contexts while taking descriptive knowledge as their ultimate basis for doing so.

4.5. e) CHILDREN: “Language recovery depends on learning from the elderly native speakers and teaching children their language.” Counterargument: Old people and children cannot bring about language recovery! The brunt of this burden must be borne by adults able to work hard, fight the fight, push for change, make things happen, and lead the way forward. A large enough proportion of the adult population has to be brought into the LR movement for it to become viable.

Language Recovery Paradigms 541 Therefore, emphasis must be placed on adult second-language learning; furthermore, in many cases such as that of Basque in the middle of the last century, the adult generation is the weakest link in spontaneous language transmission, where knowlege of the endangered language is at its most frail because of a break in transmission in their parents’ generation (that of today’s elderly native speakers) and the pressures exerted by modern life in a time of widespread economic and political hardship. Adults are also of vital importance for another reason: as parents and active members of society, they are the primary example setters and role models for the young, who even if they revere the aged, learn more from their immediate elders. As parents, it is only useful to tell our children to speak Basque if we are capable of telling them in Basque!

4.6. f) RURAL: “The stronghold of our language is the rural area, so that is where language recovery should be centered.” Counterargument: Typically LR movements do not initially take hold in the heart of the traditional language community, especially if this is located in the remote countryside. More commonly, the functioning nucleus of the LR movement is found in urban settings, in places where the language is little heard but conditions can sustain the language movement and provide resources to fuel it. This is not illogical, since effective LR is about the language expanding into new domains. Often an endangered language is only heard in certain places and certain functions, having become ghettoized and stereotyped; even in these conditions it continues to shrink (hence we call it endangered). Strategies to reverse language loss need to look at ways to counteract this by altering the relationship between the language and its possible settings, creating new opportunities for use in novel domains and places. Stereotypes should therefore be defied and denied, setting the language free and conveying a new message which says: “You don’t have to be old and poor to speak L, nor do you need to live far away from civilization; you don’t need to belong to the lower social classes, and lack education; look, L can even be spoken in the city by middle-class urbanized young folk.” Indeed, speaking L in the city is a strategy to stop it from dying in the countryside.

4.7. g) VALUES: “Our language is the vehicle of an old, traditional culture whose values and way of thinking are to be tied to the survival of the language.” Counterargument: It is not true that people must subscribe to a certain philosophy, adhere to a particular way of life, belong to a certain religion or uphold particular values

542 Alan R. King in order to speak a given language and be part of a language community which is made up of people who share the same language but not necessarily the same thoughts! Tying language choice to ideology or life choices runs counter to normalization. Language functions as both the ideal cement for continuity and the perfect vehicle of change. This is not an exclusive property of non-indigenous languages.

5. The Nawat language in the twentieth century Nawat was once the most widely spoken language in the territory constituting the small Central American nation of El Salvador, which occupies an area of a similar size to the Basque Country.13 The language was brought there by the Pipils when they colonized the west and central regions of El Salvador, having migrated from what is now Mexico in several waves which, although they cannot be dated precisely, are believed to have taken place at least 1,000 years ago (see Campbell 1985, 6f). Where the Pipils settled they became neighbors of various other ethnic communities speaking unrelated languages, which included Lenca, Xinka, and Cacaopera in eastern El Salvador.14 The language of the Pipils is called Nawat in the language itself, náhuat—formerly also nahuate—in Spanish, but in academic publications in [English] and other languages it is widely referred to as “Pipil.” The southernmost language of the Uto-Aztecan language family, Nawat is a member of the Nahua subgroup of Uto-Aztecan, closely related to “Nahuatl” of Mexico. More distantly related languages are or were spoken in what is now Mexico or the United States. Nawat was already on the decline by the 1920s when serious documentation of the language began. Then it was crippled by a massacre of tens of thousands of Pipils in 1932 by soldiers under the orders of a military government in an episode referred to in history books as La Matanza (i.e., The Slaughter). The pretext was a peasant revolt triggered by unbearable conditions. The authorities treated all indigenous people as culprits (Ward 2002, 77). Thousands of male members of the native population were dragged away and 13 Both are roughly comparable in area to Israel or Wales. According to figures posted on English Wikipedia, the respective areas (in km2) are: Basque Country 20,947; El Salvador 21,041; Israel 20,770; Wales 20,779. 14 Some evidence suggests that Nawat might have been used as a lingua franca in a wider area prior to the arrival of the Spanish. This supposition is supported by the presence of loans from Nawat into neighboring indigenous languages, such as Lenca (Lehmann 1920, 668–722; King 2016). For example, in Chilanga (Salvadoran Lenca) which was spoken until the 1970s (Campbell 1976), some words appear to be loans of Nawat origin, such as kotan “country, woods” < N[awat] kojtan; matz’ati “pineapple” < N matza(j)[*-ti]; mistu “cat” < N mistun; shikal “jar” < N shikal “gourd bowl”; shikit “basket” < N chikiwit; su(w)at “hat” < N suyat “palm”; taku “half ” < N tajku; tetunte “hearthstone” < N tetunti; wat “sugarcane” < N u(w)at.

Language Recovery Paradigms 543 shot. According to an infamous edict, the illiterate “enemies of the country” were to be recognized through two incriminating signs: traditional indigenous dress, and speaking an indigenous language. Understandably, survivors saved themselves from the firing squad by adopting non- indigenous dress and refraining from speaking Nawat. The remnant of the Pipil nation was publicly invisible, politically ignored, economically deprived, socially marginalized, and psychologically still scarred and wary of the outside world. Descendants were still aware of their identity, but few people wanted to talk about it and only a small minority retained much knowledge of their ancestral language, now in danger of lapsing into permanent silence, as all the other indigenous language once spoken in El Salvador had done already. Documentation of Nawat, mostly by foreigners, began early in the twentieth century but was limited and sporadic. Within the academic literature, the most important study for its coverage and quality is Campbell (1985).15 Besides direct documentation and the remaining speakers and semi-speakers (most of whom are now in the municipality of Santo Domingo de Guzmán, in the department of Sonsonate), which together provide the main basis of our knowledge of Nawat, insight about the language can also be aided by judicious use of comparison with related speech varieties in Mexico, collectively known as Nahuatl.

6. Nawat language recovery: first steps At the time of my arrival in El Salvador in 2002, documentation had not advanced substantially beyond what had been done in the 1970s. There was no serious, well-informed project under way to promote revitalization, and reliable knowledge about the language was generally unavailable.16 In order to do something useful I therefore needed to begin by undertaking some basic groundwork to collect information about the language, contact members of the language community, and talk to anyone who could offer help and 15 Half a century earlier, Leonhard Schultze-Jena published a voluminous work in German (Schultze- Jena 1935). In spite of its importance, the publication essentially had no impact in practical terms; for Pipils it was of course both unavailable and inaccessible. The lack of access to the textual corpus which makes up the second half of the study is particularly unfortunate. The linguistic description which makes up the first half is of some interest but clearly inferior to Campbell’s account in coherence and perspicacity. 16 The last years of the twentieth century were not completely devoid of publications. However, with the exception of Campbell there was a steady decrease in the level of scholarship and originality in comparison to the best earlier achievements, which apart from Campbell include Schultze-Jena (1935). This is not the place for an exhaustive bibliography, the most important items of which are referenced on the Tushik website, while a bibliography covering the period up to its date of publication is to be found at the end of Campbell (1985).

544 Alan R. King guidance.17 Until my departure in 2005 I participated in, and often initiated, a variety of projects and activities in support of Nawat, and since my return to Europe I have found ways to continue to do so thanks to the advances of modern technology, in collaboration with colleagues with whom I had worked in El Salvador and other components of the developing language movement that have since emerged.18 In 2003, following discussions with Jorge Lemus and Monica Ward, a project was started with backing from the Universidad Don Bosco which promised to introduce the effective teaching of Nawat to elementary school children in certain schools through a five-year language program with the objective of developing basic communicative skills in Nawat, as reported in King 2004b (some of the planned materials appeared as King 2005 and Universidad Don Bosco 2009).19 In a separate development, on my suggestion the IRIN association was founded in 2003.20 This was intended to be an umbrella organization to promote, coordinate, assist, or perform a range of activities, projects, and programs supporting the general objective of Nawat language recovery. Two basic tenets of the association were a specific agenda of Nawat language recovery and institutional independence. This was a grassroots initiative started by local citizens committed to the goal of Nawat language recovery, some of whom were native speakers of Nawat. The creation of an organization with such characteristics was new and unprecedented. The peak of IRIN’s activity came a few years later after it undertook a project to collect Nawat language audio and video documentation with participation of Nawat speakers in a number of capacities.21 An attempt to obtain legal registration of IRIN as 17

I acknowledge with gratitude the help I received from many individuals, each in their own way, particularly Lyle Campbell, Werner Hernández, Jorge Lemus, Paula López, Cecilia de Méndez, Genaro Ramírez, Gaio Tiberio, and Monica Ward. 18 Although many of the items are not formal publications, a substantial literature has been generated and circulated through websites and social media over the past fifteen years. Much of the relevant information and materials is indexed on and can be downloaded from the Tushik website, a portal that was created by myself for Nawat and Lenca language resources: http://tushik.org/. 19 Changes later made in the program have limited the program’s scope and effect, which have fallen short of the original goal (Werner Hernández, personal communication November 25, 2015). 20 IRIN stands for “Iniciativa para la Recuperación del Idioma Náhuat” [Initiative for Nawat Language Recovery]. The Nawat words Te Miki Tay Tupal [What Is Ours Shall Not Die] was later appended to the association’s original name. The shorter version “IRIN” is used here for convenience. IRIN was founded at a meeting of Nawat enthusiasts in September 2003 held at the Casa de la Cultura in Santo Domingo de Guzmán, Sonsonate. Its first president was a well-known native Nawat speaker, the late Genaro Ramírez Vásquez. Most of IRIN’s activity was coordinated from Izalco under the supervision of the secretary Cecilia de Méndez Martínez and an assessor, Nardi Gómez Sampedro. One of IRIN’s most illustrious and active members was the late Paula López, an enthusiastic native speaker who played a key role in the language documentation project and in IRIN’s work generally. 21 The IRIN Nawat documentation project produced over twenty video and audio recorded interviews between thirty and sixty minutes among Nawat speakers, some of which have been transcribed while the transcription of others is still in progress. Translations and subtitled editions of some recordings have also been produced. Much of the production and post-production process was carried out by Nawat speakers and assistants with links to the local communities, with practical training as necessary. The project was promoted by Lyle Campbell and sponsored and funded by the University of Utah and

Language Recovery Paradigms 545 an association fell through because of inadequate support. Private printing and hand- to-hand distribution at cost price were resorted to in order to put into informal circulation various language materials and to provide a minimal flow of funds through the group to facilitate continued activities. General developments which degraded the quality of life and the feasibility of group activities forced IRIN to discontinue its work subsequently. Nevertheless, this initiative made a lasting impact and cannot be considered a failure as a precedent and a pioneering effort, for it opened up the way toward language revival, helped to raise public awareness, and increased the visibility of the Nawat language and its speakers. Furthermore, it generated materials and experiences which laid the ground for a new kind of recovery movement to be discussed in section 7. One text produced and distributed by IRIN is a set of booklets titled Shimumachti Nawat! [Learn Nawat!], which constituted a basic elementary language course on modern principles suitable for self-study by adults (King 2004d). Its lessons would later provide the basis for the most successful Nawat textbook for adult learners to date, Timumachtikan! [Let’s learn] (King 2011). Other IRIN “publications” included a basic vocabulary (King 2004a), a brief grammar (King 2004c), and a continuous text by a native speaker (Ramírez 2004). A number of changes in circumstances following my departure from the scene ended up contributing in ways that could not have been foreseen to an interesting realignment of forces, a methodological reformulation, and a surprising growth in the Nawat language movement beyond what seemed possible at the beginning of the century. Developments such as my departure from the physical scene and the decline of IRIN required strategic changes. Just in time, however, new options started to make their appearance. Before the end of the decade, a new way to distribute Nawat language materials without selling printed copies became feasible, by means of PDFs distributed free of charge through social media. There is now a special website, Tushik, which facilitates information on all available resources and serves as a distribution hub.22 A perusal of Tushik shows how Nawat language materials have multiplied greatly, to include the elementary language course Timumachtikan (King 2011); a practical dictionary (Hernández 2016b); a Nawat grammar (King 2014a); listening materials (Mukaki!); a YouTube channel presenting well-made didactic clips teaching aspects of Nawat (Náhuat El Salvador); a variety of readings in Nawat (Masin et al. 2012; King 2013a, 2013b) and a textual and lexical corpus for researchers and advanced students (King 2014b, 2014c).

the National Science Foundation. Audio recordings produced through the project are currently being released gradually on the Tushik website. 22 Currently Tushik (http://tushik.org/) provides for the online resource needs of several indigenous language movements (Nawat, Salvadorean Lenca, and Honduran Lenca). In addition to portals and subportals targeting people interested in these LR processes, the site also houses a document archive containing several hundred published and unpublished relevant items which is freely available to specialists and scholars: the Tushik Library. Interested readers may contact the author for details on how to access the library.

546 Alan R. King

7. New conditions and new directions The first years of the twenty-first century everywhere have witnessed increasing availability of computers, phones, and the internet, along with the growth of social media, facilitating access to a wider range of knowledge and ideas in places where there was formerly none. In El Salvador, this period has coincided with the emergence of a new generation of young middle-class city dwellers with a university education who are benefiting from the new resources to become more discerning and socially active citizens than earlier generations. Some are taking an interest in their historical roots, eager to learn about the ethnic make-up of their country and curious about the cultural and linguistic heritage of the indigenous community in their midst. An awakening has begun, as intellectually capable and socially aware young adults start asking questions and discovering their capacity for effective action and exercising choices. A decade ago somebody created a Facebook group called Salvemos el Idioma Náhuat.23 This group has now attracted over 6,000 men and women, in a country with a total population of 6 million, yet it is no longer even the largest Facebook group in the language movement,24 for there are now perhaps a dozen groups about the subject disseminating information, providing forums for discussion and forging a new kind of “language community.” The existence of such groups has stimulated an increasing demand for opportunities to study and practice Nawat through language classes, language groups, activities, events, and excursions to visit the Pipil areas where native Nawat speakers live. New on-the-ground groups dedicated to the Nawat language have flowered, such as Colectivo Tzunhejekat. Given the uneven distribution of wealth and the continued marginalization of indigenous people in the country, however, there remains a gigantic gap between the new groups of young, urban, economically relatively comfortable, educated new speakers and the elderly, rural, economically impoverished, uneducated traditional speakers. This poses a new challenge to find ways to overcome the risk of a “disconnect” between the two communities, but given their social awareness and enlarged resources, it is a challenge that the new Nawat groups are attempting to address. Links are being forged through visits, events, activities, and various forms of interaction and mutual assistance. This may trigger social changes benefiting indigenous communities by opening up new, empowering opportunities of mutual interest for both groups, removing time-honored prejudices and altering established patterns of class and racial segregation. These developments imply new ideas and novel strategies for Nawat language recovery. Traditional premises are being challenged; Stage II has arrived! Again, the new

23 I.e., “Let’s save the Nawat language,” https://www.facebook.com/groups/33974937500/, created by Hector Castaneda. 24 At the time of writing, the largest Facebook group for Nawat is that of the Colectivo Tzunhejekat, https://www.facebook.com/Tzunejekat/.

Language Recovery Paradigms 547 strategy is to broaden the scope of the endangered language, allowing it to be shared with new speakers while giving it a new lease on life for traditional ones. Rural Pipil population centers have already begun to benefit from this, with Nawat speakers experiencing a new self-respect and empowering sense of pride and optimism, as well as giving Nawat greater visibility within the society at large. It may be another symptom of this change that representatives of official institutions have begun to venture out of their city offices, obviously eager to share the limelight when opportunities to celebrate Nawat in its own territory arise. A case in point was the sad occasion of the passing in 2016 of an important figure for Nawat, the native speaker, singer, and renowned language enthusiast Paula López. Representatives of official institutions as well as a variety of other entities vied for the chance to participate in the funeral and memorial ceremony,25 implying a degree of interest, visibility, and popularity for Nawat which certainly did not exist in Salvadorean society before the current drive for Nawat language recovery commenced.

8. Nawat between paradigms The rapid change that has begun to take place in the Nawat LR process can be interpreted as an instance of a I/II transition, since it implies a paradigm shift comparable to that described in section 4.

8.1. a) SPOKEN? Nawat is increasingly being used as a written language serving as both a medium of communication and a tool in language teaching. Consensus-based development of a standard orthography was a necessary prior step (as in the Basque case).

8.2. b) PURE? Nawat is still endangered, yet it is nevertheless growing, as it comes to be adapted to new functions and domains. As a result of this growth, Nawat is no longer a language only used by a few elderly people in remote rural communities for a diminishing number of purposes. With new uses comes the need for new linguistic forms, adaptation, and innovation. This challenges the idea of a pure language resisting change from a fictitious pristine state.

25

See Hernández (2016a), and an obituary for Paula López in Miranda (2016).

548 Alan R. King

8.3. c) NATIVE? The number of speakers of Nawat is growing as a result of its being learned as a second language. In consequence, native speakers and semi-speakers no longer constitute the whole language community. It is now important to forge an alliance between old and new speakers to work together to preserve and reactivate the Nawat linguistic inheritance.

8.4. d) DESCRIBED? Nawat is now being studied more thoroughly and intensively by more people than ever before. Linguistic consciousness among those who speak Nawat is reaching a new high. What is known has been described, but there is currently a need to go further by codifying the rules and patterns of Nawat. Lines will need to be drawn between correct and incorrect, good and bad Nawat; an anything-goes approach is not useful here. A balance between description and prescription is needed. Just because a native speaker said something, it is not always good Nawat—as another native speaker will sometimes be able to point out.

8.5. e) CHILDREN? At this time it does not now favor recovery to place all the emphasis on the teaching of children. Adult learning is a very important activity, in order to pave the way for robust transmission to children in the future. At present there are not enough adults who can speak Nawat to provide an adequate workforce of effective Nawat teachers.26

8.6. f) RURAL? The center of gravity of Nawat LR has moved from the remote countryside to the city; the uneducated poor are being joined, as its protagonists, by members of the educated middle class; Nawat’s domain of use now spreads all the way from the kojtan (forest) and the mil (cornfield) to the weytechan (big city), matapan (internet), ishkalamat (Facebook), and tepustanutza (smartphone).

26

Some Nawat teaching is currently taking place in selected primary schools. Unfortunately, this is being attempted with limited materials and mostly by teachers whose knowledge of Nawat is very limited.

Language Recovery Paradigms 549

8.7. g) VALUES? Nawat will inherit the culture of the people through whom it has survived until now, but it will also participate in the progressive redefinition of this culture (like all cultures) and build it anew in each generation.

9. Conclusions I have posited a Language Recovery Sequence which differentiates several stages of development through which successful language recovery processes may cycle, each of which is associated with a characteristic paradigm or set of ideas and recovery strategies based on those ideas. The thinking typical of Stage I in this sequence is: Our language is a spoken language. We need to keep it pure and free from foreign influence. It belongs to the old native speakers, and it is their usage that documentation should describe uncritically. Our language will survive if we can just teach children the language of the last elders. The process should focus on the traditional rural stronghold. Survival of the language should be bound up with maintenance of the ways of old. In the Basque LR movement a growing number of language activists broke away from this credo to embrace a different paradigm from the 1960s onwards. These are the new ideas: We need to develop both spoken and written forms of our language. Purity is relative: all languages change. Successful LR crucially involves action by adult second-language speakers. Documentation must serve continued development of the language, which is not achieved by treating the flawed utterances of the last semi-speakers as museum exhibits of a dead language. Adaptation of the language to new uses will allow it to grow and become stronger. It is legitimate to practise modern cultural options through the old language, which should not be ghettoized. This chapter has suggested how to interpret internal debates within LR processes as transitions between paradigms of successive stages. “The 0/I transition phase” is dominated by a debate between those who see no need for LR, who perceive no threat to their language or who do not value their language enough to care about its fate (Stage 0) and those who think it important to make an effort to keep the language alive, though there may well be confusion about how to achieve this (Stage I). “The I/II transition phase” is one where the main debate is between those in the community who emphasize that we must preserve the language as it is, in its present state as spoken by remaining speakers in a traditional rural setting, together with old ways and ideas (Stage I), and others in the community who defend a new agenda which attaches importance to writing and standardizing the language, acknowledging linguistic innovation, transcending the limited knowledge of the remaining speakers, prioritizing adult

550 Alan R. King language learning, transplanting the language to new environments, and promoting an inclusive concept of the extended language community which admits different social ideas and lifestyle choices (Stage II). To conclude I would like to suggest three areas where these insights may be found useful by those involved in language recovery work: project design, awareness, and training.

9.1. Project design At the beginning of any LR project, the key initial questions should be asked about current resources, needs, and problems; existing programs and materials; strategic priorities; prior requisites; the feasibility of implementation, etc. It seems it should not be necessary to point out the importance of these considerations, but in practice it sometimes is, perhaps because of a lack of sufficient preparation or training of would-be LR practitioners in these areas. There should also be a self-critical examination of background assumptions, and this is where the concept of stages and paradigm transitions is most pertinent. This chapter suggests we need to ask a set of questions about beliefs and attitudes aimed at diagnosing the historical background and present status of the LR scenario for the language community in question. Sometimes mistakes have been made in LR project design because of the wholemeal adoption of a ready-made formula without analyzing the current scenario and relying on faulty assumptions. Suppose for example it is decided to create a program of language nests (cf. Te Kōhanga Reo National Trust; ʻAha Pūnana Leo; E Ola Ka ʻŌlelo Hawaiʻi 1997) for a language like Nawat in a rural small town at a point in time when there are a few elderly (native) speakers, but none young enough or sufficiently well- trained to run a vigorous programme. A LRS Stage I mind-set will assume that the most important thing to focus on is putting little children in the care of the native speakers, although they lack educational expertise. The error is obvious from the perspective of a Stage II paradigm: such a program takes a lot of work and training, which a few elderly speakers cannot provide well, but which might be achieved in conjunction with a team of younger educated participants in the program. Of course, the latter should also be Nawat speakers, which means, in the context, new speakers rather than native ones, and perhaps people from a less traditional social background. This is a “Stage II solution.” It also presupposes the required technical and linguistic training of the younger teachers and so may entail a later start date and greater initial expense. If these requirements cannot be met, then perhaps the language-nest model is not the right strategy.

9.2. Awareness LR cannot take place without the support of the community, which can only be driven by an argument founded on beliefs which motivate the LR movement by ascribing to it

Language Recovery Paradigms 551 ethical, historical and logical reasons, i.e., a driving ideology. A society must strive to recover its language because of what it wants and believes. Thus ideology is an important dimension of LR movements, but it is important that the ideas should be the right ones which respond to the current experience of members of the community and are conducive to effective kinds of action and attitudes, with a discourse, a narrative, and a script that are understood and espoused by the community. A language-recovering community on its way to success is one that is LR-aware, and the policies pursued by LR practitioners should be coherent with that public awareness.

9.3. Training The debates which are played out in language- recovering communities between competing paradigms should be understood, studied, and discussed at appropriate points in places where training for LR work is provided. In particular, care should be taken not to transmit to future practitioners the assumptions typically associated with the ineffectual Stage I of the LRS without pointing out their weaknesses and fallacies. Otherwise we will risk sending LR workers into the field who are poorly equipped to make good strategic decisions and may even exert a negative influence on the way the community thinks about LR. It would therefore be helpful if further studies could be performed to test the accuracy and wider applicability of these premises and document their impact in LR processes (especially successful ones).

References Note: Items noted as “Preprint, IRIN” have been printed and distributed by IRIN without legal registration. Other items noted as “Preprint” are in electronic form, generally PDFs, listed and available through the Tushik website at the URLs indicated. (Tushik is a dedicated resource portal serving the Nawat and Lenca language recovery communities, http://tushik. org/.) ʻAha Pūnana Leo. n.d. Website of the Hawaiian language nest organization. http://www. ahapunanaleo.org/. Accessed January 8, 2017. Campbell, Lyle. 1976. “The Last Lenca.” International Journal of American Linguistics 42: 73–78. Campbell, Lyle. 1985. The Pipil Language of El Salvador. Berlin: Mouton de Gruyter. E Ola Ka ʻŌlelo Hawaiʻi. 1997. Video narrating the history of the Hawaiian language nest movement produced by ʻAha Pūnana Leo. Posted on YouTube. https://youtu.be/ ITMlt8dqKlc. Accessed January 8, 2017. Fishman, J. A. 1991. Reversing Language Shift: Theoretical and Empirical Foundations of Assistance to Threatened Language. Clevedon, UK: Multilingual Matters. Hernández, Werner. 2016a. “Chuka tuyulu.” Tushik website. Last modified April 17. http:// tushik.org/chuka-tuyulu/. Hernández, Werner. 2016b. Nawat mujmusta. Revised edition. Preprint, Tzunhejekat. http:// tushik.org/wp-content/uploads/HER-mujmusta.pdf. King, Alan R. 2004a. ¡Conozcamos el náhuat! Preprint, IRIN.

552 Alan R. King King, Alan R. 2004b. “El náhuat y su recuperación.” Científica Year 4, nº 5: 51–70. El Salvador: Universidad Don Bosco. King, Alan R. 2004c. Gramática elemental del náhuat. Preprint, IRIN. King, Alan R. 2004d. Shimumachti Nawat! Curso de lengua náhuat para adultos. Parts 1, 2 and 3. Preprint, IRIN. King, Alan R. 2005. Ne Nawat, tutaketzalis! Amachti 1. San Salvador: Editorial Universidad Don Bosco. King, Alan R. 2011. Timumachtikan! Curso de lengua náhuat para principiantes. Preprint. http://tushik.org/wp-content/uploads/timumachtikan-pdf-texto.pdf. King, Alan R., trans. 2013a. Ne Yankwik Sentaketzat (El Nuevo Testamento en náhuat, lengua de los pipiles de El Salvador). Revised ed. Preprint, Ne Bibliaj Tik Nawat. http://nebibliaj.org/. King, Alan R. 2013b. Panuk Tik Ijtzalku (Sejse cuentoj tik Nawat te uij). Preprint. http://tushik. org/panuk-tik-ijtzalku/. King, Alan R. 2014a. Curso de gramática náhuat basado en el texto del Yankwik Sentaketzat. Preprint. http://tushik.org/curso-de-gramatica-nahuat/. King, Alan R. 2014b. NawaCoLex 2.1. Software package. http://tushik.org/nawacolex/. King, Alan R. 2014c. Nawat Corpus & Lexicon Database. NawaCoLex Version 2.1. Tutorial. Preprint. http://tushik.org/nawacolex/. King, Alan R. 2016. Conozcamos el Lenca, una lengua de El Salvador. Preprint. http://tushik. org/conozcamos-el-lenca-sai/. Lehmann, Walter. 1920. Zentral-Amerika. Teil I. Die Sprachen Zentral-Amerikas in ihrer Beziehung zueinander sowie zu Süd-Amerika und Mexiko. Berlin: Dietrich Reimer. Lewis, P. M. and Simons, G. F. 2009. “Assessing Endangerment: Expanding Fishman’s GIDS. Revue Roumaine de Linguistique 55(2): 103–120. Masin, Ynés et al. 2012. Tajtaketza pal Ijtzalku. Preprint. http://tushik.org/tajtaketza-pal- ijtzalku/. Miranda, Jazz. 2016. “Paula López, voz del Río de Espinas.” La Zebra, May 1. https://lazebra.net/ 2016/05/01/jazz-miranda-paula-lopez-voz-del-rio-de-espinas-cronica/. Mukaki! 2016. Tushik. http://tushik.org/mukaki-el-nahuat-se-oye/. Accessed February 19, 2016. Náhuat El Salvador. n.d. YouTube channel. By Alej Andro (Alejandro López Mendoza). https:// www.youtube.com/channel/UCbYqsaNZAzdRq94OfRO4odQ. Accessed June 30, 2016. Ramírez Vásquez, Genaro. 2004. Naja ni Genaro. Preprint, IRIN. Schultze-Jena, Leonhard. 1935. Indiana II: Mythen in der Muttersprache der Pipil von Izalco in El Salvador. Jena: Gustav Fischer. Te Kōhanga Reo National Trust. n.d. Website of the New Zealand Maori language nest organization. http://www.kohanga.ac.nz/. Accessed January 8, 2017. Torrealdai, Joan Mari. 1998. El libro negro del euskera. Ttartalo. https://escueladesara. wordpress.com/tag/libro-negro-del-euskera/. Universidad Don Bosco. 2009. Ne Nawat, tutaketzalis! Amachti 2. San Salvador: Editorial Universidad Don Bosco. Ward, Monica. 2002. “A Template for CALL Programs for Endangered Languages.” MSc thesis, Dublin City University. Chapter 5, Nawat. http://www.computing.dcu.ie/~mward/mthesis. html. Zuazo Zelaieta, Koldo. 1988. Euskararen batasuna. La unificación de la lengua vasca. L’unification de la langue basque. Bilbo: Euskaltzaindia.

Chapter 24

M yaam iaataw e e nk i Revitalization of a Sleeping Language Daryl Baldwin and David J. Costa

1. Introduction Languages have long been labeled “dead” or “extinct” when their last speakers die. These labels largely reflect a belief that when the last L1 speakers of a language pass, the future viability of the language is unlikely. The documentation of these so-called extinct languages, provided there is any, is thus thought to be solely of linguistic interest. This chapter challenges the notion that dead languages automatically cease to play an important role within a living community, and that their language records are only of interest to linguists. The efforts of the Miami Tribe of Oklahoma (MTO) in reclaiming their heritage language from documentation in the absence of L1 speakers are well known. This chapter describes their multifaceted effort and raises many questions regarding the role and function of a reclaimed language in community education, issues of proficiency, and other authenticity measures, and provides a perspective into a wide range of capacity-building activities that are necessary to fulfill community language goals. The Myaamia are a central Algonquian-speaking people whose historic homelands included what are now the states of Indiana, Illinois, western Ohio, and portions of lower Michigan and Wisconsin. Until the mid-nineteenth century, semi-permanent Myaamia villages were centered along the upper Wabash River between what are today the cities of Fort Wayne and Lafayette, Indiana. The Myaamia signed thirteen treaties that were ratified by the US Government, leading to the loss of much of their homelands (Ironstrack 2009). By the terms of the 1840 Treaty of the Forks of the Wabash they were required to cede to the United States their remaining 500,000-acre national reserve in north central Indiana in exchange for a reservation of equal size in the Unorganized Indian Territory, which later became Kansas. In October of 1846, after years of attempting to avoid removal, a military escort was dispatched to forcibly relocate the Myaamia Nation and any citizens who had not been exempted from the removal.

554 Daryl Baldwin and David J. Costa Approximately 300 citizens and the nation proper were relocated to their new home arriving on the edge of the western plains at the start of winter. By the terms of the subsequent 1867 treaty, the Myaamia Nation was forced to relinquish its lands in Kansas and relocate a second time to Indian Territory, which later became Oklahoma. Families from Kansas continued to move to Indian Territory and settle on allotments until the 1880s. As in the first relocation from Indiana, some families were able to gain legal exemption to remain behind in Kansas causing additional fragmentation of the community. It is for this reason the Miami Tribe of Oklahoma recognizes three population centers today (Oklahoma, Indiana, Kansas), reflecting a history marred by relocations. These removals resulted in the Miami Tribe of Oklahoma’s government seat residing in the northeast corner of Oklahoma today, and from this location they maintain sovereign responsibilities to their people, including language and cultural education. The Miami Tribe retained its self-governing rights through the treaty process and is today a federally recognized tribal nation. At least here in the United States, tribal sovereignty has played a critical role in protecting and supporting tribal interests in language and cultural revitalization. The Myaamia community exists in diaspora today as a result of their tribal history of forced relocations, economic pressures, marriage patterns, and other sociological and economic factors. Tribal citizens live in nearly every state of the union, with population centers in Oklahoma, Kansas, and Indiana. The Miami Tribe of Oklahoma maintains a citizenship roll of approximately 5,000 individuals as of mid-2016. However, there are approximately 10,000 individuals today who can claim Myaamia or Illinois heritage. It should be noted that the Myaamia people share a language with the Peoria Tribe of Oklahoma. The contemporary Peoria tribe consists of descendants from various Miami and Illinois bands which confederated in 1854, becoming the Peoria Tribe of Oklahoma (Valley and Lembcke 1991). The Peoria tribe maintains a separate citizenship roll from the Miami tribe. “Miami-Illinois” is the linguistic cover term now used to subsume the very similar dialects of what is a single Algonquian language, i.e., Miami, Peoria, Wea, Kaskaskia, Piankashaw, and Illinois. In turn, “Illinois” is used to designate the language of the Miami-Illinois materials recorded by French Jesuit missionaries in the late seventeenth and early eighteenth centuries, in what is now the state of Illinois. This history is significant in understanding the social, geographical, and political landscape that surrounds the current Myaamia language revitalization effort. This context, as challenging as it may appear, has not stopped community members and leaders from moving forward with the revitalization of their language. The story of Myaamia language reconstruction and revitalization begins with the personal stories of the authors of this publication, Daryl Baldwin and David Costa. Costa’s reconstruction effort began in 1988 as a graduate student in Linguistics at the University of California, Berkeley. Costa had finished his Master’s orals, advanced to candidacy, and was looking into possible dissertation topics. His graduate advisor, Richard Rhodes (a specialist in Ojibwe), informed him that there was a language named “Miami” which had long been a mystery within Algonquian studies. It was known that some written records on the language survived, some dating to the 1700s, but very little had been

Myaamiaataweenki: Revitalization of a Sleeping Language 555 published and no one knew whether there might be more such materials in various archives. Moreover, no one knew whether there were any living speakers of Miami or its closely related dialect Peoria in either Oklahoma or in Indiana. Costa was assured that however much data there turned out to be, this was essentially untouched territory that no one had seriously pursued before, and there would certainly be enough data for a dissertation research program. Starting out, Costa’s work took two directions: gathering and analyzing written materials on the language and trying to find speakers. In a first effort to determine if there were still speakers of either Miami or Peoria, Costa visited Miami, Oklahoma in 1989, and followed up with two trips to Peru, Indiana in the early 1990s. From these trips it became apparent that no known speakers, or even semi-speakers, of either Miami or Peoria were still alive in either state. A reputed speaker of Miami passed away in Oklahoma about two months before Costa’s May 1989 visit, though since this speaker was never recorded, it is unknown whether he actually spoke the language or merely remembered words. No living speakers of either Miami or the Peoria dialect could be found in Oklahoma. A similar story played out in Indiana. Costa was able to interview Miami elders who had remembered the Miami language being spoken in their youth (often by their parents or older siblings), and a great deal of local lore and tribal history was still remembered; however, it was clear that little knowl edge of the Miami language was retained in Indiana by the early 1990s. The Miami- Illinois language seems to have lasted significantly longer in Indiana than in Oklahoma; it was long popularly believed that the last people known to have spoken the Miami language to at least some extent passed away in Indiana in the early 1960s, though subsequent research has revealed that a few people with at least semi-speaker ability lived on in that state until the early 1980s. Unfortunately, these speakers were never properly interviewed to ascertain their level of language knowledge. Even more regrettably, no significant sound recordings of native Miami-Illinois language speakers appear to have ever been made. As Costa discovered that no speakers or sound recordings of Miami-Illinois could be consulted, it became clear that all further research on the language would depend on analysis of older written materials. During 1989–1990, Costa found that although little had been published about Miami-Illinois, an astounding amount of unpublished archival materials on the language survived. Starting with three French Jesuit dictionaries of Illinois written in the late seventeenth and early eighteenth centuries, continuing up to the late nineteenth/early twentieth century materials at the National Anthropological Archives and the Indiana State Library, Costa was able to amass copies of a very large amount of written records on the language. These very extensive materials were not especially well organized, so Costa began creating his own informal dictionary files on the language, bringing some order to the chaos and making data easier to find. He still uses the direct descendants of these files to this day. Costa completed his dissertation in 1994, the first grammatical analysis of Miami- Illinois. This was later expanded into a book, published in 2003. However, Costa’s study of Miami-Illinois grammar continues to the present day, analyzing aspects of the language that were either overlooked in Costa (2003) or simply not understood fifteen

556 Daryl Baldwin and David J. Costa years ago. Some of the major aspects of Miami-Illinois grammar that have yet to be thoroughly described are word order, syntax, obviation, sentence particles, and stem derivation. The largest sources on Miami-Illinois that have not yet been fully examined are the three Illinois dictionaries from the French Jesuit period; as these manuscripts continue to be entered into a customized database, their data will become more and more accessible, allowing them to fulfill their potential as a rich source of insights on the Miami- Illinois language. During one of Costa’s trips to Indiana in the early 1990s, he met Daryl Baldwin, who was attempting to revitalize the Myaamia language in the home with his family using family documents he had found (Baldwin 2013). Motivated initially by an interest in reinforcing a cultural identity in a deeper way, the Baldwin family’s effort began in 1991 after the birth of his second child. Early efforts were very rudimentary. Armed with lists of words and phrases from old documents, the Baldwins attempted to reproduce the language by speaking to their two children as they were beginning to talk. As the children grew toward school age, the Baldwins decided to homeschool for the purpose of incorporating Myaamia language and culture into their daily learning environment. With the commitment of Daryl’s wife Karen, they began their homeschooling effort in 1994, continuing up through 2012 when the last two Baldwin children transitioned to public high school. Evolving strategies and more skilled practice during those twenty + years allowed for collective growth and learning together as a Myaamia family, which significantly enhanced Myaamia language and culture as part of the family’s identity. A significant aspect of this personal journey was Baldwin’s education in linguistics through an MA program at the University of Montana, which began in the fall of 1997. It cannot be overstated how important linguistics training was to the Baldwins’ self- directed home learning effort. Baldwin credits his linguistic training in helping him better understand the phonology, morphology, and syntax of his language and being able to work with Costa on a different level during the reconstruction process. The impact of language learning in the home has great potential to extend out into the community. In 2013 the Baldwins’ eldest son, due to his level of proficiency in the language, was hired by the Miami Tribe of Oklahoma as a language instructor after finishing a degree in Anthropology at Miami University. The Baldwins’ eldest daughter was hired as a Summer Program Assistant to the Miami Tribe’s Eewansaapita youth language program and also works with the content and methods development of the tribe’s Saakaciweeta summer day care program. Her specialized skills, acquired from a degree in Early Childhood Education from Miami University and training in Montessori methods, have aided in the development of these community youth programs, especially in regard to the inclusion of language content and activities. These examples show how long-term dedicated family efforts can aid community efforts by providing needed human resources at critical stages of program development. Today, these programs (Eewansaapita and Saakaciweeta) have developed into important community-wide efforts toward youth language and cultural development, due to the work of many dedicated staff of both the Miami Tribe Cultural Resources Office and the Myaamia Center at Miami University.

Myaamiaataweenki: Revitalization of a Sleeping Language 557

2. The sources Miami-Illinois may be unique among native North American languages, because despite not having been natively spoken for about half a century, it nevertheless has extremely extensive written documentation, spanning two and a half centuries. However, most of this documentation has never been published, and exists solely as manuscripts in archives and libraries. Purely in terms of written records, Miami-Illinois is one of the most extensively documented Algonquian languages, being far more thoroughly recorded than many other Native American languages which still have speakers. The earliest documentation of Miami-Illinois consists of three Illinois dictionary manuscripts compiled by French Jesuit missionaries from the 1690s through the 1720s. The earliest of these is the French-Illinois dictionary (Pinet n.d.) written by Pierre-François Pinet, a French missionary born in 1660. Pinet appears to have begun his work after arriving at the Mission de l’Ange Gardien at Chicago in 1696, though when this mission closed in 1700 he relocated to southern Illinois, where he spent the next two years serving among the Tamaroas and Kaskaskias, until his death in 1702. The fact that Pinet gathered data from different missions explains the fairly extensive dialect mixture seen in the document (Costa 2005). This manuscript was discovered and identified in Quebec by Michael McCafferty in 1999, and so represents the most recently found data source of importance on Miami-Illinois. The Pinet dictionary consists of about 674 pages, but of these about 97 are blank, for a total of about 577 pages with data. The next Miami-Illinois manuscript from the missionary period is the massive Illinois-French dictionary long attributed to the Reverend Jacques Gravier, though it has since been discovered that its handwriting is in fact that of Jacques Largillier, an assistant to missionaries and later a Jesuit brother himself, who spent nearly forty years in Illinois country before passing away in 1714 (Largillier n.d.; see also McCafferty 2011). It is not known when exactly the Largillier dictionary was written out, though some time around the beginning of the eighteenth century would be a reasonable estimate. This manuscript is the only dictionary organized according to Illinois words rather than French keywords. It consists of 586 pages, with an average of thirty-eight lines per page, for a total of approximately 22,000 Illinois words (Pilling 1891, 211). The last of the Jesuit Illinois dictionaries, and smaller than Largillier’s Illinois-to- French dictionary, is the 185-page French-to-Illinois dictionary written by Father Jean- Antoine Robert LeBoullenger, who was a missionary in southern Illinois from 1719 to 1744. The main body of this work consists of about 3,330 French keywords with Illinois translations listed beneath them. The manuscript also contains forty-two pages of untranslated religious texts in Illinois, including a thirteen-page, thirty-five-chapter translation of the Book of Genesis. However, perhaps most valuable of all are the three pages of verb paradigms near the beginning of the manuscript. These paradigms have been extremely useful in the ongoing analysis of Miami-Illinois morphology; almost all possible combinations of verb inflections are given, and all possible subject/object combinations. Even though the LeBoullenger dictionary is significantly smaller than the Largillier

558 Daryl Baldwin and David J. Costa manuscript, the main body of the dictionary contains far more example sentences than either of the other two Illinois manuscripts. These have been tremendously helpful in trying to discern the native Algonquian sentence patterns of Illinois when European influences on the syntax of the language would have been at a minimum. All in all, the Illinois dictionaries are an invaluable source of data, in containing tens of thousands of words and example sentences, including a huge amount of precious cultural vocabulary, collected at a time when the language was in daily use by large, monolingual communities still living in a traditional manner. After the Jesuit period, the Miami-Illinois language went undocumented for the bulk of the eighteenth century. Starting in the early American colonial period, from the 1790s through the 1860s, the Miami dialect was recorded in several vocabularies, the most extensive of which are the Miami notes of Charles Trowbridge. Trowbridge was assistant secretary for the Indian Department of the Great Lakes area, and did extensive ethnological and linguistic work among the Shawnees, Delawares, Menominees, and Miamis. Trowbridge’s work among the Miamis was undertaken during the winter of 1824–1825. His materials, which have been only partially published (Trowbridge 1938) include numerous words, sentences, paradigms, and extensive ethnographic notes. His ethnographic materials constitute the only in-depth materials that exist on the Miamis from a time when much of their traditional culture was still intact. His linguistic notes are especially valuable for containing a large number of example sentences which have been very useful for reconstructing everyday conversational dialogue in Miami. Starting in the 1890s, Miami-Illinois received its first really extensive documentation in the modern period, through the work of Bureau of American Ethnology linguist Albert Gatschet. Gatschet obtained a huge amount of data on Peoria, Miami, and Wea from several different speakers, mainly in Oklahoma. Gatschet appears to have first begun work on Miami-Illinois in 1895, continuing with it off and on until around 1902. For the Peoria dialect, he worked at least with George Finley, Frank Beaver, and John Charley, and obtained texts from the first two of these people. For Miami, he worked with Elizabeth Valley, a Miami speaker born in Ohio living among the Peorias, and the Indiana Miami speakers Gabriel Godfroy and William Peconga. He also worked extensively with Sarah Wadsworth, a Wea speaker born and raised in Indiana, who was also living in Oklahoma by the time of her work with Gatschet. Gatschet’s materials (none of which were published) include two large field notebooks and thousands of filecards, though perhaps his greatest contribution is that he was the first person to collect native texts in Miami-Illinois: in his work he elicited about sixteen texts, from six different speakers of Peoria, Miami, and Wea. These include traditional narratives, animal stories, how-to stories, and historical and autobiographical narratives (see Costa 2010). These texts are absolutely priceless, not only for the information they contain about Miami- Illinois syntax and discourse structure, but also for their insights into the oral literature and culture of the Miami-Illinois-speaking peoples. Soon after Gatschet’s work, the Miami-Illinois language was extensively documented for more than ten years by the Indianapolis lawyer and avocational linguist Jacob P. Dunn. Dunn re-elicited most of Gatschet’s texts (though not always from the same

Myaamiaataweenki: Revitalization of a Sleeping Language 559 speakers), as well as collecting several new texts, and also collected a huge amount of new vocabulary from speakers in both Indiana and Oklahoma. Dunn’s records of Miami-Illinois are rather similar in quality to Albert Gatschet’s (he worked with several of the same speakers with whom Gatschet worked), with the qualification that Albert Gatschet was considerably better than Dunn in hearing the phonemic contrasts of Miami-Illinois, whereas Dunn was considerably more insightful about the semantics and grammar of the language. Perhaps the main distinguishing accomplishment of Dunn’s work on Miami-Illinois was that he worked far more extensively with the Indiana Miami elder Gabriel Godfroy than did Albert Gatschet; for unknown reasons Gatschet worked very little with Godfroy, whereas Dunn obtained from him eleven new texts, several verb paradigms, and a large amount of vocabulary. The bulk of Gatschet’s Miami-Illinois notes are from speakers in Oklahoma, and thus Dunn’s materials provide us with our only detailed look at the speech of Indiana Miami in its last generation of fluent speakers. Dunn’s Miami-Illinois data are preserved in several notebooks and several hundred filecards. A portion of Dunn’s filecards were carelessly redacted and published by linguist Carl Voegelin in the late 1930s (Voegelin 1938–1940), but the bulk of his data has never been published, and remains at the Indiana State Library and the National Anthropological Archives to this day. The last substantial documentation of Miami-Illinois was undertaken by the Bureau of American Ethnology linguist Truman Michelson, who, in one week’s worth of fieldwork with two Peoria speakers in Oklahoma in 1916 (mostly George Finley, who also worked with Gatschet and Dunn), collected three texts, a full schedule of kinship terms, numerous verb paradigms, and a fair amount of vocabulary. Truman Michelson was the only trained Algonquianist who ever did Miami-Illinois fieldwork with fluent speakers, so his records are probably the highest-quality data that exists on the language, though nowhere near as extensive as one might wish. None of Michelson’s notes were ever published, and they too are now preserved at the National Anthropological Archives. From the 1930s through the early 1960s, the last generation of Miami-Illinois speakers and semi-speakers were documented in a handful of brief vocabularies, some documented by linguists but most by amateurs. The most valuable of these vocabularies was gathered by linguist Charles Hockett during two days of fieldwork in 1938 with Myaamia and Peoria semi-speakers living in Oklahoma. Hockett’s materials on Miami- Illinois are the most phonetically accurate notes on the language ever documented, though their reliability is diminished by the fact that Hockett did not spend enough time on the language to consistently learn its phonemic contrasts, as well as the fact that his speakers were no longer entirely fluent. Nevertheless, Hockett’s Miami and Peoria notes are valuable for giving us a better glimpse of the phonetic details of Miami-Illinois pronunciation, not limited by the Bureau of American Ethnology-influenced orthographies that Gatschet, Michelson, and Dunn used. After Hockett’s work, the Miami-Illinois language was documented in a scattering of small word lists, the importance of which is greatly diminished by the lack of any kind of linguistic training by the people who collected them as well as by the greatly decreased

560 Daryl Baldwin and David J. Costa fluency of the semi-speakers and rememberers who were still alive in the 1940s and thereafter. Although people with ancestral native knowledge of Miami-Illinois survived until the early 1980s in Indiana and to a lesser extent in Oklahoma, no trained linguists worked with these people. Despite having copious documentation covering more than two centuries, all the records of Miami-Illinois are problematic, as none of the data was accurately recorded phonetically: no source consistently marks all the distinctive sounds of the language. Data from the most fluent speakers was usually written down by the less skilled transcribers, while the more accurate later records of the language, transcribed by the first generation of trained linguists, are from a time when speakers were less fluent. Thus, even though there is a massive amount of data on Miami-Illinois, little of it was simultaneously documented competently and from fluent speakers. As a result, none of the recorded corpus of Miami-Illinois can be taken at face value, and all data must undergo careful philological analysis before it can be used (see Costa 2015). The two crucial phonological features of Miami-Illinois missed by most pre-modern recorders are vowel length and pre-aspiration of obstruents. Both vowel length and pre- aspiration are fully contrastive and carry a high functional load,1 yet they are seldom indicated by most recorders of the language. In the French Jesuit records, vowel length is never marked and pre-aspiration is only erratically marked. In the late-nineteenth-and early-twentieth-century records of the language, both vowel length and, especially, pre- aspiration are marked somewhat more often, though still far from dependably. Since no source marks both features consistently, these features must be filled in and all the data phonemicized in order to create accurate materials usable for either linguistic or pedagogical purposes. This is done by two methods: by comparing all the varying original transcriptions for the words, and by comparing the Miami-Illinois words with cognate forms from its closely related sister languages. Both vowel length and pre-aspiration are found in essentially the same places in Miami-Illinois as in neighboring Algonquian languages such as Kickapoo, Meskwaki, Ojibwe, and Shawnee, and so comparing Miami-Illinois transcriptions to cognate words from these languages is crucial in correctly phonemicizing Miami-Illinois words.

3. Community language development Language learning in the home is much easier if there is consensus and whole-family commitment. Language learning on the community level is much more complicated and involves external factors that are not easily controlled. For this reason, community- based language revitalization entails much more than just teaching language. It involves 1 For example, in many verbs, the only phonological feature distinguishing the first person singular and the second person singular is vowel length: for example, compare meenaani (“I drink”) versus meenani (“you drink”).

Myaamiaataweenki: Revitalization of a Sleeping Language 561 a wide range of capacity-building, planning, working with various ideologies, and understanding the local nuances of social interactions that are needed to support a broader effort. Therefore, community language revitalization is more of a social movement than just a language teaching and learning effort. The community and cultural context necessary to support language revitalization on a broader level varies vastly among tribal communities and for this primary reason no single approach is going to work broadly across different communities. For the Myaamia people a great deal of capacity-building is necessary before a greater outcome can be realized. Sleeping languages also require a significant amount of grammatical reconstruction from documentation due to the absence of speakers. What can be expressed effortlessly by fluent speakers must be laboriously reconstructed by trained linguists and then contextualized in community and culture learning environments, in order to achieve some usable representation of the language for revitalization purposes. The usability of this reconstruction work hinges significantly on the quantity and quality of language sources, linguistic competency to analyze the materials, and the means to collect, store, and search large amounts of data efficiently. This collective work requires a large amount of human and financial resources and can take many years, maybe even generations, to produce a productive outcome where the community begins to feel success. The Myaamia language revitalization effort supports three main organizational entities that have become keystone components of the current effort. They include the Miami Tribe of Oklahoma’s Cultural Resources Office (CRO), Myaamia Education Office (MEO), and Miami University’s Myaamia Center (MC). Each plays significantly different roles, with the CRO and MEO located at the Miami Tribe headquarters in Miami, Oklahoma and the MC located at Miami University (MU) in Oxford, Ohio within the Miami tribe’s historic homeland. The oldest of these three entities is the CRO. First created in 1999 as the Cultural Preservation Office, its initial purpose was to guide the development of language and cultural programming at the community level. Tribal member Julie Olds was instrumental in working with tribal leaders to create this new office during the mid-1990s. At that time, Olds was serving as the Secretary Treasurer of the Miami Tribe of Oklahoma, an elected position. As an elected official, she had a significant role in helping other tribal leaders understand why tribal resources should be used for the support of language and cultural development. For many years, cultural learning was left to families to pass on and not seen as a responsibility of tribal leadership. Since the language had not been spoken for many years, a cognitive disconnect had developed in understanding the crucial link between language maintenance and cultural preservation. Olds helped close that gap among leadership, helping them to understand that language and cultural educational development must become a responsibility of tribal leadership. This ideological transition in leadership perspective took nearly twenty years to solidify, to the point where today tribal leaders play a very critical role in moving language revitalization to a new and expanded level, including their direct participation in the learning process. The Cultural Preservation Office (CPO) emerged following the conclusion of an Administration for Native Americans (ANA) Language grant awarded to the Miami

562 Daryl Baldwin and David J. Costa Tribe of Oklahoma in 1996. This grant was the first formal attempt to begin working with language revitalization at the community level, drawing on the early work and assistance of Costa and Baldwin, along with Julie Olds serving as the grant’s “language clerk.” The creation of the CPO was important in that it was the first non-grant funded office developed for language and cultural preservation, and it was fully supported by tribal resources for its long-term sustainability. Early attempts at community language programs developed under the CPO included adult workshops, weekend youth programs, and printed learning aids to meet the self-directed learning needs of interested community members. For several years following the development of the CPO, community programmers, language learners, and tribal leadership struggled to figure out what to do next. It became evident early on that the resources of the Miami Tribe were not sufficient to move this effort forward in the way it needed to develop. Recognizing the need for continued linguistic research and material development, Miami Tribe leadership approached their allies at Miami University in early 2001 to seek assistance. After a short period of planning and discussion, Miami University agreed to support the development of the Myaamia Project on the Oxford campus, which later evolved into the current Myaamia Center (MC). This was a significant step in that it allowed for Myaamia language and cultural research to move forward in a supportive academic environment, as well as allowing the revitalization work on campus to directly connect with the growing number of tribal students attending Miami University under the Myaamia Heritage Award Program, which began in 1991. Shortly after the establishment of the Myaamia Project, Myaamia tribal students began meeting weekly to help develop a series of courses that allowed them to deepen their knowledge of language and culture while attending Miami University. These classes eventually developed into required courses for all tribal students who enter the Myaamia Heritage Award Program. Today, all Myaamia tribal students who enter Miami University are required to take three years of one-credit courses taught by Myaamia Center staff and supported through Miami University’s Department of Educational Leadership. These courses focus on topics such as ecological perspectives, tribal history, language, and culture, as well as modern issues such as tribal sovereignty and self- determination. Myaamia language is incorporated into all of these courses and through other on-campus initiatives, such as a biweekly language table and other social events designed to create important language use domains for tribal students on campus. The current goal for this group of students is to strengthen kinship bonds and their cultural and tribal knowledge, and to begin developing a growing base of novice language users. As of Fall 2016, thirty-two Myaamia tribal students attended Miami University, the largest group ever. Embedding language and cultural education and reinforcing community kinship bonds as part of their collegiate experience is beginning to demonstrate positive outcomes, most apparent in the increased graduation and retention rate of these students (Mosley-Howard et al. 2016). During 2004 the CPO transitioned into the Cultural Resources Office (CRO) in order to more broadly capture the growing requirements of the Miami Tribe in cultural

Myaamiaataweenki: Revitalization of a Sleeping Language 563 resources management. By 2012 the Myaamia Education Office was created to assist the CRO in directing community language programs and begin designing a long- term model for tribal education. In 2013 the Myaamia Project at Miami University transitioned into the Myaamia Center, forcing an internal reorganization of this entity in order to prepare for longer-term growth and development. Capacity-building is at the heart of this effort. The organic development of these three offices reflects growth based on experience, needs, and a developing understanding of what is possible for this particular community.

4. The Myaamia Center The work and direction of the Myaamia Center are unique in that it is a tribally directed research center located within an academic setting, within the tribe’s historic homeland. It is within this tribe-university framework of support that the MC is directed to assist the Miami Tribe in language and cultural educational development specifically for the Miami Tribe community. For many years the MC (called the Myaamia Project from 2001–2012) operated under the supervision of the Vice President for Student Affairs and the Miami Tribe Cultural Resources Officer. Together, these two positions approved and supported the direction and work of the MC. The first Memorandum of Agreement (MOA), signed in 2008, further solidified the institutional commitment. When the Myaamia Project became the Myaamia Center in 2013, a new three-year MOA was created and signed, which was updated in 2016. Four of the MOA articles cover research and education, funding and employment, assessment and reporting, and, specifically in article IV, intellectual property (Memorandum of Agreement 2016). There are five offices within the MC that reflect the evolving needs of this research and development effort. The Office of Miami Tribe Relations maintains the relationship pathways created between MU and the Miami Tribe of Oklahoma (MTO). This office is also directly responsible for overseeing the Miami Tribe student experience and heavily mentors tribal students who come to MU under the Myaamia Heritage Award Program. This office also handles any interest in the Miami Tribe that emerges from faculty or staff. The Office of Education and Outreach is directly responsible for supporting the language and cultural education needs of the tribal community and directly collaborates with the MEO. This office develops curriculum for tribal youth programs, trains community educators, carries out research in tribal history and educational pedagogy, and is responsible for utilizing the American Council on the Teaching of Foreign Language (ACTFL) standards in all programs and curricula. The Office of Language Research is responsible for carrying forward the work of researching, transcribing, and translating the massive collection of Miami-Illinois language documentation, and assisting in the development of database tools that can be utilized by tribal educators. This office also responds to daily language requests from tribal programmers, educators, leaders, etc. The Office of Communications and Publications is responsible for overseeing all technological needs

564 Daryl Baldwin and David J. Costa that develop within the MC, including the digital creation of any in-house software, phone apps, and publications and maintaining a web presence. The Office of Cultural Ecology, yet to be fully staffed, is largely responsible for developing a teachable Myaamia ecological perspective. Like many indigenous groups, Myaamia culture is historically rooted in ecological experiences, and therefore ecological interactions are at the core of many current educational initiatives. The outdoor classroom is not only the best place for initial language learning but also provides many opportunities to revitalize Myaamia ecological knowledge. Through this office a great deal of research has been conducted in ethnobotanical and agricultural knowledge, as well as defining healthy living in regard to foods and other ecological interaction. The Myaamia Center’s work covers a wide array of research topics. This is driven by the realization that the Myaamia knowledge system, which the language most efficiently reflects, requires an interdisciplinary approach, whereby research produces a wide range of diverse and culturally relevant topics for programs and curricular content. Some of the more current research foci include ethnobotany, storytelling, plant and animal interactions, dietary information, astrological knowledge, various historical research, youth identity formation, culturally defined knowledge of youth development, traditional games, and so on. Development also includes the delivery of this knowl edge, and so MC staff are also working to construct a culturally appropriate educational model with pedagogical approaches that will best serve the current population. Framed within the context of myaamia neepwaantiinki (“Myaamia education”), staff are developing culturally appropriate curricula and teaching approaches that reflect a knowledge system rooted in Myaamia culture and ways of knowing. In 2013, due to increasing pressure to report on educational outcomes, a team of tribal educators, psychologists, and cultural specialists came together to create the first formal assessment team to measure tribal student language and cultural educational experiences in relation to an evolving understanding of nahi meehtohseeniwinki (“living well”). Using ethnographic and qualitative research approaches, assessors are examining the Miami tribe student experience, including both its positive health and community benefits as well as the role of language and culture in youth identity formation (Mosley-Howard & Strass 2016). Through the experiences of the next generation, we will see knowledge, language, community, and culture merge and show itself in a way that will define what it means to be Myaamia for a future generation. As important as language teaching is, it must also be realized that it is only a single part of revitalization as a whole community concept. Among the other important developments occurring through the Myaamia Center are the technological tools that must be created to carry out its work. One significant, and more recent, development was the creation of the Miami-Illinois Digital Archive (MIDA). This online archive was created with the support of the National Endowment of the Humanities (Inokaatawaakani 2012). Its purpose is to serve as a digital home for all known Miami-Illinois language data. Aside from just being an archive, this tool allows researchers and educators to search all known content fields within and among

Myaamiaataweenki: Revitalization of a Sleeping Language 565 the available sources. Easy access to archives, and the content of those archives, has long been a struggle within the Myaamia revitalization effort. The MC is beginning to solve some of these access issues through an array of technological tools developed through the center. Aside from research tools such as MIDA, the Myaamia Center actively develops online learning tools and publications. One example is the online Myaamia Dictionary, a web-based dictionary with an accompanying IOS app for use with hand-held devices. This language learning tool is easy to use and comes with audio files. Much of the content currently added to the online dictionary comes through requests by users. The user-driven participation ensures the dictionary is being populated with language useful to learners. The MC also promotes a wide range of cultural activities and related projects, such as the recent peepankišaapiikahkia eehkwaatamenki (“ribbonwork”) project. Ribbonwork is a traditional craft deriving from the trade era where strips of silk ribbons of various colors were traded with Myaamia people and geometric patterns were sewn onto special occasioned clothing. This project, supported by a grant through the National Endowment for the Arts, is another example of how culturally based projects serve as important learning pathways for language. Myaamia language isn’t generally taught in the form of grammar exercises but rather as whole language intertwined with almost all activities, projects, and learning environments. Contextualizing language with cultural or community activity has proven over the years to be more effective for the learner, even if the learner only initially develops a novice level of proficiency in the language. Greater language skills generally come later after the individual is more fully integrated into other community learning activities. It is in the context of these culturally specific programs that language “immersive moments” begin to emerge. For instance, traditional games have long been taught as a “language activity” and for many of the more advanced learners entire game activities can be played in the language complete with joking in the language. Current work within the Myaamia Center is providing much-needed support for training qualified teachers needed to work in these educational programs. Teachers are essential to this effort and must possess specific training in child development, pedagogy, second-language acquisition, and culturally specific Myaamia knowledge. Educators in all tribal programs are typically licensed educators and trained to approach teaching in a way that allows tribal youth to experience their heritage language and culture in the context of myaamia neepwaayoni (“a myaamia knowledge system”). This tribal knowledge system is unique in content and linguistic expression, but never presented as oppositional to what is termed the “global knowledge system.” It is important to help students develop bicultural behaviors and skills for living in a multicultural world. Myaamia youth today possess flexible identities that differ significantly from previous generations. Myaamia language and cultural efforts must meet the needs of these flexible identities in a way that is nurturing and supporting in order to keep youth engaged in carrying forward their nations language and culture.

566 Daryl Baldwin and David J. Costa

5. Current tribal program The Miami Tribe of Oklahoma supports four specific areas of language and cultural learning and development. These include larger tribally supported programs, age- specific youth programs, language learning opportunities for tribal students at Miami University, and continued promotion of self-directed learning, often in the home, by anyone interested in the language, including non-tribal members. Due to community diaspora, tribally supported programs are developed wherever there exists a large enough population of tribal members interested in learning Myaamia language and culture. These include annual weekend workshops in a variety of locations such as Kansas, Indiana, Texas, Washington State, and Ohio. There are also monthly language learning gatherings at the tribal complex in Miami, Oklahoma during the fall and winter months. Additionally, language sessions in Oklahoma regularly occur during the Miami Tribe’s annual national gathering week at the end of June and during the annual mid-winter Stomp Dance weekend at the end of January. One of the more developed tribally supported programs is the Eewansaapita Summer Youth Educational Experience. The program takes its name, eewansaapita (“sunrise”), from the Myaamia language. This metaphoric expression captures the rebirth and renewal of the Myaamia nation through its youth. Coined by tribal elder Sammye Darling in response to her observations in the 1990s of the language and cultural revitalization movement, which she described as an “awakening,” this program is well developed with curriculum and trained teachers. The mission of the Eewansaapita program is to teach Myaamia specific language and culture to tribal youth ages 10 through 16. Eewansaapita emphasizes Myaamia methods of learning and cultural values and focuses on connecting Myaamia youth to each other in Myaamia places. The core educational values that drive Eewansaapita and other programs include: neepwaahkaayankwi (“we are knowledgeable, have wisdom”), eeyaakwamisiyankwi (“we strive to achieve”), eeweentiiyankwi (“we are related to each other”), peehkinaakosiyankwi (“we are generous”), aahkohkeelintiiyankwi (“we care for each other”), neehweeyankwi (“we speak well”), paahpilweeyankwi (“we are humorous”), aahkwaapawaayankwi (“we dream”). The Eewansaapita program began in 2005 in Oklahoma and is currently held in two locations, Fort Wayne, Indiana and Miami, Oklahoma. Each summer approximately twenty tribal youth attend this program for one week in each location for a total of about forty tribal youth served annually. The program is driven by six themes, one theme taught each summer, and takes six years to complete all the themes. Many if not most students return every year allowing for the majority to experience all the themes. The Eewansaapita themes include Weekihkaanki Meehkintiinki (“Games”), Kiikinaana (“Our Myaamia Homes”), Weecinaakiiyankwi Weecikaayankwi (“Song and Dance”), Meehtohseeniwinki Ašiihkionki (“Living on the Land”), Eeweentiiyankwi (“Family”), and Ašiihkiwi neehi Kiišikwi (“Earth and Sky”).

Myaamiaataweenki: Revitalization of a Sleeping Language 567 Eewansaapita counselors are typically Miami Tribe students from Miami University who utilize knowledge they learn from their experiences through the Myaamia Heritage Award Program on campus. They attend three consecutive years of coursework in language and culture, history and ecological perspectives, and modern issues such as tribal sovereignty and self-determination. In their fourth year, students may pick a topic of interest for their independent study, preferably something that gives back to the community. Throughout their experience at MU students are exposed to the Myaamia language on a near-daily basis, with multiple opportunities to interact with language instructors, attend the biweekly “language table,” and gain direct access to Myaamia Center staff and a variety of research topics. Due to these various campus experiences, these students are very capable of serving as role models for the youth in the Eewansaapita program. The Myaamia Center and the Miami Tribe’s Cultural Resources Office have become a hub for requests by tribal members interested in making use of Myaamia language for things like naming, wedding ceremonies, homeschooling, or other personal interests. It is nearly impossible to gauge at this stage the full use of language across the community due to available resources created for community use. It has always been the goal of this effort that language learning and, more important, daily language use would eventually take on a life of its own.

6. Conclusion Reclaiming a language from documentation and its reintroduction into a community takes a great amount of resources and strategic planning, and lots of community capacity-building. This work also requires a level of humility around idealized goals such as community-wide fluency and the maintenance of language purity. For many communities with living speakers who are working diligently to “save” their languages, time is of the essence. For the Myaamia, time is irrelevant in many cases when considering the amount of foundational work that needs to be done in order to support a more productive and sustainable effort with a future generation in mind. Myaamia language and cultural organizers have long used the garden as a metaphor to describe their work (McCarty et al. 2013). Any garden site needs to be prepared and unwanted plants removed before the planted seeds of nourishment can take place. After seeds begin to germinate, they need to be tended and nurtured to grow to a point of harvest. It’s a cyclical process that requires consciousness and awareness in order to maintain it. Language revitalization requires many of the same steps but in a broader context focusing more on the overall well-being of people rather than language fluency alone. People priority is important in this work because language is not a thing but a pathway to leading people toward a healthier community cultural context. This is ultimately a community health and healing issue, and the work of language and cultural revitalization forces communities like the Myaamia to re-examine self, discuss community values, and dream for a different future.

568 Daryl Baldwin and David J. Costa As indicated above, research and development play a significant role in this effort. What used to be transferred naturally in a village context now has to be transmitted consciously in an educationally constructed context. A new additive form of Myaamia education must be constructed for Myaamia knowledge transfer to take place. This requires training, research, and development. The relationship between the Miami Tribe of Oklahoma and Miami University (quite unique in the United States in the nature of its work), the level of respect between entities, and the direct tribal and university benefits provide a location for this research and developmental work to occur. The Myaamia language revitalization effort would not be where it is today without this supporting relationship. And, finally, the overall community health and well-being impacts of this work cannot be overstated (Whalen, Moss, and Baldwin 2016). From the healthy foods served at language programs to the four-to-one ratio of student to mentor/teacher, revitalization work is empowering, positive, and nurturing. Simply put, youth need to feel safe, supported, and able to explore their cultural knowledge in a contemporary way so that most aspects of learning and development have meaning to their lives. As is often said, the Myaamia are a living people with a past not a people from the past. All tribal programs embrace this concept. These positive attributes that create a climate of learning, sharing, and helping each other are not easily achieved in other learning environments. From the time youth begin participating in community language and culture programs (around the age of 9), they begin to learn the core values, listed above, of Myaamia education. Looking at what has been achieved up to this point, Myaamia educators are realizing the broader benefits of this effort in a way that could not have been experienced years ago. For instance, graduation and retention rates among Myaamia tribe students at Miami University during the 1990s were 44%. After the implementation of language and culture courses as part of the Myaamia Heritage Award Program of the 2000s, graduation and retention rates shot up to a current rate of 77% (Mosley-Howard & Strass 2016). Of course there are many other factors that contribute to this increased number, but language and cultural learning are generally recognized as important factors. Additionally, through the longitudinal work of the Myaamia assessment team, four areas of personal growth are being observed from the Myaamia student experience: increased desire toward intergenerational transmission of culture, deepening sense of self, commitment to tribal engagement, and, as already stated, academic accomplishment. Language proficiency remains for most of these students at the novice level on the ACTFL scale, but even this small measurable is a large step in the context of what has been achieved overall in the last few years. An important lesson here is that it doesn’t require fluency to create positive change in a community, but change is important in developing increased proficiency toward fluency over time. Languages stop being used because communities change, and for many indigenous communities, some changes occurred under oppressive circumstances. Language revitalization is itself a process of community change that goes beyond simple language teaching. Change, no matter how great or small, has to come first. Creating

Myaamiaataweenki: Revitalization of a Sleeping Language 569 change must therefore be the goal and education is the most direct way of creating this change. Language proficiency levels, at the community level, are therefore viewed as an outcome of a multi-pronged approach of creating positive changes through a tribal educational effort.

References Baldwin, Daryl, Karen Baldwin, Jessie Baldwin, and Jarrid Baldwin. 2013. “Myaamiaataweenki oowaaha: Miami Spoken Here.” In Bringing Our Languages Home: Language Revitalization for Families, edited by Leanne Hinton. Berkeley, CA: Heyday Books. Costa, David J. 2003. The Miami-Illinois Language. Lincoln: University of Nebraska Press. Costa, David J. 2005. “The St-Jérôme Dictionary of Miami-Illinois.” In Papers of the 36th Algonquian Conference, edited by H. C. Wolfart, 107– 133. Winnipeg: University of Manitoba Press. Costa, David J. 2010. Myaamia neehi peewaalia aacimoona neehi aalhsoohkaana. Myaamia and Peoria Narratives and Winter Stories. Oxford, OH: Myaamia Project. Costa, David J. 2015. “Redacting Premodern Texts Without Speakers: The Peoria Story of Wiihsakacaakwa.” In New Voices for Old Words: Algonquian Oral Literatures, edited by David J. Costa, 34–89. Lincoln: University of Nebraska Press. Inokaatawaakani. 2012. Illinois Dictionary Project (# PD-50017-12). Washington, DC: National Endowment for the Humanities. Ironstrack, George. 2009. “Myaamiaki neehi Myaamionki: The Miami People and Their Homelands.” In Miami University 1809–2009 Bicentennial Perspectives, edited by Curtis W. Ellison, 1–5. Athens: Ohio University Press. Largillier, Jacques S. J. c. 1700. Illinois-French Dictionary. Manuscript at Watkinson Library, Trinity College, Hartford, Connecticut. LeBoullenger, Antoine-Robert, S. J. c. 1725. French and Miami-Illinois Dictionary. Manuscript at the John Carter Brown Library, Brown University, Providence, Rhode Island. McCafferty, Michael. 2011. “Jacques Largillier: French Trader, Jesuit Brother, and Jesuit Scribe Par Excellence.” Journal of the Illinois State Historical Society 104: 188–198. McCarty, Teresa L., Daryl Baldwin, George M. Ironstrack, and Julie Olds. 2013. “Neetawaapantamaanki iilinwiaanki meehkamaanki niiyoonaani: Searching for Our Talk and Finding Ourselves.” In Language Planning and Policy in Native America: History, Theory, Praxis, edited by Teresa L. McCarty, 92–106. Bristol, Buffalo, Toronto: Multilingual Matters. Memorandum of Agreement for the Myaamia Center signed between Miami University and the Miami Tribe of Oklahoma. 2016. Mosley-Howard, G. Susan, Daryl Baldwin, George Ironstrack, Kate Rousmaniere, and Bobbe Burke. 2016. “Niila Myaamia (I Am Miami): Identity and Retention of Miami Tribe College Students.” Journal of College Student Retention: Research, Theory & Practice 17: 437–461. Mosley-Howard, G. Susan and Haley Strass. 2016. “Validating the Impact of Picking up the Threads of Knowledge.” Paper presented at the 7th Biennial Myaamiaki Conference, Miami University, April 2. Pilling, James C. 1891. Bibliography of the Algonquian Languages. Washington, DC: Bureau of American Ethnology Bulletin No. 13. Pinet, Pierre-François. c. 1702. French-Miami-Illinois dictionary. Manuscript at the Archives des jésuites au Canada, Montréal, Québec.

570 Daryl Baldwin and David J. Costa Trowbridge, Charles C. 1938. Meearmeear Traditions (Vernon Kinietz, ed.). (Occasional Contributions 7). Ann Arbor: University of Michigan Museum of Anthropology. Valley, Dorris and Mary M. Lembcke. 1991. The Peorias: A History of the Peoria Indian Tribe of Oklahoma. Miami, OK: Peoria Indian Tribe of Oklahoma. Voegelin, Carl F. 1938-1940. “Shawnee Stems and the Jacob P. Dunn Miami Dictionary.” Indiana Historical Society Prehistory Research Series 1: 63–108, 135–167, 289–323, 345–406, 409–478. Whalen, D. H., M. Moss, and D. Baldwin. 2016. “Healing Through language: Positive Physical Health Effects of Indigenous Language Use” F1000Research 2016 5: 852. doi:10.12688/ f1000research.8656.1.

Chapter 25

L an guage Revi ta l i z at i on in Kinderg a rt e n A Case Study of Truku Seediq Language Immersion Apay Ai-y u Tang

1. Introduction In 2014, thirty indigenous language immersion kindergarten programs were implemented in four regions of Taiwan.1 The project, which will run until 2019, is an initiative of Taiwan’s Council of Indigenous People (CIP) and is administered by the government. Can a government-based language immersion program contribute to stemming indigenous language erosion and reversing a critical shift toward dominant languages? Can preschoolers be the agents of language revitalization? A micro-language planning approach at the grassroots level has often been employed to address social problems that involve language. However, more recent approaches view macro-language planning at the government level as essential in a multilingual and multicultural context such as Taiwan (Jang 2007; Chen 2011). In fact, Hornberger (2008) claims that educational institutions such as schools have a powerful role to play in indigenous language revitalization and the empowerment of indigenous communities. Such approaches are applicable in linguistic situations such as that in Taiwan, where the dominant language (Mandarin Chinese) is mandatory in all domains, most grassroots efforts toward language revitalization are transient, family members who are willing and

1 I am grateful to all the participants, the indigenous language teachers and other leaders at the kindergartens, and my colleagues in this project for their participation and discussion. This chapter presents preliminary results obtained with the support of the CIP (grant no. 104088). I am solely responsible for any remaining errors.

572 Apay Ai-yu Tang able to speak and transmit indigenous languages (ILs) to the younger generations may not be available, and indigenous people may be ambivalent about their language, identity, and ethnicity (Chen 2006). Macro-level language planning considers government involvement to be necessary in indigenous language maintenance. More and more researchers agree that heritage language immersion programs are the most effective strategy for revitalizing a language, and that immersion teaching that begins in preschool is most likely to be successful in an endangered-language setting (Hinton and Hale 2001; Hornberger 2008; Chou 2011, among others). Having grown up as a Truku Seediq (hereafter, Truku) in multilingual and multicultural Taiwan, and having witnessed firsthand the ongoing decline of the indigenous language in all domains, I have felt a compelling need for effective strategies to help stem further weakening of the endangered languages on this island.2 I have been involved in diverse projects to assess Truku’s status, to document the language, and to revitalize its use in Truku communities since 2012. My own experience urges me to ask how Taiwan’s indigenous people can have access to social, economic, and educational opportunities through Mandarin and English yet still maintain their indigenous language and ethnolinguistic identity. By extension, I ask what type of government-based initiative would be appropriate and effective to reverse the critical shift from the indigenous languages to Mandarin. This chapter contributes to research on essential issues of language endangerment and revitalization with its study of three Truku immersion kindergartens that are part of the government-based indigenous language revitalization project initiated by the CIP. Since November 2015, I have been serving as the kindergarten program’s regional collaborator in the Hualien/northeast region.3 I believe that an ideal indigenous preschool program would strengthen the children’s linguistic and cultural identities through the immersion experience, so that they will be empowered to take the risks necessary to support their language choices and imagined future. Conducted midway through the time period allocated to the experimental kindergarten immersion proj ect, this study reports on the state of the program at these three kindergartens to identify problems that can still be addressed to make this program, as well as future projects, more effective. It is urgent to take a multi-component view to current ongoing language shift and language attrition issues. If Truku and other Formosan languages are to survive another generation, we cannot rely solely on bottom-up grassroots efforts that require long-term struggle to overcome lack of resources including financial support, qualified teachers, fewer participants, and pedagogical materials. It is therefore absolutely necessary to 2

My previous research clearly indicates that Truku is undergoing an intergenerational shift to Mandarin. In other words, there is an overt decline in language use between and within generations (Tang 2011). 3 The previous regional collaborators of the Hualien/northeast region from September 2014 to August 2015 moved to another area of the island and resigned.

A Case Study of Truku Seediq Language Immersion 573 explore the potential of national language planning that integrates ethnolinguistic, national, and global identities in a multilingual context. I briefly explain the motivation of the study in section 2. In section 3, immersion teaching models and the key elements of an immersion program in language revitalization are briefly discussed. I describe the research design and methods of the study in section 4. The findings are presented in section 5, followed by a discussion in section 6. Section 7 provides brief concluding remarks.

2. Motivation The motivation of this study is fourfold. First, little scholarly attention has been given to the assessment of specific implementations of indigenous language revitalization projects emerging from government-level policy and planning in Taiwan. Previous studies provide overall reviews of the national language policy (Chen 2010), discuss governmental indigenous language teachers’ training programs in the past ten years (Huang 2011), and describe governmental language revitalization strategies in the past, at present, and for the future (Huang 2014). However, as the CIP (2013) points out, continuous examination and evaluation of each specific governmental project for indigenous language revitalization is indispensable in Taiwan, where the multilingual and multicultural context makes each community’s language situation unique. Second, relatively little attention has been paid to assessing the implementation of indigenous language teaching programs in Taiwan, especially the current immersion program in the kindergartens. Some past research on endangered indigenous language teaching has focused on elementary and/or junior high schools. Tan (2008) examined the indigenous teaching methods, environment, and pedagogical materials at a Paiwan elementary school in the southern part of Taiwan, and provided constructive suggestions for those who teach local languages including Hakka, Southern Min, and indigenous languages. To gain a better understanding of effective language teaching, Zhang (2009) explored current learning adaptations and native language learning attitudes of indigenous junior high school students. The National Academy for Educational Research (2011) examined indigenous language teaching in both elementary and junior high schools. The Academy reported problems in all of the aspects it considered, including administrative support, teacher’s qualifications, pedagogical materials, and parents’ participation, and provided some suggestions for the adjustment of current policies and practices in indigenous language teaching. While all of these studies are generally beneficial for improving our understanding of effective indigenous language revitalization strategies, more focused assessment of individual projects is necessary, as the effects of specific revitalization efforts vary greatly according to First Peoples’ Cultural Council (FPCC, 2014). In particular,

574 Apay Ai-yu Tang there has been little assessment of Taiwan’s current kindergarten immersion language program at this point, except for one study by Chou (2015), which recorded and analyzed the processes and outcomes of an experimental class at a Paiwan tribe’s kindergarten. Third, endeavors that position preschoolers as the key change agents in (reversing) indigenous language shift, as suggested by McIvor (2006, 4), have mainly been limited to French Canada, New Zealand, and Hawai‘i. Nevertheless, early childhood, as Fishman (1996) and Hinton (2001) have pointed out, has long been considered the best time for language learning. Similarly, the Royal Commission on Aboriginal Peoples (1996, 447) stated that “young children absorb information at a greater rate than at any other stage of life.” Immersion language teaching from a young age has the most successful record of contributing to language preservation (Kamana and Wilson 1996; Hinton and Hale 2001). At the stage of language shift that Taiwan is currently experiencing, the key agents must be young learners. In reality, however, the lack of consensus among language policymakers, kindergartens, and family members regarding government-based efforts to revitalize languages has been a big challenge among indigenous speech communities in Taiwan. Therefore, it is worthwhile investigating the role of preschoolers in indigenous language revitalization efforts at the governmental level. Last but not least, the baseline results of the current study can serve as a point of comparison for further assessing Truku language skills, and as a starting point for developing immersion language programs in the future. Furthermore, the results lead to several suggestions for addressing problems that are hindering the effectiveness of the ongoing immersion language program in the indigenous kindergartens.

3. Theoretical orientation We organize our theoretical basis on language revitalization in kindergarten from the following three perspectives: endangered- language shift and revitalization, macro- language planning and language-in-education policy, and language immersion programs.

3.1. Endangered-language shift and revitalization Language shift involves loss of functional aspects of a given language (i.e., change in language use), and can occur at the macro-or community level. In Taiwan, language shift to Mandarin leading to indigenous language death is currently rampant among the indigenous speech communities. In other words, indigenous languages have been in a subordinate position while Mandarin has had privileged status for the past century. This is mainly due to the early sinicization of the indigenous groups, especially those who lived in lowland areas; governmental policy imposing Mandarin Chinese as the only official language; lack of intergenerational transmission in communities where the indigenous

A Case Study of Truku Seediq Language Immersion 575 languages are still spoken; and the emigration of younger villagers to neighboring towns (Zeitoun, Yu, and Weng 2003).4 In addition, indigenous languages are not used as the medium of instruction in schools, so many people feel that learning indigenous languages brings little or no future benefit. These ongoing changes have brought about a decrease in the domains of use of the indigenous languages like Truku, the reduction of the number of speakers, and the interruption of intergenerational transmission—the weakening of three factors that are crucial for a language to survive (Tang 2014). Language revitalization attempts to stem further endangered-language erosion and shift toward the dominant language; it refers to a process of re-awakening language. It involves language documentation as well (Penfield and Tucker 2011, 292), because documentation can provide communities access to language data for the purpose of language revitalization.

3.2 Macro-language planning and language-in-education policy Language planning (LP) refers to “deliberate efforts to influence the behavior of others with respect to the acquisition, structure, or functional allocation of their language code” (Cooper 1989, 45). There are two levels of LP: macro and micro. The former describes language planning taking place at the governmental level, and the latter occurs at the local level or in interpersonal communication (Kaplan and Baldauf 1997; Ricento 2006). This study emphasizes macro-LP, which includes issues related to indigenous language policy development, the orientation of future indigenous education, the implementation of language policies, and formal language learning practices, because this is the level of LP that can potentially relieve the multifaceted pressures on the endangered languages of Taiwan at this crucial juncture. It is important to note, however, that to achieve the integration of ethnolinguistic, national, and global identities in multilingual and multicultural contexts, it is essential both to engage in macro-level language planning and to ensure that it is realized in micro-level practices. Many scholars draw attention to the role of educational institutions in the implementation of macro-level planning and the resulting language education policy (LEP; e.g., Auerbach 2000; Spolsky and Shohamy 2001, among others). Several offer critical views. For example, Shohamy (2006) considered LEP a form of imposition and manipulation used by those in centralized educational systems. She pointed out that language policy is an attempt to create a set of authoritative principles regarding language behaviors, and that LEP functions as a mechanism to create de facto language practices in educational institutions. LEP is a powerful tool, as it can create and impose language behaviors in a system that it is compulsory for all children to participate in; it can determine the 4

This paragraph draws on my previous research (Tang 2011).

576 Apay Ai-yu Tang priority of certain languages in society and how these languages should be used, taught, and learned. Shohamy (2006) further observed that LEP may also include a decision to teach or not teach foreign/second languages that are used as heritage, community, immigrant, and indigenous languages. Indigenous language revitalization, as Hornberger (2008) also emphasized, is subject to the vagaries of policy, politics, and power, as well as to the economics of the linguistic marketplace. Nevertheless, schools do have an inevitable and important role to play throughout the process. LEP in macro-level planning ensures that educational institutions in gen eral play a significant role in long-term language maintenance. In addition, grassroots efforts alone cannot match the potential capacity of macro-level LEP, as an agent in a centralized educational system, to offer ways to relieve the multifaceted pressures on a language and to provide opportunities to boost speakers’ use and confidence.

3.3. Language immersion programs Hornberger (2006) and McCarty (2006) argued that the activation of indigenous voices through the use of indigenous languages in schools can be a vital force for enhancing indigenous children’s learning and promoting the maintenance and revitalization of their languages. Both authors claimed that schools have a powerful role to play in indigenous language revitalization and the empowerment of indigenous communities. The question raised by such claims is how schools can be most effective in this role. The first and foremost model for young children’s immersion schools is “Te Kōhanga Reo,” the Māori “language nests” that have achieved considerable success (Hornberger 2008). The FPCC (2014) defines a language nest as a program for children from birth to 5 years old where they are immersed in their First Nation language. The next section discusses the definition, characteristics, types, and key elements of language immersion according to the FPCC.

3.3.1. Definition, characteristics, and types Language immersion is a method of teaching language, usually a second language (L2), in which the target language is both curriculum content and medium of instruction. A successful language immersion program shares at least three common characteristics: the use of full immersion to pass the language on to the children, the incorporation of the culture into all aspects of the program, and the involvement of family members and elders in the day-to-day activities. There are three main types of language immersion: (1) total immersion, in which almost 100% of the schoolday is spent in the L2, meaning that almost all subjects will be taught in the L2; (2) partial immersion, in which only some (usually around half) of the class time is spent in the target language; and (3) two-way immersion, which is designed to “integrate language minority students and language majority students in the same classroom with the goal of academic excellence and bilingual proficiency for both student groups” (Christian 1997, 9).

A Case Study of Truku Seediq Language Immersion 577 It has been suggested that to rejuvenate an indigenous language, the preschool-age members of the language community must experience language immersion. In other words, from the time the children first enter school, at least 50% of class instruction and in communication must be taught through the medium of the indigenous language, and the students should learn the school subjects via their mother tongue (Freeman 2004). In the case of the current kindergarten programs, however, this goal would be difficult to achieve, mainly because of the lack of qualified teachers proficient in the language and/or their anxiety about students’ being unable to understand the content. Thus, the Truku kindergartens in this study do not yet meet the standards of even partial immersion.

3.3.2. Key elements of language immersion teaching Elements necessary in a language immersion (LI) program include qualified teachers, appropriate pedagogical materials, immersion pedagogy, a sufficient amount of immersion, administrative support, and family/community participation. First, the existence of teachers who are qualified to teach in the target language is one of the key factors of successful LI. The lack of teachers who are both proficient in speaking bilingually and trained in appropriate teaching strategies is widely considered one of the main challenges or problems in starting an LI program (FPCC 2014). As May, Hill, and Tiakiwai (2004) discussed, a widespread problem is the lack of proper pre-service and in-service training for teachers of indigenous languages. For example, Hirvonen (2004) described a new Saami school in which neither school leaders nor teachers had the systematic training needed to fulfill the school’s educational objectives. According to Pease-Pretty on Top (2003), a Māori teacher in the Te Kōhanga Reo needs to be trained in LI teaching, child development, class management, community collaboration, and so forth for about three years. Such teacher training is beyond the reach of most indigenous language revitalization efforts. Second, pedagogical materials are indispensable in an immersion classroom. Chou (2011) emphasized that indigenous-culture-based curricula and materials not only enable learners to learn their own ethnic tradition but also increase their understanding of their personal identity and build their confidence. Another important element in an LI classroom is the utilization of multiple teaching methods (Chou 2015). Livaccari (2012) summarized the top five immersion teaching skills as follows: (1) use visuals, gestures, body language, expressions, modeling, and movement to complement verbal cues; (2) motivate students to stay in the target language; (3) ask open-ended questions; (4) regularly assess students’ comprehension and skills development; (5) think strategically about the various types of student interactions and how to vary them, promoting a dynamic learning environment. Next, to enable children’s acquisition and development of conversational competence, at least 500 learning hours are needed in the target language (Hinton 1994; Eaton 2010). Greymorning (1997) reported a correlation between the length of immersion exposure and improvement of language proficiency. Nevertheless, it is not necessarily the case that effective communication skills can be acquired merely by long exposure to immersion teaching.

578 Apay Ai-yu Tang Finally, both administrative support and family/community participation are crucial factors in an LI program. The administrative side includes financial accounting; parent- school communication; reporting; cleaning, organization, and planning of physical space/equipment; and child-care duties (FPCC 2014). In addition, it is widely acknowledged that family and community participation should not just be peripheral to an LI project, but intrinsic elements of the LI curriculum (Hornberger 2008).

4. Research design and methods Before I discuss the three Truku kindergartens in detail, I briefly describe the Truku speech community in section 4.1. The current government-based indigenous language immersion program and the three kindergartens are presented in section 4.2. The current study’s research methods are explained in section 4.3.

4.1. The Truku speech community Seediq is an Austronesian language spoken by the indigenous groups who live in the northeastern part of Taiwan.5 There are three major dialects, Teuda, Tkdaya, and Truku. The Truku population is around 29,410, but not all of these people are fluent speakers, and the youngsters do not speak Truku. According to Krauss’s (2007) classification of degrees of language endangerment, Truku is a definitely endangered language, because it is spoken only by the parental generation and older. Similarly, on Fishman’s (1991) Graded Intergenerational Disruption Scale (GIDS), Truku is between stages 7 and 8, indicating that parents are not passing the language on to their children, and the only remaining fluent speakers of the language are members of the grandparent generation. Nowadays, the majority of the young adults work outside the village, and many villagers tend to use Mandarin in various domains, which helps them to better access socioeconomic resources. However, these changes lead to frequent contact with non-Truku speaking communities and limit the domains of Truku use.

4.2. Government-based indigenous language immersion programs The current government-based indigenous language revitalization project initiated by the CIP runs from December 2013 to 2019. The indigenous language immersion 5

This section draws on my previous research (Tang 2014).

A Case Study of Truku Seediq Language Immersion 579 program (ILIP) discussed in this chapter, mainly implemented by Kun Shan University in Tainan, in the southern part of Taiwan, is one of the five main projects of this phase. The first year of the ILIP, which was initially implemented at thirty kindergartens, began in September 2014 and ran until August 2015. At the time of this writing, the second school year of the ILIP is in progress; it began in November 2015 and will end in September 2016. It was implemented at twenty-three kindergartens in four different regions—northern, Hualien/northeast, eastern, and southern—with twenty-three trained teachers to run the kindergartens’ experimental classes.6 Three of these are Truku kindergartens in Hualien, in the northeast. Their names are An-de, Bsuring, and Miharasi (see section 4.3.1). The goals and activities of the ILIP are outlined in section 4.2.1.

4.2.1. Goals and activities of the ILIP The four goals of the government-based ILIP are: (1) to form an advisory group that visits and assists the ILIP schools regularly; (2) to help negotiate or solve problems between teachers and their kindergartens; (3) to assist the kindergartens to develop the pedagogical materials that best suit the learners; (4) to empower the language teachers with skills in teaching; and (5) to hold an annual evaluation to ensure the quality of the ILIP kindergartens. To work toward these goals, the project has focused on five major realms of activity: (1) indigenous epistemological and culture-based language classes (see section 4.2.1.1), (2) bimonthly teachers’ empowerment workshops (see section 4.2.1.2), (3) documentation of teaching processes and activities online (see section 4.2.1.3), (4) advisory visits and evaluations (see section 4.2.1.4), and (5) development of pedagogical materials (see section 4.2.1.5).

4.2.1.1. Indigenous epistemological and culture-based language classes As noted earlier, the process of teaching preschoolers their own ethnic traditions including language can have positive effects on their identity and confidence. The program’s hope is that the learners will be exposed to their native tongues for at least two hours a day, and that the theme of the courses will incorporate their real-life experiences and their own traditional cultures. To this end, the program requires the immersion language teacher (ILT) in each kindergarten to (1) employ different teaching strategies that are diverse and congruent with traditional culture such as learning by doing (i.e., learning language through chanting, planting, observing traditional practices, and so forth); (2) use pedagogical materials that are based on indigenous epistemology and designed by ILTs and 6

These language revitalization projects are sponsored by the Education and Culture Department of the CIP (http://www.apc.gov.tw/portal/docDetail.html?CID=F6F47C22D1435F95&DID=0C3331F0EBD 318C25CFD1AF05A8D00DF). This specific ILIP is implemented by a team led by Dr. Hsuan-Chen Chou with two administrative assistants at Kun Shan University. There are four collaborators, one in each region; the author is the collaborator in the Hualien/northeast region.

580 Apay Ai-yu Tang advisors7; (3) collaborate with one associate teacher who understands the significance of indigenous language immersion teaching and gives support by teaching the preschoolers basic cognitive concepts in Chinese beforehand to ease the pressure of learning their indigenous language (Chou 2015); and (4) create a culture-based environment and identity-oriented atmosphere through decoration and interaction in the classroom. For instance, the teachers could decorate their classrooms with Truku symbols from domains such as hunting, weaving, traditional clothing, and so forth to enhance the learners’ identification with their ethnic group.

4.2.1.2. Bimonthly teachers’ empowerment workshop To offer professional knowledge that sharpens the ILTs’ skills in their own indigenous language and to help them to develop their pedagogical materials and teaching strategies, each region’s leader holds bimonthly teachers’ empowerment workshops that the ILTs are required to attend. The topics of the workshops include (1) developing pedagogical materials and improving communication and collaboration between the ILTs and their associate teachers, (2) evaluating the previous pedagogical materials and creating ways of mobilizing family and community participation, (3) designing better models for listening and speaking skills as well as indigenous language teaching strategies, (4) discussing effective ways of assessing preschoolers’ language, and (5) discussing the themes and pedagogical materials that can be used for the next semester. There are five workshops annually.

4.2.1.3. Documenting teaching processes and activities online ILTs are required to upload documentation of their teaching processes and related activities online to build their own “archive.” The contents include three items. The first is recent news about the kindergarten; all recent news about the ILIP and notifications from the main supervision center/Kun Shan University can be uploaded as an album or document on the main ILIP website.8 The ILT in the kindergarten needs to upload at least two pieces of related news each month, which are reviewed monthly by the regional collaborative team. The second is an introduction to the kindergarten including its historical development, characteristics of the school, the ethnic group, its organization, students’ and parents’ backgrounds, community volunteers, and cultural resources in the community. The third is a teaching log or record, which mainly contains weekly teaching activities, mobilizing team meetings, and suggestions or advice from the advisory team. In addition, other files to be uploaded in this category include family 7

The ILTs are recruited by the CIP about eight or nine months prior to each yearly term, and they need to have an officially recognized qualification as a care educator. Before the new semester starts, all the language teachers from the twenty-three indigenous kindergartens are gathered and assisted to design their new themes and course for the whole semester with the project leaders, who either have knowledge of immersion teaching or experience in ILIP. 8 The collective site for all ILIPs and all records can be seen at http://kindergarten.klokah.tw/; however, part of the information can only be accessed by the ILTs and related administrators.

A Case Study of Truku Seediq Language Immersion 581 visitation sheets, students’ learning progress sheets, teachers’ files, and facility or property lists, as well as documents that record any collaborative activities of the school and community.

4.2.1.4. Advisory visits and evaluations To keep the immersion teaching and administration processes operating well, each kindergarten/ILT is assigned one advisor who has adequate capacity in the specific indigenous language, curriculum design, and negotiating administrative work.9 The advisor pays a visit to each designated kindergarten every two months to observe the following items: (1) overall teaching, paying special attention to the hours of using the indigenous language and interaction between teachers and students as well as between the teachers; (2) the teaching logs mentioned above; (3) collaboration among ILTs, associate teachers, principals of the kindergartens, and mobile teams; and (4) related administrative work. Finally, the advisor can offer advice or assistance in accordance with his or her observations on the spot and report back to the main supervision center. Last but not least, the advisor is encouraged to report any problems or negotiate unresolved issues at any time.

4.2.1.5. Developing pedagogical materials As noted earlier, creating and developing culture-based pedagogical materials with indigenous epistemology in the ILIP curriculum is desirable. Therefore, the supervision team and all ILTs work collaboratively to develop and produce pedagogical materials that are suitable for each kindergarten. To this end, four tasks are to be completed each year: (1) determining the themes of the semester and writing culture- based lesson plans; (2) designing indigenous language learning sheets, which are compiled in a book, in accordance with the themes of the course; (3) designing at least five different learning tools that relate to the themes and can be employed in the classroom; and (4) designing a book that can be used at home by the preschoolers and their parents such as a storybook, interactive activity book, or collaborative learning pamphlet.10

4.3. Methods The following collaborative methods of collecting data allow the general evaluation of teaching as well as learning proficiency, pedagogical materials, and administration work. The introduction of the kindergartens and participants are briefly described in section 4.3.1 and the design and procedure are stated in section 4.3.2. 9 There are two advisors for the three Truku kindergartens; both are principals at the public elementary schools where two of the kindergartens are located. 10 These materials are examined before they are allowed to be uploaded to the collective website mentioned above, and they are submitted to the CIP subsequently.

582 Apay Ai-yu Tang

4.3.1. Kindergartens and participants The three Truku kindergartens are located in three respective villages— An- de Kindergarten is in Qowgan Village, Bsuring Kindergarten is in Hsiulin Village in the northern part of Hualien County, and Miharasi Kindergarten is in Miharasi Village in the southern part of Hualien County. An-de is a private kindergarten, while Bsuring and Miharasi are public, and are physically embedded in public elementary schools. At An- de Kindergarten, fifteen of fifty-five preschoolers are in the ILIP; fourteen are Truku and one is Amis. The teachers involved in this project include the principal, one indigenous language teacher, and her associate teacher who is Han Chinese.11 Second, nineteen of thirty-three preschoolers take part in the ILIP at Bsuring Kindergarten, and there are thirteen Truku, four Amis, and two Atayal learners. One principal, three Han teachers, and one indigenous language teacher are involved in the ILIP at Bsuring.12 Third, seventeen Truku preschoolers participate in the ILIP at Miharasi Kindergarten. The teachers involved in the ILIP include the principal of the kindergarten (Han Chinese), one associate teacher (Han Chinese), and one indigenous language teacher (Truku).13 Most of the preschoolers live and have grown up in the village where their school is located.

4.3.2. Design and procedure Collaborative methods of collecting data, including focus group interviews, participatory observation, and advisory visits, were employed. The student’s proficiency tests were also collected at the beginning and end of the program. Except for the proficiency tests held twice in total for each student participant, the data were collected at the three kindergartens—An-de, Bsuring, and Miharasi—in an eight-month period from October 2015 to June 2016 by the following means: (1) conducting four focus group interviews, which took place at the teachers’ empowerment workshops (see section 5.1), (2) examining the work logs that the teachers posted online (see section 5.2), and (3) four advisory visits at each of the classrooms (see section 5.3). In addition, (4) the students’ proficiency test results from the beginning and end of one school year were collected (see section 5.4).

5. Findings To some extent, the current government-based immersion language revitalization initiative seems to be contributing to the goal of stemming further indigenous language

11

An-de Kindergarten: http://kindergarten.klokah.tw/about/index.php?kid=17. Bsuring Kindergarten: http://kindergarten.klokah.tw/about/index.php?kid=19. 13 Miharasi Kindergarten: http://kindergarten.klokah.tw/about/index.php?kid=18. 12

A Case Study of Truku Seediq Language Immersion 583 erosion. The students show significant improvement in their language proficiency. Yet several obstacles hinder the programs from being as effective as they could be. The findings presented here are drawn from the focus group interviews with the ILTs, the observations of the collaborator/author, the advisory visits from the regional advisors, and the student’s proficiency tests, each of which are discussed in the following sections.

5.1. Focus group interviews The bimonthly teachers’ empowerment workshop has already been held four times. Each one lasts about eight hours. A total of ten to twelve participants including the principals, ILTs, and associate teachers from the three kindergartens were asked to give feedback and discuss and exchange ideas toward the end of every workshop. The main purpose of these focus group interviews was to gain a better understanding of their teaching experience in the ILIP. They were asked to report any responses to the main tasks they had implemented over the past two months including culture-based language classes, documenting teaching processes and activities online, advisory visits, and developing pedagogical materials. All participants’ responses were recorded and reported to the supervision center subsequently. Two topics were mentioned by almost every ILT in the three kindergartens: insufficient assistance for developing pedagogical materials, and the heavy workload of administration. In addition, participants revealed a nuanced view toward the ILIP as a whole. One ILT found it harder to implement ILIP totally if the kindergarten had been adopting other educational approaches like Montessori or the leaders did not really stress indigenous language immersion teaching. Another hesitated to use indigenous language throughout the designated classes, and said “the students could not understand the contents of what they have been taught if I spoke Truku all the time.” Still another expressed the difficulty of collaborating with parents or the community on collective activities related to indigenous language learning.

5.2. Observations I focus on four main observations regarding the ILIP at these three kindergartens.14 First, although it is expected that the hours of immersive indigenous language teaching be gradually increased, each ILT used less than two hours of Truku a day due to their lack of proficiency or their anxiety about students’ being unable to understand the content, as demonstrated by the exchange in (1), which took place after a classroom observation.

14

The observations reported in this section are derived from the author’s conversations, participant observation, meeting notes, or interviews from October 2015 to the present.

584 Apay Ai-yu Tang (1) Advisor: ILT:

It seems you did not use too much Truku during the class time just now. (in a tone of intense anxiety) Well, if I use Truku all the time, I might lose the kids’ motivation, and they might not be able to listen to me.

In addition, this two-hour “immersive teaching” was divided into two periods of class time; one was IL teaching of lexical items related to a cultural theme, and the other was called “cultural corner time,” in which the preschoolers could choose a spot in the classroom to learn about a specific cultural topic, such as clothing or weaving by touching the real traditional clothes or machines, or playing related games or using visual and audio learning tools provided by the CIP called “Point-Read-Pen.”15 This example reflects that the real indigenous language exposure at this ILIP is still currently insufficient. Proportionally, the use of Truku to Mandarin Chinese is about 20% or 30% to 70%. The next observation is relevant to pedagogical material development. Although all ILTs took part in the three-day workshop and had been assisted to develop culture- based pedagogical materials for the whole semester, two ILTs expressed that making a teaching plan for the whole semester at one time was rather a difficult task for them, as shown in the interaction in (2), which took place between the regional collaborator and the two ILTs during a teachers’ empowerment workshop. (2) Collaborator: Is there anything you’d like to share about the pedagogical material development? ILT 1: Honestly, I am not used to doing my teaching plan for the whole semester. ILT 2: Yeah, it is difficult for me to do so too. Why don’t we plan our lessons little by little? It takes much time and effort to plan them all in one shot. This example seems to reflect that developing culture-based pedagogical materials in the ILIP is desirable but not an easy task; collecting information via fieldwork, reliable websites, or other resources to have a deeper understanding of a certain culture requires much effort and support. Another observation had to do with the ILTs’ teaching strategies. Although teaching by immersion is desirable, many second- language teaching strategies such as Total Physical Response were employed by the ILTs throughout the classes. Furthermore, three ILTs used Truku as the main medium of instruction to teach lexical items in games, pictures, songs, or dances, but conversations were rarely heard among the learners. The written learning tools like word cards were usually presented in both Truku and Chinese.

15

This learning tool was published, and it started to be distributed to the kindergartens from 2015.

A Case Study of Truku Seediq Language Immersion 585

5.3. Advisory visits Two advisors have visited their respective kindergartens two to four times. Advisor 1 has visited An-de Kindergarten two times and Bsuring Kindergarten four times; Advisor 2 has visited Miharasi two times, and has asked to visit Ihunang Kindergarten for the remaining two times.16 According to their bimonthly reports, the main observations from these two advisors during their visits can be integrated as follows. First, the heavy load of administrative work influences the ILTs’ quality of teaching. Second, the cultures of other indigenous groups nearby should be included in the themes of the curriculum so the students learn about the differences among indigenous groups in Taiwan. Next, the children’s exposure to immersion teaching needs to be increased, and the other members involved in the ILIP should learn basic Truku oral skills such as greetings or instructions in the classroom and be encouraged to participate in workshops that highlight multicultural education or culturally responsive teaching. Last, the advisors suggested that the ILTs’ native language skills could be strengthened, and the ILTs could be encouraged to create an atmosphere that would raise children’s motivation to speak in Truku. In addition to the teaching, both advisors emphasized the importance of effective communication among the CIP, the supervision center, the regional collaborators, the ILTs, and the other teammates within these three kindergartens.

5.4. Proficiency tests A proficiency test designed to assess the effectiveness of the ILIP is administered to the preschoolers at the beginning and end of the program each school year. This section’s discussion is based on the tests given at the beginning (the pre-test) and the end (the post-test) of the 2014–2015 school year. The materials, procedures, and preliminary results of both tests are briefly described here. With regard to the materials, both tests were designed by the supervision center of the ILIP as well as the ILTs, and were based on the Thousand Word List provided by the CIP, with additional phrases and sentences commonly used in daily life. However, there are some differences between the tests. The pre-test has 100 items; the post-test has 184 items, including the 100 items on the pre-test, 68 items with picture examinations, 7 items with situational pictures, and 9 dialogues from daily life.17 16

The number of visitations is different for the different kindergartens due to job transfers among the ILTs. The ILT at Miharasi passed the examination as a public preschool educator, so she was required to leave Miharasi Kindergarten to serve at another Truku indigenous preschool near her hometown, called Ihunang Kindergarten, in February 2016. Subsequently, the ILT at An-de was required to transfer to Miharasi, which was near her hometown. 17 Except for the last nine questions in the post-test (daily life dialogues to test listening skill), each ILT made his or her own 175 picture cards for the final assessment. All the IL words on the testing cards were examined and revised by elders proficient in the IL and capable in orthography before being used in the tests.

586 Apay Ai-yu Tang Two skills—listening and speaking—were tested in both pre-and post-tests. In the post-test, listening skill was tested for all 184 items, but speaking skill was tested for only 168 of the 184 items, with the situational pictures and dialogues excluded. Another difference between the tests is that the post-test materials were carefully examined by linguists who specialize in ILs in Taiwan. The listening test was given first, one week before the speaking test. With respect to procedure, the ILT sat side by side with one student at a time, encouraged him or her in Mandarin Chinese, explained and practiced test items, and chose cards. Each card had an item to which the student either listened (listening test) or responded by producing a word (speaking test). When the test was completed, the ILT praised the child and gave him/her a gift. The tests are supposed to take less than twenty-five minutes per student. Finally, the ILT computed the scores for each child’s test, and the scores were sent to the supervision center for further statistical analysis. The supervision center’s analysis of the results from the pre-and post-tests showed that the preschoolers’ listening and speaking skills both improved after a year of ILIP at all three kindergartens (see Tables 25.1 and 25.2). Table 25.1 Truku listening test: Percentages of passing rates at three kindergartensa Classes An-de

Bsuring

Miharasi

#

Pre-

#

Post-

#

Pre-

#

Post-

#

Pre- #

Post-

Junior

2

0.27

2

0.91

9

0.65

11

1.32

4

0.38 4

0.38

Middle

3

0.18

3

0.96

−

−

−

−

3

0.45 4

1.17

Senior

11

0.37

11

1.26

−

−

−

−

9

0.43 9

2.3

a

Adapted from the supervision center in 2016. “Pre-” stands for “pre-test”; “Post-” stands for “post-test.”

Table 25.2 Truku speaking test: Percentages of passing rates at three kindergartensa Classes

An-de

Bsuring

Miharasi

#

Pre- #

Post-

#

Pre- #

Post- #

Pre-

#

Junior

2

0.23 2

2.4

9

0.39 11

1.08

4

0.4

4

1

Senior

11

0.37 11

1.08

−

9

0.43

9

1.9

−

−

−

Post-

a Adapted from the supervision center in 2016. “Pre-” stands for “pre-test”; “Post-” stands for

“post-test.”

A Case Study of Truku Seediq Language Immersion 587

6. Discussion This study’s preliminary results point to six main obstacles to the effectiveness of this government-based immersion program in kindergartens for language revitalization. The first is the lack of qualified teachers. Although the ILTs at these kindergartens are enthusiastic about transmitting Truku, they are still striving to become proficient both at speaking bilingually and at teaching effectively in the immersion classroom. The second issue relates to the lack of Truku immersion exposure in general. Less than two hours of Truku exposure a day, and its limited use (i.e., mostly as individual words) cannot enable children’s acquisition and development of conversational competence, which is one of the key factors in transmitting a language. Both of these problems could be partly addressed by doing more to include the community’s elders as collaborators in ILIP, which could help to strengthen the ILTs’ proficiency in the IL, help provide the children with more opportunities to hear and take part in conversation, and furthermore, contribute to transmitting traditional Truku values and culture. The third point is related to the ILTs’ utilization of multiple teaching methods. Although they strive to use multimedia, help students to stay in Truku, and think strategically in their teaching, they find it challenging to maintain a supportive and stable language learning environment with the resources currently available to them. Fourth, the feedback from the advisory visits and the responses from the focus group interviews suggest that the current administrative workload is quite heavy, to the extent that it affects teaching quality. The data from observations also suggests that communication among the team is not yet fully open, and one of the kindergartens does not seem to have the full administrative support of its school. The fifth obstacle to the current ILIP’s full effectiveness is the need to continuously develop effective pedagogical materials and proficiency tests. The ILTs struggled to develop the culture-based pedagogical materials that are best suited for the indigenous preschoolers, and they found the implementation of the final proficiency tests stressful. One factor in the difficulty posed by the tests is the small likelihood of completing all 184 testing items with a kindergartener in a single twenty-five-minute session, the length of which also might affect the student’s ability to fully concentrate throughout the process. Finally, the interviews and observations suggest that family and community participation is minimal, with families and other community members taking a passive role when they do participate. Although all ILTs and their teammates consider that collaboration is essential to the success of a preschool immersion classroom, more attention needs to be paid to helping the families and communities understand the significance of language revitalization and the value of their active participation in the ILIP.

588 Apay Ai-yu Tang

7. Conclusion In the context of Taiwan, grassroots efforts have had a continuous long-term struggle for resources, and they have faced the inability or unwillingness of parents or elders in indigenous language communities to effectively implement language interventions aimed at language revitalization. Meanwhile, the use of the dominant language—Mandarin Chinese—continues to increase, while the indigenous languages continue to erode and decline. Languages like Truku are at a critical juncture, because while fluent speakers still remain, almost all of them belong to the elder generation, and the languages are no longer being transmitted to the younger generations. Hence, government-based language revitalization initiatives can play a very significant role in helping to stem further indigenous language erosion at this crucial moment. However, they require careful attention to several specific steps if they are to be effective. Specifically, it is suggested that the following changes would contribute to the effectiveness of the government-based immersion program in kindergartens for the purpose of indigenous language revitalization: (1) increasing the hours of immersion and co-teaching with elders, (2) increasing the number of indigenous teachers and strengthening their proficiency both in indigenous languages and in culture-based teaching, (3) encouraging better communication in the systems of administration, (4) enhancing collaboration with families and communities, and (5) developing effective pedagogical materials as well as proficiency tests for the indigenous preschoolers. However, it is too simplistic to suggest that effective language revitalization can be entirely dependent on top-down policies any more than it can be completely based on bottom-up strategies. Rather, at this critical stage of indigenous language endangerment, revitalization efforts must be both top-down and bottom-up; both the government and the local community must be engaged in order to motivate language immersion and acquisition, especially at the early stages of life. This study can only be a very modest beginning in addressing the issues accompanying government-based language revitalization initiatives in an indigenous setting. The small numbers of qualified teachers, the inability of the programs to provide sufficient hours of immersion, the lack of appropriate teaching methodologies and pedagogical materials with indigenous epistemology (approaches to learning and knowing), and the difficulty of inspiring and maintaining family/community participation are significant problems whose solutions remain to be found through deeper investigations.

References Auerbach, Elsa. 2000. “When Pedagogy Meets Politics: Challenging English Only in Adult Education.” In Language Ideologies: Critical Perspectives on the Official English Movement, edited by Roseann D. Gonzales, 177–204. Mahwah, NJ: Lawrence Erlbaum.

A Case Study of Truku Seediq Language Immersion 589 Chen, Chih-lieh. 2011. “Study on Indigenous Teacher Preparation Courses.” Paper presented at the Indigenous Education Conference, Council of Indigenous People and Ministry of Education, Taipei, Taiwan. [In Chinese] Chen, Su-chiao. 2006. “Simultaneous Promotion of Indigenization and Internationaliza tion: New Language- in- Education Policy in Taiwan.” Language and Education: An International Journal 20(4): 322–337. Chen, Su-chiao. 2010. “Multilingualism in Taiwan.” International Journal of the Sociology of Language 205: 79–104. Chou, Hsuan-Chen. 2011. “Practice of Localization Education—Case Study of the Tribe Language and Cultural Immersion Curriculum at an Early Childhood Caring Center of a Taiwanese Indigenous Tribe.” Journal of Educational and Multicultural Research 4: 73–117. Chou, Hsuan-Chen. 2015. “A Preliminary Study of Immersion Teaching of Paiwan Language at a Kindergarten’s Experimental Class.” Taiwan Journal of Indigenous Studies 8(3): 91–119. Christian, Donna. 1997. Profiles in Two-Way Immersion Education. Washington, DC: Center for Applied Linguistics/Delta Systems. Cooper, Robert L. 1989. Language Planning and Social Change. Cambridge: Cambridge University Press. Council of Indigenous Peoples (CIP). 2013. “Revitalization of Indigenous Languages in Taiwan—Past and Future.” http://ical13.ling.sinica.edu.tw/Sattelite_Event/Chinese/. Eaton, Sarah E. 2010. “How Long Does It Take to Learn a New Language?” https://drsaraheaton. wordpress.com/2011/02/20/. First Peoples’ Cultural Council (FPCC). 2014. “About Us: Publications.” www.fpcc.ca/about- us/Publications. Fishman, Joshua A. 1991. Reversing Language Shift: Theoretical and Empirical Foundations of Assistance to Threatened Languages. Avon, UK: Multilingual Matters. Fishman, Joshua A. 1996. Post-Imperial English: The Status of English in Former British and American Colonies and Spheres of Influence. Berlin: Mouton de Gruyter. Freeman, Rebecca D. 2004. Building on Community Bilingualism. Philadelphia: Caslon. Greymorning, Steve. 1997. “Going Beyond Words: The Arapaho Immersion Program.” In Teaching Indigenous Languages, edited by Jon Reyhner, 22–30. Flagstaff: Northern Arizona University. Hinton, Leanne. 1994. Flutes of Fire: Essays on California Indian Languages. Berkeley, CA: Heyday Books. Hinton, Leanne. 2001. “Teaching Methods.” In The Green Book of Language Revitalization in Practice: Toward a Sustainable World, edited by Leanne Hinton and Ken Hale, 179–189. San Diego, CA: Academic Press. Hinton, Leanne and Ken Hale. 2001. The Green Book of Language Revitalization in Practice: Toward a Sustainable World. San Diego, CA: Academic Press. Hirvonen, Vuokko. 2004. Saami Culture and the School: Reflections by Saami Teachers and the Realization of the Saami School. An Evaluation Study of Reform 97, translated by Kaija Anttonen. Kárášjohka, Norway: Cálliid Lágádus. Hornberger, Nancy H. 2006. “Voice and Biliteracy in Indigenous Language Revitalization: Contentious Educational Practices in Quechua, Guarani, and Māori Contexts.” Journal of Language, Identity, and Education 5(4): 277–292. Hornberger, Nancy H. 2008. Can Schools Save Indigenous Languages? Policy and Practice on Four Continents. New York: Palgrave Macmillan.

590 Apay Ai-yu Tang Huang, Lillian M. 2011. “Training of Indigenous Language Teachers in Taiwan: Past and Future.” Journal of Taiwanese Languages and Literature 6(1): 69–114. Huang, Lillian M. 2014. “Revitalization of Indigenous Languages in Taiwan: Past and Future.” Journal of Taiwanese Languages and Literature 9(2): 67–88. Jang, Shyue- chian. 2007. Toward a National Language Policy in Multicultural Taiwan. Taipei: Academic Sinica. [In Chinese] Kamana, Kauanoe and William H. Wilson. 1996. “Hawaiian Language Programs.” In Stabilizing Indigenous Language, edited by Gina Cantoni, 153–156. Flagstaff: Northern Arizona University. Kaplan, Robert B. and Richard B. Baldauf. 1997. Language Planning: From Practice to Theory. Clevedon, UK: Multilingual Matters. Krauss, Michael. 2007. “Classification and Terminology for Degrees of Language Endangerment.” In Language Diversity Endangered, edited by Matthias Brenzinger, 1–8. Berlin: Walter de Gruyter. Livaccari, Chris. 2012. “Instructional Strategies: Successful Approaches to Immersion Teaching; Chinese Language Learning in the Early Grades: A Handbook of Resources and Best Practices for Mandarin Immersion.” http://asiasociety.org/files/chinese-earlylanguage. pdf. May, Stephen, Richard Hill, and Sarah Tiakiwai. 2004. “Bilingual/Immersion Education: Indicators of Good Practice.” Final Report to the Ministry of Education. Auckland, NZ: Ministry of Education. http://www.minedu.govt.nz. McCarty, Teresa L. 2006. “Voice and Choice in Indigenous Language Revitalization.” Journal of Language, Identity, and Education 5(4): 308–315. McIvor, Onowa. 2006. “Building the Nests: Early Childhood Indigenous Immersion Programs in BC.” http://www.fpcc.ca/files/PDF/language-nest-programs_in_BC.pdf. National Academy for Educational Research. 2011. “An Integrated Study of Current Local Language Teaching in Elementary and Junior High Schools in Taiwan.” http://www.naer. edu.tw/ezfiles/0/1000/img/46/20118990.pdf. [In Chinese] Pease-Pretty on Top, Janine. 2003. Native American Language Immersion: Innovative Native Education for Children & Families. Battle Creek, MI: W. K. Kellogg Foundation/American Indian College Fund. Penfield, Susan D. and Benjamin V. Tucker. 2011. “From Documenting to Revitalizing an Endangered Language: Where Do Applied Linguists Fit?” Language and Education 25(4): 291–305. Ricento, Thomas. 2006. “Methodological Perspective in Language Policy: An Overview.” In An Introduction to Language Policy: Theory and Method, edited by Thomas Ricento, 129–134. Malden, MA: Blackwell. Royal Commission of Aboriginal Peoples. 1996. Gathering Strength: Report of the Royal Commission on Aboriginal Peoples. Ottawa, Canada: Minister of Supply and Services. Shohamy, Elana. 2006. Language Policy: Hidden Agendas and New Approaches. New York: Routledge. Spolsky, Bernard and Elana Shohamy. 2001. “Hebrew After a Century of RLS Efforts.” In Can Threatened Languages Be Saved?, edited by Joshua A. Fishman, 349–362. Clevedon, UK: Multilingual Matters. Tan, Hui-bi. 2008. “Opinions on Teaching Mother Languages: Examples from Language Teaching Activities in Kaohsiung City.” MA thesis, University of Taitung, Taiwan.

A Case Study of Truku Seediq Language Immersion 591 Tang, Apay Ai-yu. 2011. “From Diagnosis to Remedial Plan: A Psycholinguistic Assessment of Language Shift, L1 Proficiency, and Language Planning in Truku Seediq.” PhD diss., University of Hawai‘i at Mānoa, Honolulu. Tang, Apay Ai-yu. 2014. “Preliminary Results of a Community-based Language Revitalization Initiative in Truku Seediq.” Journal of Taiwanese Languages and Literature 9(2): 1–38. Zeitoun, Elizabeth, Ching-hua Yu, and Cui-xia Weng. 2003. “The Formosan Language Archive: Development of a Multimedia Tool to Salvage the Languages and Oral Traditions of the Indigenous Tribes of Taiwan.” Oceanic Linguistics 42(1): 218–232. Zhang, Jin-lian. 2009. “Research on the Learning Adaptation and Native Language Learning Attitudes of Junior-High School Aborigine Students.” MA thesis, University of Taitung, Taiwan.

Chapter 26

Māori Revitalization of an Endangered Language Jeanette King

1. Introduction Te Reo Māori, the Māori language, the indigenous language of Aotearoa New Zealand, is one of the most well-known endangered languages and is regularly included as a case study in the international language revitalization literature (see, for example, Fishman 1991; Benton and Benton 2001). This is partly because Māori language revitalization efforts, which began in the early 1980s, have been at the vanguard internationally. Having achieved a degree of success, the Māori language occupies a special position amongst the languages of indigenous peoples as an inspiring example of how a position of endangerment can be counteracted. As with all endangered languages undergoing revitalization, Māori is unique with respect to some aspects of its situation but also has similarities and innovations which resonate with other groups working to revitalize their own languages. This chapter gives an overview of the history of both the decline and revitalization of Māori, along with a discussion of knowledge acquired over the thirty-year period of revitalization efforts with the aim of providing information that may be of use in the revitalization efforts of other endangered languages. In accordance with print-style standards in New Zealand the English plural suffix -s will not be used on Māori words.

2. Decline The history of the Māori language and its contact with a language of colonization, in this case English, parallels the experience of many endangered languages. The language of the colonizer, after a period of relatively stable bilingualism, comes to be seen

Māori: Revitalization of an Endangered Language 593 as a necessary means of advancement in the dominant culture. A generation of passive bilinguals emerges, followed by generations who do not know the heritage language. At some subsequent point, when the extent of language loss is realized, a move toward language regeneration is begun. After European discovery of New Zealand in the seventeenth century, whalers and sealers, along with missionaries, began sustained engagement with the Māori populace from the early 1800s. After the signing of the Treaty of Waitangi between Māori and the British Crown in 1840 mass British immigration soon led to European domination and the formation of Western political and economic power structures. The gradual shift from Māori to English as the language of Māori homes was detailed by Benton (1991) in a sociolinguistic survey of 6,470 Māori households in the mid-1970s. Results showed that from 1900 on centres of Māori population closest to larger towns and cities were affected by language shift sooner than remote heartlands and that by 1955 most Māori communities were raising their children as English-only speakers. One of the main prompts for this community shift to English was the education system. There are numerous accounts of Māori children from the late nineteenth century onward being corporally punished for speaking Māori in school. The colonizing mind- set advanced by the schooling system denigrated the Māori language and contributed to an internalization of negative attitudes among the Māori populace. Thus, in the middle of the twentieth century increasing numbers of Māori parents started shifting toward speaking the colonizers’ language with the knowledge that a good grasp of English made it easier for their children to secure jobs, especially highly regarded positions in government departments. After World War II Māori families were attracted to urban areas by the promise of jobs and money. Urbanization was rapid; in 1956 the majority of Māori (76%) lived in rural areas, but by 1976 most (78%) had moved to towns and cities. This has been described this as one of the most rapid urban migrations of any sizable ethnic population in human history (Gibson 1973). This urban shift meant generations of young Māori were being brought up away from the marae (“meeting place”), the hub of the rural Māori-speaking community, leading to a loss of language and culture. By the 1970s the main domains for the use of Māori had receded to the marae and the church. It was in this decade that the seeds of discontent which led to the current Māori language revitalization movement were sown.

3. Revitalization The beginnings of the Māori language revitalization movement can be traced back to the Māori activist group Ngā Tamatoa in the early 1970s. Ngā Tamatoa (“the young warriors”) was a group of young Māori, the majority city-raised and university- educated, who became empowered by the worldwide civil rights and black consciousness movements of the era.

594 Jeanette King In 1972 Ngā Tamatoa presented a petition to Parliament lamenting the state of the Māori language and urging the government to provide teacher training to enable the Māori language to be taught in schools. The government agreed to this request and the date of the petition’s presentation to Parliament became known as Māori Language Day, soon becoming the Māori Language Week that is now celebrated annually. As Kāretu wrote: It seems ironic, and yet not surprising, that all the efforts being expended in the revival of the language are by those whose loss has been the greatest and who are painfully aware of how great that loss is. (Kāretu 1993, 225)

This genesis of the Māori language revitalization movement parallels other places where young, urban, deracinated individuals have been the prime movers behind language revitalization. Probably the most influential development in the language revitalization movement in New Zealand has been the Māori immersion preschool initiative, kōhanga reo (“language nests”) (King 2001). This program has been successful in raising a new generation of Māori children as speakers of the language and has led to the development of Māori immersion education in both English medium schools and Māori-controlled kura kaupapa Māori (“Māori philosophy schools”). The concept has also been replicated by many other endangered indigenous language groups, most notably the Hawaiians. Most descriptions of kōhanga reo emphasize its role as a language revitalization initiative although those in charge of the movement have always insisted that it was much more.

3.1. Kōhanga Reo Kōhanga reo arose from a radical new direction in the Department of Māori Affairs. In 1977 Kara Puketapu took over the leadership of the Department, whose role at the time was to promote “the social, cultural and economic well-being of the Māori people” (Puketapu 1982, 2). But in the urban situation where Māori unemployment and crime were becoming major issues, Puketapu decided that the Department needed to take a more active role to empower Māori development. The Tū Tangata (“stand tall”) program worked with communities to devise programs in response to community needs, thereby reversing the usual operating procedure of government departments which implemented policy from the top down. Another revolutionary aspect of the program was that Māori culture and language were to be seen as not part of the problem but as part of the solution. At the Tū Tangata conference of elders in 1981 Māori language became one of the main focuses. Concern about the state of the language, heightened by the work of activist groups such as Ngā Tamatoa, was confirmed by Benton’s sociolinguistic survey, the results of which were known to leading Māori present at the conference. Conference participants were informed that

Māori: Revitalization of an Endangered Language 595 few, if any, Māori children were being raised as speakers of Māori. Not only did the conference make language revitalization an “urgent target,” the elders came up with a strategy to make it happen—they wanted “Māori-speaking supervisors to run day-care centres on maraes” (Hayes 1982, 3). The first kōhanga was opened a few months later. The idea spread rapidly throughout the Māori community, with centers opening up on marae, in community halls, and even in private homes. By the end of the year there were 107 kōhanga, and three years later, 337. All were set up with the aim of fully immersing Māori children in the Māori language; that is, the Māori language would be the medium of communication and instruction. Iritana Tāwhiwhirangi from the Department of Māori Affairs was given the responsibility of leading the new movement. A National Trust was set up to channel government funding to kōhanga. Each kōhanga was to be autonomous so parents would develop, among other things, transferable management skills and experience in collective decision-making. Tāwhiwhirangi has been staunch in maintaining that kōhanga was a Māori development initiative rather than a child-care initiative, or solely a language revitalization initiative (Tawhiwhirangi 2014). Kōhanga members started to use the word whānau (“extended family”) to describe the collective grouping of parents forming around each centre. Traditionally the word whānau referred to descent groups, but since the 1970s the meaning of this word has expanded to include groupings where “the central principle of their recruitment and operation is not descent (whakapapa) but commitment to one or more common purposes (kaupapa)” (Metge 1995, 292). Despite the stated aim of having young children learning Māori from native speaking elders, from the beginnings of the movement the majority of teachers and adults in daily contact with the children were second-language speakers. The numbers of kōhanga continued to increase in the 1980s and early 1990s, the growth peaking in 1993 with 14,027 children attending 809 kōhanga. Before the advent of kōhanga only about 20% of Māori children had been attending preschool, about half of the rate for non-Māori children. Participation rates have steadily increased over the last thirty years and by 2016 95% of Māori children entering school have participated in some form of early childhood education.

3.2. Te Ataarangi Before the kōhanga reo movement began, Katarina Mataira and Ngoi Pēwhairangi had designed and implemented a successful Māori language teaching scheme for adults based on the Silent Way system of language teaching. Te Ataarangi involves small groups who are taught through the medium of Māori. Students do not use books; instead cuisenaire rods are used by the teacher to illustrate concepts and sentence patterns. Students listen, look, and speak by repeating what the teacher says (Muller and Kire 2014).

596 Jeanette King Te Ataarangi worked within kōhanga to teach adults the Māori language and was seen as an integral part of the kōhanga reo experience. The program spread throughout the country and continues to be a popular method of learning Māori.

3.3. Schooling As kōhanga reo blossomed in the early 1980s, parents started reporting that kōhanga graduates were losing their Māori language within months of entering English medium primary schooling. In response, parents began setting up kura kaupapa Māori so that their children could continue to be immersed in the Māori language during compulsory schooling. The first kura kaupapa Māori was established in 1985 on an urban marae in New Zealand’s largest city, Auckland. By 1998 there were sixty-one such schools throughout the country. At present there are seventy-three kura kaupapa Māori, twenty- four of which have wharekura (“secondary schools”). At first these schools operated outside the state system, but in 1999 they were recognized in the Education Act. Kura kaupapa Māori receive state funding and teach the Māori-medium curriculum Te Marautanga o Aotearoa (Ministry of Education 2016b) and, as with all New Zealand schools, each has an elected Board of Trustees comprising parents and community members. In 1993 kura kaupapa Māori established a national collective, Te Rūnanga Nui o Ngā Kura Kaupapa Māori which requires that each school adheres to Te Aho Matua, a set of guiding principles citing the importance for the school to support the physical, spiritual and emotional needs of the child and its whānau. The challenge of providing Māori immersion schooling for kōhanga reo graduates also increased parental demands for the formation of immersion classes or units within English medium schools. While many of these units deliver the curriculum in Māori 81– 100% of the time (designated as Level 1 immersion), a number of these classrooms teach children with lower levels of Māori immersion. Level 1 programs are generally regarded as the most effective both linguistically and educationally (May 2013). In 2017 there were 14,260 Māori children receiving their education in Level 1 programs, with 52% of these children being in kura kaupapa Māori, indicating that this is the preferred option for Māori immersion schooling. A continuing challenge for the kura kaupapa Māori movement has been retaining children since many move to English medium schools for their secondary schooling in order to avail themselves of the wider range of subject choices at schools with larger enrollments. Nevertheless, on average, Māori students in Māori medium education leave school with higher qualification levels than Māori students not in Māori medium education (Ministry of Education 2016a, 21). Māori language education is reasonably well served by Ministry of Education curriculum guidelines and an increasing array of teaching and pedagogical resources available on Te Kete Ipurangi website. Māori is also taught as a subject in New Zealand secondary schools.

Māori: Revitalization of an Endangered Language 597 There are also a number of post-secondary schooling Māori language and Māori immersion education options including three Māori tertiary institutions: Te Wānanga o Raukawa, Te Wānanga o Aotearoa, and Te Whare Wānanga o Awanuiārangi. In Benton’s research during the 1970s the marae and the church were the two strongest Māori language domains. One of the major successes of kōhanga reo and kura kaupapa Māori has been to add school as another domain for the language. It is ironic that the education system which precipitated the loss of the Māori language is now the primary site for its revitalization (Benton 1986).

3.4. Media Another important thrust in the revitalization of the Māori language has been delivered via broadcast media. The first dedicated Māori language radio station began broadcasting in Wellington in 1983 during Māori Language Week. In 1987 Te Reo Irirangi o Te Upoko o Te Ika began broadcasting full-time. Other regional stations soon followed, staffed by volunteers using whatever equipment they could obtain. Alongside this community action, legal cases were also made arguing that the Government had responsibilities toward the Māori language, and from 1990 onwards government broadcasting funding bodies began allocating broadcasting frequencies and providing money for iwi (“tribal”) radio stations (see Matamua 2014 for a detailed history and analysis). Today there are twenty-seven bilingual iwi stations throughout the country, six being self-funded. Unlike many of the other Māori language revitalization initiatives, most stations are located outside the major urban areas, fulfilling a vital community function in spread-out rural districts. (See Te Rito 2014 for an account of the history one such station.) An impact study in 2010 (Te Puni Kōkiri 2011b) found that 28% of Māori had listened to at least one iwi radio station within the previous twelve months. Eight percent of adult Māori respondents were listening to iwi radio daily. The survey also found that listening rates were higher among Māori language learners, indicating that the stations have a useful role in providing support and language exposure for those learning the Māori language. Because funding for the stations arose from claims regarding the Māori language, iwi stations are funded according to their percentage of Māori language content, thus incentivizing higher Māori content levels. With regard to television, the first regular program was a weekday Māori news program called Te Karere (“the messenger”) which was first broadcast in 1982. Initially five minutes long it has gradually lengthened to a twenty-five-minute program. From its inception Te Karere presented the Māori news in Māori, not the news in Māori. Other documentary-style programs in Māori gradually followed throughout the 1980s (see Stephens 2014 for more details). Frequencies for a Māori television channel had been allocated in the early 1990s with the knowledge that a dedicated television presence would be highly valuable for the status and revitalization efforts for the language (Benton 1985; Grin and Vaillancourt 1998).

598 Jeanette King Māori Television began broadcasting nationally in 2004. The channel’s target audience is extremely broad: children to adults, Māori and non-Māori. The station has developed a reputation for innovative, engaging, and quality programs. At present children can watch the cartoons SpongeBob SquarePants and Dora the Explorer in Māori and there is a wide and ever-changing range of news, reality, drama, sport, talent quest, and documentary programs. The channel also regularly broadcasts programs which teach the Māori language. Much of the broadcasting is in Māori, but some is in English. In 2008 Māori Television launched a second channel, Te Reo, which is Māori language only. There has been increasing ownership of portable digital devices in recent years with studies indicating that 15–24-year-old Māori have higher levels of ownership and use of various forms of digital technology than their non-Māori counterparts (Te Puni Kōkiri 2010). Encouragingly, this group are keen to access Māori language and culture content on the various technology platforms. There are already a number of Māori language apps available.

4. Government The language revitalization literature notes that while a grassroots determination and action is vital for successful language revitalization, it is also important to achieve outcomes in wider society which affirm the status of the language. Efforts on this front began with one of the earliest and successful Waitangi Tribunal claims in 1985 which argued that the Crown had a duty to protect the Māori language. The Māori Language Act was passed in 1987, making Māori an official language of New Zealand and establishing the Māori Language Commission, which became known as Te Taura Whiri i te Reo Māori (“the rope binding together the Māori language”). With the aims of promoting the Māori language and developing language policy, one of the main tasks of the Commission in its first decade was to produce vocabulary for the nascent Māori immersion education programs. Since Māori had not been a medium of educational instruction for over 100 years, words for the scientific and mathematical curricula, in particular, were needed, along with educational terminology (see Harlow 1993 and Keegan 2005 for an explanation of the processes used in the formulation of this vocabulary). By this time there were many places to learn basic Māori language skills and the Commission decided to address the need to provide higher- level language- learning opportunities for the new generation of adult speakers. The Māori Language Commission began running week-long kura reo (“language schools”) at various marae around the country. Since the early 1990s the Commission, in conjunction with various tribal groups, has run three or more kura reo annually. The target group is teachers as well as those in the broadcast media. Founding Māori Language Commissioner Tīmoti Kāretu saw the need to establish another venue to nurture those who had advanced to a higher competence in the language. In 2004, twenty-five invited students from around the country were inducted into Te Panekiretanga o Te Reo Māori (“The Institute of Excellence in the Māori Language”). Participants fly from around the country once a month for intensive weekend live-in

Māori: Revitalization of an Endangered Language 599 seminars. The eleventh intake was admitted into the program in 2015 (see Gloyne 2014 for more details). Calls for an overarching and cohesive national language policy have been made regularly (for example, Waite 1992; Human Rights Commission 2008; The Royal Society of New Zealand 2013) but have not yet received a government mandate. Despite this, several Māori language strategies have been released. The first, adopted in 2003, had a twenty-five-year vision that: by 2028, the Māori language will be widely spoken by Māori. In particular, the Māori language will be in common use within Māori whānau, homes and communities. All New Zealanders will appreciate the value of the Māori language to New Zealand society. (Te Puni Kōkiri 2003, 5)

An updated Māori language strategy was published in 2014 (Te Puni Kōkiri). That Māori is the only language for which New Zealand has any strategy reflects the fact that Māori sits atop a well-established hierarchy of minority languages in New Zealand (de Bres 2015). The 2014 strategy also introduced the rationale for changes to governance structures to be enacted in a new Māori Language Act, which would see the formation of an independent entity, Te Mātāwai (“the source”), to oversee the Māori Language Commission and Te Māngai Pāho (the Māori broadcasting funding authority). The formation of Te Mātāwai was recommended by a panel of tribal language experts who had been commissioned by the Government to review the Māori language sector (Te Puni Kōkiri 2011a). Richard Benton notes that: the proposals were criticised by some Māori groups and leaders for undermining the significance of the language for the nation as a whole, and for effectively handing over responsibility for the language to iwi organisations concerned primarily with commercial interests. (Benton 2015, 105)

Nevertheless the Māori Language Act was passed in 2016 and the process for setting up Te Mātāwai’s board has been initiated. The board comprises thirteen members from tribal, education, and broadcasting organizations.

5. Home and community With strong beginnings in the education sector, information about the importance of supporting the use of Māori in the home started to be articulated from 1995 onward. As Pawley notes, Mother tongue command of a language cannot be learned in school; the child must start by hearing and imitating native speakers using the language naturally during his or her early childhood. (Pawley 1989, 17)

600 Jeanette King Of concern were results from the National Māori Language Survey which showed that nearly half of Māori adults surveyed never spoke Māori at home and only 14% of respondents used Māori on a daily basis (Te Puni Kōkiri 1998, 49). From 2000, emphasis on home and community support has come to be the main focus of government and tribal language strategies and the Māori Language Commission has invested significant funding on initiatives to support home and community language use. Since 2004, $1.5 million per year has been allocated to Te Ataarangi to deliver the Kāinga Kōrerorero (“speaking homes”) initiative which works with families to increase language use in the home. Each family must have a highly proficient speaker and the national network of mentors work with the families to offer language tips and strategies (see Muller and Kire 2014 for more details). Since 2001 the Māori Language Commission has administered the Mā Te Reo fund which provides $2.5 million annually to community organizations to support Māori language projects, such as running language camps, devising language plans, or producing language resources. The rationale is to support initiatives which are community designed and led since these are more likely to be successful. The other government department with a major responsibility for the Māori language is Te Puni Kōkiri (Ministry for Māori Development). Te Puni Kōkiri produces a wide range of surveys on many aspects of the Māori language, all available on their website.

6. Tribal initiatives While most initial Māori language revitalization strategies emerged from a pan-Māori, urban base, one organization that has had a long-term community focus is a tribal grouping from the southern portion of the North Island. In 1975 this tribal confederation devised a twenty-five-year development strategy aimed at rejuvenating their marae and communities and improving the educational outcomes of their children (Winiata 1979). With few speakers under the age of 30, the revitalization of the Māori language was a chief objective of the program whose aims were largely channeled through the development of the tribes’ own tertiary education facility. Te Wānanga o Raukawa opened in Ōtaki in 1981 and pioneered the concept of wānanga reo, language camps for adult learners. These camps address the importance of intergenerational transmission by normalizing the Māori language as a means of communication between adults in a range of everyday settings, thus patterning behaviors which can be applied to home and community situations (King 2006). The concept of wānanga reo spread throughout the country and were the model for the kura reo mounted by the Māori Language Commission and other tribal and educational groups. It wasn’t until the late 1990s that other tribal groups started to organize their own language initiatives and strategies. This is because most tribes have spent a lot of political

Māori: Revitalization of an Endangered Language 601 focus and time in the last quarter of the twentieth century in presenting claims to the Waitangi Tribunal, predominantly over the loss of land. As these claims have been finalized, tribes have used the settlement proceeds to set up tribal infrastructure with commercial arms, and many are also now formulating and enacting tribal language strategies. The first major tribal group in the post-settlement phase to engage with language revitalization was Ngāi Tahu, whose tribal area encompasses most of the South Island, and whose loss of language was more advanced than on the North Island. In 2000, Ngāi Tahu launched a twenty-five-year language revitalization initiative that has the home as the main focus. Kotahi Mano Kāika—Kotahi Mano Wawata (“a thousand homes—a thousand dreams”) aims to have 1,000 Ngāi Tahu Māori speaking homes by 2025. The tribe’s language strategy unit runs regular language schools and has a wide range of resources available on their Kotahi Mano Kāika and Generation Reo websites. There are now increasing numbers of North Island tribal groups producing strategies and initiatives. These strategies all put a focus on the support of tribal dialects as tribal identity becomes increasingly important in the post-settlement era. This is despite the fact that the majority of Māori live in urban areas, usually outside their dialect areas. Furthermore, most new speakers learn Māori from second-language speakers, resulting in a certain amount of dialect leveling (Keegan 2017).

7. Current situation A question about language use has been included in each quinquennial census from 1996 onward. Currently 21.3% of the Māori population of just under 600,000 report being able to speak conversational Māori (a reduction from the 25% reported in 1996). Māori comprise 15% of New Zealand’s population of 4.5 million. Just under 4% of non-Māori report being able to speak the Māori language. However, the self-reported responses give no indication of how well the person speaks Māori. Te Puni Kōkiri conducts a survey on the health of the Māori language after each census with the most recent indicating that 14% of Māori adults assess themselves as being able to speak Māori “well” or “very well” (Te Puni Kōkiri 2008). The age group which has the highest proportion of Māori speakers are those over 65 (of whom 39% report being able to speak Māori). However, this is a substantial drop since 1996 when 53% of Māori aged 65 and over reported being Māori speakers. This decline reflects the fact that the cohorts of older fluent speakers are rapidly dying out. While these numbers may look less than optimal, it must be remembered that had the language question been included in earlier censuses, we would have been able to track a substantial increase in the reported ability to speak Māori among those who were under 40 years old in the 1970s, before revitalization strategies were enacted. The fact that we now have Māori speakers in all age bands throughout the Māori population

602 Jeanette King is a dramatic increase although concerns persist regarding the low levels of high-fluency speakers (Bauer 2008). The late 1980s and early 1990s were the heyday of kōhanga reo with centers attracting over half of Māori enrollments in the early childhood education sector. However, since 1997 kōhanga has lost its preferred option status. In 2014 there were 455 kōhanga and just under 9,000 children (a decline of 36% in enrollments since 1993). One consequence of the falling enrollments in kōhanga reo is that the proportions of children reported as being able to speak Māori has been declining in the younger age groups: from 22% in the 5-to 9-years age band in 1996 to 17% in the 2013 census. Overall, the intergenerational transmission rate for Māori is 44%, that is, if a child is living in a household which contains a Māori-speaking adult there is a 44% chance that he or she is also reported as being able to speak Māori (King and Cunningham 2017). Decline in the popularity of kōhanga reo is undoubtedly due to a number of reasons, including location and hours of service. There are also concerns about the operational and governance structures of kōhanga which have changed little since the beginnings of the movement. In addition, there are other Māori immersion centers offering a service similar to kōhanga reo. These centers appear to be attracting Māori urban professional parents who prefer a Māori immersion program that focuses on child development rather than whānau development (King and Gully 2009). There were thirteen such centers throughout New Zealand in 2015. Internationally, there has been much attention recently on “new speakers,” that is, speakers “with little or no home or community exposure to a minority language but who instead acquire it through immersion or bilingual programs, revitalization projects or as adult language learners” (O’Rourke, Pujolar, and Ramallo 2015, 1). By this definition, most current speakers of Māori under the age of 60 are “new speakers.” As in other parts of the world, there have been some tensions in the relationships between new speakers and older fluent speakers who grew up and were socialised in Māori-speaking communities. Unsurprisingly, new speakers of Māori speak Māori differently from older fluent speakers. The phonology of New Zealand English has affected their pronunciation of Māori (Watson et al. 2016) and there have also been some changes in the syntax of the Māori language, also due to the effect of English (Harlow 1991; Kelly 2014). In addition, because of the large vocabulary expansion required for the Māori language education system, new speakers often use vocabulary unfamiliar to older fluent speakers (Christensen 2003, 49). All of this can lead older fluent speakers to express dislike toward the speech of new speakers. Hōhepa quotes an elder as saying, “if that’s the language of my grandchildren, better that it be allowed to die” (2000, 2). In particular, highly fluent new speakers who have attended Te Panekiretanga o Te Reo Māori courses are often teased for using a register of Māori that places emphasis on idiomatic, metaphorical speech and use of obscure vocabulary which many elders are not familiar with. Nevertheless, new speakers are pervasive in the education and broadcasting arenas, outnumbering older fluent speakers by fifteen to one (Christensen 2003, 49). Consequently, Māori has not been too greatly affected by the sort of purism which is

Māori: Revitalization of an Endangered Language 603 often espoused by older fluent speakers and which can undermine revitalization efforts (Dorian 1994). In recent years, visits to New Zealand by Ghil’ad Zuckermann and his descriptions of revivalistics have led to increasing acceptance that languages change and adapt, especially when in contact with other languages (Zuckermann and Walsh 2011). Regardless, despite any tensions, the speech of older fluent speakers is still regarded as the exemplar to which most new speakers aspire. With regard to New Zealand more generally, there are many aspects of Māori language and culture which have become emblematic for New Zealand identity. One is the haka (“danced chant”) performed by the New Zealand Rugby Union team, The All Blacks. Since 1999, through the controversy caused by singer Hinewehi Mohi singing the National Anthem in Māori at a international rugby match, CDs sent to all schools have resulted in new generations of New Zealand school children becoming confident in singing the first verses of both the Māori and English versions of the anthem (in that order) and this has now become the customary delivery at all official events. It has been noted that the most distinctive aspect of New Zealand English is the number of words borrowed from the Māori language (Deverson 1991, 18). While we cannot be sure how many Māori words are recognized by New Zealanders who don’t speak Māori, Deverson (1984) estimated the number would be about forty to fifty. This has since been revised upward by Macalister to seventy to eighty (2004), indicating that wider society may be becoming more familiar and accepting of the Māori language and aspects of Māori culture. An increase in tolerability (de Bres 2008) is one of the aims of the current Māori Language Strategy.

7.1. Positives Although classified as an endangered language (Catalogue of Endangered Languages 2015) the Māori language has several distinctive aspects which have contributed to its relatively positive position. In contrast to many endangered languages, Māori is the only indigenous language of New Zealand. This makes it somewhat easier to lobby Parliament and government departments, and effect actions and strategies when resources and effort are not spread among a number of languages. The national educational system in New Zealand has also enabled schools to be established and funded relatively easily compared to other jurisdictions where elected district school boards have control of funding and curricula. Māori language has also benefited from the terms of the Treaty of Waitangi. Successful claims to the Waitangi Tribunal have resulted in recognition of Māori as an official language and consolidation of the obligations of the Crown in Māori language revitalization (Waitangi Tribunal 2011). Another positive factor is that the Māori language has an alphabet which has been largely agreed on since 1826 (Parkinson 2016, 37). Many other languages are not in such a fortunate position. Māori is also one of the most well-documented Polynesian languages, with a large number of dictionaries and grammars. Particularly useful in the

604 Jeanette King digital age is the comprehensive online dictionary maintained by John Moorfield (www. maoridictionary.co.nz). Within the last ten years there have also been two monolingual Māori language dictionaries published, one for schoolchildren (Huia Publishers 2006) and another for adults (Te Taura Whiri i te Reo Māori 2008). Another important factor which benefits the Māori language is aspects of Māori culture itself. The epitome of Māori cultural expression is to be found on the marae in rituals of encounter. These interactions, in particular those of welcome (pōwhiri) and funeral farewells (tangihanga), are largely enacted outside and in public. The non- secretive nature of these ceremonial aspects and the traditions encapsulated therein, traditions that are mandatorily conducted in Māori, have led to them being incorporated into wider New Zealand society and involving interaction with non-Māori. Most educational institutions, as well as local and central government organizations, regularly hold welcome or celebration ceremonies which involve Māori protocols. If an actual marae is not available, the nature of the ceremonies allows most public venues to serve as temporary marae. It is common for visiting dignitaries and celebrities to be accorded a pōwhiri on arrival to New Zealand. These public and inclusive rituals no doubt contribute to why Māori culture is seen an important part of New Zealand identity (Albury 2015). Another historical factor which has been of immense importance to the revitalization of Māori was the sociolinguistic survey work in the 1970s which generated the necessary momentum for the revitalization of Māori (Benton 1991). Since language shift away from a minority language usually occurs unconsciously at the micro-level, the results of research can be the catalyst which allows community members to become aware of the macro-view, leading to the conscious decision to attempt to maintain or revitalize the language. Maori performing arts, or kapa haka, are an increasingly important part of Maori culture. Kapa haka involves groups of up to forty participants who perform a bracket of songs in the Māori language. Since 1972 there has been a biennial national competition for adult performers (now known as Te Matatini). Anecdotally, the conversational use of Māori among the judges and audiences at the regional and national kapa haka finals appears to be increasing. These competitions are important in that large sections of the Māori community attend as either competitors or supporters and all age groups are involved, thus providing an opportunity for normalized intergenerational interaction in the Māori language.

7.2. Lessons Looking back over the last thirty-five years we can note several phases in the Māori language revitalization movement: disruption, institutionalization, and normalization. The trajectory of these phases can be seen in Māori language educational initiatives. In the first phase, the existing order is disrupted. The education initiatives were an example of self-determination at the grassroots level involving young and older members

Māori: Revitalization of an Endangered Language 605 of the community which disrupted both the trajectory of the numbers of children being raised as speakers of Māori and also the education scene. New communities of practice, whānau, formed. This phase is characterized by participants feeling a sense of breaking frontiers and of uniting in a common cause. In the initial phase of the educational initiatives many people were volunteers rather than paid employees. After several years an institutionalization phase emerged where procedures and practices were developed. Employment contracts were set up and training procedures established. There is still a sense of group purpose, but it is often focused more on managerial aspects. Typically, at this stage government funding has been secured. In the normalization phase the initiative becomes regarded as a normal part of life. Today all Māori children grow up in a world where kōhanga reo and kura kaupapa Māori are present. Even if they do not participate in Māori immersion education, current generations can’t remember a world where these initiatives didn’t exist. After thirty-five years the Māori language has become normalized in education and broadcasting domains and the concern is that complacency will hold back further gains. Of course, normalization doesn’t necessarily lead to complacency; one possible outcome is that it becomes normal to speak Māori in certain situations, putting an onus on people to improve their language ability in order to participate. In recent years both government and tribal groups have become concerned about complacency and have been developing new approaches to inspire vitality. There is now an understanding that it is necessary to regularly revise and recalibrate, not the targets necessarily, but the manner of approach, in order to encourage and support the speaking and learning of Māori. For example, the annual Māori Language Week has, since 2004, promoted a different theme each year. Since 2014, instead of focusing on just one week of the year the strategy has been to promote a new Māori word or phrase each week for the whole year, as a way to build up vocabulary and maintain a wider profile for the Māori language. The Māori Language Commission is also diversifying their approach and strategies. For example, in 2015 they began funding kura whakarauora (“survival schools”). These two-to three-day workshops teach the basics of language planning and provide information and tools for attendees to take back and action in their communities. The Ngāi Tahu tribe have found it important to regularly refresh and rebrand their language revitalization effort in order to maintain commitment and to encourage new involvement. Their tribal magazine Te Karaka (“the call”) regularly profiles members who are learning and using Māori to encourage other families and support normalisation of the language. In recent years there have been increasing efforts to encourage learners and speakers rather than castigate them. One mantra often used the 1980s and 1990s was: me kōrero Māori i ngā wā katoa, i ngā wāhi katoa (“speak Māori at all times and in all places”) (Jenkins and Ka’ai 1994, 163). At the time this was an important and necessary edict to remind parents, most of whom were new learners of the language, not to switch to English. Experience had shown that if English was introduced into the environment it tended to remain. However, this stricture can lead to people feeling that they shouldn’t

606 Jeanette King even try to speak Māori if they can’t make a full commitment. The purism associated with this mantra has evolved into a recognition that encouragement is needed. One example of the focus on encouragement is the ZePA model (Higgins and Rewi 2014). An acronym for “Zero, Passive, Active,” the ZePA principle is, no matter if you’re at Zero levels, by gradually learning more Māori words you can become a Passive learner of the language and move toward becoming an Active speaker. This model emphasizes the lifelong nature of the learning experience and that even a little can start you on the way. While there is a tendency to shift leftward along this continuum (from Active to Passive to Zero), the idea is to encourage people that no matter where they are on the continuum to keep right-shifting.

7.3. Challenges While there are many positive aspects of the situation in New Zealand which have helped in the maintenance and revitalization of the Māori language, there are still a number of challenges in the present environment, beyond the perpetual need for more Māori language teachers and teaching resources. New Zealand’s linguistic diversity is increasing with increased migration from India and China, particularly since 2009. The country’s largest city, Auckland, with a population of just over 400,000 and 10% of the country’s population, is now officially defined as “superdiverse” (Spoonley and Bedford 2012). Over 40% of Auckland’s population was born overseas; there are 200 ethnicities and 160 languages spoken. While in New Zealand overall, Māori is the most commonly spoken language other than English, in Auckland it is now the fourth most common, behind Samoan, Hindi, and Mandarin. Cantonese is not far behind. But paradoxically, there are some signs that the increasing multilingualism could be leading to a greater acceptance of New Zealand’s indigenous language and culture as an essential part of the national identity. Unlike other languages, including English, the Māori language uniquely belongs in Aotearoa New Zealand. There also seems to be increasing acceptance of non-Māori speakers of Māori. Another challenge is that there is not an accepted written standard for Māori. Māori has several mutually intelligible dialects with differences on the phonological, lexical and syntactic levels, but linguistically these differences are relatively small (Harlow 2007). The Māori Language Commission has provided useful orthographic conventions (Te Taura Whiri i Te Reo Māori 2012). Also much written teaching material has been produced by older fluent speakers from one dialect area (Ngāti Porou), so that dialect has become somewhat normative. Written standards in language typically evolve gradually among those involved in a print culture. While there was a flourishing Māori print culture in the second half of the nineteenth century there have been few signs of its redevelopment in the current revitalization phase. With the emphasis on tribal regeneration, dialect differences are becoming more important as part of expressing tribal identity. Some tribal material is beginning to use

Māori: Revitalization of an Endangered Language 607 spelling to indicate pronunciation features of their dialect. For example, O’Regan from Ngāi Tahu (2014) writes with what is termed the Ngāi Tahu “k,” where a merger between /k/and /ŋ/in this dialect is represented orthographically by using for both what is and in other varieties of Māori. If increased publishing by tribal groups begins to incorporate tribally based spellings this could negatively affect any move toward a standard (Keegan 2017).

8. The way ahead In his seminal work on reversing language shift Fishman (1991) gave a summary of the Māori language situation and noted that revitalization efforts should be more focused on level 6 of his GIDS scale, namely, embedding the language in the home, neighborhood, and community. Although this has been a strong priority since 2000, the first revitalization efforts in New Zealand were not concentrated on the home; rather it was the education system which became the prime focus. This stemmed from the idea that education initiatives were the most efficient and effective way of raising a new generation of Māori-speaking children. The focus on schooling initiatives did have a community aspect in the form of the whānau that built up around schools. In other ways the revitalization measures in New Zealand have followed Fishman’s interpretation. As we have seen, all the early efforts came from the bottom up, the ideas were formed at community level with volunteers and the pooling of resources and funding. In addition, the passion of the early pioneers were focused on ethnic and cultural goals, that is, self-determination and development goals with Māori language and culture at their heart (Fishman 1991, 18–21). For the last sixteen years there has been a funding prioritization at the government and tribal levels on supporting home and community language use. However, there is very little mention of the importance of neighborhoods. While some tribes have considered forming physical language enclaves, the practical issues of forming Māori- speaking neighborhoods have so far quashed any such aspirations. However, a Māori- speaking neighborhood has formed in Ōtaki, a small town with a population of 6,000 in the lower North Island, where the Māori tribal tertiary education institution, Te Wānanga o Raukawa, is located. The population of this area is 34% Māori, double the New Zealand average. Commentators report that Māori language can be heard regularly on the street and in shops in Ōtaki (Smale 2016). This is the only current example of the reclamation of a Māori-speaking locale. How successful has reversing language shift been in New Zealand? First, the answer rather depends on what success is (Hinton, Huss and Roche 2018). There have been few articulations of the macro-aim of Māori language revitalization (Christensen 2003). Some suggest it would be having large proportion of Māori being able to speak Māori (both Bauer 2008 and Ruckstuhl and Wright 2014 suggesting 80% or more). Some commentators have suggested working toward a situation where Māori would be the

608 Jeanette King preferred language in particular domains (Chrisp 1997; Harlow 2003). However, the Māori Language Strategies have been careful not to make any definite statements regarding the desired numbers or percentages of Māori speakers. In order to improve levels of engagement with the language, Hinurewa Poutu (2015, 385) notes that there is a need to “whakacoolngia te reo” (“make the Māori language cool”). New speakers of Māori have not typically been motivated by a desire to revitalize the language; rather they feel that the language revitalizes them (King 2014). Encouragingly, recent research indicates that new speakers are becoming motivated by the ability to participate in events where Māori is the main language of interaction (Te Huia 2015, 627). A real need to speak the language is the most effective motivation of all, so there is a need to harness the FOMO (fear of missing out) principle. To date, the revitalization of the Māori language has had a good measure of success, but as Leanne Hinton notes, “success is not an endpoint but a process” (Hinton, Huss, and Roche 2018, 499) It’s all about the journey, and, to invoke a Māori canoe metaphor, while not everyone will be involved along the way, inspired leadership will bring others on board. In Aotearoa New Zealand the main lesson has been: change is constant, so keep changing. “A waka cannot change the winds, but it can change its sails to match” (Tarena- Prendergast 2016, 38).

References Albury, Nathan J. 2015. “Your Language or Ours? Inclusion and Exclusion of Non-indigenous majorities in Māori and Sámi Language Revitalization Policy.” Current Issues in Language Planning 16(3): 315–334. Bauer, Winifred. 2008. “Is the Health of Te Reo Māori Improving?” Te Reo 51: 33–73. Benton, Richard A. 1985. The Role of Television in the Survival of the Maori Language: A Statement to the Waitangi Tribunal, Waiwhetu Marae, 8 October 1985 (Te Wāhanga Māori Occasional Paper No. 18). Wellington: New Zealand Council for Educational Research. Benton, Richard A. 1986. “Schools as Agents for Language Revival in Ireland and New Zealand.” In Language and Education in Multilingual Settings, edited by Bernard Spolsky, 53–76. Clevedon, UK: Multilingual Matters. Benton, Richard A. 1991. The Māori Language: Dying or Reviving? Honolulu: East West Center. (Reprinted by New Zealand Council for Educational Research 1997.) Benton, Richard A. 2015. “Perfecting the Partnership: Revitalizing the Māori language in New Zealand Education and Society 1987–2014.” Language, Culture and Curriculum 28(2): 99–112. Benton, Richard A. and Nina Benton. 2001. “RLS in Aotearoa/New Zealand 1989–1999.” In Can Threatened Languages Be Saved? Reversing Language Shift Revisited: A 21st Century Perspective, edited by Joshua A. Fishman, 423–450. Clevedon, UK: Multilingual Matters. Catalogue of Endangered Languages. 2015. “Māori.” The University of Hawaii at Manoa and Eastern Michigan University. Accessed October 20, 2016. http://www.endangeredlanguages. com/lang/3571. Chrisp, Steven. 1997. “Home and Community Language Revitalisation.” New Zealand Studies in Applied Linguistics 3: 1–20.

Māori: Revitalization of an Endangered Language 609 Christensen, Ian S. 2003. “Proficiency, Use and Transmission: Māori Language Revitalisation.” New Zealand Studies in Applied Linguistics 9(1): 41–61. de Bres, Julia. 2008. “Planning for Tolerability in New Zealand, Wales and Catalonia.” Current Issues in Language Planning 9(4): 464–482. de Bres, Julia. 2015. “The Hierarchy of Minority Languages in New Zealand.” Journal of Multilingual and Multicultural Development 36(7): 677–693. Deverson, Tony. 1984. “‘Home Loans’: Māori Input into Current New Zealand English.” English in New Zealand 33: 4–10. Deverson, Tony. 1991. “New Zealand English Lexis: The Maori Dimension.” English Today 26: 18–25. Dorian, Nancy. 1994. “Purism vs. Compromise in Language Revitalization and Language Revival.” Language in Society 23: 479–494. Fishman, Joshua A. 1991. Reversing Language Shift. Clevedon, UK: Multilingual Matters. Gibson, Campbell. 1973. “Urbanization in New Zealand: A Comparative Analysis.” Demography 10(1): 71–84. Gloyne, Paraone. 2014. “Te Panekiretanga o Te Reo.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen-Reeder, 305–317. Wellington: Huia Publishers. Grin, Francois and Francois Vaillancourt. 1998. Language Revitalisation Policy: An Analytical Survey Theoretical Framework, Policy Experience and Application to Te Reo Māori. Wellington: New Zealand Treasury. http://www.treasury.govt.nz/publications/research- policy/wp/1998/98-06. Harlow, Ray. 1991. “Contemporary Māori Language.” In Dirty Silence: Aspects of Language and Literature in New Zealand, edited by Graham McGregor and Mark Williams, 29–38. Auckland: Oxford University Press. Harlow, Ray. 1993. “Lexical Expansion in Māori.” The Journal of the Polynesian Society 102(1): 99–107. Harlow, Ray. 2003. “Issues in Māori Language Planning and Revitalization.” Journal of Māori and Pacific Development 4(1): 32–43. Harlow, Ray. 2007. Māori: A Linguistic Introduction. Cambridge: Cambridge University Press. Hayes, Lindsay. 1982. “Whakatauira 1981—Māori Leaders Proposals.” Tū Tangata 4: 3–6. Higgins, Rawinia and Poia Rewi. 2014. “ZePA— Right- shifting: Reorientation Towards Normalisation.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen- Reeder, 7– 32. Wellington: Huia Publishers. Hinton, Leanne, Leena Huss, and Gerald Roche. 2018. “Conclusion: What Works in Language Revitalization.” In The Routledge Handbook of Language Revitalization, edited by Leanne Hinton, Leena Huss, and Gerald Roche, 495–502. Hōhepa, Pat. 2000. “Towards 2030 AD (2) Māori Language Regeneration Strategies, Government, People.” He Pūkenga Kōrero 5(2): 10–15. Huia Publishers. 2006. Tirohia, kimihia. Wellington: Huia Publishers. Human Rights Commission. 2008. Statement on Language Policy. Wellington: Human Rights Commission. Jenkins, Kuni and Tania Ka’ai. 1994. “Māori Education: A Cultural Experience and Dilemma for the State—A New Direction for Māori Society.” In The Politics of Learning and Teaching in Aotearoa-New Zealand, edited by Eve Coxon, Kuni Jenkins, James Marshall, and Lauran Massey, 148–179. Palmerston North, NZ: The Dunmore Press.

610 Jeanette King Kāretu, Tīmoti. 1993. “Tōku reo, tōku mana.” In Te ao mārama 2. He whakaatanga o te ao: The Reality, edited by Witi T. Ihimaera with Haare Williams, Irihapeti Ramsden, and Don S. Long, 222–229. Auckland: Reed. Keegan, Peter J. 2005. “The Development of Maori Vocabulary.” In Languages of New Zealand, edited by Alan Bell, Ray Harlow, and Donna Starks, 131–148. Wellington, NZ: Victoria University Press. Keegan, Peter J. 2017. “Māori Dialect Issues and Māori Language Ideologies in the Revitalisation Era.” MAI Journal: A New Zealand Journal of Indigenous Scholarship 6(2): 129–142. Kelly, Karena. 2014. “Iti Te Kupu, Nui Te Kōrero—The Study of the Little Details That Make the Māori Language Māori.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen-Reeder, 255–267. Wellington, NZ: Huia Publishers. King, Jeanette. 2001. “Te Kōhanga Reo: Māori Language Revitalization.” In The Green Book of Language Revitalization in Practice edited by Leanne Hinton and Ken Hale, 118–128. San Diego, CA: Academic Press. King, Jeanette. 2006. “Wānanga reo—Māori Language Camps for Adults.” In One Voice, Many Voices, Recreating Indigenous Language Communities, edited by Teresa L. McCarty and Ofelia Zepeda, 73–86. Tucson: Arizona State University Center for Indian Education, University of Arizona American Indian Language Development Institute. King, Jeanette. 2014. “Revitalising the Maori Language?” In Endangered Languages: Beliefs and Ideologies in Language Documentation and Revitalization, edited by Peter K. Austin and Julia Sallabank, 215–230. Oxford: Oxford University Press/British Academy. King, Jeanette and Nichole Gully. 2009. “Nōku Te Ao: A New Generation of Māori Immersion Preschools.” In Proceedings of the 2nd International Conference on Language, Education and Diversity (LED), Hamilton, New Zealand, November 21–24, 2007. CD. Hamilton: Wilf Malcolm Institute of Educational Research. King, Jeanette, and Una Cunningham. 2017. “Tamariki and Fanau: Child Speakers of Māori and Samoan in Aotearoa/New Zealand.” Te Reo 60: 27–44. Macalister, John. 2004. “A Survey of Maori Word Knowledge.” English in Aotearoa 52: 69–73. Matamua, Rangi. 2014. “Te Reo Pāpāho me te Reo Māori—Māori Broadcasting and te Reo Māori.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen-Reeder, 331–348. Wellington, NZ: Huia Publishers. May, Stephen. 2013. “Indigenous Immersion Education: International Developments.” Journal of Immersion and Content-Based Education 1(1): 34–69. Metge, Joan. 1995. New Growth from Old—The whānau in the Modern World. Wellington, NZ: Victoria University Press. Ministry of Education. 2016a. Ngā Haeata Mātauranga: Assessing Māori Education. Wellington, NZ: Ministry of Education. Ministry of Education. 2016b. Te Marautanga o Aotearoa. Wellington, NZ: Ministry of Education. Muller, Maureen and Andrea Kire. 2014. “Kotahi Kapua i te Rangi he Marangai ki te Whenua: The Philosophy and Pedagogy of Te Ataarangi.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen- Reeder, 291–303. Wellington, NZ: Huia Publishers. O’Regan, Hana. 2014. “Kia Matike, Kia Mataara: Te Huanui o Kotahi Mano Kāika.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen-Reeder, 109–122. Wellington, NZ: Huia Publishers.

Māori: Revitalization of an Endangered Language 611 O’Rourke, Bernadette, Joan Pujolar, and Fernando Ramallo. 2015. “New Speakers of Minority Languages: The Challenging Opportunity—Foreword.” International Journal of the Sociology of Language 231: 1–20. Parkinson, Phil G. 2016. “The Māori Grammars and Vocabularies of Thomas Kendall and John Gare Butler.” Asia-Pacific Linguistics 26: 1–163. Pawley, Andrew. 1989. “Can the Māori Language Survive?” Hurupā 10: 12–23. Poutu, Hinurewa. 2015. “Kia Tiori ngā Pīpī: Mā te aha e korero Māori ai ngā taitamariki o ngā wharekura o Te Aho Matua.” PhD diss., Massey University, Palmerston North, NZ. Puketapu, Kara. 1982. Reform from Within. Wellington, NZ: Department of Māori Affairs. Royal Society of New Zealand. 2013. Languages in Aotearoa New Zealand. Wellington: The Royal Society of New Zealand. Ruckstuhl, Katharina and Janine Wright. 2014. “The 2014 Māori Language Strategy, Language Targets.” In Proceedings of the International Indigenous Development Research Conference (IIDRC), Auckland, New Zealand, November 25–28. Auckland: Ngā Pae o te Māramatanga. Smale, Aaron. 2016. “Rescuing the Reo.” Mana Magazine, May 27. Spoonley, Paul and Richard Bedford. 2012. Welcome to Our World? Immigration and the Reshaping of New Zealand. Wellington, NZ: Dunmore Publishing. Stephens, Tainui. 2014. “He Kura Takitahi—He Kura Takimano.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen- Reeder, 369–383. Wellington, NZ: Huia Publishers. Tarena-Prendergast, Eruera. 2016. “Indigenising the Corporation.” Te Karaka 70: 36–38. Tawhiwhirangi, Iritana. 2014. “Kua Tū Tāngata E!—Moving a Critical Mass.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen-Reeder, 33–52. Wellington, NZ: Huia Publishers. Te Huia, Awanui. 2015. “Exploring Goals and Motivations of Māori Heritage Language Learners.” Studies in Second Language Learning and Teaching, 5(4): 609–635. Te Puni Kōkiri. 1998. The National Māori Language Survey. Wellington, NZ: Te Puni Kōkiri. Te Puni Kōkiri. 2003. The Māori Language Strategy. Wellington, NZ: Ministry of Māori Development. Te Puni Kōkiri. 2008. The Health of the Māori Language in 2006. Wellington, NZ: Te Puni Kōkiri. Te Puni Kōkiri. 2010. Te Reo Pāho: Use of Broadcasting and e-Media, Māori Language and Culture. Wellington, NZ: Te Puni Kōkiri. Te Puni Kokiri. 2011a. Te Reo Mauriora—Review of the Māori Language Sector and the Māori Language Strategy. Wellington, NZ: Te Puni Kōkiri. Te Puni Kōkiri. 2011b. Impact of Iwi Radio on the Māori Language. Wellington, NZ: Te Puni Kōkiri. Te Puni Kōkiri. 2014. Māori Language Strategy 2014. Wellington, NZ: Te Puni Kōkiri. Te Rito, Joseph S. 2014. “Radio Kahungunu: Tribal Language Revitalisation Efforts.” In Te Hua o Te Reo Māori: The Value of the Māori Language, edited by Rawinia Higgins, Poia Rewi, and Vincent Olsen-Reeder, 349–367. Wellington, NZ: Huia Publishers. Te Taura Whiri i te Reo Māori. 2008. He Pātaka Kupu: te kai a te rangatira. Auckland: Raupo. Te Taura Whiri i te Reo Māori. 2012. Guidelines for Māori Orthography. Wellington, NZ: Te Taura Whiri i te Reo Māori. Waitangi Tribunal. 2011. Ko Aotearoa Tēnei: Te Taumata Tuarua. Wellington, NZ: Legislation Direct.

612 Jeanette King Waite, Jeffrey. 1992. Aoteareo, Speaking for Ourselves: A Discussion on the Development of a New Zealand Languages Policy. Wellington, NZ: Learning Media. Watson, Catherine I., Margaret Maclagan, Jeanette King, Ray Harlow, and Peter Keegan. 2016. “Sound Change in Māori and the Influence of New Zealand English.” Journal of the International Phonetic Association 46(2): 185–218. Winiata, Whatarangi. 1979. Whakatupuranga rua mano: Generation 2000. An Experiment in Tribal Development. Wellington: New Zealand Planning Council. Zuckermann, Ghil’ad and Michael Walsh. 2011. “Stop, Revive, Survive: Lessons from the Hebrew Revival Applicable to the Reclamation, Maintenance and Empowerment of Aboriginal Languages and Cultures.” Australian Journal of Linguistics 31(1): 111–127.

Chapter 27

L an guage Revi ta l i z at i on in Afri c a Bonny Sands

1. Introduction Compared to efforts on other continents, language revitalization in Africa lags behind. This is in part due to the dearth of resources at both local and national levels and in part due to the fact that language shift is, in many cases, fairly recent. In Africa, language shift typically occurs from minority languages to locally dominant languages, which themselves may be minority languages at the national level. Community organizations, linguists, missionaries, and governments are involved in various language revitalization projects. Efforts typically place a great deal of focus on the development of an orthography, as this both increases the visibility of a minority language in the public sphere and helps valorize the language. The development of basic descriptive materials such as a grammar sketch and dictionary are typically pursued. This chapter begins by surveying the extent of African language vulnerability, reasons for language endangerment, and factors specific to African sign languages. An overview of the major players in African language revitalization is followed by case studies of revitalization of small, medium, and large languages. Problems with the current emphasis on orthography and literacy development in language revitalization are also discussed.

Many thanks to Friederike Lüpke who made valuable comments on an early draft of this chapter and also to Document Delivery Services at Northern Arizona University’s Cline Library for locating the many references cited in this paper.

614 Bonny Sands

2. Survey of language endangerment in Africa Before looking at language revitalization in Africa, it is important to first understand the extent of language endangerment on the continent and the factors which have led to language shift. I also survey the special situation of African sign languages since this has largely been ignored in previous surveys.

2.1. Extent of Endangerment Nearly a third of the world’s languages are in Africa and some 10–28% of these are seriously endangered, based on conservative estimates (Brenzinger and Batibo 2010; Simons and Lewis 2013; Lewis, Simons, and Fennig 2016). When factors such as overall population size are considered, a much higher percentage of languages in Africa must be considered to be at risk (Lüpke 2015; Sands 2017). It has been suggested that a language must have at least 100,000 speakers to be considered “Safe” (Lee and Van Way 2016, 281), which would mean that some two-thirds of African languages must be considered “unsafe.” In fact, numerous instances can be found of languages considered to be “vigorous” by Ethnologue (Lewis et al. 2016) which have recently shown signs of language shift, e.g., Vute (Mutaka 2008), Langi (Dunham 2016, 261), and Oko (Adegbija 2001). Languages of other continents with similar population profiles and signs of shift are widely recognized as endangered, e.g., Irish, Navajo, Frisian, and Hopi. The number of threatened languages in Africa is difficult to pinpoint precisely since language documentation efforts tend to have the effect of revealing the existence of languages that were either unknown to linguists or thought to be dialects, e.g., Daats’iin (Ahland 2016) in Ethiopia, Tsupamini (Blench 2012) in Nigeria, and Sasi (Collins and Gruber 2014) in Botswana.

2.2. Reasons for Endangerment The causes of language endangerment in Africa are varied but are similar to those seen in other parts of the world. What is particular to the situation in Africa is the pressure on vulnerable languages by other minority languages; language shift tends to be from a minority language to a locally dominant language rather than to a national or global language (Batibo 2005, 2013a). Surveys of language shift in Africa include: Nyombe (1997), Batibo (2005), Childs (2006), Dimmendaal and Voeltz (2007), Idiata (2009), and Lüpke and Storch (2013). Case studies of language shift include Sommer (1995), Bagamba (2007), and Robson (2011). There are many languages in Africa which are spoken by very small populations which still manage to be socioeconomically dominant over even smaller populations.

Language Revitalization in Africa 615 Nkọrọọ, spoken in Nigeria by about 2,000 people is “severely endangered” (Obikudo et al. 2015) yet Defaka speakers are shifting to Nkọrọọ. In Botswana, ǃXóõ (Taa), ǂHoan (Nǃaqrike), and Sasi speakers are shifting to Shekgalagari, even though speakers of that language are undergoing shift to Setswana (Lukusa 2000). In other cases, minority languages are undergoing shift to regionally dominant languages spoken by much larger populations. For instance, in Tanzania, many Hadza speakers to the west of Lake Eyasi have shifted to Sukuma, a Bantu language that is spoken by a fairly large population (> 7 million). Sukuma is itself under pressure from the national language, Kiswahili, and younger speakers have lost a significant number of lexical items known to older speakers (Batibo 2013b). Ngasa in Tanzania have completely shifted to Chaga (Legère 2012), and Akie have largely shifted to Maa and Nguu as well as Kiswahili (Heine, König, and Legère. 2016). Many other Tanzanian languages are seeing shift directly to Kiswahili, e.g., Ngoni (Rosendal 2016) and Vidunda (Legère 2007). Lack of prestige of minority languages certainly is a factor in language shift. Former colonial languages play a role in lowering the prestige of local languages (Tamanji 2008; Connell 2015) even when they are not the direct targets of language shift. The use of a global language, lingua franca, or urban vernacular may be valued as a way of indexing one’s outlook as being modern, anti-tribalist, urban, and sophisticated; local languages may conversely be associated with the disruption of ethnic equality or national unity (Orcutt-Gachiri 2009, 2013; Anchimbe 2013). In Kenya, even a very robust language such as Gĩkũyũ is affected by shift to Kiswahili and Sheng (Orcutt-Gachiri 2009). Positive attitudes toward dominant languages and ambivalence toward local languages can be seen in many studies across the continent, e.g., in Cameroon (Mundt 2016), Nigeria (Senayon 2016a), and Sudan (Garri and Mugaddam 2015). It should be kept in mind that global languages may be starting to pose a greater threat to African language maintenance than has been the case in the past (Connell 2015). Local languages may be seen as a barrier to acquiring English, French, or other languages which are perceived as being necessary for international communication and travel (Anchimbe 2013; Mitchell 2015). Even local educational policies tend to favor ex- colonial languages in most countries. In public domains such as school, the workplace, and politics and in the media (e.g., Nwagbara 2013), even fairly large languages may be marginalized compared to majority or state-sponsored languages. Changes in economic and political activity have led to disruptions in linguistic ecologies, favoring some languages over others. Languages spoken by hunter-gatherers and pastoralists are vulnerable as people move away from these traditional economic pursuits (Batibo 2008). Population movements toward urban areas has certainly caused language shift away from minority languages (Mufwene 2012), but recent studies have highlighted the fact that rural populations are now also being affected by language shift (e.g., Gibson and Bagamba 2016; Rosendal 2016). Population movements due to war and political insatiblity can also lead to language instability. Refugees from the Nuba Mountains living in Khartoum face pressure to shift to Arabic (Kajivora 2015), for instance. Refugees from conflicts in the Democratic Republic of Congo and South Sudan encounter pressure to shift to Kiswahili.

616 Bonny Sands

2.3. Sign languages There are at least twenty-three different African sign languages (Kamei 2004) including Tunisian Sign Language, Tanzanian Sign Language, Gambian Sign Language, etc. Sign languages are relatively vulnerable minority languages based on population size. For the most part, efforts to develop sign languages are not revitalization efforts, per se. Language development and institutional support is generally seen as a positive indicator for sign language vitality (Bickford, Lewis, and Simons 2015), but support from development projects and non-governmental organizations (NGOs) can pose a threat to indigenous sign languages. For instance, Ugandan Sign Language is influenced by signed English and American Sign Language (Lutalo-Kiingi and De Clerck 2015) and foreign signs have influenced West African Sign Languages (Nyst 2010; Nyst, Sylla, and Magassouba 2012). Missionaries using Finnish Sign Language and Swedish Sign Language influenced Eritrean Sign Language, and recent development efforts have focused on removing these lexical influences from the language (Moges 2015). Ghanaian Sign Language, based on American Sign Language, is used in schools for the deaf in Ghana attended by deaf users of Adamarobe Sign Language (Kusters 2014). The lack of institutional support for Adamarobe Sign Language is not the most serious threat to the language. Rather, a proscription against marriages between deaf people issued in 1975 by Chief Nana Kwaakwaa Asiampong has meant that fewer deaf children have been born than in the past, increasing the likelihood that the language will not be passed on to future generations (Kusters 2012). Algerian Jewish Sign Language is used by a small percentage of small community from Ghardaia, Algeria now living in Israel and France (Lanesman and Meir 2012). Although the language has survived for some fifty years in Israel alongside Israeli Sign Language, most users marry outside the Algerian diaspora community (Lanesman and Meir 2012), meaning that the language is not being passed on to future generations.

3. Who is involved in African language revitalization? Language revitalization in Africa has been pursued through a number of different channels. The major actors have been individuals, grassroots organizations, missionaries, and academics (which themselves may be supported by NGOs, governments, or universities). Generally, governments have not played a major role in the funding of revitalization projects, but language policies in individual nations vary greatly in the extent to which minority language rights are given constitutional protection. For instance, Tamazight (Berber) language activism in Algeria has been threatened by government suppression (Mezhoud 2005), but the Namibian government through the Namibian Institute for Educational Development (NIED) aims to support education in mother tongue languages from grades

Language Revitalization in Africa 617 1–3 (Hays 2011). Websites on the internet such as Facebook and YouTube also play a role in language revitalization, particularly with diaspora populations. Efforts by certain individuals have been, and continue to be important to the documentation, development, and revitalization of African languages. In 1962, Pitoro Seidisa risked trying to develop a Yeyi orthography with Ernst Westphal of the University of Cape Town and was arrested upon his return to Botswana (Nyati-Ramahobo 1998; Mooko 2006, 120). More recently, Gopolang Otukiseng Maropamabi Sakuze created a Facebook page “LETS LEARN SHIYEYI” in 2011 which had more than 216 users in 2012 (Veith 2012). Esther Senayon has documented her efforts to increase the use of Ogu in her family, which include translating English songs and hymns into Ogu, and giving her children Ogu names (Senayon 2016b). Documentation of Tima began when a spokesman for the Tima contacted linguist Gerrit Dimmendaal in 2003 in order to help revitalize the language (Dimmendaal 2015). Grassroots efforts to promote language revitalization include those led by small groups of people, communities, and organizations. Countries which have seen a great deal of grassroots efforts at language revitalization include Cameroon (Mitchell 2015), Nigeria (Alimi 2016a, 2016b), Kenya (Ogweno 2016), and Botswana (Mooko 2006; Alimi 2016a, 2016b). In Botswana, for instance, the Yeyi language is promoted by the Kamanako Association (Alimi 2016a, 2016b).The majority of efforts are focused on the development of literacy and language materials, but some organizations have wider aims. A number of community-driven organizations have been formed to promote Nubian languages. The Nubian Language Society (Taamenn Orban) and Nubian Studies and Documentation Centre (NSDC) are both interested in literacy and language materials (Jaeger 2008). The Dongola Association for Nubian Culture and Heritage in Sudan, however, is focused on revitalizing traditional music and culture (Jaeger 2008). A different approach to Nubian revitalization has been the suggested use of Nubian toponyms in the Egypt-Sudan border area (Bell and Sabbār 2011; Sabbār 2011). In Tunisia, the Association de Sauvegarde de la Nature et de Protection de l’Environnement du Douiret (ASNAPED) was founded to preserve local culture (including Tunisian Berber) and to promote ecotourism (Gabsi 2011). Missionaries have been at the forefront of African language description and development and also have increasingly played a role in language revitalization. The Summer Institute of Linguistics (SIL) supports language development for numerous small “vital” languages which should perhaps be considered endangered languages. In Botswana, the Lutheran Bible Translators have a Yeyi language project (Veith 2012). The Naro Literacy Project is supported by the Reformed Church of D’Kar and the Kuru Development Trust (Visser-Wiegel 2001). Funding agencies have played a vital role in promoting the documentation of endangered languages by foreign and local academics. The Endangered Languages Documentation Programme (ELDP) and Endangered Language Fund (ELF) are two programs which have supported a number of researchers at African universities working on endangered African languages. European, American, and Japanese research agencies have also funded numerous research and documentation projects on endangered African languages.

618 Bonny Sands An increasing number of academics come from endangered-language communities. In quite a number of cases, individuals have pursued training in linguistics with the particular goal of documenting the endangered language of their community, e.g., Admire Phiri (2015) working on Tjwao and Blesswell Kure (Kure and McGregor 2011) working on Shua. Linguists Andy Chebanne and Kemmonye Monaka, of the University of Botswana, work on documenting their vulnerable mother tongues (Kalanga (Chebanne and Schmidt 2010) and Shekgalagadi (Monaka 2009)) as well as doing work on highly endangered Khoe and San languages of Botswana (e.g., Monaka and Chebanne 2005). Language revitalization often involves multiple actors, as in the case of Yeyi mentioned above. Missionary linguists are often also academics, and linguists may also be community members or involved in community organizations or NGOs. Because it is oftentimes difficult for academics living abroad to have connections to individual language communities in Africa, linguist Roger Blench has created a YouTube channel where speakers of Nigerian languages in need of documentation and revitalization such as Mupun, Mwaghavul, and Ywom can connect with linguists. In the case of Ekegusii, language development has been supported by the Kenya Institute of Education (a governmental organization), the East African Centre for Professional Studies and Local Language Promotion (a community organization), and the Institute on Field Linguistics and Language Documentation (InField) (a group established by linguists) as well as by individual linguists and community activists (Oiruria and Clayre 2010; Nash 2017). Diaspora populations have played a role in the revitalization and maintenance of some African languages. The Paris-based NGO Congrès Mondial Amazigh advocates for Berbers at the United Nations, for instance, and radio programs and music in Berber takes place in France (Mezhoud 2005). The endangered KiAmu dialect of Kiswahili is used in Facebook posts by speakers on Lamu, and outside the dialect area and country (Hillewaert 2015). There is also a Facebook page for Maay, one of the languages spoken by refugees from Somalia.

4. Case studies of language revitalization of small languages Revitalization efforts involving languages spoken by very small populations face many obstacles, as shown in this section which looks at case studies from Kenya, the Kalahari desert, and other parts of the continent.

4.1. Issues in revitalizing small languages Small languages are defined here as those with 0–10,000 mother-tongue speakers. Some of these languages are obviously more endangered than others, but all face a number of

Language Revitalization in Africa 619 challenges when it comes to language revitalization. Languages spoken by very small populations such as these are less likely to have a mother-tongue speaker with academic training in linguistics, and less likely to receive significant governmental support for literacy efforts and curricula development. Although some of these languages have received a great deal of attention from linguists, their support tends to be in the form of language documentation rather than in effective community-based, long-term language revitalization programs. Communities facing loss of land and economic instability may not prioritize language maintenance. In other cases, there is a lack of support and knowledge of how to go about revitalizing a language, particularly if it is already obsolescent. Often, language development is a first step used to counteract the negative attitudes that are frequently associated with languages spoken by historically marginalized groups. For instance, languages of formerly enslaved peoples in Ethiopia are still stigmatized; in the Oromiya region, Mao are ridiculed for using their language outside the home (Meckelburg 2015). In the Benishangul-Gumuz region, however, primary school books are being produced in Komo and Mao (Gwama), supporting the use of the languages in public (Meckelburg 2015). There is a danger that communities may be overly optimistic about the abilities of linguists to actually “save” a language. Schöpperle (2011) interviewed Chairman Ngoisolo of Ngapapa village in Tanzania who assured the researcher that the Akie language “shall be re-installed soon” given recent interest in the language by researchers. A grammar of the language has been published (König, Heine, and Legère 2015), along with audio and video documentation of the language (Legère 2015), but it is not yet clear that these efforts have encouraged any Akie to learn the language or to use it more frequently. The Rivers Readers Project (cf. Adegbija 1997) targeted small languages in Nigeria. but it is unclear how much the presence of basic literacy materials has done to promote language maintenance. Several small languages of Kenya and of the Kalahari have been the focus of community-based revitalization efforts, but there have been projects involving West African languages as well. Some projects involve elders who still speak the languages fluently while other projects are dependent on materials in related languages. Most projects involving small languages focus on language rights and literacy. An innovative approach to literacy in the Mani (a small language spoken in Guinea) will be described below in section 7.

4.2. Kenyan languages Bong’om is a Nilotic language of Western Kenya that may have only a few, if any, speakers remaining (Mberia 2014). The language was already endangered in 1953 when Ojambo arap Kishero wrote to the district officer asking for assistance in documenting Bong’om history but was told that the government could not print books in the language or develop an orthography (Mberia 2014). There is a community-based Bong’om Language

620 Bonny Sands Project, (Chepkotit arap Mungu, 2011) which is trying to revitalize the language and culture, first by seeking to have the language recognized by the government. Bongom has been regarded as one of the minor dialects of Sabaot (Larsen 1991), one of the languages supported by Kenya’s language-in-education policy. Even for Sabaot, there are challenges in implementing the policy since language shift has meant that many Sabaot teachers are not as fluent in the language as they are in Kiswahili or English (Jones and Barkhuizen 2011). In the 1930s, the Yaaku decided to abandon their language in favor of Maasai, but times have changed and there is now a desire to revitalize the language as well as the culture (Blonk, Mous, and Stoks 2005; Carrier 2011). The declaration by UNESCO that Yaaku was extinct (Mathenge 2002) was what reportedly spurred community members and activists Jeniffer Koinante and Manasseh ole Matunge to work on language revitalization (Carrier 2011). Koinante and Maasseh have engaged with media organizations and NGOS, academics, and government officials, and Manasseh has taught words, phrases, and songs to children in a Kuri Kuri school (Carrier 2011). The Elmolo of Kenya now speak Samburu but once spoke a Cushitic language (Tosco 2015). A former teacher and school official, Michael Basili founded the Gura Pau community-based organization (CBO) and led efforts to revitalize the Elmolo language (Tosco 2015). Basili worked with elders to document lexical items beginning in the 1980s and Gura Pao received support from the Christensen Fund which resulted in the development of additional language materials (Omondi 2008; Omondi and Otieno 2008, cited in Tosco 2015). Only some traces survive of Cushitic morphology in the Elmolo language being revitalized; Samburu morphology and phonology are dominant (Tosco 2015). Elmolo language revitalization is an ongoing process. It involves decisions about removing words with Samburu cognates from the lexicon (even when there is evidence that the word did once occur in Cushitic Elmolo) (Tosco 2015), showing that language revitalization is a means of indexing ethnic identity.

4.3. “Khoesan” languages There are several language revitalization efforts targeting Tuu, Kx’a, and Khoe (Khoesan, or Khoisan) languages of southern Africa. Languages targeted range from the very seriously endangered (Nǀuu, Tjwao) to those with small populations still learned by children as a mother tongue (Juǀ’hoan, Naro). Another form of Khoesan language revitalization has taken place in South Africa where languages such as ǀXam and Cape Khoe were once spoken; Khoekhoegowab teachers from Namibia have received funding from Khoesan activists and the Western Cape government to teach the language in South Africa (Williams 2011; Brown and Deumert 2017). Khoesan languages are among the most marginalized languages on the continent and changes in beliefs about the languages are important first steps to revitalization. For instance, one belief I have heard often is that the language is not a “real” language since it “cannot be written” (because of the clicks).

Language Revitalization in Africa 621 For the most seriously endangered Khoesan languages, changes in attitudes toward the endangered languages have begun with linguistic documentation and community development. In the case of Tjwao, documentation has only recently begun (e.g., Anderson et al. 2014; Ndlovu 2014; Phiri 2015). In the case of Nǀuu, speakers had a negative opinion of the language which started to turn around after the end of Apartheid through the documentation project led by Hugh Brody which is archived at the University of Cape Town (Brody 2016a, 2016b). Subsequent documentation of the language followed, conducted by multiple linguists (including myself) from the United States, Europe, and South Africa all using different orthographic schemes. Levi Namaseb, a Khoekhoegowab speaker from the University of Namibia, taught one generation of children how to spell and pronounce words in Nǀuu and say basic phrases. The most recent generation of children (grandchildren of the last mother-tongue speakers) have been taught by Katrina Esau (known as Geelmeid), along with her granddaughter Claudia du Plessis and Mary-Ann Prins. Katrina Esau’s efforts have won her national recognition: she received the National Order of the Baobab in silver in 2014 and has been named the chief of the ǂKhomani/Western Nǀuu in 2015 (Witbooi 2015). In the case of Nǀuu, language reclamation has gone hand in hand with land reclamation and cultural revitalization (Chennells and du Toit 2004). For Juǀ’hoan and Naro, language revitalization is more a matter of language development than revitalization. The populations that speak these languages are small, but there has been relatively little language shift. Juǀ’hoan in Namibia has been supported by a mother-tongue education project (cf. Cwi and Hays 2011), which has been supported by the government, various NGOs, and academics (particularly anthropologist Megan Biesele, and linguists Patrick Dickens and Kerry Jones). Naro language development has been led by missionaries Coby and Hessel Visser and Naro language preschools have been sponsored by the Kuru Development Trust (Hitchcock 2013). The use of Naro in schools appears to be positively affecting attitudes about the language (Gabanamotse-Mogara and Batibo 2016). !Xóõ is similar to Naro in that it is spoken in Botswana and acquired by children, yet it has not been the focus of a literacy campaign and language attitudes are relatively low (Gabanamotse-Mogara and Batibo 2016).

4.4. Other small languages Malawian Ngoni (Chingoni) is undergoing shift to Chitumbuka and Chichewa (Kishindo 2002). The Abenguni Revival Association was established in 1998 to revitalize Ngoni language and culture (Kishindo 2002). However, the association teaches language classes using Zulu language materials rather than by trying to expand Chingoni as still spoken in some parts of the Mzimba district of Malawi (Kishindo 2002), so these revitalization efforts do not directly target the endangered lect. Eegimaa is spoken in Senegal (Sagna 2016). In addition to language documentation and literacy efforts, Sagna collaborated with community members in order to air

622 Bonny Sands programs featuring the Eegimaa language in radio programs which were broadcast in 2015 (Sagna 2016). Revitalization of the Safaliba language of Ghana began after Mr. Edmund Kungi Yakubu contacted the Ghana Institute of Linguistics, Literacy and Bible Translation (GILLBT) asking for assistance in developing the language (Schaefer 2015). Mr. Iddi Bayaya was trained by GILLBT as a literacy facilitator and he has written a number of texts in Safaliba, and taught literacy classes to adults and children (Schaefer 2015).

5. Case studies of language revitalization of medium-sized languages Medium-sized languages may be roughly defined as those having between 10,000 and 1 million speakers. Given their relatively large size, these languages tend to miss out on funding opportunities available to document smaller languages that are perceived as more severely endangered. For a large number of medium-sized languages, language documentation and development is rudimentary and recognition of language shift and contraction is relatively recent. Revitalization efforts on medium-sized languages tend to come from grassroots community organizations and/or missionary-led organizations. For example, in Nigeria, community-led efforts to preserve Oko include an annual festival, Ogori Descendants Union meetings, and an Oko language magazine (Alimi 2016a), while efforts to promote literacy in the Tyap language are sponsored by the Nigeria Bible Translation Trust, Jos (Byat and Bivan 2015). In Mozambique, the Xironga language has been seen as threatened since it has lost ground to Portuguese and Xichangana (Lopes 2001). Ngiyana, or the Association of Natives and Friends of Maputo, was established in 1995 to promote the language as well as traditional culture, history, and values (Lopes 2001). A few years after the establishment of Ngiyana, the Maputo Municipal Assembly agreed to adopt the usage of Xironga in governmental meetings (Lopes 2001). Maurice Tadadjeu and others at the University of Yaoundé and SIL have played a major role in promoting minority language development in Cameroon since the 1980s (Wiesemann et al. 1983; Tadadjeu and Sadembouo 1984; Tadadjeu, Sadembouo, and Mba 2004). Community organizations have been founded across Cameroon to promote and revitalize languages such as Basaa, Duala, and Nugunu (Mitchell 2015). The Vute language is promoted by a cultural association called Assovute, which reaches out to speakers of other languages who have Vute ancestry (Mutaka 2008). Interest in language revitalization generally goes along with an interest in cultural revival. The Vute, for instance, are eager to revive traditional dances and traditional medicine as well as their language (Mutaka 2008).

Language Revitalization in Africa 623 Tonga is spoken in both Zambia and Zimbabwe, but cultural ties between the communities were disrupted due to the population displacement caused by the construction of the Kariba dam in the 1950s (Maseko and Moyo 2013). As is the case for many other cross-border languages, Tonga speakers are a minority group in each country where it is spoken. Several grassroots efforts are involved in promoting/revitalizing the Tonga language: the Tonga Language and Cultural Committee (TOLACCO), the Basilwizi Trust and the Zimbabwe Indigenous Languages Promotion Association (ZILPA) (Maseko and Moyo 2013). The Suba language of Kenya is undergoing a shift to Dholuo (Obiero 2008). Suba language development has been supported by SIL, BTL (Bible Translation and Literacy), and the Kenya Institute of Language (Obiero 2008; Mberia 2014). Suba language lessons are among the programs broadcast from a small Suba radio station Ekialo Kiona Suba YouthRadio Station 99 FM (Fox 2014, cited in Mberia 2014). Many medium-sized African languages have experienced effects of language contact and shift including changes to phonology, to lexicon, and to syntax. For instance, Urhobo, an Edoid language of Nigeria, has experienced rapid changes due to language contact (Aziza 2015). Nigerian Pidgin is the dominant language for those under age 30, rather than Urhobo (Aziza 2003) but the language is not considered threatened in Ethnologue 19 (Lewis et al. 2016). Loanwords from Kiswahili into in Tanzanian Ngoni are so prevalent that they are even affecting basic vocabulary (Rosendal and Mapunda 2014) and knowledge of Ngoni proverbs and stories is declining (Mapunda 2015). Language communities likely vary in their awareness of language change and shift, just as they vary in their attitudes toward the same. Language shift affecting medium-sized languages may proceed rapidly, and language vitality assessments more than one generation (~ 25 years) old must therefore be considered unreliable indicators of the vitality of a language.

6. Case studies of language revitalization of large languages Language shift has affected large languages just as it has affected smaller ones. This has been particularly noticeable in the attrition of certain semantic domains such as indigenous numeral systems in languages such as Fulfulde, Yoruba, and Igbo (Ikọtun and Akanbi 2013; Muhammad and Alkali 2013; Prezi 2013). Languages such as Tiv and Fulfulde have seen attrition of genres such as folktales, songs, riddles and in more traditional genres such as bridal negotiations and funeral inquests (Moore 2006; Swande and Udu 2015; Vande-Guma 2015). Urban varieties of languages such as Berber and Wolof are characterized by code-switching and code-shifting, and are endangering more traditional, rural varieties (El Kirat 2001; McLaughlin 2015).

624 Bonny Sands Much like Wolof, The Igbo language is undergoing change due to code-switching and code-shifting. In one study (Anyanwu 2016), for instance, more than 50% of respondents in the 25–34-year-old category were unable to produce Igbo agentive nouns for meanings such as “fisherman,” “blacksmith,” and “carpenter.” The use of Igbo has declined despite the use of the language in secondary education (Anyanwu 2015) and despite the presence of an orthography harmonizing dialects (Ekwueme 2011). Efforts to revitalize the language include the Society for the Promotion of Igbo Language and Culture founded by F. C. Ogbalu, and Asusuigbo Teta founded by Professor Eleazu (Ani 2012). Another individual who has promoted the language is radio presenter Uzoma Okpo, who encouraged programs to be broadcast in Igbo (Ani 2012). He also called upon preachers to use Igbo in sermons (Ani 2012). At least one radio station, 101.5 Unity FM, created a daily Igbo revival advertisement (Ani 2012). Although Ekegusii has over 2 million speakers (Lewis et al. 2016) it is a minority language of Kenya spoken by less than 6% of the population (Nash 2017). And, despite the introduction of Ekegusii into primary schooling (Oiruria and Clayre 2010), as recently as 2010, older schoolchildren have been shamed for speaking Ekegusii in school (Nash 2017). The community’s desire to revitalize the language has led to the production of a dictionary (Bosire and Machogu 2013), and an effort to support local musicians to work with college students to record music in Ekegusii (Oiruria and Clayre 2010). There are large Berber languages such as Kabyle and Tamazight as well as very small Berber languages such as Awjila, Ghomara, and Zenaga (Lewis et al. 2016). Berber languages are threatened by factors such as urbanization, emigration, and language shift, primarily to Arabic (Boukous 1995), but also to French, Spanish, and English (El Kirat El Allame 2008). Berber language revitalization has been undertaken by the Berber Academy (Académie Berbère, or Agraw Imazighen) in Paris which promotes the use of the neo-Tifinagh script (Mezhoud 2005) and a unified Berber identity. The Royal Institute of Amazigh Culture in Morocco promotes the development of curricula in Berber (Sadiqi 2011).

7. Orthographic development and literacy Much of the focus of language revitalization efforts is on the development of an orthography, typically as a first step toward the development of a grammar and text collection. Orthographies can valorize a language in the eyes of the larger society, showing that a minority language is a modern language that can be written, and therefore used in schools and commerce, and other aspects of public life. The development of an orthography is a first step toward literacy, which is attractive in that it allows community members to produce language materials which are meaningful to the community (including histories, public health literature, religious texts, etc.) without the direct

Language Revitalization in Africa 625 assistance of linguists. The revitalization of ancient scripts such as the Old Nubian script (Jaeger 2008; Bell 2014) or Tifinagh can promote ties to past cultural achievements and bring together people speaking different dialects or languages. Written materials provide tangible evidence of the presence of a language, and as such, they may be mistakenly understood to be an indication of language revitalization or maintenance. The “obsession with literacy” as Lüpke (2015) refers to it, typically links language revitalization efforts to a Western model of education. Western models of literacy presume a high level of institutional support which is only available to a small number of (typically majority) African languages. However, urban, or Western-oriented practices of literacy may conflict with the sociolinguistic factors which have promoted language maintenance, namely, use of the language among intimates, in traditional cultural or economic activities, in informal settings, or among women (cf. Hoffman 2006). The resistance of many Africans to mother-tongue education (Tamanji 2008) also makes this approach problematic for language revitalization. Other objections to an emphasis on literacy have been raised in several recent studies. As Rohloff and Henderson (2015, 187) point out, “fetishizing literacy is counterproductive” to expanding opportunities for language use since tends to restrict the language to the classroom, or those who have attained literacy. Classroom use of language “thrives on prescriptivism and linguistic purism through standard languages” (Ameka 2015, 26), which can discourage creativity, variation, and ease of use. Often communities prefer audio or video documentation of their language rather than written texts (e.g., Morrison 2013; Essegbey 2015). People can often express themselves in their mother tongue using orthographic practices from other languages and may prefer this to learning a standardized version, as in the case of Nyagbo, a small Ghanaian language (Essegbey 2015). The time and effort that it takes to learn a standardized orthography may make people feel disconnected from their own language. Lüpke (2015 2018) points out that practices of literacy in West Africa are highly multilingual, involving Ajami (Arabic) and Roman scripts; orthographic practices are innovative and flexible, rather than standardized. How literacy is practiced in Africa may thus run counter to revitalization efforts that focus on standardization and linguistic purity. On a practical level, the choice of a standardized orthography may privilege one dialect over another. Yet, the use of different orthographies for mutually intelligible lects can also be divisive (Chebanne and Mathangwane 2009). The Centre for Advanced Studies of African Society (CASAS) in Cape Town has promoted a number of harmonized orthographies in order to make it possible to develop language materials that can be shared across multiple languages (e.g., Chebanne 2016). But, as Makoni (2016, 223) points out, harmonization “may even accentuate differences within communities.” The supposed advantages for language development of a harmonized orthography may not benefit language revitalization if individuals do not feel that the orthography represents their ethnolinguistic identity. The unified Gbe orthography, for instance, “is seen as alien by Ewe speakers” (Ameka 2015, 24). In work revitalizing Mani, a language spoken in a small coastal area of Guinea, Childs (2017) tested several different methods of approaching literacy. The attempt to teach

626 Bonny Sands Mani literacy through classroom-based methods was an abject failure, but other methods addressing computer literacy and photography literacy were more successful. These methods emphasized ideas of emergent literacy, learner-led education, and community- based literacy goals (Childs 2017). Unlike many other literacy efforts, Childs’s efforts drew on recent work in education theory such as The Busy Intersection Model, Practice Engagement Theory, and the Hole-in-the-Wall initiative (cf. Childs 2017).

8. Conclusion There are African languages at every stage of endangerment, from moribund, with no fluent speakers, to very large languages which are beginning to undergo a shift. Political and socioeconomic pressures favor European languages over African languages, even languages spoken by millions of people. Languages of all sizes have been the focus of revitalization efforts. Language revitalization in Africa has largely been approached through traditional methods of language development with a heavy emphasis on literacy. These efforts are hampered by a lack of documentation and funding, even when language policies are favorable. The success of revitalization programs in Africa remains to be seen, but it seems fair to say that there is a great need for innovative approaches that are more in tune with local patterns of language use.

References Adegbija, Efurosibina. 1997. “The Identity, Survival, and Promotion of Minority Languages in Nigeria.” International Journal of the Sociology of Language 125: 5–27. Adegbija, Efurosibina. 2001. “Saving Threatened Languages in Africa: A Case Study of Oko.” In Can Threatened Languages Be Saved? Reversing Language Shift, Revisited: A 21st Century Perspective, edited by Joshua A. Fishman, 284–308. Clevedon, UK: Multilingual Matters. Ahland, Colleen. 2016. “Daats’iin, a newly identified undocumented language of western Ethiopia: A preliminary examination.” In Selected Proceedings of the 46th Annual Conference on African Linguistics, edited by Doris L. Payne, Sara Pacchiarotti, and Mokaya Bosire, 417– 448. Berlin: Language Science Press. Alimi, Modupe M. 2016a. “Micro Language Planning and Cultural Renaissance in Botswana.” Language Policy 15(1): 49–69. Alimi, Modupe M. 2016b. “Micro Language Planning, Minority Languages and Advocacy Groups in Botswana.” In Vanishing Languages in Context: Ideological, Attitudinal and Social Identity Perspectives, edited by Martin Pütz and Neele Mundt, 21–36. Frankfurt am Main: Peter Lang. Ameka, Felix K. 2015. “Unintended Consequences of Methodological and Practical Responses to Language Endangerment in Africa.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 15–35. Amsterdam: John Benjamins.

Language Revitalization in Africa 627 Anchimbe, Eric A. 2013. Language Policy and Identity Construction: The Dynamics of Cameroon’s Multilingualism. Amsterdam: John Benjamins. Anderson, Gregory D. S., Davy Ndlovu, Admire Phiri, Ngcoli Sibanda, Jeffrey Wills, and K. David Harrison. 2014. Tjwao Talking Dictionary. Salem, OR: Living Tongues Institute for Endangered Languages. http://talkingdictionary.swarthmore.edu/tjwao. Ani, Kelechi Johnmary. 2012. “UNESCO Prediction on the Extinction of Igbo Language in 2025: Analyzing Societal Violence and New Transformative Strategies.” Developing Country Studies 2(8): 110–118. Anyanwu, Ogbonna. 2015. “Stemming the Tide of Indigenous Language Endangerment in Nigeria by Teaching Indigenous Knowledge Systems in Schools: The Igbo Language Example.” Kiabara Journal of Humanities, University of Port Harcourt 21(1). Anyanwu, Ogbonna. 2016. “Endangered Indigenous Skills and Endangered Indigenous Vocabulary Items: Evidence from Igbo Indigenous Agentive Nouns.” Poster presented at ACAL 47, March 23–26, 2016, Berkeley, California. Aziza, R. O. 2003. “Threatened Languages and Cultures: The Case of Urhobo.” In Actes du 3e Congrès Mondial de Linguistique Africaine: Lomé 2000, edited by Kézié Kyenzi Lébikaza, 353–361. Cologne, Germany: Rüdiger Köppe. Aziza, Rose Oro. 2015. “The Effects of Language Contact on the Urhobo Phonology.” In Language Endangerment: Globalisation and the Fate of Minority Languages in Nigeria: A Festschrift for Appolonia Uzoaku Okwudishu, edited by Ozo-mekuri Ndimele, 575–583. Port Harcourt, Nigeria: M & J Grand Orbit Communications. Bagamba, B. Araali. 2007. “A Study of Language Shift in Rural Africa: The Hema of the North- East of the Democratic Republic of Congo.” PhD diss., Essex University, Essex, UK. Batibo, Herman M. 2005. Language Decline and Death in Africa: Causes, Consequences and Challenges. Clevedon, UK: Multilingual Matters. Batibo, Herman M. 2008. “Poverty as a Crucial Factor in Language Maintenance and Language Death: Case Studies from Africa.” In Language and Poverty, edited by Wayne Harbert, Sally McConnell-Ginet, Amanda Miller, and John Whitman, 23–36. Clevedon, UK: Multilingual Matters. Batibo, Herman. 2013a. “Language Shift as an Outcome of Expansive and Integrative Language Contact in Africa.” In Language Contact: A Multidimensional Perspective, edited by Kelechukwu U. Ihemere, 129–139. Newcastle upon Tyne: Cambridge Scholars Publishing. Batibo, Herman M. 2013b. “Preserving and Transmitting Indigenous Knowledge in Diminishing Bio- Cultural Environment: Case Studies from Botswana and Tanzania.” African Study Monographs 34(3): 161–173. Bell, Herman. 2014. “A World Heritage Alphabet: The Role of Old Nubian in the Revitalization of the Modern Nubian Languages.” In The Fourth Cataract and Beyond: Proceedings of the 12th International Conference for Nubian Studies, edited by J. R. Anderson and D. A. Welsby, 1189–1194. Leuven, Belgium: Peeters. Bell, Herman and Halīm Sabbār. 2011. “Nubian Geographical Names on Both Sides of an International Border.” In Trends in Exonym Use: Proceedings of the 10th UNGEGN Working Group on Exonyms Meeting, Tainach, April 28–30, 2010, edited by Peter Jordan, Hubert Bergmann, Caroline Burgess, and C. Cheetham, 295–311. Hamburg, Germany: Verlag Dr. Kovač. Bickford, J. Albert, M. Paul Lewis, and Gary F. Simons. 2015. “Rating the Vitality of Sign Languages.” Journal of Multilingual and Multicultural Development 36(5): 513–527.

628 Bonny Sands Blench, Roger. 2012. “Research and Development of Nigerian Minority Languages.” In Advances in Minority Language Research in Nigeria, vol. 1, edited by Roger M. Blench and Stuart McGill, 1– 17 (Kay Williamson Educational Foundation, 5). Cologne, Germany: Rüdiger Köppe. Blonk, Matthijs, Maarten Mous, and Hans Stoks. 2005. “The Last Speakers—Yaaku Language Saved from Extinction.” http://www.matthijsblonk.nl/paginas/YaakuENG.htm [translated by Mous, Maarten, H. Stoks, and M. Blonk. 2005. “De laatste sprekers. I Indigo—Tijdschrift over Inheemse Volken 9–13]. Bosire, Kennedy and Gladys Machogu. 2013. Authoritative Ekegusii Dictionary. Mombasa, Kenya: Ekegusii Encyclopedia Project. Boukous, Ahmed. 1995. “La langue berbere: Maintien et changement.” International Journal of the Sociology of Language 112: 9–28. Brenzinger, Matthias and Herman Batibo. 2010. “Sub-Saharan Africa.” In Atlas of the World’s Languages in Danger, 3rd ed., edited by Christopher Moseley, 20–25. Paris: UNESCO Publishing. Brody, Hugh. 2016a. ǂKhomani San Hugh Brody Archive. Cape Town: University of Cape Town. http://digitalcollections.lib.uct.ac.za/khomani. Brody, Hugh. 2016b. Gazing at the Stars. DVD. N|uu School Films, University of the Fraser Valley, and Splash Films. http://digitalcollections.lib.uct.ac.za/khomani/language. Brown, Justin and Ana Deumert. 2017. “‘My Tribe Is the Hessequa. I’m Khoisan. I’m Africa’: Language, Desire and Performance Among Cape Town’s Khoisan Language Activists.” Multilingua: Journal of Cross- Cultural and Interlanguage Communication 36(5): 571–594. Byat, Grace Caleb and Amos D. Bivan. 2015. “Towards the Preservation of the Tyap Language.” In Language Endangerment: Globalisation and the Fate of Minority Languages in Nigeria: A Festschrift for Appolonia Uzoaku Okwudishu, edited by Ozo-mekuri Ndimele, 99–106. Port Harcourt, Nigeria: M & J Grand Orbit Communications. Carrier, Neil. 2011. “Reviving Yaaku: Identity and Indigeneity in Northern Kenya.” African Studies 70(2): 246–263. Chebanne, Andy. 2016. “Writing Khoisan: Harmonized Orthographies for Development of Under-Researched and Marginalized Languages: The Case of Cua, Kua, and Tsua Dialect Continuum of Botswana.” Language Policy 15(3): 277–297. Chebanne, Andy and Joyce Mathangwane. 2009. “The Divisive Heritage: The Case of Missionary Orthography Development of African Languages of Botswana.” In The Role of Missionaries in the Development of African Languages, edited by Kwesi Kwaa Prah, 91–122. Cape Town: Centre for Advanced Studies of African Society. Chebanne, Andy and Daniel Schmidt. 2010. Kalanga: Summary Grammar. Cape Town: Centre for Advanced Studies of African Society. Chennells, Roger and Aymone du Toit. 2004. “The Rights of Indigenous Peoples in South Africa.” In Indigenous Peoples’ Rights in Southern Africa, edited by Robert K. Hitchcock and Diana Vinding, 98–113. Copenhagen: International Work Group for Indigenous Affairs. Chepkotit arap Mungu, Edward. 2011. “Bong’om of Kenya—the Forgotten People?” OGMIOS Newsletter (Foundation for Endangered Languages) 44: 5. Childs, G. Tucker. 2006. “Language Endangerment in West Africa: Its Victims and Causes.” In The Joy of Language: Proceedings of a Symposium Honoring the Colleagues of David Dwyer on the Occasion of His Retirement. East Lansing, MI, Michigan State University. https://www. msu.edu/~dwyer/16-Childs.doc.

Language Revitalization in Africa 629 Childs, G. Tucker. 2017. “Busy Intersections: A Framework for Revitalization.” In Africa’s Endangered Languages: Documentary and Theoretical Approaches, edited by Jason Kandybowicz and Harold Torrence, 145–164. Oxford: Oxford University Press. Collins, Chris and Jeff Gruber. 2014. A Grammar of ǂHȍã. Cologne: Rüdiger Köppe. Connell, Bruce. 2015. “The Role of Colonial Languages in Language Endangerment in Africa.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 107–129. Amsterdam: John Benjamins. Cwi, Cwisa and Jennifer L. Hays. 2011. “The Nyae Nyae Village Schools 1994-2010: An Indigenous Mother Tongue Education Project After 15 Years.” Diaspora, Indigenous and Minority Education 5(2): 142–148. Dimmendaal, Gerrit J. 2015. “Different Cultures, Different Attitudes, But How Different Is the African Situation Really?” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 37–57. Amsterdam: John Benjamins. Dimmendaal, Gerrit J. and F. K. Erhard Voeltz. 2007. “Africa.” In Encyclopedia of the World’s Endangered Languages, edited by Christopher Moseley, 579–634. London: Routledge. Dunham, Margaret. 2016. “Tense and Aspect in Langi.” In Aspectuality and Temporality: Descriptive and Theoretical Issues, edited by Zlatka Guentchéva, 231–263. Amsterdam: John Benjamins. Ekwueme, A. C. 2011. “Mass Media and the Declining Fortunes of Indigenous Languages: An Appraisal of Igbo Language (1960–2000).” Journal of Igbo Language & Linguistics 3: 84–94. El Kirat, Yamina. 2001. “The Current Status and Future Use of the Amazigh Language in the Beni Iznassen Community.” Languages and Linguistics/Langues et linguistique/al-Lughat wa-al-lisaniyat 8: 81–96. El Kirat El Allame, Yamina. 2008. “Bilingualism, Language Teaching, Language Transmission and Language Endangerment: The Case of Amazigh in Morocco.” In Endangered Languages and Language Learning: Proceedings of the Conference FEL XII, September 24–27, 2008, Fryske Akademy, It Aljemint, Ljouwert/Leeuwarden, The Netherlands, edited by Tjeerd de Graaf, Nicholas Ostler, and Reinier Salverda, 123–130. Bath, UK: Foundation for Endangered Languages. Essegbey, James. 2015. “‘Is This My Language?’: Developing a Writing System for an Endangered- Language Community.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 153–176. Amsterdam: John Benjamins. Fox, Jan. 2014. “Saving the Suba.” The East African, February 7. http://www.theeastafrican. co.ke/magazine/Saving-the-Suba-language/434746-2196564-o6llag/index.html. Gabanamotse-Mogara, Budzani and Herman Michael Batibo. 2016. “Ambivalence Regarding Linguistic and Cultural Choices Among Minority Language Speakers: A Case Study of the Khoesan Youth of Botswana.” African Study Monographs 37(3): 103–115. Gabsi, Zouhir. 2011. “Attrition and Maintenance of the Berber Language in Tunisia.” International Journal of the Sociology of Language 211: 135–164. Garri, Dhahawi S. A. and Abdel Rahim Hamid Mugaddam. 2015. “Language and Identity in the Context of Conflict: The Case of Ethnolinguistic Communities in South Darfur State.” International Journal of the Sociology of Language 235: 137–167. Gibson, Maik and B. Araali Bagamba. 2016. “Language Shift and Endangerment in Urban and Rural East Africa: Three Case Studies.” In Endangered Languages and Languages in

630 Bonny Sands Danger: Issues of Documentation, Policy, and Language Rights, edited by Luna Filipović and Martin Pütz, 351–360. Amsterdam: John Benjamins. Hays, Jennifer. 2011. “Educational Rights for Indigenous Communities in Botswana and Namibia.” International Journal of Human Rights 15(1): 127–153. Heine, Bernd, Christa König, and Karsten Legère. 2016. “Reacting to Language Endangerment: The Akie of North- Central Tanzania.” In Endangered Languages and Languages in Danger: Issues of Documentation, Policy, and Language Rights, edited by Luna Filipović and Martin Pütz, 313–333. Amsterdam: John Benjamins. Hillewaert, Sarah. 2015. “Writing with an Accent: Orthographic Practice, Emblems, and Traces on Facebook.” Journal of Linguistic Anthropology 25(2): 195–214. Hitchcock, Robert K. 2013. “Indigenous Children’s Rights and Well-Being: A Perspective from Central and Southern Africa.” In Vulnerable Children: Global Challenges in Education, Health, Well-Being, and Child Rights, edited by Deborah Johnson, DeBrenna Agbenyiga, and Robert K. Hitchcock, 219–238. New York: Springer. Hoffman, Katherine E. 2006. “Berber Language Ideologies, Maintenance, and Contraction: Gendered Variation in the Indigenous Margins of Morocco.” Language and Communication 26(2): 144–167. Idiata, Daniel Franck. 2009. Langues en danger et langues en voie d’extinction au Gabon. Paris: L’Harmattan. Ikọtun, Reuben Oluwafẹmi and Timothy Adeyẹmi Akanbi. 2013. “Saving the Yorùbá Counting System from Extinction.” In The Numeral Systems of Nigerian Languages, edited by Ozo-mekuri Ndimele and Eugene S. L. Chan, 279–293. Port Harcourt: Linguistic Association of Nigeria. Jaeger, Marcus. 2008. “Indigenous Efforts to Revitalize and Digitize the Nubian Languages.” Sudan Studies Association Newsletter 26(3): 13–22. Jones, Jennifer M. and Gary Barkhuizen. 2011. “‘It Is Two-Way Traffic’: Teachers’ Tensions in the Implementation of the Kenyan Language-in-Education Policy.” International Journal of Bilingual Education and Bilingualism 14(5): 513–530. Kajivora, Edward Riak. 2015. “The Nuba Moro Literacy Program.” In Language Vitality Through Bible Translation, edited by Marianne Beerle-Moor and Vitaly Voinov, 91–95. New York: Peter Lang. Kamei, Nobutaka. 2004. “The Sign Languages of Africa.” Afurika Kenkyu/Journal of African Studies 64: 43–64. Kishindo, Pascal J. 2002. “‘Flogging a Dead Cow?’: The Revival of Malawian Chingoni.” Nordic Journal of African Studies 11(2): 206–223. König, Christa, Bernd Heine, and Karsten Legère. 2015. The Akie Language of Tanzania: A Sketch of Discourse Grammar. Tokyo: Research Institute for Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies. Kure, Blesswell and William B. McGregor. 2011. “Recommendations for Writing Shua http:// www.hum.au.dk/ling/research/Shua/Recommendations_for_writing_Shua.pdf.” Accessed August 1, 2016. Kusters, Annelies. 2012. “The Gong Gong Was Beaten”—Adamarobe: A ‘Deaf Village’ in Ghana and Its Marriage Prohibition for Deaf Partners.” Sustainability 4: 2765–2784. http://www. mdpi.com/2071-1050/4/10/2765. Kusters, Annelies. 2014. “Language Ideologies in the Shared Signing Community of Adamarobe.” Language in Society 43(2): 139–158. Lanesman, Sara and Irit Meir. 2012. “The Survival of Algerian Jewish Sign Language Alongside Israeli Sign Language in Israel.” In Sign Languages in Village Communities: Anthropological

Language Revitalization in Africa 631 and Linguistic Insights, edited by Ulrike Zeshan and Connie de Vos, 153–179. Boston: Mouton de Gruyter. Larsen, Iver. 1991. “A Puzzling Dissimilation Process in Southern Nilotic.” In Proceedings of the Fourth Nilo-Saharan Conference, Bayreuth, Aug. 30–Sep. 2, (1989), edited by M. Lionel Bender, 263–272. Hamburg, Germany: Helmut Buske. Lee, Nala Huiying and John Van Way. 2016. “Assessing Levels of Endangerment in the Catalogue of Endangered Languages (ELCat) Using the Language Endangerment Index.” Language in Society 45: 271–292. Legère, Karsten. 2007. “Vidunda (G38) as an Endangered Language?” In Selected Proceedings of the 37th Annual Conference on African Linguistics, edited by Doris L. Payne and Jaime Peña, 43–54. Somerville, MA: Cascadilla Proceedings Project. Legère, Karsten. 2012. “Endangered Languages in Africa: Focus on Tanzania’s Ngasa and Akie.” In Issues of Language Endangerment, edited by Xu Shixuan, Tjeerd de Graaf, and Cecilia Brassett, 89–102. Beijing: Chinese Academy of Sciences. Legère, Karsten. 2015. “Language Erosion and Maintenance Among the Akie (Tanzania).” Paper presented at the Eleventh Conference on Hunting and Gathering Societies (CHAGS 11), September 7–11. Vienna. Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig, eds. 2016. Ethnologue: Languages of the World. 18th ed. Dallas, TX: SIL International. Accessed August, 2016. http://www. ethnologue.com. Lopes, Armando Jorge. 2001. “Language Revitalisation and Reversal in Mozambique: The Case of Xironga in Maputo.” Current Issues in Language Planning 2(2/3): 259–267. Lukusa, Stephen T. M. 2000. “The Shekgalagadi Struggle for Survival: Aspects of Language Maintenance and Shift.” In Botswana: The Future of the Minority Languages, edited by H. M. Batibo and B. Smieja, 55–77. Düsseldorf, Germany: Peter Lang. Lüpke, Friederike. 2015. “Ideologies and Typologies of Language Endangerment in Africa.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 59–105. Amsterdam: John Benjamins. Lüpke, Friederike. 2018. “Escaping the Tyranny of Writing: West African Regimes of Writing as a Model for Multilingual Literacy.” In The Tyranny of Writing: Ideologies of the Written Word, edited by Kasper Juffermans and Constanze Weth, 129–148. London: Bloomsbury. Lüpke, Friederike and Anne Storch. 2013. “Language Dynamics.” In Repertoires and Choices in African Languages, edited by Friederike Lüpke and Anne Storch, 267–344. Berlin: Mouton de Gruyter. Lutalo-Kiingi, Sam and Goedele A. M. De Clerck. 2015. “Ugandan Sign Language.” In Sign Languages of the World: A Comparative Handbook, edited by Julie Bakken Jepsen, Goedele de Clerck, Sam Lutalo-Kiingi, and William B. McGregor, 871–900. Berlin: Mouton de Gruyter. Makoni, Sinfree. 2016. “Romanticizing Differences and Managing Diversities: A Perspective on Harmonization, Language Policy, and Planning.” Language Policy 15(3): 223–234. Mapunda, Gastor. 2015. “An Analysis of the Vitality of the Intangible Cultural Heritage of the Ngoni People of Tanzania: Lessons for Other Ethnolinguistic Groups.” Nordic Journal of African Studies 24(2): 169–185. Maseko, Busani and Mthokozisi Moyo. 2013. “Minority Language Revitalisation in Zimbabwe—Fundamental Considerations for Tonga Language in the Zambezi Valley.” International Journal of Arts and Humanities 2(10): 248–259. Mathenge, Gakuu. 2002. “Unesco Certifies Yaaku Language and 30 Others in EA Dead.” The East African, September 16. http://allafrica.com/stories/200209160529.html.

632 Bonny Sands Mberia, Kithaka wa. 2014. “Death and Survival of African languages in the 21st Century.” International Journal of Linguistics and Communication 2(3): 127–144. McLaughlin, Fiona. 2015. “Can a Language Endanger Itself? Reshaping Repertoires in Urban Senegal.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 131–151. Amsterdam: John Benjamins. Meckelburg, Alexander. 2015. “Slavery, Emancipation, and Memory: Exploratory Notes on Western Ethiopia.” International Journal of African Historical Studies 48(2): 345–362. Mezhoud, Salem. 2005. “Salvation Through Migration: Immigrant Communities as Engine Rooms for the Survival and Revival of the Tamazight (Berber) Language.” In Creating Outsiders: Endangered Languages, Migration and Marginalisation: Proceedings of the Ninth FEL Conference, Stellenbosch, South Africa, November 18–20, 2005, edited by Nigel Crawhall and Nicholas Ostler, 109–116. Bath, UK: Foundation for Endangered Languages. Mitchell, Rebecca. 2015. “‘To Be a Good Westerner, You Need to Know Where You Come From’: Challenges Facing Language Revitalisation In Central Africa.” In Policy and Planning for Endangered Languages, edited by Mari C. Jones, 188–204. Cambridge: Cambridge University Press. Moges, Rezenet Tsegay. 2015. “Challenging Sign Language Lineages and Geographies: The Case of Eritrean, Finnish and Swedish Sign Languages.” In It’s a Small World: Inquiries into International Deaf Spaces, edited by Michele Friedner and Annelies Kusters, 83–94. Washington, DC: Gallaudet University Press. Monaka, Kemmonye C. 2009. “Mother Tongue Education: Prospects for Bakgalagari Learners.” In Multilingualism in Education and Communities in Southern Africa, edited by Gregory Kamwendo, Dudu Jankie, and Andy Chebanne, 69–79. Gaborone, Botswana: UBTromso Collaborative Programme for San Research & Capacity Building. Monaka, Kems C. and Andy M. Chebanne. 2005. “San Relocation: Endangerment Through Development in Botswana.” In Creating Outsiders: Endangered Languages, Migration and Marginalisation: Proceedings of the Ninth FEL Conference, Stellenbosch, South Africa, November 18–20, 2005, edited by Nigel Crawhall and Nicholas Ostler, 101–105. Bath, UK: Foundation for Endangered Languages. Mooko, Theophilus. 2006. “Counteracting the Threat of Language Death: The Case of Minority Languages in Botswana.” Journal of Multilingual and Multicultural Development 27(2): 109–125. Moore, Leslie C. 2006. “Changes in Folktale Socialization in an Urban Fulbe Community.” In West African Linguistics: Studies in Honor of Russell G. Schuh, edited by Paul Newman and Larry M. Hyman, 176–187. Columbus: Department of Linguistics and the Center for African Studies, Ohio State University. Morrison, Michelle. 2013. “Documenting ‘Middle-Sized’ Languages: Pitfalls and Potentials.” Paper presented at the 3rd International Conference on Language Documentation and Conservation, February 28, University of Hawai‘i at Mānoa, Honolulu, HI. Mufwene, Salikoko S. 2012. “What Africa Can Contribute to Understanding Language Vitality, Endangerment, and Loss.” In Proceedings of the 6th World Congress of African Linguistics, Cologne, August 17–21, 2009, edited by Matthias Brenzinger and Anne-Maria Fehn, 69–80. Cologne, Germany: Rüdiger Köppe. Muhammad, Abubakar and Abubakar Alkali. 2013. “The Fulfulde Numeral System.” In The Numeral Systems of Nigerian Languages, edited by Ozo-mekuri Ndimele and Eugene S. L. Chan, 51–62. Port Harcourt: Linguistic Association of Nigeria.

Language Revitalization in Africa 633 Mundt, Neele. 2016. “Endangering Indigenous Languages: An Empirical Study of Language Attitudes and Identity in Post-Colonial Cameroon.” In Vanishing Languages in Context: Ideological, Attitudinal and Social Identity Perspectives, edited by Martin Pütz and Neele Mundt, 103–132. Frankfurt am Main: Peter Lang. Mutaka, Ngessimo M. 2008. “Ecosystem of the Vute- Banyo Area and Language Endangerment.” In Explorations into Language Use in Africa, edited by Augustin Simo Bobda, 95–109. Frankfurt am Main: Peter Lang. Nash, Carlos. 2017. “Documenting Ekegusii: How ‘Empowering’ Research Fulfills Community and Academic Goals.” In Africa’s Endangered Languages: Documentary and Theoretical Approaches, edited by Jason Kandybowicz and Harold Torrence, 165–186. Oxford: Oxford University Press. Ndlovu, Davy. 2014. Tikwa Tshwao Kwi (Tshwao na dam’). Tsholotsho, Zimbabwe: Tso-ro-tso San Development Trust. Nwagbara, Grace U. 2013. “Indigenous Language News and the Marginalization of Some Ethnic Groups in the Nigerian Broadcast Media.” Kamla-Raj 2013 Stud Tribes Tribals 11(2): 153–160. Nyati-Ramahobo, Lydia. 1998. “Language Development for Literacy: The Case of Shiyeyi in Botswana.” Paper presented at “Non-Formal Education: New Directions for the Year 2000” Africa Regional Literacy Forum, Dakar, Senegal, March 16–20. Accessed October 2016. https://archive.org/details/ERIC_ED463394. Nyombe, B. G. V. 1997. “Survival or Extinction: The Fate of the Indigenous Languages of the Southern Sudan.” International Journal of the Sociology of Language 125: 99–130. Nyst, Victoria. 2010. “Sign Language Varieties in West Africa.” In Sign Languages, edited by Diane Brentari, 405–432. Cambridge: Cambridge University Press. Nyst, Victoria, Kara Sylla, and Moustapha Magassouba. 2012. “Deaf Signers in Douentza, a Rural Area in Mali.” In Sign Languages in Village Communities: Anthropological and Linguistic Insights, edited by Ulrike Zeshan and Connie de Vos, 251–276. Berlin: Mouton de Gruyter. Obiero, Ogone John. 2008. “Evaluating Language Revitalization in Kenya: The Contradictory Face and Place of the Local Community Factor.” Nordic Journal of African Studies 17(4): 247–68. Obikudo, Ebitare, Bruce Connell, Inoma Essien, Akin Akinlabi, Ozo-mekuri Ndimele, and Will Bennett. 2015. “The Sociolinguistic Setting of the Defaka and Nkọrọọ People.” Journal of the Linguistic Association of Nigeria 18(1): 135–141. Ogweno, Bernard. 2016. “Meet a Community Member: The Africa Foundation for Endangered Languages and Cultural Heritages.” OGMIOS Newsletter 60: 3. Oiruria, Everlyn and Beatrice Clayre. 2010. “Language Revitalisation and Reversing Language Shift: Special Problems in a Multi-Lingual and Cultural Context—The Case of Vernacular Languages in Kenyan Education.” In Reversing Language Shift: How to Re-awaken a Language Tradition: Proceedings of the Fourteenth FEL Conference, Carmarthen, Wales, September 13–15, 2010, edited by Hywel Glyn Lewis and Nicholas Ostler, 157–163. Bath, UK: Foundation for Endangered Languages. Omondi, Odero Erick (assisted by Odero Isack Otineo, Lengosira Andrea, Lengutuk Asunta, and Akolong Joseph). 2008. The Reconstruction and Documentation of the Elmolo language. Unpublished manuscript. Omondi, Odero Erick and Odero Isack Otieno (with Lengosira Andrea, Lengutuk Asunta and Akolong Joseph). 2008. Elmolo-English Dictionary. Unpublished manuscript.

634 Bonny Sands Orcutt-Gachiri, Heidi. 2009. “Kenyan Language Ideologies, Language Endangerment, and Gikuyu (Kikuyu): How Discourses of Nationalism, Education, and Development Have Placed a Large, Indigenous Language at Risk.” PhD diss., University of Arizona, Tucson, AZ. Orcutt-Gachiri, Heidi. 2013. “How Can a Language with 7 Million Speakers Be Endangered?” In The Persistence of Language: Constructing and Confronting the Past and Present in the Voices of Jane H. Hill, edited by Shannon T. Bischoff, Deborah Cole, Amy V. Fountain, and Mizuki Miyashita, 229–255. Amsterdam: John Benjamins. Phiri, Admire. 2015. “The Phoneme Inventory of Tjwao.” MA thesis, University of Zimbabwe, Harare, Zimbabwe. Prezi, Grace O. 2013. “The Igbo Numeral System in Danger of Extinction: The Way Out.” In The Numeral Systems of Nigerian Languages, edited by Ozo-mekuri Ndimele and Eugene S. L. Chan, 209–216. Port Harcourt: Linguistic Association of Nigeria. Robson, Laura. 2011. “The Documentation of the Language Ecology of Njanga, a Moribund Language of Cameroon.” PhD diss., University of Kent, Canterbury, UK. Rohloff, Peter and Brent Henderson. 2015. “Development, Language Revitalization and Culture: The Case of the Mayan Languages of Guatemala, and Their Relevance for African Languages.” In Language Documentation and Endangerment in Africa, edited by James Essegbey, Brent Henderson, and Fiona McLaughlin, 177–194. Amsterdam: John Benjamins. Rosendal, Tove. 2016. “Language Transmission and Use in a Bilingual Setting in Rural Tanzania: Findings from an In-Depth Study of Ngoni.” In Endangered Languages and Languages in Danger: Issues of Documentation, Policy, and Language Rights, edited by Luna Filipović and Martin Pütz, 335–349. Amsterdam: John Benjamins. Rosendal, Tove and Gastor Mapunda. 2014. “Is the Tanzanian Ngoni Language Threatened? A Survey of Lexical Borrowing from Swahili.” Journal of Multilingual and Multicultural Development 35(3): 271–288. Sabbār, Halīm. 2011. “The Toponymy of Ishkéed and the Revitalization of an Endangered Nubian Language.” In Trends in Exonym Use: Proceedings of the 10th UNGEGN Working Group on Exonyms Meeting, Tainach, April 28–30, 2010, edited by Peter Jordan, Hubert Bergmann, Caroline Burgess, and C. Cheetham, 313–317. Hamburg, Germany: Verlag Dr. Kovač. Sadiqi, Fatima. 2011. “The Teaching of Amazigh (Berber) in Morocco.” In Handbook of Language and Ethnic Identity, vol. 2: The Success-Failure Continuum in Language and Ethnic Identity, edited by Joshua Fishman and Ofelia García, 33–44. Oxford: Oxford University Press. Sagna, Serge. 2016. “‘Research Impact’ and How It Can Help Endangered Languages.” OGMIOS Newsletter 59: 5–8. Sands, Bonny. 2017. “The Challenge of Documenting Africa’s Least Known Languages.” In Africa’s Endangered Languages: Documentary and Theoretical Approaches, edited by Jason Kandybowicz and Harold Torrence, 11–38. Oxford: Oxford University Press, Schaefer, Paul. 2015. “Hot Eyes, White Stomach: Emotions and Character Qualities in Safaliba Metaphor.” In Language Endangerment: Disappearing Metaphors and Shifting Conceptualizations, edited by Elisabeth Piirainen and Ari Sherris, 91–110. Amsterdam: John Benjamins.

Language Revitalization in Africa 635 Schöpperle, Florian. 2011. “The Economics of Akie Identity: Adaptation and Change Among a Hunter-Gatherer People in Tanzania.” MA thesis, African Studies Centre, University of Leiden, The Netherlands. Senayon, Esther. 2016a. “Ethnic Minority Linguistic Ambivalence and the Problem of Methodological Assessment of Language Shift Among the Ogu in Ogun State, Nigeria.” International Journal of the Sociology of Language 242: 119–137. Senayon, Esther. 2016b. “Non-Native Speaker Mother, Personal Family Efforts and Language Maintenance: The Case of Ogu (Nigeria) in My Family.” In Vanishing Languages in Context: Ideological, Attitudinal and Social Identity Perspectives, edited by Martin Pütz and Neele Mundt, 255–278. Frankfurt am Main: Peter Lang. Simons, Gary F. and M. Paul Lewis. 2013. “The World’s Languages in Crisis: A 20 Year Update.” In Responses to Language Endangerment: In Honor of Mickey Noonan, edited by Elena Mihas, Bernard Perley, Gabriel Rei-Doval, and Kathleen Wheatley, 3–19. Amsterdam: John Benjamins. Sommer, Gabriele. 1995. Ethnographie des Sprachwechsels: sozialer Wandel und Sprachverhalten bei den Yeyi (Botswana). Cologne, Germany: Rüdiger Köppe. Swande, Sewuese Veronica and T. Terver Udu. 2015. “Play Songs and Peer Group Activities as Resources for Revitalising the Tiv language.” In Language Endangerment: Globalisation and the Fate of Minority Languages in Nigeria: A Festschrift for Appolonia Uzoaku Okwudishu, edited by Ozo-mekuri Ndimele, 107–118. Port Harcourt, Nigeria: M & J Grand Orbit Communications. Tadadjeu, Maurice and Etienne Sadembouo. 1984. General Alphabet of Cameroon Languages/ Alphabet général des langues camerounaises. Yaoundé, Cameroon: University of Yaoundé, Department of African Languages and Linguistics. Tadadjeu, Maurice, Etienne Sadembouo, and Gabriel Mba. 2004. Pédagogie des langues maternelles africaines. Yaoundé, Cameroon: University of Yaoundé, Department of African Languages and Linguistics. Tamanji, Pius N. 2008. “Globalization and African Languages: Regression in Linguistic Diversity.” In Explorations into Language Use in Africa, edited by Augustin Simo Bobda, 71– 94. Frankfurt am Main: Peter Lang. Tosco, Mauro. 2015. “From Elmolo to Gura Pau: A Remembered Cushitic Language of Lake Turkana and Its Possible Revitalization.” Studies in African Linguistics 44(2): 101–135. Vande- Guma, C. D. S. 2015. “Applying Systemic- Functional Linguistics to Safeguard Indigenous Genres.” In Language Endangerment: Globalisation and the Fate of Minority Languages in Nigeria: A Festschrift for Appolonia Uzoaku Okwudishu, edited by Ozo-mekuri Ndimele, 57–84. Port Harcourt, Nigeria: M & J Grand Orbit Communications. Veith, Eshinee. 2012. “Grass-roots Language Revitalization via Social Media: “Lets Learn Shiyeyi” on Facebook.” Paper presented at the 2nd Department of African Languages and Literature International Conference at the University of Botswana, Gaborone, Botswana, July 14. Visser-Wiegel, Coby. 2001. “Literacy Can Save Lives: The Naro Literacy Project.” In Education for Remote Area Dwellers in Botswana: Problems and Perspectives (Report on a three-day conference on education for Remote Area Dwellers in Botswana), edited by Otto Oussoren, 68– 72. Gaborone: University of Botswana, Research and Development Unit.

636 Bonny Sands Wiesemann, Ursula, Etienne Sadembouo, and Maurice Tadadjeu. 1983. Guide pour le développement des systèmes d’écriture des langues africaines. Yaoundé, Cameroon: University of Yaoundé, Department of African Languages and Linguistics. Williams, Weaam. 2011. Reclaiming of the Mother Tongue, a Khoe Story. Cape Town: Shamanic Organic Productions. DVD. Witbooi, Reginald. 2015. “Khomani San Tribe to Inaugurate First Female Chief.” SABC News, May 30, 2015. Accessed August 1. http://www.sabc.co.za/news/a/8fea3a80488f789b83acab5b 3432783c/Khomani-San-tribe-to-inaugurate-first-female-chief.

Chapter 28

Pl anning Mi nori t y L anguage Ma i nt e na nc e Challenges and Limitations Sue Wright

1. Introduction Language is the tool that allows human beings to remember and learn from the past, to cooperate and exchange in the present, to plan and prepare for the future. And because this human facility has developed as different languages and not as a universal language, it is also a powerful factor in the construction of groups. Throughout history, communities of communication have developed clear identities, separate from neighbors with different linguistic practices. Where differences are minor, communication can be achieved with goodwill on both sides of the interaction. However, many cleavages between language groups are profound and cannot be ignored by an act of will. This chapter traces a small part of this history of language in human groups, focusing particularly on Europe. It traces the complex relationship between purposeful policymaking and the effects of myriad individual choices and actions, which have resulted in the concept of “linguistic minority.” It discusses some of the difficulties inherent in minority language rights.

2. What is a minority? The nation-state and the making of majorities and minorities To understand how and why certain groups are constructed as linguistic minorities within larger communities of communication we need to examine the processes of

638 Sue Wright nation-building and nationalism. Before the advent of the nation-state the linguistic landscape was vastly different from what was to come after its development. Europe provides a good example of the transformation. In the feudal period, European communities of communication were either more local or more “international” than today. Before 1500 most of the population were settled agriculturalists. Singman (1999) and Fossier (1970) estimate that more than 90% of the medieval European population were peasants or serfs engaged in producing food. At worst they were legally obliged to remain in the place of their birth; at best they were assigned a place in a rigid social order. Few had either social or geographical mobility. Living in small groups and traveling little, they would have only had knowledge of the local dialect of their village or hamlet.1 And even though for most this dialect would have been situated along the great dialect continua that spanned Europe (Celtic, Germanic, Greek, Romance, Slavonic),2 differing only in minor ways from its neighbors on the continuum, it is unlikely that individuals would have appreciated their place in the greater linguistic whole. The opposite end of the social spectrum was far from monodialectal or monolingual. The feudal ruling class was “European” in that the most powerful families chose marriage partners, gave allegiance, and inherited and waged war on a pan-European scale and on a continental stage (Dewald 1996). At the pinnacle of the hierarchy, marriages and alliances were contracted among a very small number of royal dynasties. The consequent mixing promoted complex family multilingualisms. These dynasties were the focus of feudal political organization. Their subjects were not linked definitively to the polity, and they, and the territory in which they dwelt, could be passed from the stewardship of one great family to another, depending on how alliances were brokered, marriages contracted, or wars won or lost. Loyalty was to the individual ruler, not to the idea of nation or territory in the political sense (Davis 1988). Like the feudal aristocracy, the clergy of the Catholic and Orthodox Churches had patterns of identities and allegiances which transcended the local. However, they were in a very different situation from the perpetually shifting power structures of feudalism. Of course, the upper echelons of the Christian hierarchy were drawn from the aristocracy, but when they became part of the church they entered tight hierarchies controlled from Rome or Byzantium and the “international” networks to which they belonged had a stability deriving from the institutions of the Catholic and Orthodox Churches. Rietbergen (1998) argues that the Christian church was the heir to the Roman Empire in many respects. It took on its bureaucratic structure, gave a moral/legal framework for the loosely organized kingdoms of the medieval world, and employed its languages,

1

This seems a reasonable claim by analogy with non-literate peasant farming cultures which persist in the present time, even though it is difficult to substantiate from contemporary evidence. As Singman (1999, 99) notes: “Although the peasantry constituted the largest part of the medieval population they remain its most elusive component.” 2 In the south of Europe, where Muslim groups had settled, dialects of Arabic and Turkish were also present. Islands of Finnish, Estonian, Hungarian, Greek, Albanian, and Basque speakers dovetailed into these dialect continua or existed at their edges.

Planning Minority Language Maintenance 639 Greek and Latin, as sacred lingua francas. From the tenth century on, Church Slavonic was a further sacred language and lingua franca. And since even the most lowly parish priest knew enough of these sacred languages to hold a service, the communities of communication built by religion extended across Europe. Thus, in the medieval context, the modern concept of linguistic minority had little meaning. There was no majority to define minority. Where peoples could be gifted in marriage contracts, inherited by distant princes, won and lost in military struggle, there was little stability and thus little opportunity for feelings of political allegiance to develop toward a stable polity. Society remained fractured, with group membership typically understood in local terms and loyalty mostly consisting of service to a man or to a family. There were few societal forces that encouraged linguistic convergence within the dialect continua and many that favored fragmentation. Rulers showed little interest in the linguistic behav ior of their people. The king ruled subjects, not citizens who needed to be consulted, and monarchs rarely appealed to the population for support. Because there were no pressures for linguistic homogenization or convergence for the mass of people, rulers of territory of any size always governed multidialectal if not multilingual populations. When they needed to reach down into their populations, the second tier of nobility that held the land in trust and swore allegiance to the great dynasties provided the bilinguals who maintained vertical lines of communication. This is not to say, of course, that those who spoke differently might not attract resentment, aggression, or persecution, but rather that, where communication was highly contextualized and face to face, almost everyone from the peasant class was technically a minority in that they came from a small linguistic group, and any journey would take them out of their community of communication. In contrast, much of the European elite was bi- or multilingual to a degree. In Western Europe, the forces that caused linguistic convergence emerged with the fading of this feudal world. As centralizing monarchs wrested power from their aristocracies, they turned to bureaucrats from the rising middle classes to run their kingdoms, and required they use the language of the king and court to do so.3 As they took responsibility for legal matters away from the Catholic Church they decreed that national law would be enacted in the national language.4 These political moves were largely administrative conveniences, and did not at first affect the linguistic practices of the vast majority. However, other developments did disturb the general linguistic status quo. In the fifteenth century Europe only had some 20,000 towns (Hohenberg and Hollen Lees 1995). Urbanization had been interrupted by plagues which had depopulated the countryside, making land available for all who wanted, and halting the move to the town. The demographic growth of the early modern period (and land enclosure in some states) changed 3 E.g., the English king’s (Henry VIII) insistence that any Welshman in the king’s service use English—and the variety of king and court. 4 E.g., the French king’s (François Ier) requirement in 1539 that all legal documents should be in his French variety.

640 Sue Wright this state of affairs. A crowded countryside prompted a move to the towns and there was some linguistic merging as they became melting pots. Further convergence came with the Reformation and the printing press. Protestant religious observance centred on Bible study. The sacred texts were translated into the vernacular and a much greater proportion of the population became literate (Hastings 1997). Even families of modest means could aspire to owning a printed Bible. And, as Anderson (1983) remarked, the printers encouraged linguistic homogeneity by distributing a standard product to a large geographical market. The translation of the Bible was an important phase in the standardization of a host of languages. German, English, Swedish, and Danish were early examples of the process, but the complete list is extensive. In the eighteenth and nineteenth centuries a number of developments further promoted the spread of national languages. The progressive mechanization of agriculture drove more people from the land and the needs of growing industrialization brought them together in new concentrations. Industrial areas were linguistic crucibles, where different rural dialects melded. Moreover, as Gellner (1983) noted, an industrializing social order needs to educate its workforce in a different way from agricultural societies. Such education is typically run and funded by the state, and in the national language.5 The most potent force for linguistic unity was the changing nature of political organization. A number of powerful dynasties secured their frontiers, and reigned with absolute authority in the seventeenth and eighteenth centuries. When rebellion and revolution eventually removed these absolutist monarchs, the desire for centralized power persisted. This is particularly clear in the French revolution. In the French republic, the linguistic unity instituted as an administrative convenience under the monarchs grew into a moral imperative which concerned all citizens (Barère 1794). Becoming part of a national community of communication was the litmus test of loyalty and belonging, a prerequisite for membership of the demos and participation in the national conversation. In the following century, both ethnic nationalists, who held the nation to be a natural division of humanity since time immemorial, and civic nationalists, who believed the nation to be an elite-led construction, agreed that the nation needed to be a community of communication and were impatient with those who resisted linguistic assimilation. The view of J. S. Mill (1861[1972], 230) was typical: Among a people without fellow feelings, especially if they read and speak different languages, the united public opinion necessary to the workings of representative institutions cannot exist.

Since national groups were too large for their members to actually know all their fellow citizens, the feeling of national identity had to derive from some other source than contact. 5 A larger number of people need a generic education to prepare them to cope with change, and this replaces (or supplements) learning by doing in apprenticeship. Where the state takes on the provision of such education, the national language tends to spread (e.g., France). Where the Church continues to deliver education, regional languages tend to develop (e.g., Spain).

Planning Minority Language Maintenance 641 Anderson (1983) identified the community of communication as the factor that allowed a national group to see itself as a cohesive whole. National education, national military serv ice, and national media gave members of national groups the linguistic repertoire to communicate with each other. And although they were not going to interact directly with all their co-nationals they had the belief that they could if they wanted to. Thus they imagined themselves a community and this made possible a degree of identification and solidarity. Such national identity turned out to be both beneficial and toxic. On the one hand identification with and loyalty to a national community permitted the introduction of the welfare state. Taxpayers seemed more willing to pay to support the old, the young, the sick, and the needy from their community of communication than to help outsiders (Svallfors 2007). On the other hand, it also encouraged extreme patriotism. Citizens’ loyalty to the nation replaced subjects’ loyalty to the person of the monarch. National identity was preeminent and other levels of identity were eclipsed or second tier. Such bonding permitted “total war” in the Clausewitzian sense (von Clausewitz 1830[1977]). Whole populations were pitched against each other, as in the horror of the two twentieth-century world wars. The European linguistic landscape of the nationalist period was thus vastly different from that of the feudal era. The dialect continua were broken by relatively stable political frontiers. Communities adjacent on the continuum were separated as they were schooled in different national education systems, conscripted for military service in different national armies, subject to different national governments and administrations, and informed and influenced by different national media. Patterns of communication changed. The dialect continua broke down. For example, until the turn of the twentieth century Ligurian and Piedmontese speakers had been able to negotiate meaning with Nissard and Provençal speakers, and the workforce along the Mediterranean coast had been very mobile. After nation-building, Italian and French speakers had no easy commonality and after the strengthening of frontiers no easy passage. In Gellner’s (1983) metaphor, Europe came to resemble a Modigliani painting with distinct blocks of color/ language and no fuzzy edges. In another metaphor, states became containers, keeping populations securely enclosed within the national sphere and constructing high boundary fences, both actual and virtual (Joppke 1998; Giddens 2013).

3. The spread of the nation-state system and the difficult position of minorities. In the twentieth century, the dismantling of empires spread the nation-state system to many parts of the world. The treaties at the end of World War I redrew the map, following the principles of national self-determination and creating a host of new states in Europe and the Middle East. But, although US President Woodrow Wilson’s

642 Sue Wright fourteen principles were the bedrock of new state formation, the treaty makers were hard put to actually deliver congruence of linguistic/cultural group and political state. The fractured nature of the dialect continua made this difficult, quite apart from the interspersion of groups from separate dialect continua. Moreover, believing that nation-states had to be a certain size to survive, the treaty makers combined groups that did not see themselves as natural bedfellows, to create the necessary mass. For example, the southern Slavs were linked together in Yugoslavia. The resulting “national” group was largely Slavic speaking, but it also included speakers of varieties of German, Hungarian, and Italian. Decolonization after the end of World War II created many more “nation-states” throughout the world. As in Europe, there were few situations of natural congruence between linguistic groups and state boundaries. In Africa in particular, the boundaries of colonies had been drawn largely without regard to culture and history (Touval 1966). Frontiers were often straight lines decided at a distance (notably at the 1884–1885 Berlin Conference) and without any respect for linguistic and cultural groupings. After independence, African political elites agreed to maintain these borders, and were on the whole minded to centralize. The aim of creating homogenous national communities of communication was even more problematic in Africa than in Europe, since linguistic diversity was particularly complex in many states. Moreover, the language chosen to serve as national language was often that of the former colonial power, which brought its own problems. The elites learned it, which ensured class distinction in Bourdieu’s (1979) sense, but adopting the languages of former colonial masters has presented problems for national linguistic convergence (Myers-Scotton 1993). Thus nation-building in all parts of the globe came up against resistance. Some “national” groups seemed to be like the proverbial Russian doll, containing within themselves other smaller groups that did not see the greater whole as their natural home. Prizing their own cultural and linguistic tradition, they resisted incorporation and assimilation. Some “national” groups contained those who felt that they shared greater cultural and linguistic heritage with those on the other side of state borders rather than with co-nationals. Some “national” groups were structured so that one of the constituent groups dominated, making the others discontent with their subordinate position. Some ruling groups excluded certain categories within national borders from full incorporation, either in political or in economic terms. In many cases such sentiments and inequalities provoked independence and irredentist movements. In confrontational situations, needless to say, the relentless move to national linguistic homogeneity did not take place, and thus minorities are often defined by their linguistic practices. The cause for schism between the majority and the minority may have many different origins (and religion may play a significant role here). Language is very rarely the root cause for division, but is a powerful expression and confirmation of it (Armstrong 1962) In nation-building it has proven particularly difficult to accommodate language difference. In addition to the ideology which makes language use a marker of belonging and loyalty, the actual mechanics of running the modern state favor monolingualism.

Planning Minority Language Maintenance 643 A state can take an ecumenical approach with religious beliefs and, as the main religions have considerable moral overlap, emphasize what is common; languages, on the other hand, particularly in the case of the written languages of the law and administration, are systems complete in themselves that fill the public space (Zolberg and Woon 1999). Where one language is used in an institution, a second language can only be admitted in a hierarchically inferior position, as the translation of documents or interpretation of speech. Within institutions, language does seem to be a zero-sum game. And if power is devolved to the local level, allowing democracy to take place in local languages, there are still linguistic issues when the component parts come together to negotiate at national level. A hierarchy of languages is hard to avoid and this affects culturally diverse, active citizenship, and the general enjoyment of human rights (Nic Craith 2010). In summary we can say that majorities often regard linguistic minorities with some wariness. They are salient in a nation-state system that promotes and prizes homogeneity and thus attract “suspicion and prejudice” (OHCHR 2015). Those who experience their minority status as oppression have traditionally sought self-determination. And, even where minorities do not explicitly seek to secede, the majority may suspect them of separatist ambitions or sentiments. From the inside, minority status can be experienced as disadvantage because the minority language(s) is unlikely to be used on an equal footing with the national state language. This disparity affects the social and economic mobility of minority group members and limits their participation in national political and cultural life. History provides many examples of the tensions which have resulted from disregarding language difference and which have led to conflict (cf. de Varennes 1999). This is an area which has attracted international attention. As we have moved into globalizing world where the absolute sovereignty of the nation-state has been challenged in a number of domains, the relationship of national majorities and minorities has become one of the areas in which the international community has tried to exert influence.

4. Human rights, minority rights, and linguistic rights Universal human rights came to the fore in the mid-twentieth century, prompted in part by the widespread horror felt in reaction to the events of World War II and a gen eral desire to protect the vulnerable from a repeat of abuses and atrocities (Eide 1996; Baehr 2001). At first there was no specific mention of minority rights in the discourse or the legislation. Neither the Charter of the United Nations (1945) nor the Universal Declaration of Human Rights (1948) makes specific provision for minorities. The climate post-1945 was highly influenced by recent history (Simon 2000). The actions taken by the Nazis on behalf of German-speaking minorities were regularly evoked as reasons against repeating the experiment of minority group protection included in the treaties

644 Sue Wright at the end of World War I (Janics 1982).6 This had been the first time minorities had been accorded protection in international law (Thompson 2001), but it had been misused in the way its opponents had feared. Moreover, the majority of the drafters of the Declaration were inclined to a liberal political position and were generally unsympathetic to communitarianism.7 Their hope was that, if non-discrimination provisions were effectively implemented, special arrangements for the rights of minorities would not be necessary (UNHCR n.d.). The protection for the individual would safeguard the sum total of individuals in the group and thus the group itself. The Universal Declaration of Human Rights thus promoted “negative” rights (i.e., no one should be persecuted for being part of a minority), but it did not encourage the “positive” rights that would contribute actively to minority language maintenance (de Varennes 1996). International laws designed to prevent discrimination against individual minority members, developed independently in different sectors. The International Labour Organisation Convention (1958) gave the framework to challenge unfair distinction in employment. The UNESCO Convention against Discrimination in Education (1960) was designed to set standards of equal access to schooling. The UN’s International Convention on the Elimination of All Forms of Racial Discrimination (1965), the UNESCO Declaration on Race and Racial Prejudice (1978), the declaration on the Elimination of All Forms of Intolerance and of Discrimination based on Religion or Belief (1981), the Convention on the Rights of the Child (1989), and the International Labour Organization’s Protection of the Rights of Indigenous and Tribal Populations (1989) were all possible instruments that could be employed to challenge discrimination on linguistic grounds. The acceptance of the group dimension of minority rights came slowly. An early mention of “group” is contained in Article 27 of the UN’s International Covenant on Civil and Political Rights (1966). This was the most widely accepted legally binding provision for the rights of minorities as such (UNHCR n.d.), and recognized that an individual right may have little meaning unless enjoyed in community with others: In those states in which ethnic, religious or linguistic minorities exist, persons belonging to such minorities shall not be denied the right in community with the other members of their group to enjoy their own culture, to profess and practise their own religion or to use their own language. (UN 1966: Article 27)

Although this remained a negative right, the text included acknowledgement that the right to use one’s own language is not simply the right to speak it when and where one 6

The treaties required that Austria, Hungary, Bulgaria, Turkey, Yugoslavia, Germany, and Poland respect minority communities, including positive language rights (e.g., mother-tongue instruction in minority schools and freedom for cultural organizations to promote minority culture (Capotorti 1979, 47). There were no equivalent requirements of the victors. 7 There was a Western bias in the group. Western individualism prevailed over more communitarian traditions. Individual human rights are rooted in Western tradition, for example, British and American Bills of Rights and the French Declaration of the Rights of Man.

Planning Minority Language Maintenance 645 wants but also—and crucially—the right to be understood and to understand. And that is a constraint on interlocutors. Requiring another person to use one’s language to ensure your right has implications for their rights. Moreover, the freedom of individuals to use whichever language they choose is circumscribed by the competence and choices of others (Thornberry 2013). Language is de facto a group right whether acknowledged as such or not. Perry (2012, 39) provides a useful definition, framing linguistic rights as “goods”: If these goods are divisible, one can safely conclude that the right is “individual” in character. If these goods are indivisible, one can conclude the opposite, that the right is “group” in character.

It was only in 1992 that the international legal community began to address this complexity and the UN’s Declaration on the Rights of Persons Belonging to National or Ethnic, Religious and Linguistic Minorities (1992) was a new departure. In addition to the usual negative freedoms, this text clearly covers certain positive freedoms, requiring governments to promote as well as protect the identity of minority groups. Article 1 affirms that: States shall protect the existence and the national or ethnic, cultural, religious and linguistic identity of minorities within their respective territories and shall encourage conditions for the promotion of that identity.

The term “encourage” implies duties for the state, in particular that there should be policies to promote the language. Within the detail of the Declaration there is overt direction on how this should be accomplished. Article 4.3 requires governments to allow minorities adequate opportunities to learn their mother tongue or have instruction in their mother tongue. Other articles give implicit direction on language. Articles 2.2 and 2.3 require governments to aid minorities to participate in public life and share in decisions which affect them on the national and regional levels. These articles infer that political debate must be in a language which minorities can understand. In the commercial sphere, Article 4.5 requires states to make it possible for minorities to participate fully in the economic progress and development in their country, which again presupposes that there will be no language barrier. The date of the Declaration (1992) is significant. The evolution in philosophy can be traced in part to the political events of the early 1990s (Phillips 1995; Thornberry 2013). The conflicts arising among groups in the former communist states had brought pressure to bear on the international community to accept that measures were needed in order to protect persons belonging to minorities from discrimination (UNHCR n.d.). Following the 1992 declaration, there were other initiatives at the UN level. The post of United Nations High Commissioner for Human Rights was created in 1993 and has been a guardian of minority rights since then. The Office of the Commissioner for Human Rights (OHCHR) produces an annual report monitoring significant issues. A United

646 Sue Wright Nations Working Group on Minorities was set up in 1995 (Thompson 2001). One of the working group’s key tasks was to define the groups that could claim minority status to benefit from international legislation and from programs developed by UN agencies such as UNESCO, UNDP, and WHO. In 2007, the Working Group was replaced by the Forum on Minority Issues, with the mission to further implement the Declaration. A number of non-discrimination clauses were also included in all the basic regional human rights documents. The European institutions that set new standards for minorities, and framed and policed new legislation, were the Council of Europe and the European Union (although the issue was actually outside the remit of the latter). The OSCE8 focused on minority protection in the east of the continent. The Organisation of American States produced the American Convention on Human Rights, and the Organisation of African Unity9 the African Charter on Human and Peoples’ Rights, which in their main texts and subsequent protocols guaranteed negative language rights. Internationally, the Council of Europe’s Charter for Regional or Minority Languages is perhaps the most ambitious initiative. Signed and ratified by twenty-five states (as of February 2018) it requires governments to promote as well as protect minority languages. Nationally, the South African constitution is one of the most progressive on language issues and enshrines the right to non-discrimination on the basis of language, the right to information in a language one can understand, and the right to development of one’s language (Perry 2012). In the twenty-first century there has been a small but steady stream of initiatives to safeguard minority speakers’ interests. For example, an International Association of Language Commissioners (IALC) was founded in Dublin in April 2013 with the aim of unifying efforts worldwide in the defense of language rights. Its membership includes representatives from states such as Canada, Ireland, and Spain where there is legislation enshrining some language rights.

5. The problems with rights: the problem of language Despite this slight change in climate on minority language rights, there remain several fundamental problems with their implementation. The first has to do with the nature of language itself. Most minority languages have not been codified, standardized, and spread in the same way as national majority languages. And without such planning, the natural dialogic creativity of speakers has allowed variation. This is to be expected where there are no constraints except for the need to maintain mutual comprehension within 8 Up until 1995 this body was the Conference on Security and Co-operation in Europe (CSCE) with a smaller remit. 9 African Union from 2002.

Planning Minority Language Maintenance 647 the face-to-face group. Even the written forms of allied varieties can diverge. Unless there is an overarching literary canon, legal documents, or institutions of governance or a similar brake, the tendency of language is to be centrifugal (Bakhtin 1981). When rights are accorded, and education, administration, and media are to be delivered in a minority language, this usually means grouping together a number of linguistically similar groups and requiring them to act as if they constituted a homogeneous community of communication. In such cases an individual (or set of individuals) takes on the role of language planner, codifies and standardizes one variety, and imposes it as the language of the greater group. Thus according significant minority language rights has often turned out to be nation-building writ small (Wright 2016a), a homogenizing process on a smaller scale. This can lead to a situation of double minority status for some individuals. For example, when Galician was made an official language in Galicia it was standardized with large input from intellectuals based in urban areas. These were mostly new speakers of Galician, i.e., Galician had been acquired after their first language (usually Castilian). The resulting standard put many rural speakers of Galician into a position of double linguistic minority: they were speakers neither of standard Galician nor of standard Castilian (Hoffmann 1996). Such standardization can lead to accelerated loss of the minority language. For example, the Arbresh community in Italy, which had conserved its language for five centuries, is in a period of shift at the present time, in part because government recognition of the language led to norms that many rejected (Perta 2004). The Ladin-speaking community in the Italian Dolomites has experienced the same problem. Language rights led to standardization by scholars and linguists from outside the community and to a feeling of alienation (Wright 2007). The crux of the matter is this—when one decides to maintain one’s language for purposes of identity, it is one’s own language one wishes to use, not one that is similar but slightly alien. If preserving the minority language involves a shift, it may be that there is a stronger argument for shifting to the national language than accommodating to a regional Dachsprache. We see this situation in Spain where some Catalan speakers would like to include Valencian speakers in their community of communication in order to have larger numbers and thus greater political weight. Valencian and Catalan are close on the dialect continuum. However, many Valencians reject such incorporation, seeing little point in maintaining a language which is not totally theirs. Of course, the issue is not only linguistic but cultural and political as well; the question is whether Barcelona is a more attractive and acceptable power centre than Madrid (Wright 2016b). And even where there is no official policy to standardize, the factors that have traditionally contributed to (national) language convergence and homogeneity may be at work. This is the case among Sami speakers, where the Nordic states explicitly recognize diversity. However, the fact that the most influential Sami media tend to be in Northern Sami, the language of the largest group, exerts pressure for convergence on the other Sami languages (Markelin, Husband, and Moring 2013). Thus, in general, where we find acceptance of linguistic variation among the component parts of a group perceived as a minority, we will mostly find that speakers have minimal language rights. This is the case of Occitan speakers in Southern France, for

648 Sue Wright example. Occitan has a very limited role in the public space. It is available as a school subject but not promoted. It is not used in local administration. It has a very minor presence in the media. However, both speakers and authorities accept that Occitan is a pluricentric language, and in the few cases where it is taught in schools and used in the media one finds a number of varieties. Thus the compromise seems to be that where the minority language remains marginal, a measure of diversity is accepted and accommodated and speakers with slightly different practices negotiate meaning and accommodate difference when they interact. As the language does not play a significant role in the public life of the state the pressure to standardize is weaker and can be resisted,10 but at the same time it is not a language of power which brings its speakers cultural capital. So the first problem stems from the group nature of language rights. Even when the linguistic demarcation between minority and majority is clear, the linguistic cohesion within the group may not be so evident. Given the belief that a group’s language must be unified and stable for language rights to be accorded in any positive way, linguistic minorities usually come under pressure to accept some convergence to a standard.

6. The problems with rights: the problem of advocacy The second problem with language rights concerns agency. Language militancy is not always (or often) a bottom-up, grassroots action. In nation-building, elites were usually at the forefront. As Nairn (1977, 340) famously put it, “the new middle class intelligentsia of nationalism had to invite the masses into history.” The top-down approach extended to language choice and language use. If “the invitation had to be written in a language they understood” (Nairn 1977, 340), this had to be engineered. As discussed above, there was rarely one code that all understood. Thus intellectuals were at the heart of the language planning necessary for nation-building from the very beginning of the process. From the French revolutionaries who required use of French as proof of loyalty to the Estonian folklorists searching for authentic Estonian lexis to the Tanzanians devising terms to allow Swahili to be used in all domains, intellectuals drove nationalist language planning. This is equally true of language rights. Much militant action is expert led. In particular lobbying is an elite activity and those who undertake it enter a professional world for which they must be trained. The OHCHR’s Guide for advocates (2015) makes this clear. The expert panel, Strengthening Minority Rights Advocacy through Implementation Mechanisms, was set up to develop “strategies and practical measures to reinforce capacity of minority rights activists to engage with mechanisms 10

See, for example, the exchanges in Glottopol in the mid 2000s on the pluricentric standardization of Occitan (http://glottopol.univ-rouen.fr/numeros_precedents.htm).

Planning Minority Language Maintenance 649 at the international, regional, and national levels to ensure better protection of minority rights.” In recent work, Pupavac (2012) discusses the professionalization of advocacy. In her analysis she suggests that while the language rights lobby is against language homogenisation it is actually for legislative homogenisation and that an unhelpful “one size fits all” mentality pertains. Pupavac suggests that “language advocacy . . . favours linguistic governance” (2012, 261) and that language activism can be as authoritarian as nation-building and there may be an attempt to impose rights. There are obvious parallels. For example, Skutnabb-Kangas (1998), a high-profile figure in minority language rights advocacy, has termed it a “duty” for minority members to learn their mother tongue. This mirrors the language requirements of many constitutions.

7. The problem with rights: understanding versus identity There is a clear tendency to underplay communication in advocacy, and curiously it is disparaged as instrumental (Ives 2005). The focus among the minority rights activists is on language as constitutive of identity or at the very least, significant for identity. May (2012) discusses whether language is a contingent or determining factor for identity. He quotes Renan’s remark “that language may invite us to unite but it does not compel us to do so.” (May 2012, 134). He is right in that membership of allied dialect groups is not a sufficient reason for association. The Czechs and the Slovaks illustrate this. Adjacent on the Slavic dialect continuum, they were grouped together in 1919, but their different historical legaciess, religious affiliations, and economic situations proved significant and their power relations imbalanced. They divorced amicably in 1993. But if a community of communication is not a sufficient reason to unite, is it a necessary factor in groupness? Whether the members of a group can communicate easily among themselves before political association, or whether the community of communication is achieved in and after the process of political union, the language question is central to association. As Steiner (1975, 58) observed, at best incomprehension produces “zones of silence” and “cultural isolation,” while at worst it fosters the construction of those with whom we cannot communicate as “Other” or “Enemy.” I find it difficult to envisage how language could be contingent for membership of a group, however the linguistic dimension is managed. So if we agree that part of the definition of a group is that it must be a community of communication, that its members must have some means of communicating among themselves, the question then arises, does this language need to be an ancestral language of the group, or can group identity be maintained where there is language shift? Fishman (1991) argues that language shift inevitably implies some cultural change too, as the group opens to other influences; it becomes harder to maintain distinction, as exterior influences are no longer constrained by the defining boundaries of language

650 Sue Wright difference. He wonders if X-men can retain their X-ishness when they have shifted to speaking Y. He concludes that there will be a redefinition of X-ishness but that it can survive, perhaps in a new hybrid, hyphenated form. Fishman does not promote this as desirable but notes that it is possible to preserve cultural identity and knowledge in a different language. This position is at odds with those who argue that culture is intimately bound with language, and that where a group loses its language it also loses its unique worldview, its understanding of its environment. This theory that human beings dissect nature along lines laid down by their languages and organize experience largely according to the linguistic systems in our minds was most clearly developed by the Americans, Sapir and Whorf, in the mid-twentieth century. Sapir (1958, 69) argued that [N]‌o two languages are ever sufficiently similar to be considered as representing the same social reality. The worlds in which different societies live are distinct worlds, not merely the same world with different labels attached. . . . We see and hear and otherwise experience very largely as we do because the language habits of our community predispose certain choices of interpretation.

Such linguistic determinism has been largely discredited since, with translation, it is clearly possible to build bridges among linguistic systems. However, the weaker version of the Sapir-Whorf hypothesis, linguistic relativism, suggests that we are influenced by those with whom we interact either face to face or through texts, and that this is language constrained. This does seem better rooted in evidence, and has some currency (cf. Lucy 1992; Gumperz and Levinson 1996). Linguistic relativism provides structure to some of the current campaigning for minority language maintenance under the banner of language ecology. The argument here is that societies in various bio-niches have developed languages to fit their needs in much the same way that organisms evolve to make best use of resources. This provides a fund of human knowledge adapted to the natural environment that is an invaluable way of understanding our world and crucial to our survival in a time of rapid environmental change. This has proved an attractive concept and has gained traction.11 However, there is a fundamental flaw in the analogy, in that languages are not agents in the same way as organisms. Languages do nothing, it is only their speakers who act. Haugen, who was the first to use the term “ecology for language,” was concerned principally to direct attention to the fact that language is not a fixed system but a social phenomenon which changes (i.e., that its users adapt) according to the environment in which it is used (Haugen 1972, 1987; Eliasson 2013). It is not clear that he intended the metaphor to be developed as far as it has.12 We can perhaps take from these various theories that language is constitutive of identity but does not determine it in the Sapir-Whorfian sense. And crucially perhaps we 11 See Language & Ecology Research Forum (http://www.ecoling.net) and the Ecolinguistics Website (http://www-gewi.kfunigraz.ac.at/ed/project/ecoling) 12 He himself makes the point that the term has been ill-used (Haugen 1979).

Planning Minority Language Maintenance 651 should understand language not as a fixed and stable structure (de Saussure’s langue), but rather focus on communication as a messy human behavior that adapts and flexes with new pressures, reflecting identity and helping create it (de Saussure’s parole). This could put us in a better position to deal with the complexity of human language rights.

8. The problem with rights: the problem of stasis Although this perspective might seem logical, phrased in these terms, it is far from universally accepted. If we consider language practices in general in the twenty-first century, we can see that attempts to respond to new situations with linguistic adaptation and flexibility are mostly resisted. One example of this concerns the flows, exchanges, and networks of globalization which all require some means of communication. Among the elites who fill the managerial, scientific, technical, professional, and intellectual roles of inter-and trans-national business, governance, research, and education, English is widely used as a lingua franca and the varieties that dominate are UK or US standard English.13 There is a general rejection of Globish, which is dismissed as a simplified form of English, with restricted lexis. There is little acceptance that a negotiated lingua franca is actually fit for purpose in many interactions. There is little evident support for the development of Globish as a slightly different variety that could function as a lingua franca belonging to all. Another example of how the twenty-first century world prizes “langue” above “parole” concerns migration and the languaging competence which develops in linguistically superdiverse conditions (here languaging means making meaning in collaborative dialogue). Negotiation and accommodation allow migrants from a multitude of different origins living in close proximity in the megalopolises of the world, in refugee camps, or in transitory settlements, to manage the linguistic complexity in which they find themselves (Arnaut et al. 2015). However, such language skills are not prized and those whose ambition is to stay in the target countries of migration must learn the official, standard languages of those states. My third example concerns interaction on the internet. Most of the 3 billion plus users (www.internetlivestats.com/internet-users/) of the internet communicate with those who share a common language. But, where individuals seek information or interaction outside their community of communication, we see all kinds of linguistic creativity. However, much deviation from norms draws condemnation. Internet discussion boards provide many examples where participants have criticized the non-standard forms used by opponents in debate. The fact is that members of literate societies in the 13

See Lauder (2015) on standard English as gatekeeper to elite education and its role in the construction of a global elite.

652 Sue Wright twenty-first century are used to language being fixed and stable, to a world of nation- state languages with sharp (Modigliani-like) divisions; we despise strategies of negotiation, branding them as error or “pidgin,” with all the negative connotation we (perhaps wrongly) accord that word. Despite this, as many scholars from Dryden (1680) to Bakhtin (1981) have pointed out, language is not immutable and speakers are constantly pushing the language of their group in the direction of heteroglossia.14 Where there is no systematic prescription or pressure to maintain cohesion, there is the likelihood of the development of differentiation within subgroups of the speech community, as individuals engage in the creation of terms and forms which suit their purposes. Without constraint, new language forms emerge. But in the present world we treat mutation as “error” and so rein in the process. The language ideology of the nation-state era continues to apply a very strong brake to any new linguistic development. This is a pity since it seems clear that many new contexts need new linguistic solutions which might arise organically if permitted, accepted, and encouraged. In the area of minority language rights, more insouciant attitudes toward language change could have a profound effect. However, acceptance of change is very difficult in the present linguistic situation. Currently we tend to regard the 7000+/–languages of the human race as our inheritance, and activists seek to conserve even the smallest community of communication and to maintain stability in the larger. We do not usually recognize that the present linguistic landscape is simply a snapshot in time, that it was different in the past and could be different in the future. If we were to accept that linguistic development and change we might consider whether we need all the current 7,000 or so languages, or whether we need these particular 7,000. We could perhaps accept that some of them will die out if their speakers cease to see a reason to retain them. And perhaps more flexibility toward “loss” would change our attitudes toward “gain,” the development of new varieties for new groupings, new flows, and new exchanges. However, we are influenced, inevitably, by the past. The circumstances of historic language loss were often violent. Languages have been destroyed by genocide, colonization, and violent assimilation. This tends to color our view of language shift. But, we should also remember that in situations where languages have been put aside because human beings have come together in larger groups and have needed a common code for coexistence, perhaps shift was not such a tragedy, particularly if the resulting cohabitation was successful, peaceful, and productive of human happiness. Of course, even in peaceful and constructive language shift, there is loss: the disadvantage of those, particularly older adults, who find mastery of new forms beyond their competence; the breaking of the communication link between generations that so easily happens; the eclipse of valuable, language-borne culture; the weakening of small group networks as individuals enter the bigger linguistic community. Even when culture is imported into the new

14

Bakhtin first used the term to indicate a diversity of voices within a text. It has since come to be used to draw attention to the coexistence of distinct varieties within a single “language.”

Planning Minority Language Maintenance 653 setting, there will always be change and maybe loss (Fishman 1991). Cultural practice may become largely symbolic. Furthermore, inherited power differentials between majority and minority members are exceptionally difficult to dismantle. Nevertheless, trying to preserve the linguistic practices of a group when its members are making the decision to shift for economic, political, social reasons is almost always a lost cause. The incremental effect of choices made under the pressures of economic necessity, political loyalty, social mobility, or desire for inclusion cannot easily be countered. And perhaps this is not a disaster. Indeed an ecology of language more faithful to the biological metaphor would accept that there are situations where languages no longer serve and that their disappearance as languages of community may not always be a great catastrophe. It can be a greater violation of linguistic human rights to constrain the rights of individuals to choose their patterns of association in order to preserve an endangered language.

9. Conclusion In essence, we need the liberty to go out in the wider world and the liberty to retain the elements of our condition that maintain our integrity and make us what we are. Language is fundamental to these liberties. To participate fully in the modern state, to interact across national boundaries in a globalizing world and also to remain anchored in our neighborhoods may require a complex linguistic repertoire. It would be a scandal if language policy worked against an individual or a group and prevented them acquiring the linguistic competence that permitted them to realize their personal projects and trajectory. On the other hand, if individuals and families take a purely instrumental view, they lose the dimension of language as an expression of identity, as a link with heritage, as a privileged way of cementing group solidarity, as a “component of our deepest self, the spectacles through which we identify experiences as valuable,” the shaper of “us as group member” (Dworkin 1985, 228). In essence, we need to be able to exit essentialist scenarios, but, at the same time, if we wish to move into new communities of communication, we should have the right and the opportunity to integrate our former languages and cultures into our new settings. Kymlicka (2001, 210) argues that individuals not only need access to information, the education that allows them to evaluate it reflectively, and freedom of expression and association to act on it, but also “access to a societal culture” in order to make sense of it in terms that are relevant to them. This does not just (or necessarily) infer accommodation of bilingualism and multilingualism with their parallel trajectories but also an acceptance of communicative complexity and linguistic innovation. Achieving hybrid and multi-layered identities may resolve some of the binary clashes of culture set up by nation-state ideology (Bhabha 1994). But we should not be over-optimistic in the case of language, as May (2012) reminds. The rigidity of the language regimes of the nation-state era and the de Saussurean tradition which sees language as a stable, self-contained system is still extremely strong. It would need a

654 Sue Wright major paradigm shift for literate societies to accept the concept of a more fluid and creative linguistic repertoire. But, until and unless such a conceptual revolution happens, minority rights language planning is likely to remain “nation-building writ small.”

References Anderson, B. 1983. Imagined Communities. London: Verso. Armstrong, J. 1962. Nations Before Nationalism. Chapel Hill: University of North Carolina Press. Arnaut, K., J. Blommaert, B. Rampton, and M. Spotti. 2015. Language and Superdiversity. London: Routledge. Baehr, P. 2001. Human Rights—Universality in Practice. Basingstoke, UK: Palgrave Macmillan. Bakhtin, M. 1981. The Dialogic Imagination (trans. C. Emerson and M. Holquist). Austin: University of Texas Press. Barère, B. 1794. Rapport du Comité de Salut Public sur Les Idiomes. 8 pluvose an 2. Paris. Bhabha, H. 1994. The Location of Culture. London: Routledge. Bourdieu, P. 1979. La distinction. Critique sociale du jugement. Paris: Minuit. Capotorti, F. 1979. Study on the Rights of Persons Belonging to Ethnic, Religious and Linguistic Minorities. New York: United Nations Press. Davis, R. 1988. A History of Medieval Europe. 2nd ed. Harlow: Longman. de Varennes, F. 1996. Language, Minorities and Human Rights. The Hague, The Netherlands: Martinus Nijhoff. de Varennes, F. 1999. “Les droits de l’homme et la protection des minorités linguistiques.” In Langues et droits, edited by H. Guillorel and G. Koubi, 129–142. Bruxelles: Bruylant. Dewald, J. 1996. The European Nobility 1400–1800. Cambridge: Cambridge University Press. Dryden, J. 1680. Ovid’s Epistles. London: Jacob Tonson. Dworkin, R. 1985. A Matter of Principle. Cambridge, MA: Harvard University Press. Eide, A. 1996. “Ethnic Conflicts and Minority Protection: Roles for the International Community.” In Ethnicity and Power in the Contemporary World, edited by K. Rupesinghe and V. Tishkov. New York: United Nations University Press. Eliasson, S. 2013. “Language Ecology in the Work of Einar Haugen.” In Language Ecology for the 21st Century: Linguistic Conflicts and Social Environments, edited by W. Vandenbussche, E. Jahr and P. Trudgill. Oslo: Novus Press. Fishman, J. 1991. Reversing Language Shift. Clevedon, UK: Multilingual Matters. Fossier, R. 1970. Histoire sociale de l’Occident médiéval. Paris: Armand Colin. Gellner, E. 1983. Nations and Nationalism. Oxford: Blackwell Publishing. Giddens, A. 2013. The Nation-State and Violence. Chichester, UK: Wiley. Gumperz, J. and S. Levinson. 1996. Rethinking Linguistic Relativity. Cambridge: Cambridge University Press. Hastings, A. 1997. The Construction of Nationhood: Ethnicity, Religion and Nationalism. Cambridge: Cambridge University Press. Haugen, E. 1972. The Ecology of Language. Redwood CA: Stanford University Press. Haugen, E. 1979. “Language Ecology and the Case of Faroese.” In Linguistic Method: Essays in honour of Herbert Penzl, edited by I. Rauch and G. Carr. The Hague, The Netherlands: Mouton de Gruyter.

Planning Minority Language Maintenance 655 Haugen, E. 1987. Blessings of Babel: Bilingualism and Language Planning. Berlin: Mouton de Gruyter. Hoffmann, C. 1996. “Twenty Years of Language Planning in Spain.” In Monolingualism and Bilingualism: Lessons from Canada and Spain, edited by S. Wright, 59–90. Clevedon, UK: Multilingual Matters. Hohenberg, P. and L. Hollen Lees. 1995. The Making of Urban Europe, 1000–1994. Cambridge, MA: Harvard University Press. Ives, P. 2005. “Language, Agency and Hegemony: A Gramschian Response to Post-Marxism.” Critical Review of International Social and Political Philosophy 8(4): 455–468. Janics, K. 1982. Czechoslovak Policy and the Hungarian Minority 1945–1948. New York: Columbia University Press. Joppke, C. 1998. Challenge to the Nation-State: Immigration in Western Europe and the United States. Oxford: Oxford University Press. Kymlicka, W. 2001. Politics in the Vernacular: Nationalism, Multiculturalism and Citizenship. Oxford: Oxford University Press. Lauder, D. 2015. “International Schools, Education and Globalization: Towards a Research Agenda.” In The Sage Handbook of Research in International Education, edited by M. Hayden, J. Levy, and J. Thompson, 172–181. Los Angeles: Sage Publications. Lucy, J. 1992. Language Diversity and Thought: A Reformulation of the Linguistic Relativity Hypothesis. Cambridge: Cambridge University Press. Markelin, L., C. Husband, and T. Moring. 2013. “Sámi Media Professionals and the Role of Language and Identity.” Sociolinguistica. 27(1): 101–115. May, S. 2012. Language and Minority Rights: Ethnicity, Nationalism and the Politics of Language. 2nd ed. London: Routledge. Mill, J. S. 1861[1972]. Consideration on Representative Government. London: Dent. Myers-Scotton, C. 1993. “Elite Closure as a Powerful Language Strategy: The African Case.” International Journal of the Sociology of Language 3(1): 149–164. Nairn, T. 1977. The Break-Up of Britain. London: New Left Books. Nic Craith, M. 2010. “Linguistic Heritage and Language Rights in Europe: Theoretical Considerations.” In Cultural Diversity, Heritage and Human Rights, edited by M. Langfeld, W. Logan, and M. Nic Craith, 45–62. London: Routledge. OHCHR (Office of the High Commissioner for Human Rights). 2015. Minority Rights Protection in the UN System. www.ohchr.org/Documents/HRBodies/HRCouncil/. MinorityIssues/Session8/CN_ProtectionUNSystem.pdf. Perry. T. 2012. “Language Rights, Ethnic Politics: A Critique of the Pan South African Language Board.” www.praesa.org.za/files/2012/07/Paper12-text-layout.pdf. Perta, C. 2004. Language Decline and Death in Three Arbëresh Communities in Italy: A Sociolinguistic Study. Alessandria: Edizioni dell’Orso. Phillips, A. 1995. “Minority Rights in Europe.” Paper given at the Centre for Research in Ethnic Relations, Warwick University, UK, November 1995. Pupavac, V. 2012. Language Rights—From Free Speech to Linguistic Governance. Basingstoke, UK: Palgrave. Rietbergen, P. 1998. Europe, a Cultural History. London: Routledge. Sapir, E. 1958. Culture, Language and Personality. Selected Essays. Berkeley: University of California Press. Simon, T. 2000. Protecting Minorities in International Law. Trento: Centri Studi per la Pace. Singman, J. 1999. Daily Life in Medieval Europe. Westport, CT: Greenwood Press.

656 Sue Wright Skutnabb-Kangas, T. 1998. “Human Rights and Language Wrongs—A Future for Diversity?” Language Sciences 20(1): 5–27. Steiner, G. 1975. After Babel. Oxford: Oxford University Press. Svallfors, S. 2007. The Political Sociology of the Welfare State: Institutions, Social Cleavages, and Orientations. Redwood, CA: Stanford University Press. Thompson, C. 2001. “The Protection of Minorities within the United Nations.” In Minority Rights in Europe, edited by F. de Varennes, 115–137. The Hague, The Netherlands: Asser Press. Thornberry, P. 2013. Indigenous Peoples and Human Rights. Manchester: Manchester University Press. Touval, S. 1966. “Treaties, Borders and the Partition of Africa.” Journal of African History 7: 279. United Nations. 1966. “Article 27 Human Rights Declaration” (HRI/GEN/1/Rev/1 at 38). UNHCR. n.d. Minority Rights Fact sheet 18, posted at http://www.unhchr.ch/html/menu6/2/ sheets.htm. von Clausewitz, C. 1830[1997]. On War (trans. J. J. Graham). Ware, UK: Wordsworth. Wright, S. 2007. “The Right to Speak One’s Own Language: Reflections on Theory and Practice.” Language Policy 6(2): 203–224. Wright, S. 2016a. Language Policy and Language Planning. 2nd ed. London: Palgrave. Wright, S. 2016b. “Language Choices: Political and Economic Factors in Three European States.” In The Palgrave Handbook of Economics and Language, edited by V. Ginsburgh and S. Weber, 447–488. London: Palgrave. Zolberg, A. and L. L. Woon. 1999. “Why Islam Is Like Spanish: Cultural Incorporation in Europe and the United States.” Politics and Society 27(1).

Pa rt I V

E N DA N G E R E D L A N G UAG E S A N D B IO C U LT U R A L DI V E R SI T Y

Chapter 29

C ongru ence Bet we e n Species and L a ng uag e Diversi t y David Harmon and Jonathan Loh

1. The fundamental units of diversity Classifications are fashioned out of the elemental differences that exist in nature and culture. The two fundamental units of this variety on earth are species and languages: the natural world of living organisms can be classified into species and the cultural world of human beings into linguistic groups. Granted, species and languages are in a sense only mid-level proxies: species sit halfway between the base particularities of DNA and the grand integrations of ecosystems, while languages occupy a middle ground between individual memes and entire societies. Nevertheless, when it comes to our understanding of diversity, species and languages are the primary currency. When we look at how they are distributed across the terrestrial portion of the planet, a striking fact has recently begun to come into focus: the areas of highest species diversity overlap with those of highest linguistic diversity. Not only that, biological diversity and linguistic diversity show extraordinary parallels in terms of their evolution. Both species and languages have evolutionary histories that can be traced back in time to earlier, ancestral species and languages. Both are related through nested patterns of descent, and can be classified in such a way as to show the phylogenetic relationships derived from a common ancestor (Figure 29.1; see also Gavin et al. 2013). Their histories are traced by increasingly similar taxonomic methods and result in comparable phylogenetic trees that reflect an evolutionary branching process. The processes that form new species, speciation, have their equivalents in different forms of language genesis. Endemism—the quality of being restricted to or deeply associated with a particular place or region—is of special interest in both, so that endemic species and indigenous languages are considered to be especially important (Skutnabb-Kangas

660 David Harmon and Jonathan Loh

Figure 29.1. Parallel hierarchies of biological and linguistic nested groupings. Source: Loh 2017.

and Harmon 2018). It may even be that the evolutionary mechanisms that give rise to species diversity and linguistic diversity are similar, with both the result of the action of replication, variation, and selection working on hereditary material. These similarities are not something which is intuitively obvious. On the contrary, it has been more common (at least in the Western tradition) to consider nature and culture as two solitudes, with little to suggest that they might be related. Now, at last, it is being demonstrated that they are. In this chapter, we shall consider a specific instance of that interrelationship: the congruence between patterns of diversity in the distribution of species and languages. We examine the evidence for the congruence and how it has

Congruence Between Species and Language Diversity 661 evolved in the past three decades, some possible explanations for the overlap, the effect of scale on the distributions, the current status of and trends in species and language diversity, and the forces that are driving a parallel extinction crisis in both.

2. Recognizing the overlap Most of earth’s terrestrial species are found closer to the Equator than toward either pole. That core biogeographical fact was first grasped by Darwin’s contemporary, the pioneering naturalist and evolutionary biologist Alfred Russel Wallace (Stevens 1989). Confirmed as the years went by, the pattern is now known in the scientific literature as the latitudinal species gradient (or the latitudinal biodiversity gradient). A related though more disputed hypothesis, called Rapoport’s rule, claims that the latitudinal ranges of plant and animal species are generally smaller the nearer they are to the Equator. Although there are significant exceptions, together the latitudinal species gradient and Rapoport’s rule offer a generally accurate picture of global terrestrial species distribution: there are more species, but with smaller ranges, in low latitudes (i.e., the tropics) than elsewhere. Awareness of affinities between species and languages traces back to Darwin himself. Toward the end of the Origin of Species, he observed that “a perfect pedigree of mankind” could be derived if we could only properly classify the world’s languages, living and extinct, with an account of intermediate dialects—which he immediately compared with varieties of species (Darwin 1859, 422–423). He took the analogy further in The Descent of Man and recognized parallels in both their evolution and extinction: “The formation of different languages and of distinct species, and the proofs that both have been developed by a gradual process, are curiously parallel. . . . Languages, like organic beings, can be classed in groups under groups. . . . Dominant languages and dialects spread widely, and lead to the gradual extinction of other tongues. A language, like a species, when once extinct, never, as Sir C. Lyell remarks, reappears” (Darwin 1874, 90). Darwin was enthusiastic about parallels between species and languages, as were some linguists of his day. But soon a reaction set in among the latter, with many objecting to any line of thought that could cause languages to be regarded as akin to natural organisms (Harmon 2002). As Maffi (2005, 600) notes, “Analogies between languages and species became discredited and were relegated to the shelves of misconceived ideas until recently.” Things began to change in the 1980s when biologists began warning of the possibility of a large-scale extinction of species in the decades to come, the first such event in the planet’s history to be caused by humans. The moral dimensions of this alarming prospect (since confirmed; see Barnosky et al. 2011; Ceballos et al. 2015; Régnier et al. 2015) helped persuade many scientists to set aside their aversion to advocacy and join a new field, conservation biology, that melded impartial research with dedicated action to protect nature. At the same time, a revolutionary principle took shape: that we should value not just individual charismatic species (such as pandas or lions) but rather the

662 David Harmon and Jonathan Loh entire range of evolutionary variety among all species, no matter how inconspicuous or seemingly inconsequential to humans. Instead of saving endangered wildlife, conservation aimed to maintain biological diversity, or biodiversity for short. By the end of the decade the concept of biodiversity had gained a solid foothold in academia, and was being adopted by groups such as the International Union for Conservation of Nature (IUCN) as the basis for their work in protected area conservation. Well-known scientists threw themselves into a public campaign to raise awareness of earth’s natural variety (see, e.g., Wilson 1992). The rise of the biodiversity concept resonated with many cultural anthropologists and linguists, whose very disciplines were predicated upon a recognition of diversity in culture. The International Society of Ethnobiology issued a declaration in 1988 proclaiming an “inextricable link” between cultural and biological diversity, and this was soon followed by more detailed speculations about how the two might be related (e.g., Dasmann 1991; Nietschmann 1992). In a seminal 1992 paper, the linguist Michael Krauss was the first to propose that a massive extinction of languages was underway and to explicitly link it with species extinctions. “Language endangerment,” he wrote, “is significantly comparable to—and related to—endangerment of biological species in the natural world” (1992, 4). So by the mid-1990s the idea of comparing the diversity of species and languages had gained intellectual respectability. The stage was now set for quantitative studies to back up general observations and assertions. Perhaps the first breakthrough came in a study by Mace and Pagel (1995), who demonstrated that the distribution of languages present in North America at the time of European contact followed the same latitudinal gradient as do mammal species, with many more found in southerly areas. The next step— generalizing Mace and Pagel’s findings to a global level— was forwarded by the realization that the concept of endemism, the quality of a species being restricted to a particular geographical area, can be applied to languages as well. Endemism is a relative term, and can be used in reference to very large areas such as regions or even continents. In fact, though, most species and languages are found only in fairly restricted areas. Harmon (1996) used a comparison of endemism in species and languages at the national level to produce simple but indicative tables, and a global map, that identified significant overlap between the two. He found that ten of the top twelve “megadiversity” countries for biodiversity (as defined by IUCN) were also among the top twenty-five most linguistically diverse countries (as measured by language richness). His global cross-mapping of languages and higher vertebrate species (see Maffi 1998 for the earliest printed version of this map) identified various countries in Central and South America, central Africa, South and Southeast Asia, and the Pacific as among the most bioculturally diverse on the planet (Figure 29.2). At about the same time, Nettle produced a global map (Figure 1 in Nettle 1998) of the most linguistically rich large countries (those more than 50,000 km2), statistically adjusted for area. His map confirmed earlier assertions (e.g., Breton 1991; Mace and Pagel 1995) that global language distributions generally mirror the latitudinal species gradient. In addition, language range sizes tend to be smaller at low latitudes,

Congruence Between Species and Language Diversity 663

Figure 29.2. An early, coarse-scale comparison of species and language diversity: endemism by country. Source: Harmon 1996, as published in Maffi 1998.

suggesting a linguistic parallel to Rapoport’s rule. But, just as with species, there are exceptions (Amano et al. 2014). At high southern latitudes, areal calculations of language distributions are skewed by the domination of European colonial languages (e.g., Portuguese in Brazil, English in Australia). Furthermore, some Arctic languages have small range sizes. Nonetheless, it is clear that species and languages have similar distributions at the global level (Loh and Harmon 2005; Gavin and Stepp 2014; cf. Figure 1a in Gavin et al. 2013 with the various maps at www.biodiversitymapping.org).

3. Elaborating the overlap These foundational studies established correlations at the global level between species and language distributions. The next wave of studies (2000–2005) began to fill in the picture at various levels of detail. Using Ethnologue data, Oviedo and colleagues (2000) mapped the distribution of the world’s ethnolinguistic groups onto 200 areas identified by the World Wildlife Fund as being a comprehensive, representative sample of global biodiversity at the ecoregion scale; they found an overlap in two-thirds of the cases. In a study focused on Africa, Moore et al. (2002, 1649–1650) presented several lines of quantitative evidence to conclude “that not only do language richness and species richness show similar latitudinal gradients, but there is a correspondence between the distributions of species and language richness.” In an analysis of joint extinction risk

664 David Harmon and Jonathan Loh biogeographical correlations, Sutherland (2003) found that areas with high language diversity also have high bird and mammal diversity, and that both forms of diversity are positively associated with overall area, low latitude, forest area, and (for languages and birds) maximum altitude. He did not find any evidence for an earlier suggestion by Nettle (1999) that length of the growing season might influence language diversity. Using geographic information systems (GIS), Stepp et al. (2004) globally mapped plant diversity against language diversity (Figure 29.3) and found a strong co-occurrence, especially in Mesoamerica, the Andes, West Africa, the Himalayas, and South Asia/ Pacific. In another global study, Loh and Harmon (2005) used species–language comparisons (among others) as part of a first attempt to quantify global biocultural diversity by means of a country-level index. Their Index of Biocultural Diversity (IBCD) affirmed the general patterns of congruence between species and languages, and called out three areas of exceptional biocultural diversity: the Amazon Basin, Central Africa, and Indomalaysia/Melanesia. Not all studies from this period agree, though. Looking for correlations between biological and cultural diversity within ten “culture areas” of indigenous peoples in North America north of Mexico, Smith (2001) arrived at a mixed bag of results: tree species diversity was well correlated with native linguistic diversity, but diversity in other vascular plants and in birds/mammals was not. Smith cautioned that his results were “very

Figure 29.3. A more recent, finer-scale comparison of species and language diversity: areas of high plant diversity (darker colors) strongly correlate with the diversity of languages (black dots). Source: Stepp et al. 2004.

Congruence Between Species and Language Diversity 665 tentative” (2001, 110) and, as Moore et al. (2002) later pointed out, a small sample size and failure to control for area effects make the results difficult to interpret. A more cautionary note was struck by Manne (2003), who focused on separate areas in Central and South America. She found concordance between the distributions of languages and Passeriform (perching) bird species at broad scales, but a weaker fit between them when one “zooms in” to finer scales. Moreover, she found no coincidence, at any resolution, between endangered species and endangered languages. Nonetheless, the most recent studies (2006 to the present) have consistently found congruence between biological and linguistic diversity. They have substantially advanced the conversation by taking new approaches, employing more sophisticated modes of statistical analysis, and looking at the overlap at finer scales. In a novel line of reasoning, Fincher and Thornhill (2008) hypothesized that the diversity of human parasites acts like an evolutionary wedge driving people apart, with groups maintaining separation to avoid contagion. The upshot would be that parasite diversity helps promote cultural and linguistic divergence. Therefore, they predicted that human parasite diversity should be positively correlated with linguistic diversity. They were right: this particular subset of biodiversity is strongly related to the worldwide distribution of indigenous language diversity. Other studies have looked at various environmental factors as they might relate to language differences. Such factors rarely stand alone, however. Analyzing a data set of 264 Pacific islands with 1,640 languages, Gavin and Sibanda (2012) used three models to test whether a number of areal, climatic, geographic, or physiographic variables predicted language richness, either singly or jointly. They found that language diversity relates strongly to island area, and, after controlling for area, with variables linked to isolation, but concluded that human diversity patterns appear to be influenced by economic, political, and social factors as well. This observation is borne out in a study by Currie and Mace (2012), who found that the relationship between ethnolinguistic distribution and environmental variables is stronger in societies whose primary mode of subsistence is foraging than it is with agriculturalists, with net primary productivity being a good predictor. In a global study, Gavin and Stepp (2014) explicitly linked environmental factors (as expressed through Rapoport’s rule) to social factors (strong group boundary formation between sociolinguistic groups) as the two key reasons for the similar latitudinal gradients of species and languages. Just as social variables often factor into studies comparing the distribution of species and languages, the outcome of these studies can have implications for social policy, particularly strategies for integrated conservation of biological and cultural/linguistic diversity. For example, Larsen, Turner, and Brooks (2012, e36971: 6) studied the Alliance for Zero Extinction (AZE) global network of priority sites for biodiversity conservation to find whether protecting these sites would also yield benefits for human well-being. They found that the AZE sites “lie in areas of significantly higher linguistic diversity of both all languages and threatened languages (i.e., those spoken by