Linguistics: An Introduction 9781350164277, 1350164275

What is Linguistics? How do languages work? Why is this important? Answering these questions and more, Linguistics: An I

126 68 25MB

English Pages 613 [545] Year 2024

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
Halftitle page
ALSO AVAILABLE FROM BLOOMSBURY
Title page
Copyright page
Contents
Figures
Maps
Tables
Preface
Acknowledgements
Notes on the Text
Organization and presentation
Structure of the book
About the third edition
Guide for the student
Abbreviations and Conventions Used in Examples
1 Introduction
1.1 What is linguistics?
1.2 Fundamental concepts
1.3 Design features of human language
1.4 Outline of modern linguistics froma historical perspective
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
Part I Language: System and Structure
2 Sounds of Language: Phonetics and Phonology
2.1 Fundamental properties of speech sounds
2.2 The vocal tract
2.3 Types of phones
2.4 Some additional features
2.5 Prosodies
2.6 Phonology
2.7 How to establish the phonemesof a language
2.8 Transcription
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
3 Structure of Words: Morphology
3.1 Words
3.2 Morphemes, allomorphs and morphs
3.3 Main types of morphemes
3.4 Allomorphs and allomorph conditioning
3.5 Morphological description
3.6 Morphological analysis
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
4 Lexicon
4.1 The lexicon
4.2 Ways of making new words
4.3 Ways of using old forms to get new meanings
4.4 Fixed expressions
4.5 What’s in a word?
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
5 Structure of Sentences: Syntax
5.1 What is syntax?
5.2 Hierarchical structure in sentences
5.3 Syntactic units
5.4 The structure of clauses
Summing up
Guide to further reading
6 Meaning
6.1 What is meaning?
6.2 Semantics
6.3 Pragmatics: the meaning of utterances
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
Part II Language in Use
7 Sociolinguistics: Language in Its Social Context
7.1 Language as a social phenomenon
7.2 Social varieties and variation
7.3 Varieties and variation according to use
7.4 Language use in bilingual communities
7.5 Language shift and endangerment
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
8 Text and Discourse
8.1 Preliminaries
8.2 Text organization
8.3 Discourse: language in interactive use
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
9 Investigating Language in Use: Corpus Linguistics
9.1 What is a corpus and what is corpus linguistics?
9.2 Types of corpora
9.3 Building a corpus of your own
9.4 Analysing a corpus
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
Part III Language: A Human Phenomenon
10 Language in Its Biological Context
10.1 Natural communication systems of other animals
10.2 Teaching human language to animals
10.3 Origins and evolution of human language
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
11 Psycholinguistics: Language, the Mind and the Brain
11.1 Language and cognition
11.2 Language processing
11.3 Language and the brain
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
12 Language Learning
12.1 Major features of child language learning
12.2 Strategies for child language learning
12.3 Second-language learning
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
Part IV Language: Uniformity and Diversity
13 Gesture and sign languages
13.1 The visual-gestural medium
13.2 Primary sign languages
13.3 Alternate sign languages
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
14 Writing
14.1 The visual-inscribed medium
14.2 Types of writing system
14.3 The English writing system
14.4 Writing systems in society
14.5 Linguistic features of some written varieties
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
15 Unity and Diversity in Language Structure
15.1 Preliminaries to the study of the unity and diversity of languages
15.2 Universals of language
15.3 Typology
15.4 Explaining unity and diversity of language structure
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
16 Language Change
16.1 Major characteristics of language change
16.2 Sound change
Assimilation
16.3 Morphological change
16.4 Syntactic change
16.5 Grammaticalization
16.6 Semantic change
16.7 Causes of language change
Summing up
Guide to further reading
Issues for further thought and exercises
Research project
17 Languages of the World
17.1 Number and variety of the world’slanguages
17.2 Relations among the languages
17.3 Seven (putative) language families
17.4 Contact languages
Summing upIt
Guide to further reading
Issues for further thought and exercises
Research project
Glossary
Notes
References
Language Index
Name Index
Subject Index
Recommend Papers

Linguistics: An Introduction
 9781350164277, 1350164275

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Linguistics

i

ALSO AVAILABLE FROM BLOOMSBURY Why Do Linguistics? Fiona English and Tim Marr Discourse Analysis, Brian Paltridge Research Methods in Linguistics, Lia Litosseliti

ii

Linguistics An Introduction Third Edition

William B. McGregor

iii

BLOOMSBURY ACADEMIC Bloomsbury Publishing Plc 50 Bedford Square, London, WC1B 3DP, UK 1385 Broadway, New York, NY 10018, USA 29 Earlsfort Terrace, Dublin 2, Ireland BLOOMSBURY, BLOOMSBURY ACADEMIC and the Diana logo are trademarks of Bloomsbury Publishing Plc First published in Great Britain 2009 This edition published 2024 Copyright © William B. McGregor, 2024 William B. McGregor has asserted his right under the Copyright, Designs and Patents Act, 1988, to be identified as Author of this work. For legal purposes the Acknowledgements on p. xi constitute an extension of this copyright page. Cover design: Tjasa Krivec Cover image © Santy silvi/Getty Images All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage or retrieval system, without prior permission in writing from the publishers. Bloomsbury Publishing Plc does not have any control over, or responsibility for, any third-party websites referred to or in this book. All internet addresses given in this book were correct at the time of going to press. The author and publisher regret any inconvenience caused if addresses have changed or sites have ceased to exist, but can accept no responsibility for any such changes. A catalogue record for this book is available from the British Library. A catalog record for this book is available from the Library of Congress. ISBN:

HB: PB: ePDF: eBook:

978-1-350-16426-0 978-1-3501-6425-3 978-1-3501-6428-4 978-1-3501-6427-7

Typeset by RefineCatch Limited, Bungay, Suffolk To find out more about our authors and books visit www.bloomsbury.com and sign up for our newsletters. Online resources to accompany this book are available at https://www.bloomsburyonlineresources.com/linguistics-anintroduction-3. If you experience any problems, please contact Bloomsbury at: [email protected]

iv

Contents

List of Figures vii List of Maps viii List of Tables ix Preface x Acknowledgements xi Notes on the Text xiii Abbreviations and Conventions Used in Examples xxi

1

Introduction

1

Part I Language: System and Structure 2

Sounds of Language: Phonetics and Phonology

3

Structure of Words: Morphology

4 Lexicon 5

55

81

Structure of Sentences: Syntax

6 Meaning

25

105

131

Part II Language in Use 7

Sociolinguistics: Language in Its Social Context

8 Text and Discourse

159

183

9 Investigating Language in Use: Corpus Linguistics

207 v

vi

Contents

Part III Language: A Human Phenomenon 10 Language in Its Biological Context

235

11 Psycholinguistics: Language, the Mind and the Brain 12 Language Learning

283

Part IV Language: Uniformity and Diversity 13 Sign Languages 14 Writing

307

333

15 Unity and Diversity in Language Structure 16 Language Change

387

17 Languages of the World Glossary 441 Notes 467 References 475 Language Index 497 Name Index 503 Subject Index 505

413

359

257

Figures

Figure 1.1 Figure 1.2 Figure 1.3 Figure 2.1 Figure 2.2 Figure 2.3 Figure 3.1 Figure 6.1 Figure 6.2 Figure 6.3 Figure 9.1 Figure 9.2 Figure 9.3 Figure 10.1 Figure 10.2 Figure 10.3 Figure 11.1 Figure 11.2 Figure 11.3 Figure 13.1 Figure 13.2 Figure 13.3 Figure 13.4 Figure 13.5 Figure 14.1 Figure 15.1 Figure 15.2 Figure 15.3 Figure 16.1 Figure 17.1 Figure 17.2

Saussure’s conceptualization of the linguistic sign Some iconic signs Some symbolic signs Sound wave of the author’s production of The farmer kissed the duckling Human vocal tract Upper vocal tract showing main places of articulation of consonants Conceptual representation of the differences between the three main types of bound morpheme Aspects of linguistic meaning Small portion of a taxonomic hierarchy for plant in English Partial meronymic hierarchy for Gooniyandi body-part terms A ‘slip’ (A6 size) from the corpus compiled by Randolph Quirk with an example of a noun phrase Survey of English Usage research room, early 1970s Concordance lines for prevaricate followed by a preposition in COCA Indication of direction of nectar source by honey bee’s tail-wagging dance Vocal tract of the chimpanzee Nim Chimpsky signing me hug cat to trainer Version of Levinson’s spatial arrangement experiment Major structures of the human brain Some of the major areas of the left hemisphere of the brain Beginning of the gesture accompanying (13-1), by Maudie Lennard Middle-finger gesture produced simultaneously with wuba ‘little’ in the utterance of (13-2), by Maudie Lennard Four signs of Auslan Content interrogative in Finnish Sign Language Three signs in Tshàúkák’ùí, signed by Maxwell Kebuelemang Trilingual sign in Yanbian Prefecture, China Language types according to airstream mechanisms used contrastively Location of the 23 sample languages on two dimensions of morpheme integrity Animacy hierarchy Grimm’s law chain shift Indo-European family tree Some major groupings within the Niger-Congo family

7 7 8 28 30 33 66 137 142 142 210 211 223 239 244 245 261 268 269 310 312 315 320 327 341 361 367 374 391 424 428 vii

Maps

Map 1 Map 7.1 Map 7.2 Map 17.1 Map 17.2 Map 17.3 Map 17.4

viii

Approximate homeland locations of main languages mentioned in this book Varieties of the Western Desert language Two isoglosses in Danish Location of the main groups of the Indo-European family prior to the sixteenth century Approximate locations of the putative language families of Africa Location of the Sino-Tibetan family Location of Khoisan languages of Southern Africa

xxiv 163 165 424 427 430 431

Tables

Table 2.1 Table 2.2 Table 2.3 Table 3.1 Table 6.1 Table 7.1 Table 7.2 Table 7.3 Table 8.1 Table 8.2 Table 8.3 Table 8.4 Table 9.1 Table 9.2 Table 9.3 Table 11.1 Table 14.1 Table 15.1 Table 15.2 Table 16.1 Table 17.1 Table 17.2

Main symbols of the IPA Chart of vowels of BBC English showing IPA representation and example Diphthongs of three dialects of English Summary of major morpheme types Syntactic forms and their typical illocutionary forces in English Model of the major phenomena relevant to the sociolinguistics of language use Bound pronouns in women’s and men’s varieties of Yanyuwa Some allomorphs of two case suffixes in Gurindji Stages in narrative structure Structural analysis of the urban legend of the tough professor Structure of the model answer exposition Hierarchy of discourse units Frequency list in three corpora Ten last frequency positions in AmE06 Comparison of the strongest collocates of strong and powerful in the COCA corpus, listed for each part of speech in order of strength Five classical types of aphasia Some differences between speech and writing Four types of language universal Frequencies of word order types Some cognates in English and French illustrating Grimm’s law Some basic words in four languages of the north-west of Australia A selection of basic words in six African languages

29 39 41 66 146 161 167 177 187 189 190 199 218 219 227 272 347 363 377 391 419 421

ix

Preface

My intention in writing this book is to provide a basic introduction to modern linguistics that conveys an idea of the scope of the subject, and a feeling for the excitement of doing linguistics – the excitement of finding out about language and languages, your own and others. I hope it will stimulate your understanding of the subject, rather than encourage mere rote memorization of facts. I would also like to convey some appreciation of the reasons why linguists do what they do, and for the approaches and methods they adopt in studying languages. The third thing I would like to promote is the development of your powers of observation, as well as your critical and creative faculties. There are many excellent introductory textbooks on linguistics. Why another? My initial motivation lay mainly in dissatisfaction with certain aspects of the existing textbooks. None offered what I wished in terms of manner of presentation, pedagogic philosophy, the range and type of information presented and theoretical stance. As a result of teaching an introductory course in linguistics in 2002, I was convinced of the need to write my own textbook to remedy these dissatisfactions. This resulted in the first edition, published in 2009. Errors in that edition, omission of important topics and new developments in linguistics necessitated a second edition, published in 2015. Seven years on, further updating, revision and extension necessitate a third edition. Hobro, January 2023

x

Acknowledgements

For useful comments on draft chapters for earlier editions I thank Peter Bakker, Andreas Højlund, Jan Rijkhoff, Alan Rumsey, Jean-Christophe Verstraete and six anonymous referees. I am also grateful to Chris Butler, Lise Fontaine, Hilde Hasselgård, Gunther Kaltenböck, Dirk Noel, AnneMarie Simon-Vandenbergen and Jean-Christophe Verstraete for comments on the new chapter on corpus linguistics and to Reinhild Vandekerckhove for references on instant messaging. Various anonymous referees provided useful commentaries on the proposals for the second and third editions. They have provided much useful advice, and prevented me making a number of mistakes. I alone am responsible for any remaining errors of fact or interpretation. I would have liked to include more additional topics and themes that they suggested, but limitations of space precluded doing so. Thanks also to the classes of 2004 (Grundkursus) and 2005–22 (At forstå lingvistik/Understanding linguistics) at Aarhus University for their helpful input as end-users. A number of students from elsewhere – too many to list individually – have also provided valuable feedback. Many people have contributed to this book by providing information on languages they have expert knowledge of, including: Paula Andersson, Peter Bakker, John Bowden, Hilary Chappell, G. Tucker Childs, Ann Elveberg, Nick Enfield, Anne-Maria Fehn, Michael Fortescue, Yoko Fukuda, Xinyi Gao, Tom Güldemann, Gerd Haverling, Birgit Hellwig, Judit Horváth, Blesswell Kure, Tine Larsen, Eva Lindström, Haicun Liu, Susanne Mohr, Hitomi Ono, Andy Pawley, Alan Rumsey, Johanna Seibt, Paul Sidwell, Tsunoda Tasaku and Hein van der Voort. Speakers (many deceased) of a number of other languages have generously shared their language with me over the past forty or so years; without their input my knowledge would be severely restricted. I might single out here Maxwell Kebuelemang, who demonstrated words and sentences in Ts’ixa Sign Language (Tshàúkák’ùí) specifically for this book. Many thanks to all of these people (who are acknowledged by name in my publications on the particular language). Aleksandr Kibrik (1939–2012) kindly gave permission to use the Archi problem in Chapter 5. Bas Aarts and Sean Wallis of University College London kindly provided the heritage photographs from the Survey of English Usage in its pre-digital days in the 1970s, as well as information about them. The videos of ASL included on the accompanying website are courtesy of the National Center for Sign Language and Gestures Resources at Boston University (directors Carol Neidle and Stan Sclaroff ). Thanks also to the signer, Lana Cook, for permission to use the videos, and Carol Neidle for links to useful material on their website, as well as other assistance and advice. Credits for graphical and other materials are given in the captions.

xi

xii

Acknowledgements

Thanks also to my editor at Bloomsbury, Laura Gallon, for her patience with an extremely slow author. Every effort has been made to trace copyright holders and to obtain their permission for the use of copyright material. However, if any have been inadvertently overlooked, the publishers will be pleased, if notified of any omissions, to make the necessary arrangement at the first opportunity.

Notes on the Text

Organization and presentation Manner of presentation This book employs a clear physical layout of information, presenting the material in brief, clear sentences and sections. Each chapter begins with a brief abstract, a detailed table of contents, a list of the main goals of the chapter, and a checklist of the major terms and concepts introduced. It concludes with a summary, a guide to further reading, a list of problems and issues for further thought, and a project suggestion. The book is accompanied by a website containing additional information, including a set of multiple-choice questions designed to test your understanding of the main points of each chapter. Address: https://www.bloomsburyonlineresources.com/linguistics-an-introduction-3. This manner of organization should permit the book to be used not just as a textbook for a course in linguistics, but also by an independent reader wanting to find out about linguistics.

Pedagogic philosophy My two major concerns are first to encourage and facilitate understanding linguistic concepts and how to use them, and second to promote observation of language. Thus the work aims to present not just ‘facts’ but also ways of dealing with them, ways of understanding them in relation to the broad issues of concern to linguistics. To this end, each chapter includes a set of questions for further thought and exercises (some easy, some quite challenging). I believe that it is through attempting to solve simple – and difficult! – problems that students can learn and understand a subject, more so than by reflecting on larger philosophical issues. You have to understand linguistics to do it. But at the same time, you have to do it to understand it: you have to get your hands dirty by engaging with data – grappling with it, attempting to understand it and relating it to what you already know (or think you know) about language or about a particular language. Until you begin doing both of these, it is pointless, in my opinion, to dwell on the philosophical issues surrounding the subject, as fascinating as they may be. A part of what is especially attractive about linguistics is that there is still a lot to learn about every one of the roughly 7,000 languages spoken in the world today. Even a student new to the subject can, if they are attentive to speech – and writing – around them, learn something new (or not widely known)

xiii

xiv

Notes on the Text

about their own language. This I know, having seen some nice examples from introductory students in the past. Not only do doing and understanding go together, but a third component is essential, namely the ‘facts’, the knowledge about language and languages. Notice that I said in the first paragraph of this section that my aim was to present ‘not just “facts” ’; I didn’t say or mean to suggest that facts are unimportant or uninteresting! To the contrary, the ‘facts’ are extremely interesting, and to ignore them would be suicidal for a linguist, as it would be for any scientist or researcher. Some chapters rely rather more heavily on facts than others. Another aspect of understanding linguistics is to see it in a historical perspective: part of understanding why linguistics is as it is, and why linguists do things the way they do, requires an appreciation of the intellectual traditions of the subject. The introductory chapter contains a section presenting modern linguistics from a broad historical perspective; it also contains a project directing students to find out about particular linguists. The website for the book extends on these by presenting a brief overview of the history of the subject.

Range and type of information Many introductory textbooks focus almost exclusively on one language, English. This book also uses many examples from English, presuming that anyone who can read it will have sufficient knowledge of the language to permit it to be used as a foundation on which an understanding of concepts and arguments can be constructed. However, numerous examples are given from other languages, many ‘exotic’ and/or endangered, including languages I have first-hand experience of myself. Partly this is a statement that other languages are as important as English to linguistics, and indeed are crucial to the subject. These examples are also intended to encourage you to try to understand and appreciate the ways other languages do things, which can be very different to the way English does things.

Theoretical framework Modern introductory textbooks tend either to acknowledge no particular theoretical framework, purporting to be either atheoretical or catholic in orientation, or to adopt the dominant theoretical framework in linguistics, generative grammar (see §1.4). Almost no introductory textbook presents linguistics from any of the many alternative perspectives. For these other perspectives one must go to more advanced textbooks on specific topics. This book is intended to fill the lacuna and present beginning linguistics from an alternative perspective, specifically one in which both meaning and use play absolutely central roles. While I have my own minority theoretical perspective, I do not attempt to present or argue it here; rather I stand back from it, and adopt a more general stance that includes many theories within the so-called functionalist, usage-based and cognitivist paradigms. Needless to say, not all practitioners in these theories will agree with everything I say. There are other reasons why I believe it is unhelpful to adopt a too non-partisan approach in an introductory text. It is important for beginning students to get the feel for working and thinking

Notes on the Text

within one approach. This has the advantage of permitting them to go more deeply into a topic and gain ‘hands-on’ experience in doing things according to that approach. On the other hand, presentation of theoretical variety – perhaps chaos – tends to leave students bewildered on the one hand, and on the other, frustrated with the sketchy treatment of topics. There is also the danger that they acquire no usable skills.

Structure of the book Aside from Chapter 1, Introduction, which sets the scene, this book is divided into four parts. Part I, Language: System and Structure, comprising Chapters 2–6, focuses on the structure and system of human languages. These chapters present a number of central notions of modern linguistics that are essential to an understanding of the subject, both in the remainder of the book and in your later courses in linguistics. They are fairly demanding on understanding, and you are likely to find them fairly heavy going.

It is perfectly normal for beginning students to feel lost in the first few weeks of their introductory linguistics course: it takes time to get the ‘feel’ of what linguistics is all about, and to appreciate the unfamiliar ways of thinking about language. Things should start to become clearer in the second month; if not, you should consult your lecturer or tutor.

Part II, Language in Use, comprises three chapters, Chapters 7–9. These chapters explore issues in the use of language, including its social use and its use in the context of human interaction. The final chapter of this part is more practical in orientation, and deals with ways of exploring how language is actually used by speakers and writers. Part III, Language: A Human Phenomenon, consists of Chapters 10–12. These chapters explore language as a human phenomenon. They examine such issues as how children learn their first language, the relation between language and the human mind and brain, and the relation between human languages and the communication systems of other animals. The remaining five chapters, Chapters 12–16, make up Part IV, Language: Uniformity and Diversity. These chapters focus on variety and variation in languages: with the different ‘mediums’ for the expression of languages, the cross-linguistic range of variation in language structures, language change over time, and the linguistic diversity of the world. You will probably find Parts II–IV easier going conceptually than Part I, though probably more taxing on memory. The chapters of Parts II through IV are relatively independent, and may be read or taught in almost any order. All, however, presume knowledge and understanding of the basic notions presented in Part I. In particular, notions of the phoneme and morpheme are presumed in many places. With a little additional introductory material, or explanation of concepts as they are encountered, it would be possible to present Parts II and III prior to Part I; Part IV, however, demands

xv

xvi

Notes on the Text

more of the notions developed in Part I, and would not be very satisfactory prior to Part I. I teach, and have placed, Part I first in order to give students a longer time to think about and reflect on fundamental notions such as the phoneme and morpheme; this means also that students have most of the semester to practise doing basic problems in phonology and morphology in the tutorials (which focus on the first six chapters of the book throughout the semester; tutorials are not so essential for the remaining chapters). The remainder of the course then is much less conceptually demanding, though there is a slight upturn in the final couple of weeks with Chapters 15–17; however, students who have mastered the first six chapters should have no trouble with these.

About the third edition This edition corrects a number of errors in the first and second editions, including typos and errors of fact and interpretation, including, I am ashamed to say, a few instances where I uncritically repeated widespread linguistic myths. Doubtless other such lapses remain, and I can only ask the reader to advise me so that I can correct them in future editions. I have also updated the references in the Guide to further reading for each chapter. Each chapter also includes a new section labelled Research project, which suggests a possible idea for a research project that a student might undertake, or which might be used as inspiration for an essay for the course. (The plan had been to include instead a section outlining research methods for each chapter; this turned out not to be feasible for an introductory text such as this, especially for the first six chapters and Chapter 9.) This edition includes of a new chapter on corpus linguistics, the addition of which necessitated a reorganization of the parts of the book, specifically the inclusion of a new part, Part II dealing with language in use. I had considered undertaking more radical revisions of the chapters on psycholinguistics and language learning, orienting them more to functional and usage-based approaches. In the end I opted to stick with the content of the previous editions since it seems to me that much of that represents knowledge that students should be aware of. As much as I am impressed by Michael Tomasello’s approach to first-language learning, it seems to me that students also need to be exposed to earlier work that was not informed by a usage-based approach. Similarly for psycholinguistics and neurolinguistics. The main change to Chapter 12 on language learning is that it consistently uses the term language learning instead of language acquisition. The terms learning and acquisition mean different things to different linguists, but in changing the terminology I wanted to focus on the fact that the child actively engages in constructing their language through interaction with others, and uses learning strategies that are not peculiar to language. In using this new terminology I align myself with a constructivist approach to language learning, as per e.g. Michael A. K. Halliday and Michael Tomasello. One suggestion for the third edition was to retire the chapter on gesture and sign language. This I was more than reluctant to do. I believe that a textbook on linguistics should discuss gesture and sign languages, and that it is appropriate to treat the latter separately from spoken languages. Doing this is consistent with the current trend in sign language linguistics to pay attention to what is

Notes on the Text

special about sign languages, and what these peculiarities might tell us about the nature of human language. This represents a pendulum swing from an earlier trend to downplay the differences, and focus on the commonalities – which was motivated by a need to argue for deaf sign languages as genuine languages. This is now accepted in linguistics, and no longer needs to be argued. Similar reasoning motivates a separate chapter on writing, Chapter 14. This edition retains the same general functional/usage-based stance adopted in the first two editions. Although I work within such a theoretical approach, more specifically a Neo-Firthian approach, I have avoided use of a specific theory, and preferred to present things as far as possible from a more general functionalist/usage-based approach. This is the main reason why I have not taken the advice of Bartlett (2011: 124) and gone deeper into the Hallidayan interpretation of the ‘metafunctional’ structure of the clause. There is of course a great diversity of opinion among functionalists and usage-based theorists; however, it seems to me that sufficient is shared at the general level of the present text to permit one to adopt a general non-formalist perspective. The reviewer just mentioned has also commented: ‘McGregor is sticking rather too rigidly to mainstream syllabuses rather than providing the radical functional and cognitivist approach promised’ (Bartlett 2011: 127). I regret that my text suggested a promise for a radical functional and cognitivist approach; in fact, I never intended this. First, my own theoretical inclinations aside, I believe it is important in introductory courses on linguistics to attempt to give a feeling for the landscape of the subject. A textbook providing the suggested radical approach would be nice, but in my opinion for a non-introductory course – for students who have already been exposed to the subject. Moreover, as I see it, such a textbook would have difficulty covering fundamental topics including phonetics, phonology and morphology in a manner suitable for beginners – a bit like presenting relativity to primary school students. Nor would I like my account of the linguistic diversity of the world to be construed from such a perspective. Second, my intention was always to cater to mainstream syllabuses. Third, I have had since the early 1990s serious reservations about labels such as functional, usage-based, which seem to me to suggest function and usage are more important than form. In my view each should be accorded equal theoretical importance. The second edition of the textbook came with an accompanying manual for instructors containing answers to exercises included at the end of each chapter. For this edition the manual takes the form of a pdf file that can be downloaded by teachers from the website for the book.

Guide for the student This book contains too much material to be covered in a one-semester introductory course on linguistics. Your instructor will be selective in the range of chapters covered and the material from each chapter that is used. Nevertheless, my advice is to read the entire book, including the chapters not covered in your course, and the material on the accompanying website. You should of course focus more on the chapters covered in your course – but do read the others. They provide valuable additional information and perspectives on the subject, which may provide useful background information for later courses.

xvii

xviii

Notes on the Text

I advise students to read and attempt to understand each chapter before the class on that topic. But don’t get too bogged down on details. If you don’t understand something after making an honest attempt at it, move on, keep reading. The classes will present the fundamental ideas of each chapter in a different medium, orally, providing you with another chance to understand the topic. Your lecturer or tutor can also be consulted on points you have difficulty with. But you should first make a serious attempt to understand and attempt to formulate precise questions. Your lecturer or tutor will usually be able to answer a specific question, though they will be hard pushed to help you if you can’t formulate a question. It is very hard to provide guidance if you can only say you don’t understand! I always advise my students to formulate at least one question about each chapter and send it to me or the tutor prior to the class. I can then tailor the contents of the class accordingly. Here is my advice on how to attack each chapter. Begin by examining the preliminary materials, which give an idea of the scope, contents, goals and organization of ideas in the chapter; the list of key terms highlights concepts to take particular notice of as you read the body of the chapter. With this background, read the chapter through. I would recommend first reading it rapidly, and then to go back and read it more carefully, focusing in particular on the places where you had difficulties in understanding. After you finish reading the chapter, go back over the list of key terms, and check that they now make sense to you. If they don’t, review that part of the text, and refresh your memory and understanding. I also advise doing this after the class as well: that is, review what you have read, and the notes you have taken. After the class try to summarize the chapter in a few sentences, in your own words. Compare your summary with the summary included at the end of the chapter.

Can you answer the basic question: what is the chapter about?

Each chapter, as mentioned above, contains exercises and questions for further thought. There are almost certainly too many for you to complete each week. You will need to be selective – read each question through, and select the ones that you find most interesting and attempt them first. (Don’t just start at the first, and go until you run out of steam or time.) Answers are not provided – it is just too tempting to look at an answer before thinking a problem through. Similarly, read the research project, and give some thought to it even if you don’t actually do it. If your course is graded on an essay, the research project idea might suggest a topic or direction.

One of the skills that you need to learn at university is how to be selective in what you do and think carefully about, and how to make good choices in your selections. This is a skill that you should attempt to develop over time: don’t expect it to come immediately and naturally to you. One of the ways you can develop the skill is by comparing your summary of the chapter with the summary in the book. Have you identified all of the major points identified in the chapter summary? Have you focused on a minor issue?

Notes on the Text

A set of multiple-choice questions can be found on the website for the book. The idea of these is to test your understanding of the main ideas of each chapter. You should attempt these each week to keep track of your progress. Some feedback is provided at the completion of the test, when you submit your responses. Your overall performance is indicated, and brief comments are provided on each of the questions you got wrong. These comments will often direct you to places in the text where you will find, or be able to work out, the correct answer. The questions have been revised many times over the past two decades in an attempt to make them maximally accessible to nonnative speakers of English, and to remove all (apparent) ‘trick questions’. What else should you read? Each chapter contains a list of further readings on the topics covered. No one expects you to read all of these things: that would take you much longer than a semester. The references are mentioned for your information, for you to follow up if you are especially interested in some topic. (Here again you need to hone your skills in selection.) This could be after the course, perhaps even at a later time in your linguistic studies. Glancing over this section of the chapter may also alert you to interesting issues not dealt with in the text, or only briefly dealt with. Sometimes reading another treatment of a topic will assist your understanding of it; but if you have difficulty in understanding something, take the advice given above, rather than attempt to read about it in a dozen different books – as likely as not you will still not understand!

I am certainly not discouraging you from reading. What I do want to discourage is reading as a replacement to thinking. Reading should be, rather, an enhancement to your thinking. Read as much additional material as you can find the time for, focusing in particular on those issues that most interest you or are most relevant to your course. I am sometimes asked, ‘What do I need to remember?’ This is an impossible question to answer. As indicated above, I consider understanding the subject more important than rote memorization of numerous facts. Understanding, however, also involves memory, and if you understand a particular point, you will want to remember the general drift of ideas leading up to it, even if you don’t remember every tiny detail. The general patterns are much more important to remember than the details; specific minor details you can find by referring to this or some other book. But think about it! The less you remember, the more you need to rely on finding or re-finding the information. Imagine if every time you added 2 and 3 you had to work it out from first principles, on your fingers, or with the calculator app on your phone or tablet, or if you had to check the sequence of counting numbers on a website each time you needed to count the number of bananas to buy. This would be more than a little impractical when you go shopping. My advice is that you should remember the main concepts – i.e. both the terms and their meanings – highlighted at the beginning of each chapter. These are notions that are likely to be used in later chapters, lectures and other courses. If you have to look them up every time you encounter them your understanding will be seriously impaired.

xix

xx

Notes on the Text

For advice on studying linguistics I recommend Bauer (2021) and Sakel (2015), both of which provide much practical information and advice, including on fundamental notions and conventions, reading linguistics, and writing essays. Wray and Bloomer (2012) also provides valuable information and advice to students on doing research on linguistics, including essay writing. These works will be particularly useful to those who intend to continue studying linguistics. The website for this book also contains my advice on essay writing.

Abbreviations and Conventions Used in Examples

I have avoided using abbreviations as far as possible, in most cases restricting them to glosses in example sentences, occasionally for cited forms of morphemes in the text. Just a few of the technical terms used in the text are abbreviated; these are not included in the list below, but can be found in the Glossary at the end of the book. Following the normal practice of sign language linguistics, I have – in the text and examples of Chapter 13 – adopted the convention of representing morphemes of sign languages by their English gloss, given in capitals. Many sign languages are commonly referred to by acronyms – e.g. ASL for American Sign Language. These abbreviated labels are specified on the first mention of the language name, and are not listed here. Similarly for many corpora that are referred to by acronyms (e.g. COCA for Corpus of Contemporary American English). The following is a list of the main abbreviations used in the example sentences. Where possible they follow the recommendations of The Leipzig glossing rules: conventions for interlinear morphemeby-morpheme glosses, available online at https://www.eva.mpg.de/lingua/pdf/Glossing-Rules.pdf. In a few cases an abbreviation has more than one different interpretation; it should be obvious from context which is intended. ABS

absolutive (case of object and intransitive subject)

ACC

accusative (object case)

ACT

active (case of active participant)

AdjP

adjectival phrase

AdvP

adverbial phrase

APP

applicative (‘do with (something)’)

AUX

auxiliary

C

clause; consonant

CL

classifier

COMP

complementiser (like that in think that X)

DAT

dative (‘for’)

DEC

declarative xxi

xxii

Abbreviations and Conventions Used in Examples

DET

determiner

ERG

ergative (case of transitive subject)

FUT

future

GER

gerund

INACT

inactive (case of an inactive participant)

IND

indicative

INTER

interrogative word

IO

indirect object (e.g. recipient)

IRR

irrealis (‘didn’t or mightn’t happen’)

MAS

masculine (gender of nouns referring to things classified like male human beings)

N

noun, nominal

NEUT

neuter (gender of nouns referring to things)

NF

non-finite

NOM

nominative (case of transitive and intransitive subjects)

NP

noun phrase

NPST

non-past (present or future)

O

object (grammatical relation)

OBJ

object case form

OBL

oblique form (i.e. a case form other than nominative or accusative)

P

phrase

PART

participial

PFV

perfective

PL

plural (‘many’)

POSS

possessive

PP

prepositional phrase; postpositional phrase

PROG

progressive

PRF

perfect

PRS

present

Abbreviations and Conventions Used in Examples

PST

past

REL

relative clause marker (e.g. who in the woman who saw it)

RPST

recent/past

S

subject (grammatical relation)

SG

singular (‘one’)

SUB

subject case form

SUBJ

subjunctive (‘it is hoped or wished that’)

TR

transitive

V

verb; vowel

VOL

volitional (‘to want to do something’)

VP

verb phrase

1

first person (‘I’, ‘we’)

2

second person (‘you’)

3

third person (‘he’, ‘she’, ‘it’, ‘they’)

*

ungrammatical or unacceptable sentence in syntax; proto-form in historical linguistics

?

expression of questionable grammaticality



morpheme boundary

=

beginning of an overlapping stretch of speech



acting on (in glosses); is realized as (in phonological and morphological rules); changes historically into (in historical rules)

(x.y)

a pause of x.y seconds

(.)

a pause of less than 0.2 seconds

[]

phonetic representation

//

phonemic representation

graphemic (written) representation

xxiii

xxiv

Map 1 Approximate homeland locations of main languages mentioned in this book.

xxv

xxvi

1 Introduction

In this chapter we introduce some fundamental concepts of linguistics, including the notion of the sign, and outline the major features of human language. We also present linguistics as a science, and overview its main concerns.

Chapter contents Goals Key terms 1.1 What is linguistics? 1.2 Fundamental concepts 1.3 Design features of human language 1.4 Outline of modern linguistics from a historical perspective Summing up Guide to further reading Issues for further thought and exercises Research project

1 2 2 6 13 16 19 19 20 21

Goals The goals of the chapter are to: ● present linguistics as a science and outline the main concerns of the discipline; ● lay out the main orientations of modern linguistics and situate them in a historical perspective; ● introduce some fundamental concepts of modern linguistics; ● overview the main characteristics (‘design features’) of human languages; and ● distinguish between three mediums of language, auditory-vocal, visual-gestural and visual-inscribed.

1

2

Linguistics

Key terms arbitrariness

icon

sign languages

auditory-vocal medium

paradigmatic relation

speech

cultural transmission

productivity

structuralism

design features

reflexivity

symbol

displacement

relation

syntagmatic

duality

Saussure

visual-gestural medium

formalism

scientific method

visual-inscribed medium

functionalism

sign

writing

1.1 What is linguistics? David Crystal’s Dictionary of Linguistics and Phonetics begins the entry for linguistics with the words ‘[t]he scientific study of language’ (Crystal 2003: 272). He goes on to say that it is also called linguistic science, and refers to it as an academic discipline. Ask any linguist what linguistics is and you are likely to be given a similar answer, mentioning both its scientific character and its subject matter – language.

Linguistics as a science What does it mean to say that linguistics is a science or scientific field of study? To begin with, it says something about the approach taken to the subject matter. A scientific approach to the study of language involves a critical and inquiring attitude, and refusal to accept uncritically, on faith, or on authority, ideas or ways of thinking about language. It strives for objectivity, for developing hypotheses and putting them to the test by confronting them with observations. This means that linguistics is empirically grounded: it is based on actual language data, including observations of language use by speakers, and their intuitions about their language. Linguistics is thus descriptive rather than prescriptive: its primary goal is to describe languages as they are actually spoken, indicating what they are like and how they are used, rather than prescribe how they ought to be spoken. Many people are concerned about how their language ought to be spoken, as a glance in a newspaper is likely to reveal: people often comment on ‘wrong’ grammar or pronunciation that people (usually others!) use.1 At school you may have learnt that you should say That is the child whom the dog bit and not That is the child who the dog bit. But in modern English (Indo-European, England)2 most people say the latter, and few could use the school rule consistently and appropriately without consciously thinking about it. Linguistics is concerned with what people actually say, not with what they should say or think they should say.

Introduction

A scientific approach is not purely empirical in the sense of merely collecting observations. It involves formulation and testing of hypotheses and generalizations, as well as theory development, development of ways of understanding language. This calls for rigorous and explicit formulation of ideas, as well as rigour in testing them. Linguistics as a scientific endeavour is as much a theoretical enterprise as an empirical one: whatever observations one makes are useful and make sense only in relation to hypotheses and theories.

As a science, linguistics is concerned with developing theories that account for and explain the phenomena of language and language use. Doing linguistics is concerned with theory development and testing, and with making generalizations about language – with uncovering regularities and repeated general characteristics. Exceptions play a crucial role, as in all sciences: they challenge the generalizations, and force the investigator to rethink matters, and refine or revise their ideas. We will see in this book places where exceptions loom large in scientific thinking about language, and have resulted in significant new developments. An important skill to develop is the ability to recognize the significance of observed phenomena as exceptional or unexpected. Linguistics is a relatively new science, and it is possible for beginners to observe new things about their language (even as well studied a language as English), including things that challenge existing theories. While reading this book you should be constantly thinking about and observing the language in use around you, and linking your observations to the discussion and generalizations we make.

Linguistics is often regarded as a humanities (or arts) subject, though in many ways it straddles the boundaries between humanities and sciences, with a foot in both camps. Links to humanities include to language history and philosophy, as well as to ancient and modern ‘language’ subjects taught in universities, such as English, French (Indo-European, France), German (Indo-European, Germany), Ancient Greek (Indo-European, Greece), Sanskrit (Indo-European, India) and so on; links to social sciences include to sociology, psychology, anthropology and archaeology. But there are also links to the ‘hard’ sciences such as biology, physiology, physics and mathematics, most obviously in the production and perception of speech. Statistical methods have become increasingly important in many domains of linguistics. The human side of linguistics is as central as its scientific face. Language is a human artefact, and many types of linguistic research involve interaction between the linguist and other human beings, speakers of languages. Their work thus not infrequently confronts linguists with human considerations, such as provision of professional expertise or services and research ethics.

The subject matter of linguistics The subject matter of linguistics is, of course, language, and as a scientific subject linguistics is in principle concerned with all aspects of language. But first what is language? The term has many

3

4

Linguistics

senses. People talk of the language of bees, the language of the genetic code, the language of science, the language of mathematics, the language of flowers, the language of love, body language, computer language(s), the English language, the American language and so on. In this book we use the term specifically in reference to natural human languages, such as French, Mandarin Chinese (SinoTibetan, People’s Republic of China), Basque (Isolate, Spain and France) and Hausa (Afroasiatic, Nigeria). This is of course not a definition; to provide one now would be premature, as it would presuppose much of the content of this book. In §1.3 we make a beginning by discussing some features that characterize human language and distinguish it from other communication systems, including those of other animals. We return to the question a number of times throughout the book, not always explicitly – so keep awake! By the time you have finished reading the book, you should have a clearer notion of what linguists mean by the terms language and languages, and an appreciation of some of the difficulties in defining them. Let us now discuss some aspects of language that are of interest to linguists. The best approach for our purposes is to outline the core branches of linguistics as a discipline, linking them to the chapters of this book. (There are of course many other less central fields and subfields in linguistics.) If you do a degree in linguistics, you are likely to study many of these branches, some of which will be covered in their own course. ●







Phonetics and phonology deal with the sounds of languages. Phonetics is concerned with the ways speech sounds are produced, their nature (the physics of sound waves) and how they are perceived. Phonology is concerned with the ways sounds are patterned in a language, with those characteristics that are significant in the sound system of the language. These two branches are dealt with in Chapter 2. Morphology deals with the way the words of a language are structured, how they are made up of smaller meaningful parts. For example, reads is made up of read and the ending s, which tells you that the reading is being done by one person (who is not the speaker or hearer) either at the present time or generally. Morphology is treated in Chapters 3 and 4. Syntax is concerned with the ways words go together to form sentences, and how the words are related to one another. For instance, The boy reads comics consists of a subject or doer of the action, the boy; a verb representing an event, reads; and an object or patient of the action, comics. Sometimes words go together to make up constructions of intermediate size – larger than words, but smaller than sentences. An example is the boy in our previous sentence. Syntax is the topic of Chapter 5. Syntax and morphology together make up the core of grammar. Semantics and pragmatics deal with meaning. Semantics is concerned with the aspects of meaning that are encoded by words and grammar. Pragmatics handles the aspects of meaning of an utterance that come from its use in a particular context. The sentence Come again! is made up of two words each of which has a meaning, as does the entire sentence; these matters are the concern of semantics. But you can use this sentence in different circumstances and in different ways to mean different things. If said when farewelling a visitor it could be interpreted as an invitation to return at a later time. In other contexts it could be interpreted as an expression of disbelief, or a request that the hearer repeat what they have just said. Such interpretations are the concern of pragmatics. Chapter 6 treats semantics and pragmatics.

Introduction















Sociolinguistics is concerned with language in its social context, with the relations between language and society. It explores the variation in languages associated with social phenomena such as the social group to which speakers and/or hearers belong (for instance, differences in speech according to class in Western societies). Other topics of interest in sociolinguistics are multilingualism, language choice (what motivates the choice of language in multilingual settings), attitudes to languages and language variation, and standard and non-standard varieties of a language. Anthropological linguistics has basically the same range of concerns as sociolinguistics, but takes inspiration more from anthropology than sociology, and usually deals with small-scale non-Western cultures. Sociolinguistics is dealt with mainly in Chapter 7. Discourse analysis examines stretches of language, both spoken and written, larger than the sentence. It attempts to find regularities in the formation of these stretches, and correlations with grammatical, phonological, lexical and semantic phenomena. Among the issues that have attracted interest are: how sentences are connected; how texts are made coherent; and the use of words like well, like, huh and so on. Conversation analysis focuses attention on the properties of everyday conversation, including turn-taking (how conversation partners organize the exchange of speaker and hearer roles), negotiation of interactive expectations and goals, use of discourse markers (words like well, however, nevertheless, etc.) and conversational coherence. Discourse analysis is the topic of Chapter 8. Corpus linguistics is an increasingly popular field of linguistics that analyses language in use by applying quantitative and qualitative empirical methods to corpora, or collections – usually stored in electronic format – of naturally occurring spoken and written texts. Corpus linguistic methods have been applied to virtually every subject domain of linguistics. Chapter 9 presents some of the basic notions and methods of corpus linguistics. Evolutionary linguistics is concerned with the origins of language, with how we came to speak. A fundamental question is why are we the only species with language? Is language a part of our genetic make-up as human beings, or does biology merely permit us to speak? Some ideas about language origins and evolution are discussed in Chapter 10, which also sets human language in its biological context. Psycholinguistics and neurolinguistics are concerned with the processes involved in language production (e.g. speaking and writing) and comprehension (e.g. listening and reading). Psycholinguistics investigates the mental processes underlying language processing, while neurolinguistics is biologically oriented, focusing on the brain’s language-processing activities. Psycholinguistics tends to adopt methods of psychology, neurolinguistics, medical methods and technology. These topics are treated in Chapter 11. Language learning is the field of linguistics that investigates processes of attaining comprehension and production of a language. It is concerned with both how children learn their first language (native language or mother tongue) and how adults learn second or later languages. Chapter 12 discusses processes of language learning. Sign language linguistics studies languages that employ the medium of non-vocal bodily gestures, mainly of the hands and face, and include the languages of deaf communities. (See further under the section Mediums of language, on p. 12 below.) These are fully-fledged languages with their own distinctive grammars and lexicons, different from the grammars

5

6

Linguistics









and lexicons of the surrounding spoken languages. Chapter 13 discusses some of the main characteristics of sign languages; it also deals with gestures that accompany speech. Written language linguistics (not a standard term) studies the characteristics of language in another non-vocal medium, namely writing (see again p. 12 below). Writing in electronic media of various sorts has attracted a considerable amount of attention in recent years not only among linguists, but also the general public. In Chapter 14 we discuss some of the main linguistic features of writing, and how and why written varieties of a language differ from spoken varieties in both structure and use. Typology and universals are concerned with the range and limitations on variation among languages. Typology seeks to discover and account for the variation by classifying languages into types according to some structural feature (for instance, the order of subject, verb and object), and classifying linguistic structures according to their similarities and differences (e.g. whether possession is expressed by a ‘have’ verb, or a verb ‘be at’). The study of language universals is concerned with identifying features common to all of the world’s languages. These matters are the concern of Chapter 15. Historical linguistics studies how languages change over time. Languages never remain static for long; indeed they change rapidly. Historical linguistics has methods for working out what changes are likely to have happened over time to a language or group of languages. It is also concerned with establishing genetic relations among languages: that is, with showing that certain languages are related by having evolved from the same ancestor language. The comparative method is a technique devised for this purpose. Chapter 16 deals with historical linguistics. Diversity linguistics is concerned with documenting the linguistic diversity of the world, in terms of both structural diversity (the concern of linguistic typology) and genetic diversity (the concern of historical linguistics). It is also concerned with other dimensions of variation in language. Chapter 17 discusses this topic, and overviews some of the main language families of the world.

1.2 Fundamental concepts The sign One of the most important concepts of modern linguistics is the notion of the sign, a fundamental unit used in the representation and conveyance of information. The sign involves a pairing of a form (roughly, something perceivable) and a meaning (a mental notion or idea). Some examples of written (or graphic) signs are: ♀, meaning ‘female’; ♂, meaning ‘male’; €, meaning ‘euro’; &, meaning ‘and’; and 3 meaning ‘three’.3 A gesture such as the ‘thumbs-up’ is also a sign, since it pairs the hand-shape with a meaning like ‘OK, right, go ahead’. Signs can also involve sound forms, that can be heard rather than seen, as in the case of spoken words – for example, the spoken words ten and tree.

Introduction

Figure 1.1 Saussure’s conceptualization of the linguistic sign. Here ‘sound-image’ refers to ‘form’ (the idealized sound-shape of a word, ignoring variations in particular instances of production); ‘concept’ refers to ‘meaning’, illustrated here by means of an explanatory definition, and visually (see Saussure 1974/59: 66–7). © 2009 William B. McGregor and his licensors. All rights reserved.

The fundamental properties of the sign are illustrated in Figure 1.1, which is based on Ferdinand de Saussure’s (1857–1913) famous diagram of the word as a linguistic sign, exemplified by the English word tree. Saussure likened the sign to a coin: just as both faces are essential for a coin to count as an object that can be used in economic transactions, so also are form and meaning both essential to the sign as a unit in information exchange. Without a meaning we have no sign: the letter h of the Latin alphabet has no meaning in written English words, and so is not a sign: it can no more be used in information conveyance than the image of a head on a coin can be used in a shop. (Can the letter h ever be a sign? If so, when?) Nor is a disembodied meaning or concept without a form a sign.

Relations between form and meaning in the sign Depending on how the form and meaning of a sign are related we can talk of iconic signs and symbolic signs. A third type, indexical signs, is identified and discussed in §10.1.

Iconic signs An iconic sign or icon is a sign that has a form resembling its meaning in some way: the form shows some characteristics of the corresponding concept. Figure 1.2 gives some examples. Notice

Figure 1.2 Some iconic signs. The forms of (a) and (b) visually depict salient characteristics of a telephone, and thus iconically represent ‘telephone’; (c) depicts salient features of a mobile phone, and is also iconic; and (d) depicts characteristics of an hour glass in operation, and thus is sometimes used to indicate the passage of time as a computer processes data. (Note that (d) does not iconically represent time.)

7

8

Linguistics

that the form of an icon is never an exact representation of the meaning; it shows salient features in stylized ways, ignoring other features. Different forms can iconically represent the same concept by selecting different features of the concept. The first two icons, (a) and (b), represent the same concept, ‘telephone’, although (b) depicts only a single aspect of the concept, the receiver. Many manual gestures are iconic: holding up a hand with the digits spread out to represent the number ‘five’ is iconic.

Symbolic signs A symbolic sign or symbol is a sign the form and meaning of which are related purely by convention, being established and acquired through repeated instances of use in communication: the form bears no obvious similarity to the meaning. Figure 1.3 gives some examples. The line between symbols and icons is not clear-cut, and they are not really different types of sign. What is a symbol to one person might be an icon to another. To someone who knows only mobile phones, (a) and (b) in Figure 1.2 might appear completely arbitrary and inexplicable as signs for phones generally, established purely by convention. Iconic signs always involve some degree of conventionality and/or arbitrariness in the form–meaning link; they are not connected by necessity, and could be otherwise. Think of the equals sign =, which has an obvious iconic basis in the identity of lengths of the lines, and was first used by the English mathematician Robert Recorde (1510–58) with this in mind. Its orientation on the page is arbitrary, and some mathematicians of Recorde’s time used the equally iconic ║ (now used to iconically represent parallel lines).

Figure 1.3 Some symbolic signs. (a), the symbol for the mathematical operation of division in the UK, USA and Australia, shows no likeness to the operation itself, and in Denmark represents instead subtraction. The cross in (b) indicates ‘wrong, incorrect’ when placed by a teacher next to an answer on a school test. This is a purely conventional link, and is often used in boxes on multiple-choice questions to indicate the correct option. (c) is used in comics to indicate that the words enclosed in it are representations of the thoughts of a character. The link between the graphic form and meaning is not based on any actual resemblance – thoughts do not look like (c) (although one might suggest a link via the notion that thoughts are fluffy things like clouds, to which (c) shows some similarity).

Introduction

Language as a sign system The examples discussed in the previous sections illustrate non-linguistic signs. It was one of Saussure’s important insights that human languages are systems of signs. This means on the one hand that human languages are made up of signs, and on the other that the signs interrelate and form a system; they do not exist in isolation from one another.

Nature of signs in human language Symbolic signs in language We have already said that the word tree is a sign, being constituted in speech by a phonetic (sound) form and in writing by an orthographic (written) form in association with a meaning. The same goes for the word for ‘tree’ in many other languages: qoqa in Aymara (Aymaran, Peru), icimuti in Bemba (Niger-Congo, Zambia), miistsís in Blackfoot (Algonquian, Canada and USA), træ in Danish (IndoEuropean, Denmark), tree in English, girili in Gooniyandi (Bunuban, Australia), fa in Hungarian (Uralic, Hungary), arbor in Latin (Indo-European, Italy), uhs in Papago (Uto-Aztecan, USA), laau in Samoan (Austronesian, Samoa) and dji: in Shua (Khoe-Kwadi, Botswana). Clearly these word-signs are symbolic. There is no natural connection between the sound or orthographic forms and the meaning; each form is as good as another for expressing the meaning ‘tree’, none is in any way suggestive of the meaning (if you did not know the language you would not be able to guess the meaning if you heard the form), and there is little similarity among the various forms (except in the case of the two closely related languages Danish and English). Most words in human languages are symbols.

It is often said that linguistic signs are typically ‘arbitrary’ (see also §1.3). This is a potentially misleading statement: it does not mean that ‘anything goes’, that a speaker is free to choose whatever form or meaning they like to associate together in a sign. Humpty Dumpty may have believed that he could: ‘“When I use a word,” Humpty Dumpty said, in a rather scornful tone, “it means just what I choose it to mean – neither more nor less”’ (Carroll 1899: 123). Clearly communication would be impossible with such anarchy. Arbitrariness refers to the non-necessary relation between the form and the meaning of a sign.

Iconic signs in language There are exceptions. Some words are iconic. The phonetic forms of words like woof-woof, cock-adoodle-do, baa-baa, meow, ding-dong, pop and ping are quite suggestive of the meanings, which are sounds, the sound made by dogs, cockerels, sheep and so on. The spoken form is somewhat similar to the sound it represents; such words are onomatopoeic. (The written forms of these words, however, do not resemble the sounds made by the animals or things.) Many languages have onomatopoeic words for the characteristic calls of animals. These are not normally exactly the same in different languages – remember that icons also involve conventional associations of form and meaning – though they are often similar. The noise made by a cat is meow

9

10

Linguistics

(there are different spellings) in English, miau in Hungarian (pronounced almost exactly as in English), mjá in Icelandic (Indo-European, Iceland), nyao in Japanese (Japanese, Japan), miook in Bulu (NigerCongo, Cameroon), mya:u(:) in Hindi (Indo-European, India), meu-meu in Bengali (Indo-European, Bangladesh), niaou in Greek (Indo-European, Greece), miao in Mandarin Chinese, and ngeong in Indonesian (Austronesian, Indonesia). No one would mistake these for the vocalizations of a dog or horse. But we sometimes find no phonetic similarity in different onomatopoeic representations of the vocalizations of particular animals: both woof-woof and bow-wow are onomatopoeic of the noise of a dog; these represent different sounds made by the same animal. Young children often call a dog a bow-wow, and a sheep a baa-baa. In fact, in many languages we find words for at least a few animals (especially birds) that are identical with or similar to an onomatopoeic sign for their characteristic call. In Gooniyandi minyawoo is the word for ‘cat’; the word for ‘peewee, peewit, mudlark’ is diyadiya, for ‘galah’ is gilinygiliny and for ‘brolga’ is goorrarlga. There are very similar representations of the characteristic vocalizations of these animals in the language – e.g. diyadiya repeated two to four or five times for the peewee and uttered with a high pitch – which anyone who has heard the calls will immediately recognize to be onomatopoeic. These terms for animals are not onomatopoeic, but they do show iconic components: they depict the characteristic sounds of the animals. A more complex example of iconicity in words is drawing out the pronunciation of the word long to loooong or big to biiiig. The increased length of the word represents increased size – that the thing is very long or big. Other languages allow similar things: in Gooniyandi you can lengthen girabingarri ‘long’ to giraaabingarri to mean ‘very long’ and nyamani ‘big’ to nyaaamani ‘very big’. It is not the phonetic form of the words loooong or giraaabingarri that iconically represents the meaning ‘long’. That meaning is associated with the word-forms long and girabingarri themselves. The iconicity comes in at a different level: the phonetic difference between long and loooong represents the meaning difference between ‘long’ and ‘very long’. Here we have a sign with the form ‘extra length word-form’ and intensifying meaning ‘very word-meaning’. This is why teeny can be lengthened to teeeeeeny in English, and jiginya ‘small’ to jigiiiinya ‘really small’ in Gooniyandi; the lengthened words obviously do not convey a sense of ‘larger in size’.

Relations between linguistic signs This brings us to the second aspect of language as a sign system, the notion of system: the notion that the signs of any human language interrelate to form a coherent whole. This happens on two dimensions, syntagmatic and paradigmatic.

Syntagmatic In everyday speech and writing, linguistic signs occur in combination with other signs. Human beings often put together a number of signs to convey complex meanings; they are not restricted to producing single-sign utterances like one-year-old children and many animals. In speech, wordsigns follow one another in order, even though the boundaries between them are often fuzzy (see next chapter); in writing, they follow one another in a conventional spatial sequence (in the writing traditions of Europe, from left to right, top to bottom).

Introduction

This dimension is called syntagmatic. The signs that go together to make up an utterance are not put together randomly, but are related in specific ways to one another. In I will never forget that terrible day the order of signs plays an important function. The fact that I precedes will tells us that the utterance is a statement. If these two words had occurred in the reverse order, we would have a question: Will I never forget that terrible day? Relations between signs that appear in the presence of one another are syntagmatic relations. For example, terrible describes day, and is dependent on it (you can omit it, but you can’t omit the following word day). The words never and forget are also syntagmatically related; but the relation is different: never does not describe forget in the way terrible indicates a quality of the day in question. The term syntagm refers to any coherent grouping of signs that form a unit together. Thus I will never forget that terrible day is a syntagm; so also is that terrible day: these three words belong together and function as a single unit (they cannot be split up or separated) in a way that never forget that does not.

Paradigmatic Not only do speakers put signs together in strings, but they choose the signs that go in the sequence from a range of possible alternative signs that could have been used instead. This gives us the paradigmatic dimension, the notion that each sign invokes a contrast with other signs that might have been used instead; signs so related are in a paradigmatic relation.4 Signs in paradigmatic relation form a paradigm. The paradigmatic dimension is important because the set of signs in paradigmatic relation with a particular sign in a syntagm is restricted. In our example sentence I will never forget that terrible day, I contrasts with you, he, she, my brother, John, John’s older brother, and many other signs, simple and complex. But it does not contrast with, and cannot be replaced by, hit, and, not, up, won’t and so on. The existence of such restrictions is evidence that the signs in the syntagm are genuinely syntagmatically related, that there is structure on the syntagmatic dimension, and that the signs are not arbitrarily placed in sequence one after the other. If we examine the signs in paradigmatic alternation with I in our example sentence, it is clear that they relate in different ways to one another. I, you, he and she are more closely related to one another than any is to John or John’s older brother. Imagine a game in which I say a word, and you respond with as many words as come to mind in 30 seconds. Most likely, if I say I, you would respond with you, he, she, we; responses John, John’s older brother and hit would be less likely. If I were to say brother, the chances are that you would respond with words like sister, father, mother, son, sooner than we, you, atom or star. The signs in the groups of likely responses have similar meanings. For brother and sister the difference is in terms of the sex of the relative; for brother and father it is in terms of the genetic relation. These dimensions of contrast recur throughout the paradigm of kin terms in English. The meaning of a sign in a language is dependent in part on the other signs in close paradigmatic relationship with it. In English we means ‘me and someone else’; it contrasts with I in terms of the number of persons specified. Gumbaynggirr (Pama-Nyungan, Australia) has four words for ‘we’, ngalii, ngiyaa, ngaligay and ngiyagay, as well as ngaya ‘I’. The first two of these, ngalii and ngiyaa, are used if the group includes the hearer; the second pair, ngaligay and ngiyagay, if it does not. The first word of each pair is used if there are just two persons in the ‘we’ group, the second, if there are more.

11

12

Linguistics

The Gumbaynggirr word ngalii does not mean the same thing as English we partly because of the other words in paradigmatic contrast to it.

The meaning of a stretch of language depends both on the signs present in it and on the signs absent from it. The same goes for its grammatical structure. The two dimensions, paradigmatic and syntagmatic, are important both to meaning and to form; just as the meaning and form of a sign are inseparable, so also are the paradigmatic and syntagmatic dimensions.

Mediums of language The auditory-vocal medium is the primary medium for language: most natural human languages are usually conveyed by speech. Exceptional are the sign languages of the deaf, which use the visual-gestural medium of the eyes, hands, face and body. As will be shown in Chapter 13, these are full human languages satisfying Hockett’s design features (see §1.3), and are structurally distinct from the spoken languages surrounding them. This book is primarily about spoken language, and unless otherwise stated it should be assumed that we are talking specifically about spoken language. Another medium of language is writing. Writing is derivative from speech, and secondary to it. It is a system of representing the words of a language by visual forms and their combinations. In this regard writing is distinct from other systems of visual representation such as paintings, murals, carvings, notches on sticks and so on, which do not represent the words of a language. For want of a better technical term, we will call this medium the visual-inscribed medium. It is important to note, however, that writing is not just speech converted into the visual-inscribed medium. Although writing represents the spoken language, there are real differences in written and spoken varieties of one and the same language, some of which result from the different natures of the mediums. Modern technologies are bringing writing closer in some ways to speech, e.g. in instant messaging. Linguists are generally more interested – or claim to be more interested – in speech than in writing, although both are appropriate topics for linguists to study. We discuss writing in Chapter 14.

It is important not to confuse speech and writing. Beginning students frequently make this mistake, and are apt to be misled by features of the way their language is written. For instance, many beginners believe that English has five vowels because five vowel letters are used in writing the language a, e, i, o, u. In fact, as we will see in Chapter 2, most dialects of English have more than a dozen vowels, as well as a number of diphthongs pie (double vowel sounds such as in the pronunciation of the word pie). e). The term letterr should be reserved for talking about writing. It is misleading to speak of the letters of spoken language; instead, the terms phone e or sound d should be used.

Introduction

1.3 Design features of human language Many animals use signs to communicate with other members of their species. Some species of bees, for instance, use dances to indicate the location of a source of nectar (see §10.1). Human beings, however, are obsessed with signs, and can’t help seeing them everywhere. Clothing is a sign system; so also are the Hindu/Arabic numerals (1, 2, 3, . . .), the Chinese numerals (一, 二, 三, . . .) and traffic lights. Human language occupies a privileged place among sign systems. It is a particularly elaborate sign system that has properties not manifested, or weakly manifested, in other sign systems. What might these features be? The American linguist Charles Hockett (1916–2000) proposed a set of design features of human language, a set of features useful for understanding the differences between human languages and the communication systems of other animals. This set has undergone modifications and additions since it was proposed by Hockett (1960). Below we discuss six of the most important features.5 Some of these will be taken up again in our discussions of animal communication in Chapter 10, sign languages in Chapter 13 and writing in Chapter 14.

Arbitrariness We have already mentioned arbitrariness as a property of word-signs in human languages, and explained that it is to be understood in the sense that the form and meaning of a word-sign are not connected by necessity. Arbitrariness is a matter of degree, and ranges from highly iconic and motivated (though never bereft of some conventionalization) to purely symbolic. In the animal world, too, most signs show some degree of conventionalization. In some cases the signs are quite iconic – the dance of some bee species iconically represents the direction to a nectar source by one of the axes of their figure-eight dance (see §10.1 below for more information). But the forms for this meaning could easily have been otherwise, and do in fact differ among bee types. Mating and territorial calls and dances of animals are generally conventionalized.

Displacement People often talk about things that are not present. They speak about events and things from distant times and places – about things that happened long ago in faraway places. Indeed, these may be entirely imaginary, like unicorns and time travel. This book would not have been possible otherwise, if language could only be used to describe what is actually physically present in the writer’s environment. This is called displacement. Animal communication systems sometimes allow limited displacement, for signalling things that aren’t physically present and perceivable. The dances of bees can signal presence of nectar at a distance of some kilometres from the hive. Some studies have shown that chimpanzees can sign about items that are not visible. In one study it was shown that the chimpanzee Panzee, using a system of signs on a monitor, could call attention to items of food it had observed hidden by a trainer, sometimes days previously. But the displacement revealed in these examples is limited: what is communicated about is something that is relevant to the present circumstances. Thus the

13

14

Linguistics

invisible food Panzee indicated seems to have always been an item of food recently hidden and not yet retrieved; the communication was concerned with its retrieval. Displacement is a matter of degree rather than an all-or-nothing thing.

Displacement is not always a good thing to have in a sign system. The system of alarm calls of vervet monkeys (see §10.1) would be compromised if it permitted displacement. It would then no longer be a system of alarm calls, but of calls sometimes used as alarms calling for immediate evasive action, and sometimes referring to the presence of a predator from a different occasion. Similarly, the system of sirens used on emergency vehicles would be of little use if it allowed displacement!

Cultural transmission Children learn to speak the language or languages used in the environment in which they are reared; they do not inherit their language via parental genes, in the way they inherit hair and skin colour. Languages are passed on by cultural transmission. Many of the world’s languages are endangered due partly to interruptions in transmission across the generations. Animal communication systems by contrast seem to be to a large extent instinctive and genetically conditioned. Think, for instance, of the meowing of the domestic cat. To most of us meowing is an archetypal trait of the species and found in cats regardless of whether they live in Europe or New Zealand, and regardless of whether they were reared by humans in the virtual absence of other cats.6 Some birds do require exposure to the songs of other members of their species. Lacking this, they still instinctively produce songs, but these will be abnormal in some way. This is like some types of body behaviour in human beings such as laughing, smiling and crying: though universal, they admit cultural modifications and elaborations. Although the language a person speaks is culturally transmitted, the ability to speak a language is a genetic predisposition. The extent of this predisposition – what aspects of language are genetically encoded – is a controversial issue on which linguists take conflicting positions.

Duality Utterances in human languages are patterned simultaneously on two levels, the level of form and the level of meaning. This is called duality, or duality of patterning. The Warrwa (Nyulnyulan, Australia) word yila ‘dog’ is made up of sounds that are meaningless in themselves, but when put together in a certain way make up the sign-form. Put together in a different way – for instance as layi – we get a different word, meaning ‘alone, singly’. Put together in yet other ways – for example, iayl – we get forms that are not possible words in the language. Duality of patterning permits many different words to be made up from a small number of meaningless elements that are put together in various ways.

Introduction

Duality of patterning seems not to be found in animal communication systems. Their signforms are simple in the sense that they cannot be analysed into components that are reused in other signs; there is an absence of patterning in the forms – as also in the meanings. Each form is completely different from every other form, and does not involve components that are reused to make other forms. The various calls your cat produces are separate whole units, and cannot be divided into parts that can be reused to make other calls with different meanings. The meow of a cat is not composed of separate sounds like m and i (ee) that could be used in different orders, to produce different calls, say im. The English word meow is, however, composed of reusable parts that are found in other words of the language.

Productivity Productivity or creativity is the characteristic whereby speakers can make new meanings by producing new expressions and utterances. Linguistic signs can be put together to form sequences that may never have been produced before; and even if they are not entirely novel, they may be innovative in that they are not drawn from memory. Not only do we effortlessly create such utterances, but hearers have little difficulty understanding them.

A good deal of what we say is not new: we use formulaic greetings and farewells many times in the average day. We express meanings that have been expressed, with variations in sound and perhaps also slight variation in wording, innumerable times before, as in the case of poetry, jokes, oral traditions and urban myths, for instance.

Another aspect of the productivity of language is that speakers can invent new words to express new ideas and new objects and events that they encounter. No living human language has a rigidly closed class of words that admits no new members. Think of the number of new English words that have been invented in recent years to facilitate talking about and on computers and the internet. Some of the main ways that new words are incorporated into a language are discussed in §4.2. The communication systems of non-human animals, by contrast, are typically non-productive, and do not admit new combinations of signs or the invention of new signs for new meanings. The systems allow for the expression of a small set of possible meanings. The honeybees’ dance that indicates the location of a nectar source (see §10.1) is restricted to the horizontal dimension, and bees are incapable of specifying that the location of a nectar source is vertically above the hive.

Reflexivity This book is about human language, and is written in a human language. Your lectures on linguistics are about language and are spoken in a human language. All human languages can be, and often are, used in this way, for conveying information about themselves. This need not be abstruse linguistic

15

16

Linguistics

information; it could be something as simple as ‘that word is not nice to use in polite company’. This property is reflexivity. No known animal communication system shows reflexivity. There is no evidence that dogs can bark about barking, that cats ever meow about meowing, or that bees dance about their dancing. Likewise, many sign systems human beings employ cannot be used to convey information about themselves. Traffic lights do not allow for messages about themselves, and nor do gestures or facial expressions.

1.4 Outline of modern linguistics from a historical perspective People everywhere talk about language: they have ideas about its nature, uses, origins, structure, how it is learnt and so on. Some of these notions are enshrined in mythology (e.g. the Tower of Babel story). In some sense the things people say and believe about language could qualify as linguistics, perhaps folk linguistics. But, as we are using it, the term linguistics refers to a scientific system of knowledge. Before we go deeper into the subject, it is useful to overview the main trends, situating them in a broad historical perspective. The earliest concrete evidence of discourse about language dates to about 4,000 years ago, when scribes in ancient Mesopotamia listed forms of Sumerian nouns and verbs on clay tablets. They did this because Sumerian, the language of religion and the law, was no longer in everyday use, and it had to be taught as a foreign language. Traditions of linguistics also emerged in ancient India and Greece for similar reasons: an earlier language was important for some purposes, but had changed so much that it could no longer be understood by the ordinary speaker. The study of linguistics intensified in the Middle Ages. Subsequently, with the advent of European colonialism in the fifteenth century, Europeans came into contact with an unexpected diversity of languages and peoples. From information gathered by travellers, missionaries and others, it became apparent that some languages are related to one another. Procedures for establishing these relationships were gradually honed, until the late nineteenth century, by which time the comparative method (see §16.2) had been largely perfected. Modern linguistics emerged soon after, with a change of focus from historical concerns to the notion of language as a system, the basis of structuralism, which still permeates the subject. The Swiss linguist Ferdinand de Saussure (who we have already met) was a key figure in this refocusing of interest, and is regarded as the founding father of modern linguistics. His Cours de linguistique générale [Course in general linguistics] was published posthumously in 1916, reconstructed from his students’ lecture notes. As we have already noted, modern linguistics is an empirical endeavour, concerned with describing and accounting for patterns of speech and language. To account for the patterns means to explain them; for this, theory is essential. As in other social sciences, there is considerable theoretical diversity. This theoretical diversity characterizes all domains of modern linguistics, including grammar. Grammatical theories tend to cluster into two main types, formal and

Introduction

functional, according to whether the primary emphasis is on language as an algebraic system of symbols put together according to rules, or on language as a system that has developed in particular ways in order to serve functions in human life. We discuss these two approaches in the following subsections. It should be noted, however, that the formal–functional opposition is not restricted to grammatical theories. It underpins theoretical approaches to virtually all domains of linguistics: we can talk of formal and functional linguistics according to the approach to grammar assumed. For instance, studies of both psycholinguistics and sociolinguistics are markedly different in the questions considered significant, methodologies, and so on, depending on whether a formal or a functional approach is presumed.

Formal linguistics In America, from the 1930s onwards, mainstream structuralism became increasingly algebraic in orientation and focused increasingly on syntax. In 1957 this tradition suffered a major blow with the publication of Noam Chomsky’s (1928–) Syntactic Structures. Influenced by developments in mathematical logic, Chomsky’s programme explicitly rejected some of the dominant preoccupations of American structuralism, including its empiricist philosophy (that knowledge derives from sense experiences). Instead, Chomsky advocated a rationalist philosophy (that knowledge is based on reason). Chomsky’s thought quickly became dominant, not just in America but also in Europe and elsewhere; it has effectively defined mainstream linguistics ever since. Grammar is considered to be a formal system comprising rules and/or other mechanisms for generating the grammatical sentences of a language; and for this reason the tradition is called Generative Grammar. Generative theory developed rapidly, and mainstream Chomskyan generative grammar has undergone numerous substantial changes and renovations over the past half a century or so. Alternative generative theories were also developed by others.

Functional linguistics The late 1950s also saw new developments in linguistics in Europe, that took off in functional directions, stressing both the meaning side of the Saussurean sign and the idea that language developed the way it did because of the uses it is put to. Key figures were André Martinet (1908–99) and Michael Halliday (1925–2018). The functionalist schools they initiated continue to this day as minor but significant forces on the linguistic landscape. Later, other functionally oriented schools emerged, mainly in opposition to generative grammar. One was Functional Grammar, developed from the late 1960s by the Dutch linguist Simon Dik (1940–95). A rather amorphous tradition arose in the USA around the same time, West Coast Functional Grammar. Prominent in this tradition was the idea that grammatical categories are functional, that they arose to serve a purpose. More recently, two more coherent schools arose in the USA largely replacing West Coast Functional Grammar: Cognitive Grammar (associated with Ronald Langacker (1942–)) and Construction Grammar (Charles Fillmore (1929–2014) and associates). Both assign a prominent place to the Saussurean sign.

17

18

Linguistics

Scope of modern linguistics Contemporary linguistics is a richly diversified field, with so many specializations that no scholar can possibly cover them all. Many branches acquired their separate identities and methodologies in the second half of the twentieth century, although most had been investigated previously. Generative Grammar remains a major force determining the orientations and goals of many branches, although other theories have had some impact. The majority of the approximately 7,000 languages spoken in the world today and in the recent past have yet to be adequately documented and described. Many linguists are engaged in gathering data on the poorly documented languages, normally by doing fieldwork in remote locations, and describing them by writing grammars and compiling dictionaries and collections of texts. Missionary linguists, many working under the umbrella of SIL International, a missionary organization established in the USA in 1934, continue to play a prominent role. Over 1,000 languages are currently under investigation by SIL linguists. Speakers of poorly documented languages are increasingly playing prominent roles, both as gatekeepers determining access to speech communities and controlling the direction of research and its applications, and in describing and documenting their languages. The need for this descriptive work is underlined by the fact that many of the world’s languages are endangered, and unlikely to survive into the next century (see §7.4). Despite the political rhetoric, this field does not occupy a prominent or powerful position in linguistics, or on the agenda of many research-funding bodies, and a relatively small proportion of linguists are active in it. Like other sciences, linguistics has applications, including to language learning, literacy and translation. Indeed, many branches of linguistics have contributed to applied linguistics, the field concerned with applications of linguistics: for example, descriptive linguistics to maintaining and strengthening endangered languages; psycholinguistics to assisting individuals with language difficulties (e.g. resulting from strokes); pragmatics and conversation analysis to cross-cultural communication; and sociolinguistics to the educational field. Recent years have seen linguists increasingly called on for expert advice in the legal domain, including speaker identification and land-rights for Indigenous peoples. Another major area of application is in the computational field, including to machine generation and recognition of speech, automatic parsing of texts, translation, and building and maintaining large corpora (collections of texts).

Forensic linguistics – the branch of linguistics concerned with applications to the legal domain – does not have quite the popular appeal of forensic anthropology or archaeology, and I am not aware of any popular mystery fiction with a forensic linguist in a lead role, comparable with Simon Beckett’s series with forensic anthropologist Dr David Hunter. Nevertheless, it is not uncommon for something spoken or written to provide a critical clue in crime fiction. For example, in Agatha Christie’s first book, The Mysterious Affair at Styles s, Poirot infers that someone had written a will from scribblings on an envelope. And in an episode of Midsomer Murders s entitled Dead Man’s Eleven, the crucial clue comes when Barnaby understands – or thinks he understands – the significance of two people’s use of a non-standard expression; he even uses the word linguistic. linguistic c.

Introduction

Summing up Linguistics, the scientific study of language, has its roots in our everyday knowledge of, thinking about, and talking about language. This everyday thought is often prescriptive. By contrast, modern linguistics has a descriptive orientation: it is an empirical endeavour, concerned with describing and accounting for patterns in speech and language. To do this theory is essential; modern linguistics is dominated by two opposing theoretical orientations, formal and functional. One of the most fundamental notions of modern linguistics is the sign, a unit made up of a form paired with a meaning. Most linguistic signs are arbitrary, the connection between the form and the meaning being established purely by convention. Such signs are symbols. Some linguistic signs, however, display likeness between their form and meaning: these are icons. The signs of a language form a system, a primary characteristic of which is the relationships among the signs, which are either paradigmatic or syntagmatic. To highlight the differences between human languages and systems of communication of other animals, Hockett proposed a set of design features of human language, including: arbitrariness, displacement, cultural transmission, duality, productivity and reflexivity. Speech is the primary medium of human languages. It is historically prior to writing, which is a recent invention, dating back just a few thousand years. Most languages are virtually exclusively spoken; many writing systems have emerged only in the last century, and many are used quite rarely. Another medium for the representation of languages is gesture, and in many deaf communities sign languages are used in which words are represented by gestures. These are full languages satisfying Hockett’s design features.

Guide to further reading Of the enormous range of introductory textbooks on linguistics, my recommendations are: Bolinger (1975); Fromkin and Rodman (1974), which has subsequently appeared in many editions (e.g. Fromkin et al. 2014); Hudson (2000); Jackson and Stockwell (2011); Yule (2022); and Burridge and Stebbins (2020). Six other introductory books make excellent reading: Hudson (1984); Matthews (2003); Parkvall (2006); Rickerson and Hilton (2012); Pullum (2018); and English and Marr (2023). It is advisable to have a good dictionary of linguistics, such as Crystal (1980/2003), Matthews (2007) or Trask (1998). Encyclopedias such as Frawley (2003) and Crystal (1987) are also worth digging into. Aronoff and Rees-Miller’s The Handbook of Linguistics (2017) contains thirty-three readable articles covering many fields of modern linguistics. Design features of human language were first proposed by Charles Hockett (1958, 1960); the list has been subsequently modified and expanded. Kaplan (2002) gives an indication of the scope of applied linguistics, while Oaks (2001) is an excellent collection of articles illustrating the applications of linguistics to education, law, medicine, the film industry, business, etc. Olsson (2009, 2018) provide fascinating accounts of a number of cases in which he was involved as a forensic linguist, and reveals some of the techniques and uncertainties in the field.

19

20

Linguistics

On the nature of science and the scientific method, see especially Chalmers (2013) and Okasha (2016). Horgan (1996) contains interviews with leading scientists on the limitations of science, and gives insights into the lives and work of scientists. The best introduction to the history of linguistics is Robins (1984); for a brief overview, see the website for this book.

Issues for further thought and exercises 1 Depicted below are the forms of various signs in everyday use. What are their meanings? Is the sign an icon or a symbol? Justify your answers.

2 Traffic lights form a sign system. Describe the system of traffic lights in use in your country. To do this you should identify the range of signs belonging to the system, specifying their forms and meanings. Answer also the following questions. What combinations of signs are permitted? How would you describe their syntagmatic relations? Which of Hockett’s design features are satisfied, and to what extent? 3 We have seen (pp.10–12) that Saussure distinguished between paradigmatic and syntagmatic axes in language as a sign system. Saussure is famous for a number of other such dichotomies. Find out what they are, and write a paragraph description of one of them in relation to language. 4 On p. 9 above it was remarked that the word for ‘tree’ is non-iconic in many languages. Why do you think this might be so? Do you know any language in which the word is iconic, and/ or can you think of any type of language in which you would not be surprised if it was? Explain your reasoning. 5 If you were to ask me for the loan of a book, I might reply with a simple No! If I had replied in a very loud voice, No! this might be understood as an emphatic and unequivocal refusal. What meaning would you say loudness conveys, and do you consider loudness to be iconic of this meaning? Can you think of other iconic ways of expressing similar meanings? 6 Collect comments on ‘incorrect’ or ‘sloppy’ English (or another language spoken in your community) from the media and everyday speech. What aspects do they target (e.g. pronunciation, meaning, grammar)? What is the basis for the claim (are arguments produced, and if so, what are they)? What do they reveal about the author of the comment? 7 The male Australian lyre bird’s mating song is made up of sequences of songs copied from other bird species, in various selections (depending on the range of other birds it has heard) and coming in orders that differ from bird to bird. Does this illustrate duality of patterning? Explain your answer. 8 Does written English show duality of patterning? What about written Mandarin Chinese? Explain. (See §14.2 for basic information on Chinese writing.) If your answer to both questions is ‘yes’, is the duality manifested in the same or different ways in the two types of writing?

Introduction

9 We discussed six design features of human languages. As remarked, another seven were proposed by Hockett (1960), and more have been proposed since. Find out what others there are, and think about their usefulness and the extent to which they distinguish human language from traffic lights or another system of signs used by humans or animals. (Begin by searching the internet.) 10 We said that spoken and written language differ in certain respects. Is a good piece of writing also a good piece of speech if it is read aloud? What differences would you expect to find between speech and writing in the ways that things are expressed? What (if any) grammatical differences would you expect? 11 Writing does not only influence the way that people think about their language, but can also influence speech. What are some of the ways your language (and opinions about it) has been influenced by the way it is written? 12 We have mentioned a few branches of linguistics in this chapter. The list was selective, and there are many more named branches. Here are some: internet linguistics, descriptive linguistics, documentary linguistics, sign language linguistics, philosophical linguistics, dialectology, computational linguistics, onomastics, stylistics, mathematical linguistics, philology, contrastive linguistics, lexicography and narratology. Look one or more of these terms up in an encyclopedia or dictionary of linguistics, and/or on the web, and write a paragraph description in your own words explaining what the branch studies. 13 What mediums do the communication systems of animals employ other than the visual and auditory? Write a brief description of one such medium, and the type of information conveyed in the communication system. To what extent does the system satisfy Hockett’s design features? Do any systems of human communication employ other mediums? Can human language be realized on any other medium? See if you can find any examples, and write a brief overview.

Research project It is people, mostly linguists, who have made modern linguistics what it is. It is worth meeting some of the personages. See what you can learn about one (or more) of the following linguists: Leonard Bloomfield, Frans Boas, Dwight Bolinger, Joan Bresnan, William Bright, Arthur Capell, Yuan Ren Chao, Noam Chomsky, Bernard Comrie, Simon Dik, J. R. Firth, Joseph Greenberg, Mary Haas, William Haas, Michael A. K. Halliday, Louis Hjelmslev, Charles Hockett, Otto Jespersen, Daniel Jones, Ronald Langacker, Stephen Levinson, John Lotz, Johanna Nichols, Kenneth Pike, Edward Sapir, Nikolai Trubetzkoy and Benjamin L. Whorf. Write a biography of your chosen linguist. This should include basic information about their lives, such as when and where they were born and educated, where they worked, and the range of non-linguistic interests they had. It should also discuss the type of linguistics they did, including their main interests in the field, the main influences on their thought, their major publications, and what they are best known for.

21

22

Part I Language: System and Structure

23

24

2 Sounds of Language: Phonetics and Phonology

This chapter deals with spoken language, and sets up the basic framework for describing speech sounds. The bulk of the chapter deals with the ways speech sounds are produced. It also explores the ways speech sounds pattern in the sound-systems of languages, which leads us to the notion of distinctive sounds or phonemes.

Chapter contents Goals Key terms 2.1 Fundamental properties of speech sounds 2.2 The vocal tract 2.3 Types of phones 2.4 Some additional features 2.5 Prosodies 2.6 Phonology 2.7 How to establish the phonemes of a language 2.8 Transcription Summing up Guide to further reading Issues for further thought and exercises Research project

26 26 26 30 32 39 42 43 47 49 50 51 52 54

25

26

Linguistics

Goals The goals of the chapter are to: ● describe the basic structure of the speech organs or vocal tract; ● explain how speech sounds are produced; ● identify the main types of speech sounds and how they are classified; ● present the essentials of the main system for representing speech sounds, the International Phonetic Alphabet; ● outline the basic prosodic properties of speech (pitch, tone, intonation and stress) and how they are used in different languages; ● explain the notions of phoneme and allophone; ● show how to determine the phonemes of a language; and ● describe the major methods of transcribing speech.

Key terms acoustic

glottalic airstream

place of articulation

allophone

implosive

pulmonic airstream

articulatory phonetics

International Phonetic Alphabet (IPA)

stress

auditory phonetics click complementary distribution

intonation manner of articulation minimal pair

consonant

phone

diphthong

phoneme

ejective

phonetics

free variation

pitch

suspicious pair syllable tone velaric airstream vocal tract voicing voice onset time vowel

2.1 Fundamental properties of speech sounds The speech chain A simple and influential model of speech communication, the so-called speech chain model (presented in diagrammatic form in Denes and Pinson 1993: 5), identifies the following steps in conveying a message from speaker to hearer.

Sounds of Language: Phonetics and Phonology

● ●

● ● ●

● ●



A thought emerges in the brain of the speaker and is encoded in language. Messages are sent through nerves from the brain to the vocal apparatus – the muscles and organs that act together to produce speech sounds. The muscles and organs are positioned and set into motion. As a result, sounds are produced that travel through the air. These sounds reach the ears of the hearer, which ‘process’ the sounds, converting them into nervous signals. These signals travel along the auditory nerves to the brain of the hearer. The hearer’s brain decodes these impulses, arriving at a thought (which is hopefully similar to the thought that started in the speaker’s brain). The last three steps also apply to the speaker, in a feedback loop: the sound reaches the speaker’s ear, is converted to electrical signals that travel to the brain, which decodes them and compares the spoken utterance with the intended utterance.

The three processes in the middle of this list, marked by the vertical bars, are the main concerns of phonetics, and give us the three primary divisions of the subject: articulatory phonetics (concerning the production of speech sounds), acoustic phonetics (concerning the physical properties of the sound waves) and auditory phonetics (concerning the perception of speech sounds). We deal mainly with articulatory phonetics in this chapter.

Phones Readers of this book will be familiar with the idea that a stretch of speech such as a spoken version of The farmer kissed the duckling can be divided up into a sequence of phones or sounds coming one after another. First, there is the sound written th and pronounced in the same way as in they and them; this is followed by another sound written e and pronounced like the a in sofa. Then comes the f sound (as in frog), an a sound as is also found in father, and so on. But the reality is not so simple. Figure 2.1 shows the sound wave for my production of this short sentence. What you can see immediately is that the sound is a continuous stream: it is not made up of separate blocks of sound separated by pauses. There is no precise point where you could say that the sound written th ends, and the one written e begins – the best we can say is that the e sound begins at around 0.06 seconds. Nor is there a clear division between the words, and the f sound runs on directly from the e sound at around 0.11 seconds, and is followed immediately by the a sound, at around 0.25 seconds. There is no interruption in the sound-stream throughout the word farmer. Later, between about 0.97 and 1.03 seconds there is a significant reduction in the sound wave. This might seem to be between kissed and the, but in fact it is within the word kissed! The final sound of this word, despite the spelling, is a t-sound, and that sound extends from about 0.91 to about 1.03, progressively becoming weaker. Similar remarks apply to the remainder of the utterance. In particular, the other apparent break in the sound wave – between 1.42 and 1.45 seconds – occurs within the k sound of duckling, not between the k sound and the following l sound.

27

28

Linguistics

You will have to take my word for the approximate divisions between the sounds. It will be impossible to verify them from Figure 2.1. In order to decide on the approximate transitions between the sounds I played the sound file many times while looking at the wave representation in a sound-analysis program called Praat (freely available at https://www. fon.hum.uva.nl/praat/). Even then there were uncertainties that I could only resolve by simultaneously examining another representation of the sounds, called a sound spectrogram. It is beyond the scope of the present chapter to discuss these acoustic representations. If you are interested in finding out more about spectrograms, you could consult Ladefoged and Disner (2012).

The same goes for the pronunciation of the sentence. The parts of the mouth involved in producing the sounds are in continuous motion; they do not move in a series of jerks from one fixed position to another. You do not first put your tongue in the position to make the th sound, then shift it instantaneously to the position to produce the e sound, then immediately put your lower lip against the upper teeth to make the f sound and so on. Even if you cannot draw an exact boundary between each of the phones in the way that you can between the letters making up the words of the printed sentence (an ordinary handwritten version would raise similar difficulties of division), it does seem that the stream is made up of sequences of sounds of various types, following one after another. The idea that you can divide stretches of speech into phones is a good approximation. To study the sounds of a language it is useful to represent phones in writing. For this purpose we use the International Phonetic Alphabet (IPA), a set of symbols based primarily on the Latin alphabet, and extensive and flexible enough to accommodate the sounds of any language. Table 2.1 shows the main symbols from the latest version of the IPA (dated 2015).

Figure 2.1 Sound wave of the author’s production of The farmer kissed the duckling.

Sounds of Language: Phonetics and Phonology

Table 2.1 Main symbols of the IPA (excerpted and slightly rearranged from IPA Chart, http:// www.internationalphoneticassociation.org/content/ipa-chart, available under a Creative Commons Attribution-Sharealike 3.0 Unported License. Copyright © 2015 International Phonetic Association.)

m ɸ ß

ɱ f v

n s z

ʃ ʒ ʧ ʤ

c ɟ

k g

q ɢ

ɳ ʂ ʐ

ɲ ç ʝ

ŋ x ɣ

ɴ χ ʁ

ʎ

ʟ

Affricate

voiceless voiced

ʦ ʣ

Lateral

voiceless voiced

ɬ* l

ɭ

tap, trill approximant

ɾ, r ɹ

ɽ ɻ

Rhotic Glides

ʀ j

w†

ɰ

Notes * ɬ is a voiceless lateral fricative; the other laterals (on the following line) are approximants. † w is a voiced labio-velar glide. Consonants (non-pulmonic) Clicks

Voiced implosives

Ejectives

ʘ Bilabial

ɓ Bilabial

ʼ Examples:

ǀ Dental

ɗ Dental-alveolar

pʼ Bilabal

ǃ Retroflex

ʄ Palatal

tʼ Dental-alveolar

ǂ Alveo-palatal

ɠ Velar

kʼ Velar

ǁ Lateral

ʛ Uvular

sʼ Alveolar fricative

n̥ d̥ Voiceless

b̤ Breathy voiced

ɫ Velarized

tʰ ɖʰ Aspirated

b̰ Creaky voiced

d̪ n̪ Dental

u̟ Advanced

tʷ Labialized

ẽ Nasalized

u̠ Retracted

n̩ Syllabic

d˺ Unreleased

ˈ Primary stress

ˌ Secondary stress

ː Long

ˑ Half long

. Syllable boundary

k͡ p Coarticulated

Diacritics

Glottal

ɖ ɖ

Pharyngeal

Alveo-palatal

Alveolar

Dental θ ð

Uvular

voiceless voiced

t d

Velar

p b

Palatal

Nasal Fricative

voiceless voiced

Retroflex

Stop

Labio-dental

Bilabial

Consonants (pulmonic)

ʔ

ħ ʕ

h ɦ

29

30

Linguistics

Vowels

High

Front iy ɪ

High Mid



Central ɨʉ

Back ɯu ʊ ɤo

ə Low Mid

ɛœ æ

Low

a

ɜ ɐ

ʌɔ ɑɒ

Where symbols are paired, the first symbol indicates the unrounded vowel, the second the corresponding rounded vowel.

2.2 The vocal tract The organs involved in producing the sounds of speech are referred to collectively as the vocal tract. These are the lungs, the larynx, the oral cavity and the nasal cavity; see Figure 2.2.

Figure 2.2 The human vocal tract.

Sounds of Language: Phonetics and Phonology

The lungs Most speech sounds are produced on a stream of air forced out from the lungs, through the trachea or windpipe, and then through the upper vocal tract, where the airstream is modified in various ways to produce different sounds. This stream of air is called an egressive pulmonic airstream. Speech in English and most other languages is usually produced on egressive pulmonic air. It is also possible to produce speech sounds on air drawn into the lungs, on an ingressive pulmonic airstream. This is like speaking while breathing in. Although not as often used as egressive pulmonic air, in some languages it is used to convey emotional effects. For instance, in Danish and other Scandinavian languages, words – for example, ja ‘yes’ – are sometimes produced on an ingressive airstream to indicate sympathy or commiseration. Other airstream mechanisms used in human languages will be discussed in §2.4.

The larynx In the larynx the airstream passes between a pair of muscular flaps called the vocal folds or cords, which can be drawn together to interrupt the airstream, or left open. If you bring the vocal folds together closely but not too tightly, forcing air between them will cause them to vibrate regularly. This is how you produce aaa, the sound you might produce after the first sip of a long-awaited drink, or when lying down for a well-needed nap. These vibrations of the vocal folds, called voicing, can be felt by holding your thumb and first finger against your Adam’s apple. Now say ssss, the sound of a deflating tyre. You will notice that the vibrations are not present while you produce this sound. Sounds like aaa that are produced with voicing are called voiced phones, while those produced without it, like ssss are called voiceless. Compare the and thin. You will notice that the first phone of one of them is voiced, the other voiceless. Which is which? In producing voiced speech sounds the vocal folds vibrate regularly, usually at between 80 and 400 cycles per second or hertz (Hz). This means that they move from the closed position to the open position and back again to the closed position between 80 and 400 times in a second, with each complete cycle taking about the same time. While you are producing an aaa sound tighten your vocal folds further until the airstream is blocked off completely. Release it, continuing with the aaa sound. What you got in the middle is called a glottal stop, written ʔ in IPA. This sound can be heard in some English words – for example, the interjection written uh-oh. From now on, we will use the IPA to represent phones, and enclose them in square brackets – thus [ʔ] is the glottal stop.

The oral cavity After travelling through the larynx, the airstream passes through the oral cavity (mouth) and/or nasal cavity (nose), and then to the outside air. In the oral cavity the airstream can be modified in

31

32

Linguistics

many different ways to produce different sounds. The main organs used to make these modifications are the tongue and lips; the jaw is also involved, though in more subtle ways. The phones [b] and [z] are produced on the same physical input, a voiced egressive airstream. But they sound quite different, and are formed differently. [b] is produced by first completely blocking the airstream through the oral cavity at the lips, allowing no air to escape to the outside; air from the lungs passes through the vibrating vocal folds and builds up behind the lip closure. Then the lips are parted and air behind the closure is released. [z] is produced by partially blocking the airstream from the lungs at the alveolar ridge and allowing the air to escape continuously with a noisy sound. These two phones differ in both the way they are produced and where they are produced.

The nasal cavity At the back of the roof of the mouth is the velum or soft palate, which can be raised or lowered. When lowered, the airstream can enter the nasal cavity; when fully raised, the airstream is channelled into the oral cavity. In uttering the word bad [bæd] the velum remains closed throughout, and the air passes entirely through the oral cavity; all of these phones are oral. To produce the word man the only difference is that the velum is lowered so that the airstream enters into the nasal cavity; unless you speak very carefully, the velum will remain lowered throughout the word, including on the vowel, giving [mæ ˜ n]. (In the IPA, a ˜ over a symbol indicates that the sound is nasalized.) Phones like [m], [æ ˜ ] and [n] are called nasals.

2.3 Types of phones Speech sounds are divided into two main types, consonants and vowels. Consonants involve a constriction in the vocal tract, obstructing the flow of air; the airstream is impeded or interrupted somewhere along the path from the lungs to the outside. Vowels are produced with no significant obstruction to the passage of air through the oral cavity, and the air exits unimpeded through the oral cavity (and perhaps the nasal cavity as well).Vowels are the most resonant phones, those that resound or re-echo the most, like the chime of Big Ben. Consonants are either non-resonant (like [d] and [f]), or are somewhat resonant (like [m] and [l]), though these are less resonant than a vowel (compare these phones with [i] or [a]).

Consonants Consonants are described in terms of the point where the airstream is impeded, and how it is impeded. These two properties are called the place of articulation and the manner of articulation. By convention, the places of articulation are shown across the top row of the IPA chart, the manners of articulation down the first column.

Sounds of Language: Phonetics and Phonology

Places of articulation The main places of articulation are illustrated schematically in Figure 2.3. Below we make a few remarks on each of them.

Labial Labial sounds are made with the lips. If both lips are used, the phone is called a bilabial. The initial sounds of bad [bæd] and mad [mæd] are bilabials: they are made by bringing the upper and lower lips together; so also is the initial sound of pad [pʰæd]. Instead of bringing both lips together, you can bring the lower lip into contact with the upper teeth to form a labiodental. The phones [f] and [v], as in the beginning of the English words fix [fɪks] and Vicks [vɪks], are produced in this way.

Dental Dental sounds are formed with the tongue and the upper teeth. Usually the tip of the tongue is used, and touches the upper teeth, as in English [θ] and [ð], the initial phones of three and this, respectively. In some dialects of English, including Californian English, the tip of the tongue

Figure 2.3 The upper vocal tract showing main places of articulation of consonants: 1, labial; 2, dental; 3, alveolar; 4, post-alveolar; 5, palatal; 6, velar; 7, uvula; 8, pharyngeal; 9, glottal. Arrows emanate from the main active parts (articulators) and point towards typical targets (passive articulators). More precise descriptions of places of articulation can be given by first specifying the active articulator, then the passive one, thus: apico-alveolar; lamino-palatal, dorso-velar and so on. (The tongue tip gives apico-; the blade gives lamino-; and the dorsum gives dorso-.)

33

34

Linguistics

protrudes between the upper and lower teeth for these consonants, which are called interdentals. The [n] and [d] phones in French are made with the tip of the tongue touching the upper teeth; this can be specified in the IPA by the tooth symbol under the letter, as in [d̪ ]. (In English the contact is slightly further back, at the alveolar ridge.) In some languages dental sounds are made not with the tip of the tongue against the teeth, but the blade of the tongue, the part behind the tip. Many Australian languages have such phones. The Yindjibarndi (Pama-Nyungan, Australia) word thugu ‘young boy’, begins with this type of phone, as does nhuurga ‘ankle’. (In Yindjibarndi spelling, the h indicates that it is the blade of the tongue that makes contact with the upper teeth.)

Alveolar Sounds made with the front part of the tongue – usually the tip, sometimes the blade – touching or almost touching the alveolar ridge (the ridge on the roof of the mouth just behind the upper teeth) are called alveolars. The initial phones of the English words top, dog, log, nag, rag, and sag are all alveolars. If the tip of the tongue makes contact just behind the alveolar ridge, the sound is sometimes called post-alveolar or retroflex.1 Post-alveolar phones occur in the Yindjibarndi words marda ‘blood’ (the rd indicates that the d-sound is made further back than ordinary [d]) and thurla ‘eye’.

Palatal The palate or hard palate is the large region of the roof of the mouth extending from a little behind the alveolar ridge to the soft palate or velum. Sounds made with contact (or approximation) between the tongue and the palate are called palatals. If this is in the front part of the palate, immediately behind the alveolar ridge, the sound is an alveo-palatal. The English words shingle, jungle and child begin with alveo-palatals. For the first phones of the Hungarian words nyak ‘neck’ and gyufa ‘match’ (represented in spelling by ny and gy) the contact is a little further back, and these phones are palatals.

Velar Behind the hard palate lies a soft area called the soft palate or velum. Consonants made with the back of the tongue touching the velum are called velar sounds. (Although it is physically possible to touch the velum with the tip of the tongue, it is not easy, and no language is known to employ this combination.) Velar sounds begin the English words cull, kill, car, go, grip and give. (The first three words begin with the voiceless velar stop despite the different spellings.) The nasal sound at the end of the English word sing is also a velar (IPA [ŋ]).

Uvular The uvula is the appendage hanging down at the back of the velum. The relatively few phones made with the back of the tongue and the uvula are uvulars. The r-sound of Parisian French, many dialects of Dutch (Indo-European, Europe) and German, and a few dialects of English (such as Northumbrian English) are (not necessarily always) realized as uvular trills (IPA [ʀ]), made with rapid vibration of the uvula (see p. 37).

Sounds of Language: Phonetics and Phonology

Pharyngeal The pharynx is the chamber behind the back of the tongue, above the larynx, and roughly at right angles to the oral cavity. Pharyngeal consonants are made by pulling the root of the tongue back to narrow the pharynx so that the air passes through noisily. Pharyngeals are not found in English; Arabic (Afroasiatic, Arabian peninsula and North Africa), however, has them. So does Danish, in the r-sound of words like råd ‘council’.

Glottal Glottal consonants involve a constriction of the glottis or opening between the vocal folds. We have already encountered one segment made at this place, namely the glottal stop (see §2.2), which is produced by holding the vocal folds together and then releasing them. The initial phones of the English words hot [hɒt] and hill [hɪɫ] are also glottal.

Manners of articulation Seven main manners of articulation are employed in human languages: stops, nasals, fricatives, affricates, laterals, rhotics and glides. In addition to these, the term approximant is used for phones that are produced by bringing the articulators towards one another, but not close enough to produce either a complete blockage or a turbulent noise.

Stops A phone that has complete closure or blockage of the airstream is a stop or plosive. Stops can be made at (almost) any place of articulation, from the glottis ([ʔ]) to the lips (e.g. [p], [b]). At the beginning of a word such as bed or pet, the bilabial phone involves complete blockage of the airstream, followed by an abrupt release of the pent-up air behind the blockage. When an English speaker says bed, the vocal folds are vibrating during the production of the initial stop. But for pet the vocal folds do not begin to vibrate until a short time after the first stop has been released. Stops like [b] are voiced; those like [p] are voiceless. The period of time (measured in milliseconds or thousandths of a second) between the release of a stop and the onset of vibration of the vocal folds is called voice onset time (VOT). For the English [b] this is usually a negative number, as voicing begins before the release of the stop. For the English [p], at the beginning of a word, voice onset time is usually a little under 50 milliseconds (ms). For Danish, the voice onset times of stops are generally slightly longer, while in Spanish they are shorter. When the VOT of a stop is rather long, as in the case of English and Danish voiceless stops, a puff of air follows the release of the stop, before the regular voicing of a following vowel begins. This is called aspiration, and indicated in IPA by a small raised ʰ, as in [pʰ]. You can observe this puff of air if you hold a small piece of paper loosely in front of your mouth when you say pin. Now say spin. What do you notice? The stops at the end of the English words bed and pet are voiced and voiceless, respectively. Voicing extends into the closure of the alveolar stop in bed, but not (or more accurately, only a short way) into the closure of the alveolar stop in pet. The stop at the end of both words can be either released or unreleased: that is, you can either release or not release the pent-up air behind the

35

36

Linguistics

alveolar blockage, resulting in a noise. If you say my pet dog Rover, it is likely that you will not release the [t] of pet; but if you say this phrase very carefully, separating pet from dog, you might release the [t] before forming the following [d]. Unreleased stops are indicated in the IPA by a raised angle ˺. So the English word pet could be pronounced as either [pʰɛt] or [pʰɛt˺].

Nasals Nasals have already been described (§2.2) as phones produced by lowering the velum to permit air to flow through the nose. Nasal consonants are like stops in having a complete blockage of the airstream through the oral cavity; but they allow the air to flow freely through the nasal cavity. Thus for [m] there is a complete closure of the oral cavity at the lips, as in [b] and [p]; but the velum is lowered, and the air travels out through the nose. Nasals are normally voiced. Burmese (Sino-Tibetan, Myanmar), however, has voiceless bilabial, dental, palatal and velar nasals.

Fricatives Fricatives are produced with incomplete closure at the place of articulation. A narrow passage is left open through which the airstream is forced, giving rise to a ‘noisy’ sound a bit like the sound you get by rubbing your hands on your clothing, or scratching your head. Fricatives are found in many, though not all languages, and can be produced at any place of articulation, from the lips to the glottis. English has both voiced and voiceless fricatives. There is a voiced labiodental fricative, [v] as in vale, and a voiceless fricative at the same point of articulation, [f] as in fail. Other fricatives in English are dental [θ] and [ð], alveolar [s] and [z], alveopalatal [ʃ] and [ʒ], and glottal [h]. Ewe (Niger-Congo, Ghana) has both voiceless and voiced bilabial fricatives [ɸ] and [β], as well as labiodental fricatives [f] and [v]. Velar fricatives are found in German (as in Buch [buːx] ‘book’), Old English, and many other languages including some dialects of English, including Scottish English (as in the word loch [ɫɔx] ‘lake’).

Affricates Phones produced by combining a stop and a fricative are called affricates. English has two affricates, both alveopalatal: voiceless [ʧ] (the first and last sounds of church) and voiced [ʤ] (the first phone in jungle). The alveopalatal region is the most common place of articulation for affricates, though other places are possible. For instance, Mandarin Chinese has both aspirated and unaspirated alveolar and retroflex affricates, and Beembe (Niger-Congo, Democratic Republic of Congo) has labiodental affricates, again both aspirated and unaspirated.

Laterals Phones like the initial [l] of English laugh are called laterals because the sides of the tongue are lowered at the point of articulation, allowing air to pass on both sides of a central closure. Sometimes laterals are produced with just one side of the tongue lowered. The most common place of articulation of laterals is at the alveolar ridge. Dental and postalveolar or retroflex laterals are also possible. A fair number of languages have palatal laterals, produced with contact between the blade of the tongue and the hard palate, with the sides open.

Sounds of Language: Phonetics and Phonology

This sound is found in Castilian Spanish (Indo-European, Spain) llave ‘key’. Velar laterals are also possible, though rare. In my dialect of English (Australian) words such as milk involve a velar lateral; the tip of the tongue is not used in this sound. The Papuan language Melpa (Papuan, New Guinea) also has a velar lateral. In many dialects of English the back of the tongue is raised in the production of the lateral, especially when it occurs at the end of a word, as in ball or school; this lateral has a ‘dark’ sound quality. The IPA symbol for this sound is [ɫ]. In Australian English the alveolar lateral is usually dark, even at the beginning of words. By contrast, the lateral of Danish and French is always ‘clear’ in sound, and is produced without raising of the velum. Like nasals, laterals are normally voiced; just a few languages have voiceless laterals. Burmese and Welsh (Indo-European, Wales) both have the voiceless alveolar lateral [ɬ]. It should be noted that this lateral is a lateral fricative, in contrast with the other laterals, which are not fricatives, but approximants (see note * to Table 2.1).

Rhotics The term rhotic is used for r-like sounds, those phones represented in IPA by some variant of the Latin letter r. Rhotics come in a wide variety of shapes and forms, the main ones being taps, trills and approximants. Taps (or flaps) involve a single rapid short closure (shorter than for a stop), usually between the tip of the tongue and the teeth or alveolar ridge, as in the Spanish words caro [kaɾɔ] ‘expensive’ and pero [peɾɔ] ‘but’. This is the usual rhotic of Scottish English. Trills are phones consisting of two or more rapid taps one after another. Usually trills are produced with the tip of the tongue, as in Spanish perro [perɔ] ‘dog’. They can also be produced by vibrating the uvula rapidly, as in Parisian French, Standard German and some varieties of Southern Swedish (Indo-European, Sweden) and Dutch. Bilabial trills – like the disrespectful ‘raspberry’ in English – are found in a few languages, for instance in Kele (Papuan, New Guinea). Rhotic approximants involve proximity, rather than contact, at the place of articulation. The rsound of most dialects of English is an approximant. However, it differs considerably between dialects. In Australian English and Estuary English (London) it is an apical approximant, in which the tip of the tongue points towards the alveolar ridge, but does not make contact with it. Uvular approximants are found in Northumbrian English (the uvula trill is sometimes also used in this dialect), and sometimes in German and French.

Glides Glides or semivowels are the most vowel-like of the consonants, having the least constriction at the point of articulation. They are characterized by movement of one articulator, which travels towards but does not reach the other. The y-sound of English, IPA [j], is a glide in which the blade of the tongue moves towards the palate. The blade of the tongue can also move towards the teeth. Bunuba (Bunuban, Australia) and Unggumi (Worrorran, Australia) have such glides (as well as the palatal glide). The so-called ‘soft-d’ of Danish (as in mad ‘food’) is a dental approximant; however, it involves movement of the tip rather than the blade of the tongue towards the teeth.

37

38

Linguistics

English has a second glide, the labiovelar [w], produced by moving the back of the tongue towards the velum and at the same time rounding the lips. This phone occurs in about threequarters of the world’s languages (the palatal glide is found in a slightly higher fraction of the languages, about 85 per cent). A few languages have a plain velar glide [ɰ], unaccompanied by lip rounding, and even fewer have a bilabial or labiodental glide without accompanying velarization.

Vowels Vowels are speech sounds produced without interruption to the passage of air through the vocal tract. The vocal tract is used as a resonating chamber for an airstream vibrating from the action of the vocal folds; and as this suggests, vowels are normally voiced in all languages. The cavities above the glottis act rather like an organ pipe or the chamber of a wind instrument, except that it has a characteristic right-angled bend between the vertical pharynx and the horizontal oral cavity. The shape of the cavities can be modified by positioning and shaping the tongue in different ways, which has an effect comparable to directing the airstream in an organ into different sized pipes. The position of the high point of the tongue during the production of a vowel effectively defines the size and shape of the two resonating chambers, and thus the quality of the vowel. If the high point is high up and towards the front of the mouth, you get vowel sounds like the one usually written ee in English, as in beet [biːt], or the one written i in bit [bɪt]. These vowels are called high front vowels. If instead the high point of the tongue is high up and towards the back of the oral cavity, you get a vowel like the one written u in put [pʰʊt]. These are high back vowels. For high vowels the body of the tongue is raised above its neutral or rest position. If the body of the tongue rests at a relatively neutral height, as for the vowel of bed [bɛd] we have a mid vowel; lowering the body of the tongue results in a low vowel, such as the vowel of fat [fæt]. The front-to-back dimension is also usually divided into three: front, with the high point relatively front; back, with the high point towards the back; and central, with the high point in between, in the central region. Table 2.2 shows the vowels of BBC English, the variety used by national newscasters in Britain, with the IPA symbols and illustrative examples. Other dialects of English have slightly different ranges of vowels. General American English, for instance, the variety used by many national broadcasters in the USA, lacks the low back unrounded vowel [ɒ], and has [ɚ] in place of [ɜː]. There are also minor differences in the qualities of the vowels, due partly to differences in the positioning of the high point of the tongue. New Zealand English has the high central vowel [ɨ] instead of the [ɪ] of other dialects.2 This vowel, though not common, is found in various other languages, including Amharic (Afro-Asiatic, Ethiopia) and Nimboran (Papuan, West Papua). High- and mid-back vowels are usually accompanied by lip-rounding, and so are called rounded vowels. High- and mid-front vowels are usually produced with spreading of the lips, while low vowels are usually produced with the lips in a neutral position.

Sounds of Language: Phonetics and Phonology

Table 2.2 Chart of vowels of BBC English showing IPA representation and examples (based on Ladefoged and Disner 2012: 30). Front High

iː bead ɪ bit

Mid

ɛ bed

Low

æ bat

Central

Back uː boot ʊ put

ǝ the (as normally spoken) ɜ: bird

ɔː ought ʌ cut ɒ cot; ɑː bard

These correlations are imperfect. Some languages have high- and/or mid-front rounded vowels. Danish has three rounded front vowels: high-front rounded [y]; mid-high front rounded [ø]; and mid-low front rounded [œ]. These have the same tongue positions as the three non-low front vowels [i], [e] and [ɛ],3 respectively. Fewer languages have high- and/or mid-back unrounded vowels. But Vietnamese (Mon-Khmer, Vietnam) has three unrounded back vowels, [ɯ], [ɤ] and [ʌ], which have the same heights as the rounded [u], [o] and [ɔ]. The velum can be lowered during the production of a vowel, allowing air to pass through the nasal cavity. This gives a nasal vowel. We have already seen that vowels in English can be nasal before a nasal consonant, as in man [mæ ˜ n]. French has nasal vowels, as in lent [lã] ‘slow’.

2.4 Some additional features The previous section presented an overview of the basic types of speech sounds found in the world’s languages. In this section we mention a few additional important features of the phonetics of human languages.

Airstream mechanisms In addition to the pulmonic airstream, two other airstreams are used for the articulation of speech sounds: glottalic and velaric.

Glottalic A number of languages of Africa, India and the Americas employ a glottalic airstream in the production of some (never all) phones. This airstream is created by closing the vocal folds and raising or lowering the larynx, while a closure – complete or partial – is made somewhere in the oral cavity. Ejectives are produced on an egressive (outgoing) glottalic airstream, formed by raising the larynx so as to compress the air behind the oral closure; this closure is released while the glottal

39

40

Linguistics

closure remains, resulting in a popping sound. About a fifth of the world’s languages have ejectives. Quechua (Quechuan, Peru) has three, with alveopalatal, velar and uvular places of articulation: [ʧ’aka] ‘hoarse’, [k’ujui] ‘to twist’ and [q’aʎu] ‘tomato sauce’. Implosives are produced by pulling the larynx downwards during oral closure, and releasing the oral closure, resulting in an audible inrush of air. Only about 10 per cent of the world’s languages have implosives. Sindhi (Indo-European, India) has bilabial (as in [ɓəni] ‘field’), retroflex (as in [ɗ ̣ɪnu] ‘festival’), palatal (as in [ʄətu] ‘illiterate’) and velar (as in [ɠənu] ‘handle’) implosives, in addition to ejectives.

Velaric The ‘tutting’ sound written tsk tsk (sometimes tut tut) in English involves an ingressive velaric airstream. It is produced by forming closures between the tip of the tongue and the alveolar ridge and the back of the tongue and the velum or uvula, keeping these closures in place while the body of the tongue is drawn downwards, rarefying the enclosed air. Then the contact between the tip of the tongue and the alveolar ridge is released, with the result that air is drawn into the mouth. The kissing sound is made in the same manner, except that front closure is at the lips. These sounds are called clicks. So-called Khoisan languages and some neighbouring Bantu languages of south-east Africa employ the velaric airstream in speech sounds. !Xóõ (Taa) (Tuu, Botswana and Namibia) has bilabial, dental, alveolar, lateral and palatal clicks, along with a number of accompaniments including voicing, nasality, aspiration, ejection and so on. The five clicks, without accompaniments, are illustrated in: [ʘân] ‘sleep’, [ǀān] a type of shrub, [ǃāo] ‘remain’, [ǁâe] ‘three’ and [ǂám] ‘return, go back (home)’.

Coarticulation It was mentioned above that the w-sound of English is produced with simultaneous movement of the lips and of the back of the tongue towards the velum. Such sounds, with simultaneous constriction at two places of articulation, involve coarticulation or double articulation. Stops and nasals can also be coarticulated. The most common are bilabial-velar combinations. Various languages of West Africa have such phones. Idoma (Niger-Congo, Nigeria) has coarticulated voiceless and voiced stops, and a coarticulated nasal: [àk͡ pà] ‘bridge’, [àg͡ bà]̣ ‘jaw’ and [aŋ͡ màa] ‘body painting’. Other combinations are rarer. Yélî Dnye (Papuan, Rossel Island) also has bilabial-alveolar coarticulated stops and nasals, in addition to labio-velars, as in [t͡ pɛnɛ] ‘lung’ and [n͡ mo] ‘bird’. Perhaps the most unusual coarticulated phone is found in Wari’ (Chapacura-Wanham, Brazil): a voiceless dental stop coarticulated with a voiceless bilabial trill, [t͡ ʙ̥].

Diphthongs Vowels are produced with the tongue in a relatively steady state throughout their articulation. A diphthong is produced when instead the tongue is in constant motion throughout, travelling from

Sounds of Language: Phonetics and Phonology

Table 2.3 Diphthongs of three dialects of English (based on Ladefoged and Disner 2012: 28, 30 and my own dialect) BBC English

Australian English

General American English

Examples

[aʊ]

[æʊ]

[aʊ]

bowed, loud

[əʊ]

[oʊ]

[oʊ]

grow, owe

[ɔɪ]

[ɔɪ]

[ɔɪ]

boy, quoit

[eɪ]

[eɪ]

[eɪ]

day, say

[aɪ]

[aɪ]

[aɪ]

my, dry

[ɪə]

[ɪə]

beer, leer

[eə]

[ɛə]

hair, lair

[ʊə]

[ʊə]

poor, boor

[aə]

hire, pyre

one vowel position to another. English dialects generally have a fair number of diphthongs, as can be seen in Table 2.3. (There are additional minor differences in the beginning and end points of the diphthongs that are not shown.)

Syllables Phones combine into larger units called syllables. Syllables are surprisingly difficult to characterize precisely, and there is disagreement among phoneticians concerning their definition. Nevertheless, syllables seem to have more psychological reality for speakers than do phones, and native speakers of any language are usually able to divide spoken utterances into syllables without difficulty.4 For instance, the word phonetics has three syllables (the boundaries of which are indicated by dots), [fə. nɛ.tɪks]. Even if you are not a native speaker of a language it is not normally too difficult to guess what the syllable division of a word is. The Gooniyandi word girili ‘tree’ has three syllables, and you can probably guess what they are.5 Syllables generally consist of a vowel together with one or more consonants, usually before the vowel, sometimes after it. Some languages allow syllables consisting of just a vowel; some allow syllables consisting of a consonant, generally a nasal or lateral. English has syllables of both of these types: consider about [ə.bæʊt] and medal [mɛd.ɫ̩ ]; these words also illustrate syllables with a consonant before and after the vowel (or diphthong) – that is, CVC syllables (where C stands for consonant, and V for vowel or diphthong). English has a considerable number of syllable types (including CV, CVC, V, VC, CCV and C), and thousands of possible syllables. Languages differ considerably in terms of the syllable types they allow. In Gooniyandi all syllables must contain a vowel; usually it is preceded by a consonant, giving the most common

41

42

Linguistics

form of the syllable CV; syllables can, however, also end in consonants. CV syllables are the most common syllables in human languages, and are the most frequent in most languages.

2.5 Prosodies Vowels and consonants are segments that come one after another in speech (although the boundaries between them are generally not precisely delimited, as we have seen). Some phonetic properties are spread over sequences of phones. These are called prosodies or suprasegmentals. Two prosodies are discussed in this section, pitch and stress. Others include loudness, tempo, length and rhythm.

Pitch Pitch refers to the frequency of vibration of the vocal folds. When you speak the pitch varies from moment to moment. Variations in pitch are used in two main ways in languages: to distinguish between words; and to convey different inflections on the meaning of an utterance. In the former case, we speak of tone; in the latter, of intonation.

Tone Many languages use different patterns of pitch to distinguish words; these pitch differences are called tones. Languages that use tones are called tone languages. Cantonese (Sino-Tibetan, China) is, like many nearby languages, a tone language. Differences in the pitch on the syllable [si] give six different words: with high falling tone, it is the word ‘poem’; with mid-level, ‘to try’; with low level, ‘matter’; with extra low, ‘time’; with high rising, ‘to cause’; and with mid-rising, ‘city’.6 In fact, Cantonese has three more tones, making nine in total.

Intonation All languages use variation in pitch over an utterance to convey modulations of the meaning expressed by the words. If you say I’ll see you tomorrow as a plain statement, you will probably say it with fall of pitch at the end. If you want to ask your friend whether they will be coming into the university on the following day, you would normally utter it with a rise in pitch on the final word. Produced with a rising-falling intonation contour on tomorrow, the utterance would convey a degree of insistence. Pitch variations also convey other kinds of information, including information about grammatical structure and the speaker’s emotional state – for example, whether they are angry, happy or sad. The different intonation patterns and the variations of meaning that are expressed by them are difficult to specify exactly, and differ somewhat among dialects of English – as well as between

Sounds of Language: Phonetics and Phonology

languages. For instance, in my dialect (Australian English) ordinary statements are often made on a rising intonation contour, which generally signifies a question in other dialects (see previous paragraph).

Stress Syllables can be produced with different degrees of forcefulness or lung energy, normally accompanied by differences in the tension of the vocal folds. Increasing the energy gives greater intensity, loudness and usually higher pitch. Syllables with greater energy are called stressed syllables, indicated in the IPA by a ˈ before the syllable. Other phonetic differences sometimes correlate with stress. For instance, in English the vowel of an unstressed syllable is often schwa, as in the usual unstressed utterance of words such as the ([ðə]) and a ([ə]), and the second syllable of farmer ([ˈfɑːmə]). This vowel normally does not occur in stressed syllables. In some languages stress always falls on a particular syllable of a word. In most Australian languages the first syllable of a word is always stressed. The following examples from Walmajarri (Pama-Nyungan, Australia) illustrate this: ngarpu [ˈŋaɻbu] ‘father’, kurrapa [ˈkuɾapa] ‘hand’ and martuwarra [ˈmaʈuˌwaɾa] ‘river’. As the last example shows, the third syllable of a word with four syllables is also stressed, though usually less strongly than the first, indicated by the ˌ. Stress in Hungarian also goes to the first syllable of a word. In Swahili (Niger-Congo, Democratic Republic of Congo) and Polish (Indo-European, Poland), by contrast, it is the second-last syllable of a word that normally gets stress. In English, stress is not predictable and goes on different syllables, depending on the word. Compare the placement in the three trisyllabic words (where the stressed syllable is bolded): photograph, diploma and disagree. There are a fair number of pairs of nouns and verbs in English that are distinguished by placement of stress. Stress goes on the first syllable of the noun, as in an import, a convict and an insult, but on the second syllable of the corresponding verb: to import, to convict and to insult.

2.6 Phonology How many phones does English have? Every time you utter a word or sentence there will be slight differences in the precise configuration of your vocal tract and the surrounding air. With sufficiently accurate instruments you could find minor differences in the shape of the vocal tract, and in the sound wave. (It’s a bit like not being able to step into the same river twice!) In this sense you might say that English effectively has an unlimited number of phones. Many of the differences are too small to be perceived. Some differences are perceptible in principle – that is, are within the distinguishing capabilities of the human ear – but are ignored by speakers. Phonology investigates the sound differences that are linguistically relevant in a language, and how the sounds pattern as a system.

43

44

Linguistics

Phonemes and allophones The bilabial stops in the three words ban, pan and span are all phonetically different: [b], [pʰ] and [p]. However, to native speakers of English the [pʰ] and [p] of the second and third words are the ‘same’ sound, while the [b] of the first word is perceived as a different sound. To a native speaker of Nyulnyul (Nyulnyulan, Australia) with no knowledge of English all three phones would sound the same, and they would experience difficulty in telling them apart. On the other hand, in the simple CV syllables [ba], [pʰa] and [pa], native speakers of Thai (Tai-Kadai, Thailand) or Shua would have no trouble hearing all three bilabial stops as different. This is not because Nyulnyul speakers have worse ears than us, and Thai and Shua speakers have better ears than we do. Rather, it is because the differences are not important in Nyulnyul, whereas they are important in Thai and Shua. How can this be? For the speaker of English [pʰ] and [p] of pin and spin are both instances of the voiceless bilabial stop. You cannot get a different word by replacing one of the phones by the other – at worst, you will get a weird-sounding production of the same word. If you said [spʰɪn] instead of [spɪn] no one would think this was a new word of English; they would still interpret it as spin. Similarly, at the end of a word such as sip you could produce either [pʰ] or [p˺], and no difference would result. But you do get a difference if you replace the [pʰ] at the beginning of pin with a [b]: you have another word, bin, with a completely different and unrelated meaning. In Thai, replacing [p] by either [b] or [pʰ] makes a difference. The word [pâː] means ‘aunt’; if you were to replace the first phone by a [b] you would get [bâː] ‘crazy’, and if you replace it by [pʰ] you get [pʰâː] ‘cloth’. These are three completely different words in Thai, no more related than English pin and bin. A similar situation exists in Shua. The phones [p], [pʰ] and [p˺] do not contrast with one another in English; the differences between them are said to be non-contrastive. The differences are not used in the language to distinguish between words. The three phones are said to be allophones (the allo- comes from Greek allos ‘other’ – literally, ‘other sounds’). The difference between [pʰ] and [b] is significant in English; [pʰ] and [b] contrast, and the difference between them is contrastive. These phones are not allophones in English, but represent distinct phonemes. In Thai and Shua, [b], [p] and [pʰ] all contrast, and so represent three distinct phonemes. In Nyulnyul all three phones can be heard in speech, but do not contrast with one another. They are thus allophones of a single phoneme. English, Thai, Shua and Nyulnyul organize the phonetic ‘reality’ in three different ways. Phonemes are distinguished from phones in writing by putting them in slanted brackets, //. Thus /p/ indicates the voiceless bilabial stop phoneme that has (in English) allophones [pʰ], [p] and [p˺], and is different from the phoneme /b/.

Non-contrastiveness of allophones By definition, allophones do not contrast with one another: no allophone can be used in place of another to make a different word in a language. This can be for one of two reasons. First, the

Sounds of Language: Phonetics and Phonology

phones might be able to occur in the same phonetic context, without having any effect on meaning; in this case the phones are said to be in free variation. Second, the phones might be unable to occur in the same phonetic environment; if this is so, we speak of complementary distribution.

Do you understand why phones in either free variation or complementary distribution cannot contrast? Think carefully about it – this is the crucial point to understand!

Free variation Free variation can be illustrated by the stop phones at the end of words in English, which may be either released or unreleased.7 The word slab can be produced with either a final released [b] or an unreleased [b˺]. In Gooniyandi the phones [ʎ] and [l̪ ] (a lateral made with the blade of the tongue touching the back of the upper teeth) are in free variation at the end of a syllable and after a low vowel. So the word galyba ‘soft’ can be pronounced as either [kaʎbɐ] or [kal̪ bɐ]. And at the end of a word in Gooniyandi the tap/trill [r] and flap [ɾ] are in free variation. The word bananggarr- ‘snatch’ can be uttered as either [banɑŋgʌr] or [banɑŋgʌɾ]. We also speak of free variation across speakers or dialects when some speakers use one phone, others another. For instance, the English rhotic is produced in a variety of different ways in different dialects, including [ɹ], [ɾ], [r] and [ʀ].

Complementary distribution The phones [pʰ] and [p] in English are in complementary distribution. Unaspirated [p] occurs after [s], and within a word before an unstressed vowel. Aspirated [pʰ] occurs elsewhere: at the beginning of a word, and within a word before a stressed vowel; it never follows [s]. Where [p] occurs, [pʰ] does not, and vice versa.8 The two vowel phones [ɪ] and [ɪ]̃ are also in complementary distribution in English. The oral vowel occurs in words like sip and pill; the nasal vowel occurs in words like sin and sing. More precisely, the nasal vowel occurs before a nasal consonant in the same syllable; the oral vowel occurs elsewhere.

Rules of realization The allophones of a phoneme are its phonetic realizations, the phonetic substances that as it were make up the abstract phonemic unit. Below are a few phonemes of English together with their allophonic realizations. (This list is incomplete: there are other phonemes in English, and those shown have other allophones.)

45

46

Linguistics

/p/

[p], [pʰ], [p˺]

/b/

[b], [b˺]

/t/

[t], [tʰ], [t˺], [tʷ], [ʈ ], [ɾ], [ʔ]

/d/

[d], [d˺], [ɖ]

/k/

[k], [kʰ], [k˺], [kʷ], [k̟ ], [k̟ ʰ]

/g/

[g], [g˺], [g̟ ]

/m/

[m], [ɱ]

/n/

[n], [n̪ ], [n̩ ], [ɳ]

/l/

[l], [ɫ], [ l̪ ], [ʟ], [l̩ ]

/s/

[s]

/θ/

[θ], [θ̟ ]

/ð/

[ð], [ð̟ ]

/ɪ/

[ɪ], [ɪ],̃ [ɨ]

/i/

[iː], [ĩː]

/ʊ/

[ʊ], [ʊ̃]

/ɛ/

[ɛ], [ɛː], [ɛ]̃

Some of the allophones shown above are in free variation. Other allophones are conditioned by the surrounding phonetic context: that is, the allophone is chosen in a particular instance because of the phones in the neighbourhood, or because of a prosodic feature such as stress or length. For example, the [ɱ] allophone of /m/ occurs only before a labiodental fricative, as in bumf; it is conditioned by the following sound. We can express this as a rule: ●

/m/ is realized as labiodental [ɱ] before a labiodental fricative; otherwise it is bilabial

To give a more complex example, we can express some of the allophonic variation of the voiced velar stop in English as follows: ●

/g/ is realized by the voiced velar stop [g], except when: (i) it is followed by a high front vowel, in which case it is advanced towards the back of the palatal region, [g̟ ]; (ii) it is followed by the high back vowel [u] or the labio-velar glide [w], in which case it is labialized, [gʷ]; and (iii) before another stop or nasal in the same word, and optionally at the end of a word, it is unreleased [g˺].

You should verify this rule by finding examples of the phoneme in the various environments and checking that the particular allophone does indeed occur there.

It is sometimes useful to express realization rules more formally, to give a clearer and more succinct statement. The above two rules can be written as follows: /m/ /g/

→ →

[ɱ [ɱ]

/

[m]

/

otherwise

[g̟ ]

/

_{[i], [ɪ]}

_{[f], [v]]

[gʷ]

/

_{[u], [w]}

[g˺]

/

_{stop, nasal consonant}

[g˺]

/

_# (optional)

Sounds of Language: Phonetics and Phonology

In these rules the arrow → indicates ‘is realized as’; the slash / is to be read ‘in the environment; the underline _ indicates where the segment occurs; the hash symbol # marks a word boundary; and braces (either a single one, or a pair) { } indicate alternatives. Other standard symbols not used in this rule include: round brackets () for optional items and $ for a syllable boundary.

2.7 How to establish the phonemes of a language All human languages distinguish a limited number of phonemes, between ten for Pirahã (Mura, Brazil) and just over a hundred for some Khoisan languages of southern Africa. For example, !Xóõ (Taa) is often said to have the largest phoneme inventory, with 164 consonants and over thirty vowels according to some sources; the actual numbers vary depending on the interpretation of certain sounds as single phones or sequences of phones. English is a relatively typical language with some thirty-eight (some dialects of American English) or thirty-nine phonemes (British English, Australian English). How does one go about determining the phonemes of a language? This section outlines the procedure and reasoning. This may perhaps give the misleading impression that it is a mechanical process, which it certainly is not, when you are confronted with a real language. But it is important to explain the basic steps since beginning students often experience difficulties in putting the considerations outlined in §2.6 into practice. The two basic steps can be summarized as follows. First, look for suspicious pairs of phones, phones that are phonetically similar enough to count as possible allophones of a single phoneme. Next, examine their distribution to see if they are in complementary distribution, free variation or contrastive variation (i.e. occur in the same environment). Let’s put this to work.

Example Gooniyandi has stop phones [d] and [ɖ], both made with the tip of the tongue in roughly the same region: the alveolar ridge, and just behind it. They are illustrated in the following short but representative list: [ɟʊdu]

‘straight’

[laɾgaɖi]

‘boab tree’

[ŋaɭʊdu]

‘three’

[lambaɖi]

‘little’

[waɖa]

‘star’

[bɪɖi]

‘thigh’

[lambadi]

‘father-in-law’

[mɑːdi]

‘cold’

[ɟʊɖu]

‘dust’

[t̪ aɻɪdi]

‘heavy’

[bɪdi]

‘they’

[lawɑdi]

‘shoulder’

47

48

Linguistics

The phones [d] and [ɖ] are phonetically similar, suspicious pairs, and we must determine whether they are allophones of a single phoneme (like their counterparts in some dialects of English) or separate phonemes. Before reading on, examine the list carefully yourself, and see if you can come to a conclusion. We begin by looking for minimal pairs, pairs of words that differ only in that one has the phone [d] and the other has [ɖ]. This is because if we can find such pairs of words, we know that the two phones occur in the same phonetic environment, and that they contrast. The list has three minimal pairs: [ɟʊdu] ‘straight’ and [ɟʊɖu] ‘dust’; [lambadi] ‘father-in-law’ and [lambaɖi] ‘little’; and [bɪdi] ‘they’ and [bɪɖi] ‘thigh’. The pairs differ only in that where there is a [d] in the first member of each pair there is a [ɖ] in the second. We have to conclude that [d] and [ɖ] contrast, and cannot be allophones of a single phoneme. They must represent two separate phonemes, which we naturally write /d/ and /ɖ/.

Although the existence of minimal pairs is evidence for a phonemic contrast, their absence does not prove allophone status. This is an important point to appreciate, because minimal pairs are rare in some languages. Lacking minimal pairs, one sometimes has to be satisfied with near minimal pairs – that is, with pairs of words that differ not just in the targeted segment but in others as well. Examples of near minimal pairs for [d] and [ɖ] in the list ɑdi] ‘shoulder’ and [laɾgaɖi] ‘boab tree’, and [t̪ aɻɪdi] ‘heavy’ and [bɪɪɖ above are [law [lawɑ ɖi] ‘thigh’. The first pair shows that both [d] and [ɖ] can occur between a low vowel and a high vowel, while the second shows they can occur between high vowels. Nothing in the wider phonetic environment is likely to guarantee the occurrence of one or the other of these phones, and we can conclude that they belong to separate phonemes.

There are two other suspicious pairs in the above list, [i] and [ɪ], and [u] and [ʊ]. In this case there are no minimal pairs for either contrast. Examination of the list reveals that the first member of each pair, the higher vowel phone, occurs exclusively at the end of a word, while the second member, the lower phone, occurs within words. They are in complementary distribution. If this data is representative, we are justified in identifying two separate phonemes, /i/ and /u/, each with two allophones.

Beginning students sometimes wonder which of the allophone symbols should be chosen as the label for the phoneme. The answer is that it doesn’t matter. We could equally have chosen to call the two phonemes just discussed /ɪ/ and /ʊ/ – or even /£/ and /ҹ/. Practicality is the main guiding principle: it makes sense to use familiar symbols that are easiest to type. It is also more sensible to use the symbol representing a widespread allophone than one that occurs in just one narrow phonetic environment.

Sounds of Language: Phonetics and Phonology

Why focus on suspicious pairs? In deciding on the phonemes of a language it is sensible to focus on phonetically similar phones, suspicious pairs. Phones that are phonetically very different are unlikely to be allophones of a single phoneme in a language: if they are very different it is unlikely that they will be treated as ‘the same’, and speakers are unlikely to regard them as variants. This is so even if the phones are in complementary distribution. The phones [h] and [ŋ] are in complementary distribution in English: [h] can only occur at the beginning of a syllable, and [ŋ] only at the end.9 But they are phonetically very different from one another, and are not normally regarded as allophones. And consistent with this, native speakers of English do not feel that they are in any sense the same sound. Or consider [p] and [kʰ], which are in complementary distribution in English. No one would regard them as allophones, when [p] is clearly each much more similar to [pʰ]. This is why we begin by looking for suspicious pairs. You do not need to slavishly go through every pair of phones and decide whether or not they contrast. Nevertheless, you do need to be awake to the possibility that phones that initially seem phonetically different to you might be allophones in the language. For instance, [r] and [l] will doubtless sound very different to you if you are a native speaker of English; but they are allophones of a single phoneme in Japanese.

2.8 Transcription An accurate representation of speech in writing is called a transcription. Three types of transcription are: phonemic, broad phonetic and narrow phonetic. Phonemic transcription represents the spoken word phonemically, and by convention is enclosed in slashes, //. A phonemic transcription can be useful for many purposes. For instance, in a dictionary, accompanying the standard orthographic representation of a word the phonemic representation might provide information not predictable from the spelling. With knowledge of the rules of realization (such as outlined in §2.6) and dialectal variation, a user could pronounce the word accurately. It is more economical for the dictionary maker to use phonemic representations than to try and represent the range of phonetic variation of each word. Phonetic transcription represents the phonetic substance of speech, and by convention is enclosed in square brackets, []. The IPA (or a variant of it) is generally used for phonetic transcription. Phonetic transcription can never be absolutely and completely precise, and the two types, broad and narrow, refer to extremes of accuracy in representation. Broad phonetic transcription is the least accurate, and ignores many of the precise phonetic details, especially those predictable by general rules. Thus, in a broad transcription of English aspiration of voiceless stops is likely to be ignored, since it is predictable. Generally speaking the use of diacritics in broad transcription is kept to a minimum.

49

50

Linguistics

Broad transcription is quite similar to phonemic transcription. However, they are not the same thing, since the units represented are completely different! It is quite possible, for example, to make a broad transcription of utterances in a language you do not know, and have not analysed. This would not be a phonemic transcription, which presumes an analysis of the sound system of the language.

In narrow transcription one represents as much phonetic detail as possible given the circumstances. This means that you should not represent more than can be perceived by ear, eye and/or instrumental means; to do so would misrepresent rather than improve the accuracy of the transcription. In a narrow transcription of English one would indicate such things as the aspiration of voiceless stops, velar quality of the lateral and so forth. Broad and narrow phonetic transcriptions differ in degree, in the accuracy of representation. It is perfectly reasonable to combine them, representing some things narrowly, others broadly. In a phonetic study of stop consonants in a language it would make sense to represent vowels and other consonants fairly broadly (especially those not in the immediate environment of the targeted phones), and the stops narrowly. Below are three transcriptions of a simple English sentence to illustrate the three types of transcription. Notice that in this case the phonemic and broad phonetic transcriptions are slightly different in that stress is indicated in the phonemic transcription (since it is phonemic in English) but omitted in the broad phonetic transcription, though it might have been included. What is represented in each transcription, however, is different: phonemes in the first case, phones in the second. The narrow transcription represents a spoken version of the sentence in Standard Australian English, which can be found on the website for this book; it is not as narrow as it could have been: for instance, stress and intonation are not represented. (As an exercise, you could try to improve its accuracy by marking in these features.) On the other hand, words have been separated even though no phonetic or phonemic indications occur; this is for reasons of comprehensibility. Orthographic representation Phonemic transcription Broad phonetic transcription Narrow phonetic transcription

The farmer kissed the duckling /ðə ˈfɑːmə ˈkɪst ðə ˈdʌklɪŋ/ [ðə fɑːmə kɪst ðə dʌklɪŋ] [ðə fãːmə k̟ ʰɪst̚ ðə dʌkɫɪŋ̃ ̟ ]

Summing up The scientific study of speech sounds is phonetics, and is divided into three main branches: articulatory, acoustic and auditory. Speech sounds are produced by continuous movement of the speech organs. This continuous stream can be divided into discrete segments called phones. Phones are produced by movement of air through the vocal tract. This air normally comes from the lungs, and is pushed through the

Sounds of Language: Phonetics and Phonology

vocal tract and out of the mouth and/or nose. This is the egressive pulmonic airstream. Speech sounds can also be produced on an ingressive pulmonic airstream. Some languages use velaric (giving clicks) or glottalic (giving ejectives and implosives) airstreams. Two main classes of phones are consonants and vowels. Consonants involve a constriction at some point in the vocal tract. They are defined by place and manner of articulation. The vocal folds may be drawn together to vibrate regularly, giving voicing, or held apart, giving voiceless phones. For stops, voice onset time is relevant. Some phones involve two simultaneous places or manners of articulation; this is called coarticulation. Vowels do not involve a constriction in the vocal tract. They are resonant sounds, characterized by the position of the highest point of the tongue, and the shape of the lips. Vowels are almost always voiced, and are normally oral; nasal vowels do occur in some languages. Diphthongs are vocalic phones characterized by movement of the tongue throughout. Consonants and vowels combine together to form syllables. Length, pitch, loudness and stress are prosodic features. These apply over phones, syllables or larger stretches of speech. Phonology is concerned with the sound system of a language. The most important concept is the phoneme, which is realized as a set of allophones. To establish the phonemes of a language one first looks for suspicious pairs of phones. Next one looks for minimal (or near minimal) pairs. If these can be found, the phones must represent distinct phonemes. If the phones are in either complementary distribution or free variation they are allophones. Phonological rules describe where the different allophones occur. Transcription is the systematic representation of the sounds of a stretch of speech, usually using the International Phonetic Alphabet. The main modes of transcription are phonetic (either broad or narrow) and phonemic. Phonetic transcriptions are enclosed in square brackets, []; phonemic transcriptions, between slashes, //.

Guide to further reading Most introductory textbooks discuss the basics of phonetics and phonology (see the list at the end of Chapter 1). Collins and Mees (2003) gives a good overview of English phonetics and phonology; Hughes et al. (2012) provides detailed information on phonetic and phonological differences among British dialects. An excellent introductory textbook on phonetics is Ladefoged and Disner (2012); Laver (2001) is a useful synopsis of the subject; Ladefoged and Maddieson (1996) surveys the phonetics of the world’s languages; and Denes and Pinson (1993) is a comprehensible account of all three major branches of phonetics. The Handbook of the International Phonetic Association (1999) is a necessary reference work for the serious student of phonetics. Samples of phones in most of the languages illustrated in this handbook (including varieties of British, American and Australian English) can be accessed at the website of the International Phonetic Association, https://www. internationalphoneticassociation.org/. Chapters 20 and 21 of Bauer (2007) provide useful information on the IPA and its usage.

51

52

Linguistics

This chapter touches on the very basics of phonology, the identification of phonemes. Modern phonology has more theoretical agendas, and is concerned with identification of distinctive patterns of sound in human languages, and how these are best captured, represented and accounted for. A succinct account is Cohn (2017); introductory textbooks include Clark and Yallop (1990), Carr (1993) and Gussenhoven and Jacobs (2003/1998). A useful interactive course on phonetics that includes numerous illustrations of sounds of the world’s languages is available on the internet at https://australianlinguistics.com/.

Issues for further thought and exercises 1 Transcribe the following English words and expressions, pronounced as in your dialect, in broad phonetic, narrow phonetic and phonemic transcriptions: butter singer antiques

mutton ginger ask

tarry lymph button

really rouge canyon

blur catches atom

phew brr (expression of cold) Adam

2 Transcribe the following English sentences, pronounced as in your dialect, in broad phonetic, narrow phonetic and phonemic transcriptions: Would you please stop that racket? I haven’t seen my brother for ages. Did you talk to her about it yesterday? ‘Seems like we’ll have to wait for another killing,’ Bony said calmly. I haven’t got the faintest idea where you put it. 3 Below are some words written in a broad phonetic transcription. Identify the words, and write them in ordinary English orthography and in phonemic transcription. The transcription represents the author’s pronunciation of the words (in his dialect of Australian English), and may differ from yours. If there are differences, re-transcribe the word in the IPA according to your pronunciation. [səbmɪt] [kʰɜːɫi]

[θɪðə] [mɛɹi]

[hjɯ̟ ːmn̩ ] [əbɹʌp˺tʰ]

[sɪgnɫ̩ ] [stɹɒŋɫi]

[fɪŋ̃ gə̃neɪɫ] [ʤɛn ̃ əsaɪd]

4 Divide the following English words into syllables and indicate stress: hesitate sentential supermarket

influence essence bookcase

influenza essential notebook

habitual sentence economy

habit raspberry economics

5 Pronounce the following phones (if necessary, practise making the phone, adding in a vowel where necessary) and explain in as much detail as you can their articulatory features: [ə̃] [æ]

[ɳ] [ø]

[ɲ] [ʊ]

[ɾ] [ɒ]

[ʎ] [ɔ]

[ɴ] [ʔ]

[ʙ] [ʤ]

[ɣ] [ɨ]

[ʃ] [ŋ]

[ɭ] [!]

Sounds of Language: Phonetics and Phonology

6 Give one minimal pair for each of the following contrasting phonemes in English: /s/, /z/ /l/, /r/ /ɛ/, /æ/

/s/, /ʃ/ /n/, /ŋ/ /ɒ/, /ɔ/

/z/, /ʒ/ /ʧ/, /ʤ/ /ɔ/, /ʊ/

/θ/, /ð/ /f/, /v/ /ʊ/, /u/

7 Based on the data below, say whether the placement of stress predictable in Taba (Austronesian, Halmahera). If it is, state the rule. [ˈplaŋ] [ˈpo.jo] [ˈbub] [ma.ˈni.tap] [su.ˈsa.ra] [ˌsa.ko.ˈa.mo]

‘fly’ ‘head’ ‘hornet’ ‘work’ ‘twins’ ‘to insert’

[ka.ˈʧu.paŋ] [ˌpa.ra.ˈdi.du] [ˈlu.ri] [ˌma.nu.ˈsi.a] [ˌku.pat.ˈba.waŋ] [kam.ˌkum.pa.ˈppi.do]

‘grasshopper’ ‘run-away child’ ‘rosella’ ‘people’ ‘small woven rice basket’ ‘large woven rice basket’

8 We gave a number of allophones for English /t/ and /n/ in §2.6. List as many examples as you can of words with these allophones. What factors motivate the choice of allophone? Can you write a rule to explain some or all of the allophonic distribution? 9 Gooniyandi has velar nasal phones [ŋ] and [ŋ̟ ] that differ in terms of how far back the point of contact between the tongue and velum is. (The + diacritic under the engma indicates that it is fronted.) Based on the following data, are they allophones of a single phoneme, or do they contrast? Justify your answer. [ŋ̟ iːdi] [jiŋ̟ i] [juʊŋgu] [ŋɑːbu] [ŋɑːŋ̟ g̟ i]

‘we’ ‘name’ ‘scrub’ ‘father’ ‘your’

[ŋ̟ ela] [mɑŋu] [ŋʊmbaɳɒ] [bɪɾaŋ̟ i] [ŋaɭʊdu]

‘east’ ‘Mangu’ (a place name) ‘husband’ ‘their’ ‘three’

10 Based on the following data, what are the distributions of the bilabial stop phones [b], [p], [pʰ] and [ɓ] in Goemai (Afroasiatic, Nigeria)? Pay attention to the positions of the phones in the words, and specify whether they contrast, or are in free variation or complementary distribution in the specified positions? How many phonemes do they represent? How would you describe what happens at the end of words? [baŋ] [gəba] [paŋ] [bukʰ] [pʰaŋ] [pʰaːtʰ] [ɓaŋ] [petʰ] [poːtʰ]

‘gourd’ ‘one who returned’ ‘stone’ ‘return’ ‘snake’ ‘five’ ‘red’ ‘exist’ ‘narrow’

[mʉep] [pen] [gəɓaːr] [reːp] [mʉepʰ] [gəpaːɾ] [bi] [ba] [tʰebul]

‘they’ ‘remove’ ‘one who saluted someone’ ‘girl’ ‘they’ ‘one who sent something’ ‘thing’ ‘return’ ‘table’

53

54

Linguistics

[kaɓan] [ɓoːtʰ] [gəpʰaːr] [reːpʰ]

‘face down’ ‘able’ ‘one who jumped’ ‘girl’

[ɓetʰ] [pʰe] [pʰepʰe] [ɓakʰ]

‘belly’ ‘place’ ‘cover’ ‘here’

11 Based on the following words, say what types of syllable are found in Kuot (Papuan, New Ireland) – that is, indicate what shapes of syllable are found in terms of the sequences of consonants and vowels. Syllable boundaries have been marked in words of more than one syllable. What rules of syllabification have been observed? [dʊs] [u.waʊ] [ɛs.pan] [sə.gər] [lɛj.lom] [pa.ku.ɔ] [lə.le.u.ma] [na.bwaj.ma] [kejn] [dan.wot]

‘stand’ ‘cloud’ ‘sun’ ‘egg’ ‘dolphin’ ‘taro leaf ’ ‘termite’ ‘ant species’ ‘type of basket’ ‘river’

[dʊ.ri] [nʊ.nə.map˺] [u.de.bʊn] [ʊt˺] [lə.kə.bwon] [fa.nu.ɔ] [mwa.ba.ri] [mus.gju] [siːgeː] [a.fa.ji]

‘sleep’ ‘life’ ‘banana plant’ ‘be full’ ‘stick of firewood’ ‘short side of house’ ‘sun’ ‘bird species’ ‘spoon’ ‘raintree’

Research project On p. 42 we mentioned but did not discuss the prosody of rhythm. One commonly drawn distinction is between stress-timed and syllable-timed languages. Find out what differences there are between these two rhythmic types, and write a paragraph description of each. Find examples of languages that are stress-timed and examples of languages that are syllable-timed. Which category does English – or your own language, if not English – belong to? Discuss consequences of the timing differences to the structure of syllables, the realization of stress, and vowel quality and quantity.

3 Structure of Words: Morphology

In this chapter we turn attention to what is perhaps the most salient unit in the grammars of human languages, and also to speakers: the word. The boundaries of words in spoken utterances are not overtly marked, so we need criteria for their identification. We introduce the widely used notion of the word as a minimal free form. We also examine the internal structure of words – that is, how they can be divided into smaller meaningful units. The scientific investigation of this domain is called morphology.

Chapter contents Goals Key terms 3.1 Words 3.2 Morphemes, allomorphs and morphs 3.3 Main types of morphemes 3.4 Allomorphs and allomorph conditioning 3.5 Morphological description 3.6 Morphological analysis Summing up Guide to further reading Issues for further thought and exercises Research project

55 55 55 58 59 65 69 71 74 75 75 78

55

56

Linguistics

Goals The goals of the chapter are to: ● explain the concept of word as a minimal free form; ● introduce and explain the key concepts in the study of the structure of words, including morpheme and allomorph; ● distinguish and classify the main types of morpheme according to their behaviour; ● exemplify the main methods and techniques of identifying morphemes and allomorphs; ● show how the structure of words in a language can be described; and ● raise the question of the psychological reality of morphemes and morphological analysis.

Key terms allomorph

inflectional morpheme

root

bound morpheme

lexical morpheme

stem

case

morph

suffix

clitic

morpheme

suppletion

derivational morpheme

morphophonemic form

tense

enclitic

morphophonemic rule

verb

free morpheme

noun

word

grammatical morpheme

number

zero morpheme

infix

person

inflection

prefix

3.1 Words Notion of the word Speakers generally have some notion of what constitutes a word in their language, and all languages probably have a word for ‘word’ – that is, a word that can be used to refer to a linguistic unit of this type, and that translates into English as word in certain contexts. (It may have other meanings as well, such as ‘utterance’ or ‘language’.) This does not hold for most of the other terms used in grammatical description, including many of the terms we encountered in the previous chapter, such as phone, phoneme, syllable and so on.

Structure of Words: Morphology

Speakers of English generally have a good feel for how an utterance can be divided into words. This may seem trivial: surely words are the things that are separated by largish white spaces in writing. But this does not always work smoothly. Bookcase and bookshelf would be words by this criterion, and this probably agrees with your intuitions. On the other hand there is no apparent motivation for writing church mouse as two words, churchman as one. And of course, you could not use this criterion at all for an unwritten language. In speech we find no corresponding pauses between the words – recall Figure 2.1, the sound wave for The farmer kissed the duckling. Nevertheless, no speaker of English would have any doubt that there are five words in this sentence, the, farmer, kissed, the (again) and duckling. No one would say that there is a word boundary between farm ([fɑːm]) and erkissed ([əkʰɪst]), or between kiss ([kʰɪs]) and ed ([t]). Although no pauses occur between the words of our sentence, you could potentially pause at any of the word boundaries in uttering the sentence, or put an um or er in between the words, as you might do in hesitating while trying to think of the right word. You could say, for instance, The farmer . . . kissed the duckling (where . . . represents a pause), or The um: farmer kissed um the duckling. But you can’t put pauses within the words: you wouldn’t say Th . . . e ([ð . . . ə]) farmer kissed the duckling or The farm . . . er kissed the duckling or The farmer kissed the duck-um-ling. In the last case, you would, rather, say duck um duckling, saying the full form rather than part of it. We can extend this observation. First, notice that each word in the example sentence can be separated from its neighbour by another word: The hairy farmer always kisses all the little ducklings. Second, each word can stand alone as an utterance. For instance, if a non-native speaker of English had said De farmer kisses de duckling, a native speaker might possibly correct them by saying just the ([ðə]); the same can be done for the other words. But only full words would be corrected in this way: no native speaker would correct The farmer kissed the duckring with just ling or l – they would repeat the whole word, duckling (with perhaps extra emphasis on the final syllable). Words are in this sense minimal free forms, with a degree of independence from other words in the sentence. First, they can be separated from the other words (this gives us the ‘free’ bit). Second, they cannot be divided into smaller parts each of which has such freedom (giving the ‘minimal’ bit). Observe that farmer is a minimal free form, since in our sentence farm cannot be separated from er, which does not have freedom of occurrence.1 (Of course, in other sentences – e.g. The farm is where the duckling was kissed – farm may be a minimal free form.)

Returning to our earlier problem examples, church mouse e and churchman, can you now decide whether they are each single words or two words?

The structure of words We begin by drawing a distinction between simple words like farm, kiss and duck that have no internal structure, and complex words like farmer, kissed and duckling that do have internal structure. Complex words can be divided into smaller meaningful pieces: farmer into farm and -er;

57

58

Linguistics

kissed into kiss and -ed; and duckling into duck and -ling. By contrast the simple words farm, kiss and duck can’t be divided up further into meaningful pieces: no smaller part of these words has a meaning. Nor can we divide -er, -ed or -ling into smaller meaningful pieces. The ‘pieces’ we have been talking about are minimal linguistic signs: they have a form and a meaning, and cannot be divided into smaller linguistic signs. Such pieces are morphemes. Morphemes are in a sense atomic signs: they can’t be split up further. Simple words consist of a single morpheme; complex words of more than one morpheme. Languages differ vastly in terms of the word-complexity they permit. English words are generally made up of relatively few morphemes. By comparison, words in Yup’ik (Eskimo-Aleut, Alaska) tend to be more complex, and may even correspond to full sentences in English. Thus the single word kaipiallrulliniuk means ‘the two of them were apparently really hungry’, and is made up of six morphemes: (3-1)

kai-pia-llru-llinibe:hungry-really-past-apparently‘The two of them were apparently really hungry.’

-u-statement-

-k -they:two

We return to the issue of the morphological complexity of languages in Chapter 15. Morphemes are linguistic signs, not mere phonological forms. The word duct /dʌkt/ (referring to a type of tube) contains the phoneme string /dʌk/ found at the beginning of duckling, which we separated off as a morpheme, as well as /t/, which is found in /fɪnɪʃt/ and means ‘past time’. But duct is a simple, not complex, word and does not mean anything like ‘a duck in the past’, or ‘a onetime duck’.2 To identify a morpheme requires that we identify a repeated form–meaning correlation, not just a repeated form (or meaning).

3.2 Morphemes, allomorphs and morphs Morphemes sometimes come in different phonological shapes. For instance, we identified a morpheme with the shape /t/ in kissed, which indicates that the event happened in the past. For pat the corresponding form ends instead in /əd/, and for goggle, it ends in /d/. These variant forms are called allomorphs. Other allomorphs in English are /s/, /z/ and /əz/, which are variant forms of the morpheme that attaches to verbs and indicates present time and that a single person other than the speaker or hearer did the act, as illustrated by pats /pʰæts/, goggles /gɒgɫz/ and kisses / kɪsəz/, respectively. In Yingkarta (Pama-Nyungan, Australia) future time is indicated in some verbs by adding -ku, in others by adding -wu, and in others by -lku: karnkaya-ku ‘will call out’, nyina-wu ‘will sit’ and kampa-lku ‘will cook’. The three forms -ku, -wu and -lku are allomorphs of the future morpheme. Allomorphs, like allophones, may be in either complementary distribution or free variation. The allomorphs /ə/ and /æn/ of the English indefinite article (these are not the only allomorphs) – written a and an, respectively – are in complementary distribution: the former occurs when the following word begins with a consonant, the latter when it begins with a vowel. Free variation is

Structure of Words: Morphology

illustrated by alternative realizations of the word exit as /ɛgzɪt/ and /ɛksɪt/, economics as /ikənɒmɪks/ and /ɛkənɒmɪks/, and off as /ɔːf/ and /ɒf/. All forms of each pair are found in the speech of many speakers of English, and are phonemically distinct: elsewhere [gz] and [ks] (eggs and ex – as in my ex – are minimal pairs), [iː] and [ɛ], and [ɔː] and [ɒ] contrast phonemically. Speakers may of course tend to use one of the alternative forms in preference to the other. The term morph is sometimes (albeit rather infrequently) used in reference to any minimal meaningful element in a language. For example, three morphs can be identified in the English word unwisely: /ʌn/, realizing a negative morpheme, /waɪz/, realizing the morpheme meaning ‘wise’, and /liː/, realizing a morpheme meaning ‘manner’. In English we have three morphs with the same phonological form /z/, one going on nouns and specifying plural (‘more than one’), as in dogs /dɔgz/, another going on nouns and indicating a possessor, as in dog’s /dɔgz/, and yet another going on verbs, and indicating ‘he, she or it is doing something’. There are also three morphs with the phonological shape /s/, and three with the shape /əz/ (in my dialect, Australian English). This gives nine morphs altogether, forming three sets of allomorphs of three morphemes.

3.3 Main types of morphemes Types according to occurrence Free morphemes Words, as we have already seen, are free forms. A simple word consists of a single morpheme, and so is a free morpheme, a morpheme with the potential for independent occurrence. In The farmer kissed the duckling the free morphemes are the, farm, kiss and duck. It is important to notice here that (in this sentence) not all of these free morphemes are words in the sense of minimal free forms – farm and duck are cases in point.

Bound morphemes Bound morphemes, by contrast, require the presence of another morpheme to make up a word; they can’t occur independently. The morphs -er, -ed and -ling in our example sentence are bound morphemes; all the other morphemes are free. Yingkarta -ku, -wu and -lku (see §3.2) are also bound morphemes. Bound morphemes which, like those discussed in the previous paragraph, go onto the ends of words, are called suffixes (see below for some qualifications). Another type of bound morpheme is a prefix, which precedes the morpheme to which it is attached. The bound morphemes un- and re- in English are prefixes, as in un-happy and re-constitute. A third type of bound morpheme is an infix, that goes inside another morpheme, as in Tagalog (Austronesian, Philippines) -in- ‘past’ in ib-in-igay ‘gave’, which occurs within the morpheme ibigay ‘give’. Collectively, suffixes, prefixes and infixes are called affixes.

59

60

Linguistics

It is important to stress that infixes are affixes that occur within other morphemes, and not between them. Thus --er err (/ə/) is not an infix in farmerss, where it occurs between the two morphemes farm and --s s, not within either of them; -err remains a suffix. The closest thing to infixation in English is the incorporation of expletives into words, as in abso-bloody-lutely and fan-fucking-tastic fan-fucking-tastic. c.

The distinction between bound and free morphemes is not always completely clear-cut. A morpheme can have both free and bound allomorphs. For example, not in English has a free form /nɒt/, and a bound form /nt/: he is not going has the free form, while he isn’t going has the bound form. There are also words like deride (/dəɹaɪd/) that have a bound form /dəɹɪʒ/, as in derision; this form is different from the free form and must be used when the word is followed by the morpheme -ion, and cannot be used in other circumstances. The free and bound forms of the negative word are in free variation, whereas the free and bound forms of deride are in complementary distribution.

Types of morpheme according to function or use A different classification can be arrived at if we consider the usage of a morpheme, including the type of meaning it conveys, instead of its distributional possibilities. This is the basis for the distinction between lexical and grammatical morphemes.

Lexical morphemes Lexical morphemes are those like farm, kiss, happy, constitute, book in English, and ibigay ‘give’ in Tagalog, that convey the major ‘content’ of a message, specifying the things, qualities and events spoken about. The lexical morphemes of a language form a large set, which allows new members – new ones are constantly being introduced into languages in response to the changing worlds in which speakers live (see Chapter 4). Lexical morphemes, that is, form an open set. Farm, kiss, happy, constitute, book and so on are free lexical morphemes, or free roots, which may serve as bases to which bound morphemes can be attached. Lexical morphemes can also be bound; there are bound lexical roots and derivational affixes.

Bound roots Recall that some lexical roots have bound and free allomorphs. Sometimes lexical roots are bound in all of their manifestations. Nyulnyul has around fifty bound roots, mainly terms for parts of the body, that must take a prefix indicating the owner of the part. For example, there is no free lexical morpheme ‘hand’. The root is the bound form -marl that has to have a prefix, as in nga-marl ‘my hand’, nyi-marl ‘your hand’, ni-marl ‘his or her hand’, irr-marl ‘their hands’ and so on. Nyulnyul has in addition some hundreds of other bound roots, including -jid ‘go’, -m ‘put’ and -j ‘do, say’, that have to take a prefix indicating the doer of the action, and may in addition take a

Structure of Words: Morphology

range of other prefixes and suffixes indicating the time of the event, whether it occurred or not, and so forth.

Derivational affixes These are affixes that attach to a lexical root and result in a new word, a complex item called a stem. The suffix -er /ə/ in English is a derivational suffix: adding it to a lexical root gives a stem with a related meaning. Attaching this suffix to bake gives baker, to boil gives boiler and so on. Other derivational suffixes in English include -ish as in childish, -ic as in alcoholic, -ful as in tearful and -ly as in precisely, among others. Notice that these suffixes do not only change the meaning of the morpheme they are attached to, they also (normally) change its part-of-speech (see §4.1). Thus from nouns like child, alcohol and tear, we get adjectives like childish, alcoholic and tearful. Not all derivational morphemes change the part-of-speech of a root. For example, hood normally attaches to a noun, giving another noun: childhood, priesthood and sisterhood. Many derivational prefixes in English are like this, and the derived form is the same part-of-speech as the root. Nyulnyul has a derivational morpheme -id – with a meaning similar to English -er – that can be attached to a lexical morpheme of virtually any type to give a noun (so it may or may not change a morpheme’s part-of-speech): yaward-id (horse-er) ‘horseman’, -alm-id (head-er) ‘hat’, majanbin-id (to:jump-er) ‘jumper’ and junk-id (run-er) ‘runner’. Stems can often be further derived to give yet more complex stems. For instance, from the root help we can derive helpful, which can be further derived to unhelpful; this new word can be further derived by suffixing -ness, unhelpfulness, or -ly, unhelpfully.

Grammatical morphemes Whereas lexical morphemes give the major meaning content of an utterance, grammatical morphemes mainly give information about the grammatical structure of the utterance, about how to put the content together to form a coherent whole. Grammatical morphemes are generally demanded by the grammar, and contribute relatively abstract schematic meanings concerning the functions of the lexical items. For this reason they are sometimes called function morphemes. Like lexical morphemes, they can be either free or bound.

Free grammatical morphemes Free grammatical morphemes in English include words like and, but, by, in, on, not, the, a, that, it, me and so forth. Languages only rarely acquire new grammatical morphemes, and the grammatical morphemes in a language can be regarded as effectively forming a closed class. The most frequent words in English are free grammatical morphemes. This is confirmed by investigations of large corpora: see Table 9.1 for listings of the ten most frequent words in three corpora. Although there is some disagreement among the corpora, the, be, of, and, a, to, in appear among the ten highest frequency words in each of the listings. In fact, almost all of the forty most frequent words in a corpus of a bit over two million words of both speech and writing, made up of three publicly available corpora,3 turned out to be grammatical words. The three or four that are not all have grammatical uses as well as lexical uses.

61

62

Linguistics

Bound grammatical morphemes Inflectional affixes Inflectional affixes are bound morphemes that give grammatical information relevant to the interpretation of a word in the sentence in which it occurs. They do not give rise to new lexical words, but to different forms of a single lexical word, different forms that are appropriate for the use of that word in the sentence. Consider the following Latin sentence: (3-2)

serv-ī cōnsul-em slave-PL:SUB consul-SG:OBJ ‘The slaves hear the consul.’

audi-unt hear-they:PRS

This example shows the standard way of laying out example sentences, and should be adhered to. The first line shows the words in the language, divided into morphemes separated by hyphens; this line is given in italics. Below is a line giving a gloss (simple translation) for each morpheme. For grammatical morphemes the gloss is usually given as an abbreviation in capitals: in the above example, PL stands for ‘plural’, PRS for ‘present’ (i.e. an event going on now), OBJ for ‘object’, SG for ‘singular’ and SUB for ‘subject’. When more than one word is used in the gloss for a single morpheme in the language line, the words are separated by a colon (:). The third line shows a free translation, a translation that indicates the meaning of the entire sentence. This is enclosed in single quotation marks. For fuller details of recommended conventions, see The Leipzig glossing rules: conventions for interlinear morpheme-by-morpheme glosses s, available at http://www.eva.mpg.de/ lingua/resources/glossing-rules.php.

The suffix -ī on serv ‘slave’ indicates that this word is the subject of the sentence, and that more than one slave is involved – that is, that the word is plural in number; the -em suffix on the second word indicates that cōnsul ‘consul’ is the object and is singular in number. (See §5.4 on the notions of subject and object.) The suffixes -ī and -em are inflectional suffixes that give information about how the words they are attached to are incorporated into the grammar of the sentence. Attaching them to the roots serv ‘slave’ and cōnsul ‘consul’ does not give rise to new words, but to forms of the same words that are appropriate to their grammatical environments. If instead the sentence was to express the meaning that the consul heard the slaves, the word cōnsul ‘consul’ would be used as is – that is, without a suffix – while the suffix -ōs would be attached to serv ‘slave’. The suffix -unt added to audi ‘hear’ is also an inflectional suffix, giving a form of this word that should be used when the subject is plural. If the subject had been singular, the suffix -t would have been used instead. This suffix also indicates that the event is going on at the present time. The lexical words serv ‘slave’, cōnsul ‘consul’ and audi ‘hear’ come in different forms according to the grammatical features of the sentence they occur in. Thus compare (3-2) with (3-3).

Structure of Words: Morphology

(3-3)

cōnsul serv-ōs consul:SG:SUB slave-PL:OBJ ‘The consul hears the slaves.’

audi-t hear-he:PRS

Inflections on nouns that indicate the grammatical role of the noun in the sentence are called cases. It is usual to call the subject case (as in Latin) the nominative (abbreviated NOM), and the object case accusative (abbreviated ACC). The inflections on the verb are called agreement inflections: they agree with the subject in terms of whether it is ‘I’, ‘we’, ‘you’, ‘they’ and so on. English also has inflectional suffixes, including a regular plural suffix with allomorphs /s/ ~ /z/ ~ /əz/ – where the tilde ~ separates alternating allomorphs – as in magistrates, slaves and churches, respectively. (Some nouns have irregular plurals – for example, the plural of mouse is mice, and of child is children.) The suffix /s/ ~ /z/ ~ /əz/ on kisses, hears and passes is also an inflectional suffix, giving the form of these words appropriate to sentences with a singular subject excluding the speaker or hearer (i.e. ‘he’, ‘she’ or ‘it’) and present time of the event.

Clitics Not all bound grammatical morphemes are inflectional affixes. Consider the bound form of the free grammatical morpheme not, written n’t. When this morpheme is attached to an auxiliary verb such as have as in They haven’t broken in, it does not result in a new form of that auxiliary verb: the word haven’t is not a distinct form of have that has been chosen because of the grammar. Nor is it a new lexical stem. Bound grammatical morphemes like this, which behave grammatically as separate words, but are phonologically part of the preceding word, are called enclitics. If they are part of a following word, they are called proclitics; the term clitic is a generic term covering both types. Not all clitics are like -n’t in having corresponding free form words. The English possessive morpheme written ’s – with allomorphs /s/ ~ /z/ ~ /əz/ – is an example. This morpheme does not make a variant form of the word to which it is attached, a form that is appropriate to a particular grammatical environment. This is because it is attached not to the end of a word, but to the end of a phrase, a group of words that go together (§5.3) as a single unit. Although in the king’s crown it might seem that the ’s gives the form of the noun used when it is a possessor, this is not always the case. In the king of England’s crown it is still the king who is the possessor, not England; similarly he remains the possessor in examples such as the king who beheaded him’s palace, the king we saw’s palace and the king they knocked the crown from’s palace. There is no free allomorph of the possessive ’s.

Differences between derivational affixes, inflectional affixes and clitics Three main types of bound morphemes were introduced in the previous sections: derivational affixes, inflectional affixes and clitics. Derivational affixes give rise to new lexical words, while inflectional affixes give different forms of the word to which they are attached, forms that are appropriate to the grammatical environment of the sentence.

63

64

Linguistics

There are several other differences between derivational and inflectional affixes, that are tendencies rather than absolute differences. These differences underline the status of derivational morphemes as lexical rather than grammatical in nature. ●











Concept distinctiveness. Attaching a derivational affix to a word generally gives rise to a new concept, that could in principle be expressed by a simple lexical root; attaching an inflectional affix does not result in a new concept. In Warrwa the derived word burr-kurru ‘a thing associated with a brumming noise’ means ‘car’; the alternative form mudika (borrowed from English) is a root expressing the same meaning. Degree of abstraction. Derivational affixes tend to have more concrete meanings than inflectional affixes: their meaning is more like that of a lexical word, while inflectional affixes have meanings more like grammatical words. For example, the English derivational suffix -ess has a quite concrete, lexical-like meaning, ‘female’. Relevance. The meaning of a derivational affix is usually relevant to the meaning of the root; the meaning of an inflectional affix need not necessarily be very relevant to the meaning of the word. Thus compare the meaning of the derivational affix -ess, which highlights a characteristic of the person or animal denoted, with that of a case inflection, which does not. Replaceability. A word with a derivational affix attached to it can usually be replaced by a single simple word; this is not normally possible for a word with an inflectional affix, which will usually have to be replaced by another inflected form. Regularity. Derived words often have irregular or not entirely predictable meanings; the meanings of inflected words are normally completely regular and predictable. The lack of regularity in meanings of derived words can be illustrated by the derivational suffix -ize, which has rather different effects in publicize ‘draw to public attention, make well known’, romanticize ‘portray in a romantic fashion’, vaporize ‘to cause something to become a vapour’, and winterize ‘prepare something for use in winter’. Productivity. Derivational affixes usually have limited applicability; inflectional affixes usually apply to all words of a particular part-of-speech (with perhaps a small class of irregularities). The English derivational suffix -ess occurs on heiress, authoress, waitress, lioness, goddess and so on. But there are arbitrary restrictions on the suffix: jokes aside, we do not say elephantess for ‘female elephant’, personess for ‘woman’, workeress for ‘female worker’, or professoress for ‘female professor’.

Clitics do not give a new form of the word to which they are attached, and nor do they result in new lexical words. The forms haven’t and isn’t are not inflected forms of have and be, and nor are they separate lexical items that need to be listed individually in a dictionary. Differences between clitics and affixes include the following, which are again tendencies rather than absolute differences. ●

Freedom of position. Clitics often – though not always – have a degree of freedom of movement in a sentence, a feature not shared by affixes. Gooniyandi has a question clitic -mi that can go on any word of a sentence, regardless of its position. No affix has this degree of freedom.

Structure of Words: Morphology









Selectivity. Clitics tend to be relatively free in terms of the range of lexical items they can be attached to (as we saw for English ’s); affixes are generally more particular about the company they keep. Allomorphic variation. Clitics generally show few allomorphs, and those they do have can usually be explained by the phonological environment, as is the case for the allomorphs of the possessive ’s. Allomorphs of affixes can have peculiarities that are not explicable by the phonological environment, as in the case of the plural allomorph -en, which is found unpredictably on just a few nouns. Predictability of meaning. There are rarely, if ever, semantic idiosyncrasies in clitics. Regardless of what word they happen to be attached to, the meaning is generally consistent. Prosodic integration. Clitics are not necessarily prosodically integrated into the words they are attached to, whereas affixes are often integrated. In some languages inflected or derived words are stressed like roots; words and their clitics need not be stressed like roots, and the clitic may show prosodic features of a word. For instance, Gooniyandi has a set of clitic pronominals that attach to verbs, as in wardba-ngarra (bring-1SG.OBL) ‘bring it to/for me’; these are stressed as though they were separate words.

Figure 3.1 attempts to bring out by analogy the difference between the three types of bound morphemes. If bound morphemes of all three types can be attached on the same side of a given word in a language, it is derivational affixes that normally go next to the root, giving a stem; then come inflectional affixes, and finally clitics, at the greatest distance from the root. Thus in English the plural inflection of nouns follows derivational affixes such as er: teach-er-s, farm-er-s and so on. Similarly in energ-iz-ed the derivational affix is next to the root, and is followed by the inflection. And in the robot we energ-iz-ed-’s co-processor the possessive enclitic -’s comes word finally.

Summary of morpheme types We can summarize the classification of morphemes given in this section as shown in Table 3.1.

3.4 Allomorphs and allomorph conditioning We have now introduced the main morphological units found in languages, and their types. In this section we will look at the relations between allomorphs: how allomorphs of a morpheme resemble one another phonologically; and the factors that condition the choices among them.

Types of allomorph So far, with one exception, the morphemes we have discussed have allomorphs that are phonologically similar. The three English plural allomorphs /s/ ~ /z/ ~ /əz/, like the formally

65

66

Linguistics

Figure 3.1 A conceptual representation of the differences between the three main types of bound morpheme. (1) Inflection gives different forms of a single item. For example, (a) and (b) are different forms a letterbox can be found in; (c), (d) and (e) are different shapes a hand can take. (2) Derivation gives rise to something new, an item different from the one it is derived from, and not a different form of the item. The distinctive markings on the car in (a) mark it as a police vehicle, and give rise to vehicle used in particular ways. Similarly, wearing a uniform in (b) specifies the individual as discharging a specific social role. (3) Clitics are items that lean on or depend on other items, like ticks on a cat. The thing that they depend on is their host; the clitic is like a parasite. The presence of a tick does not give rise to a new form of a cat, nor does it derive a new type of animal. © 2009 William B. McGregor and his licensors. All rights reserved. Table 3.1 Summary of major morpheme types Free Lexical

Bound

Lexical words (roots and stems) Bound lexical roots and stems English: man, dog, big, love, run Nyulnyul: -alm ‘head’, -marl ‘hand’ (must have a prefix) Derivational morphemes English: -er, -ion

Grammatical

Grammatical words English: of, the, not, we, be

Clitics English: -’s, -n’t Inflectional morphemes English: -ed (past), -s (on noun, plural)

Structure of Words: Morphology

identical possessive allomorphs /s/ ~ /z/ ~ /əz/, are obviously very similar phonologically. These are called phonological allomorphs. But allomorphs can be quite different phonologically. The derived comparative and superlative forms of good are better and best, with the regular derivational suffixes -er and -(e)st. However, these suffixes are not attached to good to give gooder and goodest (although these can be found in the speech of children, and in jocular, playful or foreigner speech of adults). The derivational morphemes are instead attached to bet- and be- (which are phonological allomorphs), allomorphs of good that are not phonologically related to it. Variants like good and bet- ~ be- are said to be suppletive allomorphs. Goemai (Afro-Asiatic, Nigeria) has a fair amount of suppletion in its lexicon. Compare f ’yer and nan the singular and plural verbs ‘become big’, and mat and sharap the singular and plural forms of the noun ‘woman’. Suppletion can also be found in grammatical morphemes. For instance, in Yawuru (Nyulnyulan, Australia) verbs take prefixes indicating whether the subject is ‘I’, ‘we’, ‘you’, ‘he/she/it’, ‘they’ and so on. The form of the prefix for subject ‘you (one individual)’ is usually /mi/, as in /minaɲanda/ ‘you caught it’ and /miɲɟurkuɲ/ ‘you were cutting hair’. In reference to a future event, however, the form of this prefix is instead usually /wal/: /walaɲa/ ‘you will catch it’ and /walɟurku/ ‘you will cut hair’. But for a few verbs it is /ŋa/, as in /ŋaɟali/ ‘you will return’. The three forms /mi/, /wal/ and /ŋa/ are suppletive allomorphs of the second person singular subject prefix.

Types of conditioning factor Conditioning factors are the factors that determine which allomorph of a morpheme you use. In phonological conditioning different allomorphs are selected according to the phonological environment. The choice between the allomorphs /ə/ and /æn/ of the English indefinite article is phonologically conditioned by the following phoneme, whether it is a vowel or a consonant. The three possessive allomorphs /s/ ~ /z/ ~ /əz/ are also phonologically conditioned, though in this instance by the preceding phoneme. Sometimes allomorphs show lexical conditioning – that is, the choice of allomorph depends on the particular word the morpheme is attached to. A specific instance of this is the plural suffix involving /n/ found on the irregular plurals children and oxen. So also is the choice between -en (phonological form /n/) and -ed (/t/ ~ /d/ ~ /əd/) in the past participle, the form of the verb used after have and had. Thus we have on the one hand (have) given, (have) eaten, (have) broken, and on the other (have) finished, (have) grabbed and (have) wanted. A third possibility is morphological conditioning. Here it is the grammatical rather than lexical morphemes that condition the presence of the allomorph. We saw an example in the previous section, with the distribution of the suppletive allomorphs of the ‘you subject’ prefix in Yawuru (which depend in part on whether the event is in the future or not). All three types of conditioning factor can be relevant to the choice among allomorphs of inflectional and derivational affixes, including combinations of factors. For allomorphs of clitics, conditioning is almost always phonological.

67

68

Linguistics

Morphological rules In §2.6 we saw that phonemes are abstract forms that are realized by phones, and a phoneme can be regarded as a set of allophones. Likewise, in morphology it can be descriptively and conceptually useful to identify abstract forms for morphemes that are realized by different phonological allomorphs. Thus the regular past suffix in English has three phonological allomorphs, /t/ ~ /d/ ~ /əd/, which are in complementary distribution. We could hypothesize that they are alternative realizations of a more abstract form of the morpheme: the phonemes would thus relate to the abstract form in the same way as phones relate to phonemes. Such abstract forms are sometimes called morphophonemes; the specification of a morpheme in terms of these units is its morphophonemic (or underlying) form. Rules are needed to get from the morphophonemic form of a morpheme to its phonological forms, its allomorphs – and then another set of rules is necessary to get from that to the phonetic form. We could presume that the regular allomorphs of the past suffix in English have the underlying form {d}, using braces in order to maintain a distinction from phonemic and phonetic levels. The three allomorphs could be accounted for by two rules: 1 Insert a schwa (/ə/) following a verb stem ending in an alveolar stop. 2 {d} is realized (a) by /t/ when the preceding segment is voiceless; otherwise it is realized (b) by /d/. Thus, the past of the regular verb wish is /wɪʃt/, which can be derived from {wɪʃ-d} by rule 2a; the past form of the regular open is /əʊpn̩ d/, which derives from /əʊpn̩ -d/ by 2b (which indicates to do nothing). What about verbs ending in an alveolar stop, like debate? This can be accounted for by first applying rule 1 (of schwa insertion), giving /dəbeɪtə/, then rule 2b, giving /dəbeɪt-əd/. Notice that the rules must be applied in this order. Otherwise, you would get the incorrect form /dəbeɪtət/, by first realizing the {d} as /t/ and then inserting the schwa after the first /t/. You should verify these rules for a selection of regular verbs (e.g. love, bow, hoe, pitch, wish, want, raid, grab) to ensure they give the correct result.

There are more formal ways of writing morphophonemic rules, allowing for more succinct statements of the rules. These follow basically the same conventions as phonemic rules (see box on pp. 46–7), with the addition of the hyphen for a morpheme boundary. Thus, for example, the above rule for the realization of the English regular past morpheme could be expressed as follows: 1 insert /ə/

/

alveolar stop - _

2 {d}



/t/

/

/d/

otherwise

voiceless segment - _

Structure of Words: Morphology

3.5 Morphological description To describe the morphology of a language it is necessary to: a. identify the grammatical morphemes, their allomorphs and conditioning factors; and b. specify the possible shape of words as combinations of morphemes of various types. We give below some illustrations of how aspects of the morphology of a language can be described in these terms.

Case morphology in Warumungu As in Latin, cases in Warumungu (Pama-Nyungan, Australia) are marked by inflectional suffixes to nouns. The basic shape of a noun is stem (a root, or a root plus derivational suffix) plus an optional case suffix. Here we ignore the complexities of Warumungu case-marking morphology, and deal with just five case suffixes, and their forms with roots of two syllables. They are dative (roughly ‘for’), allative (roughly ‘towards’), locative (‘at’), ablative (‘from’) and adversative (‘lest’). The ablative and adversative each have a single allomorph: /ŋaɹa/ and /kaːɟi/, respectively. Some examples illustrating the ablative are /kaʈiŋaɹa/ ‘from the man’, /ŋapaŋaɹa/ ‘from water’ and /manuŋaɹa/ ‘from the country’. The other three case-marking inflections each have three allomorphs: /ki/, /ka/ and /ku/ for the dative; /kina/, /kana/ and /kuna/ for the allative; and /ŋki/, /ŋka/ and /ŋku/ for the locative. The three allomorphs are chosen in the same way: ● ● ●

If the root ends in /i/, use /ki/, /kina/ or /ŋki/, as in /kaʈiki/ ‘for the man’. If the root ends in /a/, use /ka/, /kana/ or /ŋka/, as in /ŋapaŋka/ ‘in the water’. If the root ends in /u/, use /ku/, /kuna/ or /ŋku/, as in /manukuna/ ‘to the country’.

These allomorphs are phonological and phonologically conditioned. Underlying forms can be suggested for the three morphemes: {ki}, {kina} and {ŋki}, together with a rule whereby the first {i} is realized as /a/ when the preceding vowel (the final vowel of the stem) is /a/, and /u/ when it is /u/. The fact that the first vowel of the ablative and adversative, /a/, remains unchanged suggests that the first vowel can’t be /a/ in the dative, allative and locative. It could of course equally be /u/ as /i/, and there is no way of making an informed decision on the basis of the little evidence we have; to avoid making a decision, you might suggest instead that the underlying form is VH, for ‘high vowel’.

Morphology of locative case marking in Turkish Case marking in Turkish (Altaic, Turkey) is also by means of suffixes to nouns. The suffixes follow the plural inflectional suffix /ler/ ~ /lar/, if present.

69

70

Linguistics

The locative suffix has four allomorphs: /da/ ~ /de/ ~ /ta/ ~ /te/. They are chosen as follows, where IPA symbols are used, rather than the Turkish orthography: ●







/da/ is used when following a voiced segment, and the first vowel preceding the suffix is a non-front vowel. Examples are: /binada/ ‘in the building’, /kapɨda/ ‘at the door’ and /pulda/ ‘on the stamp’. /de/ is used following a voiced phone, and when the first vowel before the morpheme is front. Examples are /evde/ ‘in the house’, /evlerde/ ‘at the houses’ and /kedide/ ‘at the cat’. /ta/ is used following a voiceless consonant, and when the preceding vowel is non-front: /kitapta/ ‘on the book’, /tarafta/ ‘on the side’ and /balɨkta/ ‘on the fish’. /te/ occurs when the previous phone is voiceless, and the immediately prior vowel is front: /køpekte/ ‘on the dog’, /gyneʃte/ ‘on/in the sun’.

The locative case allomorphs are phonological, and are phonologically conditioned. As an exercise, suggest underlying forms for the locative, and suggest a rule that would give the occurring forms.

Locative case in Meryam Mir Case in Meryam Mir (Papuan, Murray Island) is indicated by suffixes attached to noun roots and stems. The locative suffix has five allomorphs, /e/ ~ /ge/ ~ /ɟdoge/ ~ /doge/ ~ /idoge/, which are chosen as follows: ● ●







/e/ is used on ordinary nouns ending in a velar stop: /mayke/ ‘with/at the widow’. /ge/ is used on other ordinary nouns, not ending in a velar stop: /metage/ ‘at the house’, /utebge/ ‘at the place’ and /kimyarge/ ‘at/with the married man’. /ɟdoge/ is used on names of persons and places when the name ends in a vowel: /eydyanaɟdoge/ ‘at Eidiana’s place’. /doge/ is used on names of persons and places which end in a /ɟ/: /pomoɟdoge/ ‘at Pomoy’s place’. /idoge/ is used on other names of persons and places: /lezidoge/ ‘at Les’s place’, /opnoridoge/ ‘at/on the Barrier Reef ’.

The allomorphs fall into two sets of phonological allomorphs: /ge/ ~ /e/; and /idoge/ ~ /ɟdoge/ ~ /doge/. The factors conditioning choice among the members of the two sets are phonological, and concern the final segment of the word to which they are attached. The conditioning factors for the choice between the two sets are lexical, depending on whether the word is a proper name or not. How are the members of the two sets related? Notice that the three members of the second set each involve the first allomorph of the first set attached to a connecting form, whose shape is phonologically conditioned. It seems that the locative forms of proper names require attachment of a perhaps meaningless connecting form before the ordinary locative suffix is added.

Structure of Words: Morphology

3.6 Morphological analysis Sample morphological analysis: Hungarian verbs Listed below are a small selection of verb forms in Hungarian, given in phonemic representation using IPA symbols. Before reading further, you should attempt to identify the morphemes, their allomorphs and the order they occur in. /futok/ /ugɔtott/ /futott/ /ɔdok/ /futottɔm/ /laːtok/ /laːt/ /olvɔʃtɔm/ /olvɔʃ/ /ɔdott/ /fut/

‘I am running’ ‘it barked’ ‘he/she ran’ ‘I give’ ‘I ran’ ‘I see’ ‘he/she sees’ ‘I read (before)’ ‘he/she is reading’ ‘he/she gave’ ‘he/she is running’

/hɔlɔdott/ /ɔd/ /vaːr/ /vaːrok/ /mɔrɔtt/ /olvɔʃok/ /mɔrɔdok/ /ugɔt/ /hɔlɔdok/ /mɔrɔd/ /hɔlɔd/

‘he/she/it moved forward’ ‘he/she is giving’ ‘he/she is waiting’ ‘I am waiting’ ‘he/she/it remained’ ‘I am reading’ ‘I remain’ ‘it barks’ ‘I move forward’ ‘he remains’ ‘he/she/it is moving forward’

Grammatical morphemes From the list, it is easy to see that the phoneme sequences /ok/ and /ɔm/ are always associated with the subject ‘I’ – that is, with first person singular subject. These are suppletive allomorphs. Examining their contexts of occurrence reveals that /ok/ occurs when the verb refers to an event going on right now, and /ɔm/ to one that happened previously. The allomorphs are morphologically conditioned by the time of occurrence of the event. It will also be noticed that whereas the ‘I’ form has /ok/ or /ɔm/, there is nothing in the corresponding form for subjects ‘he’, ‘she’ or ‘it’ – that is, for third person singular subjects. When the verb describes an event in past time there is always either a /t/ or an /ott/; this is absent when the event is happening now. These are allomorphs of a morpheme that specifies that the event happened before, ‘past time’. The allomorph /ott/ is found following an alveolar consonant, /t/, /r/, /d/ or /l/; following /ʃ/ we find the allomorph /t/. The conditioning is thus phonological: the first allomorph follows an alveolar segment, the second, a palatal segment. A grammatical morpheme that specifies the time of occurrence of an event is called a tense marker. The basic tenses are present, for events happening now; past for events that happened before; and future for events expected to happen later (not represented in the Hungarian data above).

Lexical morphemes Looking at the list we find that the following invariant phonological segments are associated with the lexical meanings indicated:

71

72

Linguistics

/fut/ /ɔd/ /hɔlɔd/ /laːt/

‘run’ ‘give’ ‘move forward’ ‘see’

/ugɔt/ /vaːr/ /mɔrɔd/ /olvɔʃ/

‘bark’ ‘wait’ ‘remain’ ‘read’

These are identical with the forms for third person singular subject and present tense; the other forms can be considered to be based on these invariant forms.

Structure of the verb As a first attempt, we could give the following description of the structure of the above Hungarian verbs (recall that the round brackets indicate optional material): (3-4)

Lexical verb root + (Tense suffix) + (Subject suffix)

The tense suffix is there if the verb is in the past, but absent if in the present; the subject suffix is there for first person singular subject, but absent for a third person singular subject. The absence of anything in the place where the past tense suffix goes tells us that it is present tense; the absence of anything in the place where the first person singular subject suffix goes tells us that it is third person singular. An alternative way of describing the structure is possible if we identify not two grammatical morphemes, but four: first person singular subject third person singular subject past present

/ok/ ~ /ɔm/ ø /t/ ~ /ott/ ø

Here ø represents a zero form – that is, the morpheme expressing the particular meaning has no phonological form. Such morphemes, lacking phonological form, but having a meaning, are called zero morphemes. The usefulness of zero morphemes is first that they allow us to say that for both tenses, and both types of subject, there is a grammatical morpheme marking that category. And secondly, they allow us to state the structure of the verb as follows: (3-5)

Lexical verb root + Tense suffix + Subject suffix

This formula provides a characterization of the Hungarian verb that is in some ways preferable to the previous formula. There are no options in the structure, and we have a more general way of speaking about tense and subject marking. Furthermore, it implies that a form such as /fut/ can be interpreted as either of two things: the plain verb root fut ‘run’, or the inflected form of this verb, fut-ø-ø ‘he/she is running’. These forms are phonologically identical, but morphologically different.

Zero morphemes are controversial, and not all linguists are happy with them. It is always necessary to have reasons for identifying them. Generally, absence of phonological form should be interpreted as absence of a morpheme. You need no reason to make this

Structure of Words: Morphology

presumption, that nothing in the phonological form corresponds to nothing in the morphology. (As William Haas 1957: 53 aptly observed, ‘If some men in civilian clothes are soldiers, this is no reason for suggesting that they wear zero-uniforms.’) But you do need good reasons to suppose that there is a zero morpheme in the absence of a phonological form!

Morphological analysis by speakers There is reason to believe that it is not just linguists who engage in morphological analysis, but also language learners and speakers. The child learning her first language begins with words or larger units as unanalysed wholes. Each word is a separate sign in itself, bearing no perceived relation to any other word. But by four years of age, the normal child has heard many millions of words in the speech of those around her, and has a vocabulary of signs that amounts to over a thousand words. By this time the number of signs has become so large that it is impractical for them to be all treated as separate, isolated entities. The child must engage in analysis of the units; she must treat them no longer as separate and unrelated items, but divide them into smaller meaningful, morpheme-sized components. Evidence that the child does indeed do this comes from the fact that she begins to overuse regular morphemes, extending them to irregular words. Thus whereas at an early age the child acquiring English has both foot and feet, see and saw, and go and went, morphologically irregular forms, at the age of about four, she will most likely begin using regular forms like foots, seed and goed (possibly alongside of the irregular forms). (See further §12.1.) These are unlikely to have been learnt from the speech of those communicating with the child. This indicates that the child has separated out and identified morphemes like the regular plural -s and the regular past tense -ed – forms that were probably previously not separately identified, being treated as unanalysed parts of lexical forms. Speakers continue to apply morphological analysis into adulthood, extracting apparently meaningful parts from words, sometimes even extracting meaningless parts and imbuing them with meaning. An example is provided by the perhaps incipient bound morpheme -gate which indicates ‘a (political) scandal associated with a place’. This has nothing to do with the free word gate, as in an entrance in a fence. Rather, its source is Watergate – the name of a building in which certain now infamous events occurred in the early 1970s – from which has been extracted gate, which is treated as though it is a meaningful unit. Now this form can be attached quite productively to placenames, as in Iran-gate and Korea-gate, and more recently to other nouns, as in planet-gate and envelope-gate, or even parts of words, as in fornigate. (If you haven’t encountered these words, look them up on the internet to see how they fit with meaning given above.) Many languages are morphologically much more complex than English, to the extent that it is unlikely that every form of every lexical item could be stored as a separate item in the speaker’s mind. There are languages in which each verb has many hundreds of distinct inflectional

73

74

Linguistics

forms. The more frequent forms of high frequency verbs may well be stored in the mind as separate items. But many forms are liable to be very infrequent, and likely to be created on the spur of the moment, using allomorph shapes and descriptions of word shapes such as given in the previous section. The verb form meaning ‘I’ll hit you’ is likely to be of the first type since (like it or not) it describes a common threat in everyday life in most (if not all) societies, while the form meaning ‘they two could have bamboozled us three excluding you’ is less likely to be stored as a whole.

Summing up A word is a minimal free form. Some words are structurally complex, and are made up of smaller meaningful atoms, morphemes. There are regularities in the way morphemes combine together in a language, and morphology studies these regularities. The words of a language can be described as sequences of morphemes. Usually there are restrictions on the ordering of morphemes in a word, so that it is possible to give a general description of the morphological shape of words of particular types, such as the nouns of English. Languages differ considerably in their morphological structures and complexity; English is morphologically rather simple. Morphemes may be free or bound, and lexical or grammatical. The main types of bound morphemes are affixes (suffixes, prefixes and infixes) and clitics (enclitics and proclitics). Lexical morphemes include roots and derivational affixes, which create new stems from roots. Grammatical morphemes include inflectional affixes, which convey information about the grammatical function or category of the lexical morpheme to which they are attached; examples are case and number inflections on nouns and/or pronouns, and tense inflections on verbs. Grammatical morphemes can also be free words. Like phonemes, morphemes often come in different shapes; these are called allomorphs. Allomorphs that resemble one another in form are phonological allomorphs; otherwise they are suppletive allomorphs. The circumstances governing the choice of allomorphs are conditioning factors, which may be phonological, lexical or morphological. One way of providing a more general account of phonological allomorphs is to identify a single abstract morphophonemic representation of the morpheme, plus rules that derive phonological representations from it. Speakers and language learners also engage in morphological analysis, and show awareness of the components of complex words. While irregular forms of frequent words are learnt early by the child, around the age of four children begin to construct regular forms. This suggests that the child has learnt and internalized the regular morphological patterns of the language, and is not merely repeating forms they have heard. Adults also sometimes analyse out meaningless parts of words and interpret them as though they were morphemes, as in the case of the emergent morpheme -gate.

Structure of Words: Morphology

Guide to further reading Most introductions to linguistics (see the list at the end of Chapter 1) treat the basic notions of morphology, and include sample problems in morphological analysis. For a comprehensive description of English morphology, see Bauer (1983). This chapter adopts an approach to morphology focusing on the syntagmatic dimension (recall §1.2), the ‘item-arrangement’ approach (Hockett 1954), that deals with items (morphemes) and their arrangements into strings. Most morphologists today favour a ‘word-paradigm’ approach, which takes the word as the central concept and adopts a paradigmatic orientation. Robins (1959) is the classic paper arguing in favour of the word-paradigm approach. Spencer (2017) presents a good overview of the field, and identifies a range of problems in morphological theory and description. Bauer (2004) provides clear explanations of the essential terminology of morphology. Chapter 2 of Pavey (2010) is a good introduction to morphology and morphological analysis. For more advanced textbook treatments, see Bauer (2003), Lieber (2010), Haspelmath and Sims (2010) and Matthews (1972, 1974). Spencer and Zwicky (2001/1998) contains papers on a range of topics in, and approaches to, morphology; it is, however, quite technical and is most useful to advanced students. Those interested in the morphology of a particular language should begin by consulting a grammar of that language. Series such as the Mouton Grammar Library and Pacific Linguistics include many good grammars of minority languages.

Issues for further thought and exercises 1 Divide the following passage into morphs, list the morphs, and label each according to whether it is free or bound, lexical or grammatical. You should encounter some problems in identifying morphs: dubious cases where the status of a form as a meaningful element is not entirely certain, and where it is difficult to decide precisely where the morpheme division occurs. Identify and discuss these difficulties. The city wasn’t pretty. Most of its builders had gone in for gaudiness. Maybe they had been successful at first. Since then the smelters whose brick stacks stuck up tall against a gloomy mountain to the south had yellow-smoked everything into uniform dinginess. The result was an ugly city of forty thousand people set in an ugly notch between two ugly mountains that had been all dirtied up by mining. (Hammett 2003/1929: 1–2)

2 What are the conditioning factors for the three allomorphs of the possessive enclitic in English? Are they identical with the conditioning factors for the regular plural morpheme? What grammatical differences can you find between the two morphemes? (Use the properties of the various morpheme types mentioned in the text to help you find differences. You could also think about their ordering.) What happens when the enclitic is attached to plural nouns (regular and irregular)?

75

76

Linguistics

3 The past tense suffix for regular verbs in English has three allomorphs the shapes of which are analogous to the shapes of the plural noun suffix and the possessive enclitic. What are they? Are they phonological allomorphs? Are the conditioning factors for the past tense allomorphs the same as the conditioning factors for the plural noun suffix and possessive clitic? If not, can you identify anything common between them? 4 Based on the following data from Gumbaynggirr, what are the allomorphs of the lexical and grammatical (case) morphemes? (Note that ergative is the name for the case that marks the subject of a transitive clause (like John sees Mary) but not the subject of an intransitive clause (such as John ran).) Are they phonological or suppletive allomorphs? What are their conditioning factors? For the phonological allomorphs can you suggest a morphophonemic representation, and rules of phonological realization? Ergative

Locative

Dative

Ablative

‘man’

/niːgadu/

/niːgada/

/niːgargu/

/niːgana/

‘small’

/ɟunujɟu/

/ɟunujɟa/

/ɟunujgu/

/ɟunujɲar/

‘father’

/baːbagu/

/baːbaŋumbala/

/baːbaŋu/

/baːbaŋumbajŋa/

‘flood’

/duːlgambu/

/duːlgamba/

/duːlgamgu/

/duːlgamɲar/

‘tail’

/ɟuːndu/

/ɟuːnda/

/ɟuːngu/

/ɟuːnɲar/

‘pademelon’

/gulɟuːdu/

/gulɟuːda/

/gulɟuːgu/

/gulɟuːɲar/

‘mother’

/miːmigu/

/miːmiŋumbala/

/miːmiŋu/

/miːmiŋumbajŋa/

‘mosquito’

/guɹaːdu/

/guɹaːda/

/guɹaːgu/

/guɹaːɲar/

‘brother-in-law’ /ŋaɟiːgu/

/ŋaɟiːŋumbala/

/ŋaɟiːŋu/

/ŋaɟiːŋumbajŋa/

‘cattle’

/bulaŋgu/

/bulaŋga/

/bulaŋgu/

/bulaŋɲar/

‘brother’

/gagugu/

/gaguŋumbala/

/gaguŋu/

/gaguŋumbajŋa/

‘magpie’

/ŋaːmbulu/

/ŋaːmbula/

/ŋaːmbulgu/

/ŋaːmbulɲar/

‘whiting’

/ɟuruwiɲɟu/

/ɟuruwiɲɟa/

/ɟuruwiɲgu/

/ɟuruwiɲɲar/

5 Below are some verb forms in Saliba (Austronesian, Sariba and Rogeia Islands). Describe the morphology of the verb, and identify the lexical and grammatical morphemes; suggest a meaning for each morpheme. /selaoko/ /jelaoma/ /sedeuli/

‘they went already’ ‘he came this way’ ‘they washed it’

/jeseseko/ /sekeno/ /jalaowako/

‘it is already swollen’ ‘they slept’ ‘I already went away’

Structure of Words: Morphology

/jeligadi/ /jadeuli/ /jeheloiwa/ /sekitagau/

‘she cooked them’ ‘I washed it’ ‘he ran away’ ‘they saw me’

/jeligako/ /jakitadiko/ /selageko/ /sepesama/

‘she cooked it already’ ‘I saw them already’ ‘they arrived already’ ‘they came out here’

6 It was mentioned that English have has, according to some linguists, both grammatical and lexical uses. Do you think it is preferable to consider these to represent different uses of a single lexical word, or two homophonous words, one lexical, one grammatical? Explain your reasoning. 7 Examine the following sentences in Northern Sotho (Niger-Congo, South Africa), written phonemically. Identify the morphemes, stating their phonological form and their meanings, as revealed by these examples. Describe the morphological structure of words. a. b. c. d. e. f. g. h. i. j.

/mpʃa elomilɛ ŋwana/ /basadi barɛka diaparɔ/ /bana batla/ /mosadi orɛkilɛ nama/ /dimpʃa dilomilɛ bana/ /monaŋ olomilɛ mmutla/ /ŋkwe ebɔna dintlo/ /ŋwana otlilɛ/ /banna barɔbilɛ selɛpɛ ntlɔng/ /monna obona setimɛla/

‘The dog bit a child’ ‘The women buy clothes’ ‘The children come’ ‘The woman bought meat’ ‘The dogs bit the children’ ‘The mosquito bit a hare’ ‘The leopard sees the huts’ ‘The child came’ ‘The men broke an axe by the hut’ ‘The man sees a train’

8 English nouns mark plurals regularly by the morpheme /s/ ~ /z/ ~ /əz/, and irregularly by a variety of means. The singular never has any phonological marking. Is there sufficient evidence to suppose that there is a zero suffix marking the singular? Discuss the pros and cons of identifying a zero morpheme. (In answering this question, consider the consequences of this analysis for nouns like fish and sheep.) 9 Analyse the following Warrwa verb forms and identify the morphemes that correspond to the English pronouns; what are the allomorphs and their conditioning factors? How is information about the time of the event expressed? (Note that there is no direct representation of ‘it’ as object.) How would you describe the structure of the verb? (The forms are given in the IPA, not the recommended orthography for the language.) a. b. c. d. e. f. g. h.

‘looked’ /ŋamuɹuŋuɲ/ /mimuɹuŋuɲ/ /ŋarmuɹuŋuɲ/ /jamuɹuŋuɲ/ /muɹuŋuɲ/ /jarmuɹuŋuɲ/ /gurmuɹuŋuɲ/ /ŋirmuɹuŋuɲ/

‘pierced (it)’ /ŋanaɹaɲ/ /minaɹaɲ/ /ŋaraɹaɲ/ /janaɹaɲ/ /naɹaɲ/ /jaraɹaɲ/ /guraɹaɲ/ /ŋiraɹaɲ/

‘was cooking (it)’ /ŋanamarana/ /minamarana/ /ŋaramarana/ /janamarana/ /namarana/ /jaramarana/ /guramarana/ /ŋiramarana/

‘I’ ‘you’ ‘we two (me and him/her)’ ‘we two (me and you)’ ‘he’ ‘we (me, you and one or more others)’ ‘you plural’ ‘they’

77

78

Linguistics

10 Compare the root and progressive (indicating that the event is in progress) forms of Babungo (Niger-Congo, Cameroon) verbs below. How is the progressive formed? (Tone is not indicated.) a. /faʔ/ ‘work’ /fɨfaʔ/ ‘be working’ b. /təə/ ‘dig’ /tɨtəə/ ‘be digging’ c. /baj/ ‘be red’ /bɨbaj/ ‘be becoming red’ d. /zasə/ ‘sick’ /zɨzasə/ ‘be sick’ e. /fesə/ ‘frighten’ /fɨfesə/ ‘be frightening’ f. /bʷəj/ ‘live’ /bɨbʷəj/ ‘be living’ g. /kuːnə/ ‘return’ /kɨkuːnə/ ‘be returning’ 11 Examine the following noun forms in Kuot, which inflect regularly for number. These can be singular or non-singular (i.e. one or more than one) for inanimates, or singular, non-singular and dual for animates and some inanimates. Describe number formation and identify the number markers; account for the distribution of allomorphs. a. b. c. d. e. f. g. h. i. j.

singular /ie/ /ŋof/ /alaŋ/ /nur/ /kuala/ /kobeŋ/ /iakur/ /nəp/ /pas/ /kakok/

non-singular /iep/ /ŋofup/ /alaŋip/ /nurup/ /kualap/ /kobeŋip/ /iakurup/ /nəpup/ /pasip/ /kakokup/

dual (2)

/alaŋipien/ /kualapien/ /kobeŋipien/ /iakurupien/

‘knife’ ‘nostril’ ‘road’ ‘coconut’ ‘wife’ ‘bird’ ‘vine’ ‘part’ ‘stick’ ‘snake’

Not all nouns in Kuot form numbers in this way. How are the following inflected forms constructed? How would you account for the two different patterns? k. /irəma/ l. /dədema/ m. /karaima/ n. /muana/ o. /tabuna/

/irəp/ /dədep/ /karaip/ /muap/ /tabup/

/irəpien/ /dədepien/ /muapien/

‘eye’ ‘word’ ‘nail, claw’ ‘reason’ ‘door’

Research project Describe the case-marking morphology of one of the following languages: Basque, Gooniyandi, Kalkatungu (Pama-Nyungan, Australia), Kwaza (isolate, South-western Amazon), Lezgian (NakhDaghestanian, Dagestan), Hungarian, Tamil or Turkish. You should use a good descriptive grammar of the language (some are available on the internet; for some languages you may need to visit your library) or an article dealing specifically with case marking. Your description should include the

Structure of Words: Morphology

following: a. A listing of the cases distinguished in the language, together with basic information on their forms. What type of morphemes mark the cases? What types of word take case markers? Note that it may be necessary to distinguish different case systems for different types of word, such as nouns and pronouns. b. For each case you should identify the case morphemes, their main allomorphs and conditioning factors. Can you suggest morphophonemic forms for the morphemes and rules of realization for them? c. A description of the main meanings and uses of the cases, along with examples.

79

80

4 Lexicon

In the previous chapter we discussed the notion of the word, and explored how words can be analysed into smaller meaningful components; we also distinguished lexical words and morphemes from grammatical ones. In this chapter we focus on lexical items. We begin by discussing how they can be grouped together into types on the basis of shared grammatical behaviour; this gives the notion of parts-of-speech or word classes. We then turn to some of the means available in languages for making new lexemes. We conclude with a brief discussion of two other topics: fixed expressions involving lexemes, and the attitudinal values of words.

Chapter contents Goals Key terms 4.1 The lexicon 4.2 Ways of making new words 4.3 Ways of using old forms to get new meanings 4.4 Fixed expressions 4.5 What’s in a word? Summing up Guide to further reading Issues for further thought and exercises Research project

82 82 82 88 94 97 99 101 102 103 104

81

82

Linguistics

Goals The goals of the chapter are to: ● introduce two primary concepts, the lexicon and parts-of-speech; ● outline criteria for distinguishing parts-of-speech, and identify the main types found in human languages; ● identify and exemplify some of the main ways of creating new words; ● touch briefly on idioms as more or less fixed strings of words forming single lexemes; and ● show that words are not always neutral, but can evoke emotional responses in speakers of a language and be used to soften or make harsher unpleasant realities.

Key terms acronyming

conjunction

noun

adjective

derivation

particle

adverb

dysphemism

part-of-speech

auxiliary

euphemism

phonaesthesia

backformation

extension of meaning

postposition

binomial

ideophone

preposition

blending

idiom

preverb

borrowing

inflecting verb

pronoun

calquing

interjection

reduplication

clipping

lexeme

taboo word

coinage

lexicon

verb

compounding

narrowing of meaning

4.1 The lexicon Nature of the lexicon As a literate speaker of English, you doubtless expect that the words of a language can be listed in a dictionary. You perhaps imagine this as an alphabetical list of entries that tells you how to pronounce each word, and gives a description of its meaning; you might also expect information about the type of word it is, whether it is a noun, verb or whatever. This sort of information is crucial if you want to use an unknown word correctly.

Lexicon

It seems reasonable to believe also that speakers of a language have internalized mental dictionaries, from which they make selections in constructing utterances. Experimental evidence shows that these mental dictionaries are not organized in the same way as ordinary printed dictionaries, as alphabetical lists. Indeed, the entries are not likely to be recorded in the format of a list. On the other hand, they are highly structured, with multiple links both between the phonological forms and the meanings of the items: they are more like web-documents than printed ones. Instead of the everyday term dictionary, linguists use the technical term lexicon for such a construct. Let us leave aside for now the question of how the lexicon is structured, and enquire into what should go into it. To begin with, we have assumed that it contains the words of a language. It would have to include all of the root morphemes; it must also contain the derived stems, since (as seen in the previous chapter) their meaning is not usually entirely predictable, and therefore must be separately recorded. For English the lexicon will have farmer as well as farm. On the other hand, forms like farmers and farmer’s need not be listed, since knowledge of the morphology of English is sufficient to permit full understanding of these words from the meaning of the component morphemes. This is so provided that the bound morphemes are also listed, including inflectional affixes as well as clitics. For completeness, the derivational affixes should also be included, in addition to the totality of derived forms they are employed in. Other things you would expect to find in a lexicon are irregular inflectional forms of words, such as the irregular forms of be, including are and is, since these can’t be predicted from the morphology (although their meaning can be). Morphologically complex words like blackboard, strawberry, penknife and so on (see §4.3) will also need to be in the list. You will probably agree that a comprehensive lexicon should also list longer expressions such as kick the bucket, know by heart, grasp the nettle, chew your heart out, kill time and so on. Expressions like these, the meaning of which can’t be guessed from the meanings of the component words and the grammar, are called idioms (see §4.4). In short, the lexicon of a language should contain all signs whose meaning is not predictable, regardless of whether they are single morphemes, words, or combinations of words. To reiterate, the lexicon includes grammatical items as well as non-grammatical items, which express content meaning. The latter are called the lexemes or lexical items of a language. Lexemes include morphemes, as well as certain combinations of morphemes, such as idioms. Anything that has a predictable meaning – like an ordinary non-idiomatic sentence such as John kicked the bucket down the street – does not need to be included in the lexicon, and is not a lexeme.

Beware of a potential confusion in terminology! In classifying morphemes the term lexical is used in contrast with grammatical to indicate morphemes that convey content meaning. But the lexicon of a language, as it is usually conceived, includes all morphemes, regardless of whether they are lexical or grammatical; but although grammatical morphemes are included in the lexicon, they are not lexemes.

83

84

Linguistics

Openness The lexicon of a language is not fixed; it does not remain constant forever. Indeed, lexicons change quite rapidly. You are probably aware of changes that have happened during your own lifetime, as new words come into use, and old ones lose popularity, and eventually disappear. Some changes are due to social and technological changes: new terms are required for new inventions, and old words are forgotten as the items go out of use. But other factors can be relevant, including multilingualism (ability to speak more than one language) and tabooing of words (see §4.5). In sections 4.2 and 4.3 we examine some ways languages add to their lexicons. Even the grammatical morphemes of a language change over time, although they are more stable than lexical morphemes. Over long periods of time new grammatical morphemes are created, often out of existing lexical items, and old ones wear out, so to say, and disappear from use. When we said (p. 61) that grammatical morphemes form closed classes, this is to be understood in a relative sense: they are not absolutely closed classes, but much less likely to be added to or taken from than lexical morphemes.

Parts-of-speech Main categories The notion of parts-of-speech or word classes is the idea that the words in the lexicon of a language can be put into different classes. In school grammars these classes are usually defined intuitively, in terms of the type of meaning expressed. But in modern linguistics grammatical behaviour is the primary consideration, although meaning does play a role, and serves as the basis for labelling the classes. The idea is, on the one hand, that not all words show the same grammatical behaviour, and, on the other, there are sufficient commonalities among some groups of words to allow us to make generalizations about them. It is not the case that each word behaves in its own idiosyncratic manner. Below is a list of some of the main parts-of-speech found in the world’s languages, with some brief remarks indicating the typical semantic content of the members of each class, and some characteristics of the part-of-speech in English. (Different criteria may be required for other languages.) ●



Nouns are words that typically specify things or entities (people, animals, objects, places, abstract ideas).1 Grammatical characteristics of English nouns include the ability to inflect for number (except for a small set of irregular nouns), and that they can be preceded by modifiers such as the, a and adjectives. In many languages nouns distinguish inflectional categories of number (§3.3), case (§3.3) and gender (see §7.2). Adjectives indicate qualities or properties of things, such as age, colour, size, speed and shape. In English adjectives usually go before nouns; they can usually be preceded by an intensifier like very or too; they can sometimes be negated by the prefix un-; and they can be used in making comparisons by adding the suffix -er or the word more. They also admit the superlative suffix -st or modification by the word most.

Lexicon















Pronouns are words like I, me, you and they that are used instead of nouns to refer to persons and things, especially known and identifiable ones. Pronouns are grammatical morphemes, and normally form closed classes that can’t be added to. Pronouns in English make casedistinctions (see §3.3) that are not made for nouns. Verbs generally designate events (actions, states, processes, happenings, mental and bodily activities). In English, verbs can be distinguished by the fact that they make past tense forms regularly by the suffix -ed (and irregularly by other means, including suppletion); many take the agentive derivational suffix -er to form nouns (thinker, walker, caller, lover). Verbs in many languages inflect for tense, and person and number of the subject. Auxiliaries are verbs that express grammatical rather than lexical information, and are used along with lexical verbs denoting events. Auxiliaries in English include do, be and have, as in Does the duckling love the farmer?, The duckling is quacking and The farmer has kissed the duckling. Adverbs indicate qualities and properties of events (e.g. like quickly, happily, specifying the manner of performance), or indicate the degree or intensity of a quality (like very in a very slow train). In English, many adverbs show the derivational affix -ly, and, as in many other languages, adverbs do not take inflections. Prepositions are grammatical words like at, in, to, by and from, that go with nouns to specify how they are related to the rest of the sentence (e.g. by locating the event in space or time). Some languages have postpositions, which are words that do the same work as prepositions, but follow the noun rather than precede it. The closest thing to a postposition in English is the possessive clitic ’s. Conjunctions are grammatical words like and, or, but, if and the like, that join words or groups of words together. In English they admit no morphological modification, and usually occur in front of the last item of the list of words joined together (salt and pepper; Tom, Dick and Harry). Interjections are words like hey!, yuk!, strewth!, erk! which mostly express the speaker’s emotional attitude, or call for attention. Important characteristics of these words are that they can stand alone as full utterances, and do not allow any morphological modification.

Criteria As mentioned above, in modern linguistics parts-of-speech are defined by grammatical behaviour, not meaning. Thus a word such as seem can hardly be interpreted as denoting an event, although its grammatical behaviour in English groups it with verbs: it takes the regular past tense suffix -ed (here /d/), and third person singular present -s (here /z/) and occurs in the same position in English sentences as event-denoting verbs like bite, walk and so on. The grammatical features characterizing the parts-of-speech vary from language to language. In some languages it is relatively easy to set up nouns and verbs as distinct parts-of-speech by morphological behaviour. Thus in Pitta-Pitta (Pama-Nyungan, Australia) nouns take case-marking suffixes, and verbs tense suffixes. This criterion works very well, giving distinct, almost disjoint, classes – just two words are known to take both sets of suffixes.

85

86

Linguistics

But in many languages things are more complex: simple morphological criteria like the ability to take certain morphemes lead to parts-of-speech with considerable overlapping of members. This is the case in English. Linguists have different opinions as to what to do in such circumstances. Some are not bothered by massive overlapping; others look for ways to reduce overlap. There are also differences of opinion on at least two other issues. One is whether we should be satisfied with defining parts-of-speech in a language-specific way, or whether we should be seeking universally valid criteria. The other is whether to use morphological criteria (as in Pitta-Pitta), or syntactic criteria – that is, to assign words to parts-of-speech according to the way the words are used in sentences – or a mixture of both.

Parts-of-speech across languages Not all languages distinguish all of the parts-of-speech listed above. A fair number of languages do not recognize a distinct class of adjectives. This is so in a fair number of Australian languages, where words translating as adjectives in English belong together with nouns in a single part-ofspeech. By contrast, in Mandarin Chinese words translating into English as adjectives usually belong with verbs. Probably the majority of languages distinguish at least the two major lexical parts-of-speech, nouns and verbs. But even this distinction is perhaps not universal. Some languages of Native North America – including many Salishan languages – have been claimed to lack this distinction. (There are differences of opinion among specialists.) Samoan is another such language. It seems that in principle any lexical word can behave either like a noun or like a verb; English translation equivalents of a single word can be either a noun or a verb, as illustrated by lā ‘sun’, ‘suns, be sunny’ in (4-1) and (4-2).2 There is thus, it has been argued (e.g. by Hengeveld et al. 2004), no reason to distinguish nouns from verbs as different parts-of-speech in Samoan. (4-1)

(4-2)

‘ua mālosi le lā perfective strong article sun ‘The sun is strong.’ More literally, ‘The sun strongs’. ‘ua lā le aso perfective sun article day ‘The day is sunny.’ More literally, ‘The day suns.’

Samoan

Samoan

While some of the parts-of-speech listed on pp. 84–5 above may not exist in a language, additional categories are not infrequently distinguished. In many northern Australian languages words corresponding to verbs in English belong to two different parts-of-speech: members of one group are morphologically complex, and take inflectional affixes; members of the other group are morphologically simple, and admit no inflection and just a little derivation. The first group is a closed set with between about ten and two hundred members, while the second group is open, with many hundreds of members. In the following Warrwa sentence, -wani- ‘be’ belongs to the class of morphologically complex verbs, sometimes called inflecting verbs, and takes the prefix ngirrwhich indicates subject ‘they’, the suffix -n indicating present time, and the suffix -bili indicating

Lexicon

that the persons are two in number. (These glosses are rough, and the morpheme analysis is incomplete.) The word nganka is a member of the second part-of-speech, preverbs; the suffix -ngkaya, indicating ongoing or continuous action, is the only bound morpheme that can be attached to it. (Words like nganka do not admit inflections; -ngkaya is a derivational suffix.) (4-3)

kujarra nganka-ngkaya ngirr-wani-n-bili two talk-continuous they-be-present-two ‘The two of them are speaking together.’

Warrwa

Many languages of Australia and other parts of the world also distinguish a class of particles. These are words which, like interjections, typically occur without any morphological modification. In contrast with interjections, however, they tend to be more integrated into the utterance. For instance, the negative word, and words indicating probability (e.g. ‘likely’) are particles in Warrwa and Gooniyandi. It should be noted that grouping words together into parts-of-speech categories does not imply that every word in each class patterns in precisely the same way. Rather, they are sufficiently similar to make it reasonable to group them together. Sometimes one can identify subclasses of major parts-of-speech because of shared minor differences. For instance, numerals are sometimes identifiable as a separate subclass of nouns (or nominals). On the other hand, it is not unusual to find a small number of lexical items that have their own unique patterning, only partly shared with other words.

A considerable number of languages from Africa, Asia and the Americas have words called ideophones. These are expressive words that depict sensory experiences such as sounds made by various things (for instance, onomatopoeic representations of the calls of animals or noises made by objects) or events (e.g. like woosh!! and bang!!), and, in some languages, non-auditory features such as manner of motion (e.g. gbadara-gbadaraa ‘a drunkard’s wobbling gait’ in Siwu (Niger-Congo, Ghana)), shape (e.g. tíígh ghí-tìghì-tíghí ‘twisted’ in Edo (Niger-Congo, Nigeria)), texture (e.g. lip rip p ‘smooth, flat’ in Hausa), visual appearance (e.g. kirakira a ‘glitter’ in Japanese), taste and smell (e.g. thuu u ‘smelling horribly’ in Venda (NigerCongo, South Africa)), and/or feelings and sensations (e.g. zokuzoku u ‘thrilled’ in Japanese). Ideophones often show distinctive properties, phonological, morphological and/or syntactic. In a number of languages they have distinctive phonological features, including the employment of phones not found in other words. In Mundang (Niger-Congo, Chad and northern Cameroon) the labio-dental flap [ѵ] occurs only in ideophones, while the fricative [v] occurs only in ideophones and words borrowed from other languages. In Ewe, ideophones sometimes show unusual syllable types, including vowelless syllables, as in gbrrrr ‘sound of thunder’, where the repeated r’s indicate repetition of that trill. As the examples of the previous paragraph indicate, it is not uncommon for ideophones to comprise reduplication of an otherwise meaningless form. Ideophones are generally relatively inert morphologically: they rarely admit many morphological modifications. However, in various languages they may be repeated two, three or more times in sequence. (Note that this is repetition, not reduplication, as described in §4.3 below.)

87

88

Linguistics

There are differences among languages concerning the position of ideophones in the lexicon. In some languages ideophones represent their own part-of-speech. This seems to be the case in Setswana (Niger-Congo, South Africa and Botswana), where they show grammatical properties that distinguish them as a class of their own. In other languages they do not. In Ewe, for instance, ideophones can belong to any part-of-speech.

4.2 Ways of making new words Limitations on formation of new words In this section and the next we look at some of the means by which languages expand their lexicons; we will be concerned with lexical roots and stems, and ignore grammatical items and idioms. This section discusses ways of expanding the lexicon by making new word forms, sometimes to express new meanings, sometimes to express existing meanings. The following section identifies ways in which existing forms can be used to make new lexical signs expressing new meanings. The formation of new words is constrained in many ways. New forms must normally satisfy the phonological system of the language. A new word with final /ŋ/, e.g. /fɛŋ/, would be possible in English, though one with initial /ŋ/, e.g. /ŋɛf/, would be impossible. There are also meaning constraints: the meaning must be one that speakers are likely to want to make. The meaning ‘quark’ is useful only in a language spoken in a scientifically oriented society. The lexical item quark entered the English language in the twentieth century, when the meaning was needed, not in 1063! Aside from these constraints, human inventiveness is not unlimited, and it is unusual for a new lexeme to be totally original in both form and meaning. More usually, we put together bits and pieces of old forms and meanings according to relatively well-established patterns (sometimes grammatical, sometimes not), to come up with new lexemes.

Etymology is the study of origin of words. It is often difficult to be sure how and when a word entered a language. This is particularly true of languages which, like most human languages, are unwritten or just recently written. At least for a language with a long tradition of writing old records may be preserved, permitting one to conclude that a certain word was in use by a particular time. However, it doesn’t follow that the earliest written record remaining is the first use of the word – relevant written documents may well not have survived, and it is likely that the word was used in speech before it was ever written. (Writing is generally more conservative than speech.) Moreover, the basis on which the word was formed may not be apparent, or there may be two or three quite different though equally likely explanations. Dictionary makers these days keep a careful eye – and ear – open for new words entering the major European languages like English, French, German and Spanish. But for the vast majority of languages there are no written dictionaries, let alone sufficient human and financial resources to monitor the acquisition of new words.

Lexicon

Clipping Clipping is the shortening of an existing word of more than one syllable, generally to a single syllable. Common examples are pub (from public house), fan (in one sense from fantastic, in another from fanatic), fax (from facsimile), ad (from advertisement), condo (from condominium) and flu (from influenza). Personal names are often clipped in English – Mike, Ron, Rob, Sue and Liz. Over time, clippings may become more frequent than the longer forms. This has happened with pub, pram and fan (in the ‘devoted follower’ sense), for which links with the source-words are not recognized by all speakers. Sometimes the long and short forms take on different senses, or become associated with different contexts of use (e.g. formal vs. casual speech). This is the case for short and long forms of names in English, where short forms tend to be used in more informal and intimate contexts, long forms in formal contexts. In the case of fax and facsimile the former has come to apply specifically to a document sent via the telephone network (now an outmoded technology), while facsimile is normally restricted to an exact replica of a document, preserving its original written or printed form. A variant on clipping that is common in Australian English is hypocorism. This involves first clipping a word down to a closed monosyllable. Next the suffix -y ~ -ie (/i/) is attached to the clipped form. Some examples are Aussie ‘Australian’, brekky ‘breakfast’, bickie ‘biscuit’, barbie ‘barbeque’ and telly ‘television’. The same suffix can be added to clipped personal names (e.g. Mickey, Robby, Lizzie); but there is no suffixless brek or bick corresponding to brekky and bickie.

Acronyming Acronyms are words formed from the first letters of a string of written words. There are two types: word acronyms and spelling acronyms. Word acronyms are pronounced as single words, following the spelling to pronunciation rules of the language. Examples are RAM (random access memory), ROM (read only memory), NASA (National Aeronautics and Space Administration), UNESCO (United Nations Educational, Scientific, and Cultural Organization) and AIDS (Acquired Immune-Deficiency Syndrome). Acronyms are often written with capital letters, giving away their status; but many well-established acronyms are written as ordinary words: laser (light amplification by stimulated emission of radiation), scuba (selfcontained underwater breathing apparatus) and radar (radio detecting and ranging). Few people realize that these words are acronyms. Famous is snafu, an acronym dating from the Second World War, that stands for situation normal all fucked (or fouled) up. Spelling acronyms are pronounced as sequences of the names of the letters used, rather than as words (often because the sequences of letters are unpronounceable). Examples are EU (European Union), PR (public relations), VCR (video cassette recorder) and CD (compact disk). The widespread OK, which has been borrowed into many languages as an expression meaning ‘all right, satisfactory, acceptable’, is said by some to have begun life as a jocular acronym for orl korrect, coined in 1839. Many other origins have been proposed, however. Acronyming is a popular way of forming new terms in modern English and many other European languages, especially names for organizations. They are often chosen so as to be evocative

89

90

Linguistics

of the function of the organization, as in Ian Fleming’s SPECTRE (Special Executive for Counterintelligence, Terrorism, Revenge and Extortion), WAR (women against rape), NOW (National Organization for Women) and MADD (mothers against drunk driving). Acronyming is restricted to languages with established traditions of alphabetic writing and widespread literacy. It is not universally employed, and is dependent on the visual medium. To the best of my knowledge, speakers never construct words from the initial phonemes of strings of spoken words.

Blending Blends involve the combination of parts of two separate words to form a single word. Usually it is the first part (often a syllable) of one word together with the second part of the other word (either a syllable or a single final consonant), which occur in that sequence. The word motel is a blending of motor and hotel; smog is a blending of smoke and fog; and bit is a blending of binary and digit. Other examples are Chunnel (the tunnel under the channel between England and France) from channel and tunnel, refolution ‘a peaceful revolution’ from reform and revolution, and names for various mixes of languages such as Franglais, which blends français (French) and anglais (English), and Japlish a blend of Japanese and English. While speakers undoubtedly realize the status of some of these words as blends, others are not so obvious. Occasionally it is the first part of both words that are combined together, as in modem, a blend of modulator and demodulator.

Borrowing Major features of word borrowing Borrowing, the process of incorporating into one language words from another, is perhaps the most common source of new words. Words that have been borrowed are called loanwords. Loanwords are normally adapted to the phonological (and phonetic) patterns of the language they are borrowed into, although if the source (or loaning) language is well known to speakers of the borrowing language, this adaptation does not always occur. An example is the word kangaroo, a borrowing from Guugu Yimithirr (Pama-Nyungan, Australia). The Guugu Yimithirr word is /gaŋuru/, with stress on the first syllable. But this is not a possible word in English, and it has been regularized to follow the phonemic patterns of the borrowing language: the velar nasal beginning the second syllable has been replaced by the nasal-stop cluster /ŋg/ (which is a possible sequence), and the /r/ – a tap or trill [r] (phonemically distinct from the continuant [ɹ] in Guugu Yimithirr) – has been replaced by the English rhotic ([ɹ] in most dialects). In addition, stress has shifted to the third syllable, and the vowel of the second syllable has been reduced from /u/ to /ə/. The other two vowels also show different qualities. This example illustrates another characteristic of loanwords: their meaning need not be identical with the meaning of the word in the source language. The Guugu Yimithirr word refers to a

Lexicon

particular type of macropod, not to kangaroos in general. According to one story, speakers of the language did not recognize the word as one of their own when pronounced in the English fashion because the English speaker was pointing to the wrong type of macropod!3 Sometimes borrowed words are misidentified. A number of borrowings from Arabic include along with the lexical item also the definite article al, or a part of it. For example, lute comes from Arabic al ūd, the morpheme boundary being wrongly placed after the initial vowel; algebra (from al jabr ‘the reunion of broken parts’) and algorithm (from al Khwārizmῑ, ‘the man from Khwārizm (Khiva)’), on the other hand, preserve the entire definite article as a part of the root. Loan translations or calques are a special type of borrowing in which the morphemes composing the source word are translated item by item. Examples are English power politics from German Machtpolitik and Mandarin Chinese 男朋友 nán péngyoˇu (male friend) from English boyfriend. Similar to calques are loanblends, in which one of the morphemes, usually the lexical root, is borrowed, and the other is native, as in Pennsylvanian German bassig ‘bossy’, with borrowed stem and native suffix, -ig a German morpheme corresponding to the English -y suffix.

Another example of calquing comes from Mandarin Chinese, where 水门事件 shuˇımén én shìjiiàn ‘watergate event’, referring to the Watergate scandal of 1972, was calqued on shì mén shìjiiàn ‘gate event’ has come to be used in Watergate scandall. Subsequently, 门事件 mé reference to scandals in a similar way to the recently invented -gate e suffix of English (see p. 73 above). Examples are: 监听门事件 jiia¯nt¯ıng m én shìjiiàn ‘monitor gate event’ (which mé refers to a number of scandals in which private communications were intercepted by authorities such as the FBI), 艳照门事件 yànzhào men shìjiiàn ‘pornographic photograph gate event’ (referring to events in China in 2008 in which compromising photographs of a number of film stars were made public) and 虐猫门事件 nü è ma¯o mén shìjiiàn ‘mistreat cat üè gate event’ (various events in which cats were mistreated). The form is sometimes clipped to just 门 mé n ‘gate’, as in 诈捐门 zh à jua¯n mé n ‘fraudulent donation scandal’ (referring to én zhà én a scandal involving the actress Zhang Ziyi, who donated less than she claimed to the Wenchuan earthquake relief fund). This usage is presumably borrowed from English; however, as usual, the meaning is not identical with the meaning of the English suffix. The -gate -gate e suffix has been borrowed without translation into other languages, including French and German.

Some borrowings into English Over the last millennium or so English has borrowed a vast number of words. Indeed, it has been estimated that over 60 per cent of the vocabulary of the average text in the modern language has been borrowed since 500 CE (Williams 1975). Knowledge of the time and place of these borrowings gives something of a picture of the history of the language and its speakers. Below is an outline of some of the major sources of borrowings. Some centuries after the invasion of Britain by West Germanic tribes in the fifth century there occurred another invasion of Germanic tribes from Denmark. Around 900–1000 CE numerous

91

92

Linguistics

Danish words were borrowed into English; many remain, including a number of very basic words, among them sky, sister, egg, both and thing. The Norman invasion of England in 1066 resulted in French becoming the second language of many inhabitants. Numerous French loanwords date from this period, including many terms relating to the political and economic sphere such as duke, cost, labour, rent and calendar, as well as some everyday lexemes such as uncle, aunt and easy. From about the tenth to the sixteenth centuries English borrowed a considerable number of Latin and Greek words. The loanwords dating from those times are mainly from the scientific and philosophical domains: these developing intellectual pursuits required terminologies, and Latin and Greek were good sources because they were known by the educated elite. Examples include solar, gravity, telescope, history and legal. Following the colonization of America, numerous words were borrowed into English – as well as the other languages of major colonial powers including Spanish, French and Portuguese (Romance, Portugal) – from Native American languages. Many were names of places (e.g. Michigan, Chicago, Texas), and terms for plants (maize, tomato, tobacco) and animals (moose, skunk, caribou) that were unfamiliar to the European invaders. A number were subsequently borrowed into other European languages. The post-1788 colonization of Australia brought another rash of loanwords, again primarily lexemes for places, plants and animals. Some 200 words have been borrowed from Australian languages into Australian English, and subsequently into other dialects. These include kangaroo (see pp. 90–1), boobook ‘an owl type’ and dingo ‘wild dog’ (from the language of Port Jackson), and koala (from the language of the Sydney area). Some of these words, including kangaroo, emu and boomerang, have since been borrowed into other languages, including other Australian Aboriginal languages.

Some borrowings from English English has also contributed loans to numerous languages, especially in the colonial and postcolonial periods. Thus the languages of North America and Australia borrowed numerous words from English, again often words for previously unknown objects (e.g. cattle, sheep, horses, guns, carriages and, later, cars) and activities (e.g. branding, mustering and working). With the increasing globalization of English during recent decades, and its status as the international language of business, science and technology, many languages have borrowed, and continue to borrow, considerable numbers of English specialized terms from these domains. Danish has borrowed numerous terms from the technological domain including computer, video, radio, internet, harddisk, rom, pc and cd. For logging off my internet bank account I go to the button labelled log af, a loanblend from log off. When I’ve finished working on the computer, I click luk computeren (literally ‘close the computer’), at which point I have the choice of going to standby (a straight borrowing from English), luk ‘shut down’ (indigenous Danish) or genstart (literally ‘againstart’ a loanblend from restart). Examples of calques in Danish are fjernsyn (fjern ‘distant’ and syn ‘sight’) ‘television’ (in which the tele bit has its source in the Greek word tēle ‘far off ’) and fjernstyring (fjern ‘distant’ and styring ‘control’) ‘remote control’.

Lexicon

Coinage Very rarely, a word is completely novel, an entirely creative invention; such words are called coinages. Coinages are always restricted to the limits imposed by the phonology of the language. Possible examples in English are nerd, chunder (‘to vomit’, primarily in Australian English), barf (‘to vomit’, mainly in American English), naff ‘unfashionable, worthless, faulty’, razoo (usually preceded by brass, and referring to an imaginary coin of no value in Australian and New Zealand English, as in I haven’t got a brass razoo ‘I’m completely out of money’), and brand names, like Kodak, xerox, Vegemite and Exxon. More usually the degree of creativity of a new word is limited, and speakers exploit existing words and word patterns. Thus, according to The New Shorter Oxford English Dictionary the word rayon, which refers to fabric made from fibres and filaments of a certain type, is an invented word that could be suggestive of the now rare rayon ‘ray of light’, and/or the noun ray with ending on (which is not a morpheme) motivated by cotton. Other invented words like nylon and teflon might further exploit this meaningless ending. We have already remarked (§1.2) on the iconicity of some words, in particular of onomatopoeic words that represent the sound characteristic of some object, animal or event. A fair number of Australian languages have invented an onomatopoeic word for the characteristic vocalization of a domestic cat. This in turn has come to be used in reference to the animal itself – thus minyawoo in Gooniyandi, minyawu in Warrwa and mijawu in Nyulnyul. Sometimes certain phonemes or combinations of phonemes are felt by speakers to be evocative of certain meanings. For instance, in many languages the high front vowel /i/ conveys a suggestion of smallness or closeness in contrast with /a/ or /u/. Compare, for instance, English ding and dong – which of these do you feel best describes the noise of a large bell?4 If I was a betting man I would bet you chose dong, and that you would go for ding for the sound of a small bell. Interestingly, in Australian English I have heard ding in reference to a very small dent in the body of a car. And in English (among other languages) the lateral l has a tendency to suggest liquids and fluid or uncontrolled movements. These are instances of sound symbolism, where the phoneme iconically represents a component of the meaning. The term phonaesthesia includes not just iconic motivation such as this, but effectively any recurrent association between a phoneme and a meaning in the lexicon of a language. For instance, many English words with initial gl- have to do with brightness, as in glisten, gleam, glitter; and many beginning with sl- tend to be associated with uncontrolled, liquid-like movements, as in slip, slide and slither. Phonaesthesia can be, and often is, exploited at least to some extent by speakers in coining new words. Indeed, coinages that display some degree of phonaesthesia are more likely to gain acceptance than those that are totally arbitrary.

A special type of coinage is nonsense words, forms that could be words in the language, but are not. Lewis Carroll was a master inventor of nonsense words, which he used in many poems in Carroll (1899). These poems (sometimes called nonsense verse) use the usual grammatical morphemes of English, but replace some of the lexical words by

93

94

Linguistics

nonsense words, sometimes with stunning effect. Here is the first verse of Carroll’s poem Jabberwockyy (Carroll 1899: 28): ’Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe. Carroll also went on to explain, in a dialogue between Humpty Dumpty and Alice, the meanings of the nonsense words, and their motivations. It is worth repeating the explanation of the first two lines here not just because of its sheer brilliance, but also because it alludes to some of the processes we have identified. ‘ “Brillig g” means four o’clock in the afternoon – the time when you begin broiling things for dinner.’ ‘That’ll do very well,’ said Alice: ‘and “slithyy”?’ ‘Well, “slithy “slithyy” means “lithe and slimy”. “Lithe” is the same as “active”.You see it’s a portmanteau – there are two meanings packed up into one word.’ ‘I see it now,’ Alice remarked thoughtfully: ‘and what are “tovess”?’ ‘Well, “tovess” are something like badgers – they’re something like lizards – and they’re something like corkscrews.’ ‘They must be very curious-looking creatures.’ ‘They are that,’ said Humpty Dumpty; ‘also they make their nests under sun-dials – also they live on cheese.’ ‘And what’s to “gyre e” and to “gimble e”?’ ‘To “gyre e” is to go round and round like a gyroscope. To “gimble gimble” e” is to make holes like a gimlet.’ ‘And “ the wabe “the e” is the grass-plot around a sun-dial, I suppose?’ said Alice, surprised at her own ingenuity. ‘Of course it is. It’s called “ wabe “wabe e” you know, because it goes a long way before it, and a long way behind it–’ ‘And a long way beyond it on each side,’ Alice added. Carroll 1899: 125–7

4.3 Ways of using old forms to get new meanings In this section we focus attention on new meanings instead of new forms. Of course, there is overlap with the strategies for making new forms discussed in §4.2, most of which also give rise to new meanings. Conversely, many of the processes we discuss here result in new forms; but, in contrast to the processes discussed in the previous section, these new forms are more systematically related to existing forms.

Lexicon

Derivation Derivation is the process of forming new words by use of derivational morphemes, morphemes that create new lexical stems (see §3.3). Derivation is commonly used in English to form new words in science, medicine and technology. Linguistics is typical, and numerous derived forms can be found in this book. Dating from the twentieth century are deverbal, denominal and deadjectival, referring to derivational processes by which verbs, nouns and adjectives lose their original part-ofspeech membership.

Compounding Two separate words are sometimes joined together to form a single word, a new word with a new meaning of its own, a meaning that is not entirely predictable from the component words. This process is called compounding. An example is loanword, a single word made up of two independent words loan and word. Notice that although a loanword is a type of word, one that has been borrowed into one language from another, the meaning of loanword is not entirely predictable from the meaning of the individual words making it up. Why is this? There are two main reasons. First, a loanword is not borrowed in the way that things are usually borrowed – the word loan suggests a temporary change of possession (otherwise it is surely a gift), which does not apply in this case: the word remains in both donor language and borrowing language. (I would be happy to loan you any amount of money if I knew that I would keep the same amount in my possession!) Second, as described earlier, the term loanword applies to the specific case in which the word is incorporated into the lexicon of another language, which usually means that it adapts to the phonological structure of the borrowing language. A loanword is not just any word of Danish that I as a native speaker of English living in Denmark might insert into my spoken English, although nothing in the meaning of the words making up the compound precludes words of this type. Compounding is quite heavily used in English, German, Dutch and Danish as a means of forming new words. Kiowa (Kiowa Tanoan, USA) also makes extensive use of compounding: t’ɔ̀ -á (ear-stick) ‘earring’, mɔ́ n-kʰɔ́ y (hand-cloth) ‘glove’, mɔ́ n-sóˑdè (hand-hook:on) ‘bracelet’ and mɔ́ npà-tò (hand-against-hold) ‘weapon’. It is less frequent in Romance languages such as French, Spanish and Romanian, and almost non-existent in Mosetén (Mosetenan, Bolivia).

Reduplication Many languages form new words by repeating an existing word either in full or in part. This is called reduplication. Oceanic languages usually make a good deal of use of reduplication to construct new words. For example, in Saliba (Austronesian, Papua New Guinea) nouns can be formed from verbs by reduplication, as in kuya-kuya (sweep-sweep) ‘broom’, and lau-lau (go-go) ‘way, method’. A verb can also be reduplicated to form a word expressing a quality: dou-dou (crycry) ‘crying (person)’. If the word has more than two syllables, reduplication is partial, the first two

95

96

Linguistics

syllables only being repeated, and attached to the front of the word, as in tago-tagodu (break-break) ‘broken (thing)’ and hede-hedede (talk-talk) ‘word, talking’. Reduplication is often iconic. Reduplication of verbs generally conveys the idea of a frequent and repeated event, or one that is habitual or characteristic of something, as in the Saliba examples just cited. Reduplication of a noun often indicates numerosity or multiplicity, or intensity. In many Australian languages reduplication of nouns results in a new noun indicating a multiplicity of things. This can be illustrated by the Jaru (Pama-Nyungan) reduplications maluga-maluga (old:man-old:man) ‘many old men’, manga-manga (girl-girl) ‘many girls’, and guju-guju (baby:animal-baby:animal) ‘many baby animals’. Reduplication of nouns in Chichewa (NigerCongo, Malawi) indicates intensification: m-kází-kazi (m-woman-woman)5 ‘cute and cultured woman’ and munthu-múnthu (person-person) ‘a real (i.e. humane) person’. Reduplication is often considered marginal in English, although this is questionable: over 2,000 words are formed by this process according to Burridge (2004: 47). Sometimes the whole stem is repeated, as in fifty-fifty, hush-hush and never-never; more often there is a slight phonological change in the repeated element, as in helter-skelter, dilly-dally, higgledy-piggledy, teeny-weeny, hanky-panky and shilly-shally.

Backformation In backformation a shorter word is created from a longer one by removing a part that is wrongly taken to be an existing morpheme. From the noun television the verb televise was backformed on analogy with other pairs such as revise ~ revision, and incise ~ incision, that involve the nominal derivational suffix -ion (-ʒn̩ ). Another example is burger, which, as most speakers of English know, comes from hamburger. But the term was originally Hamburg-er, with the derivational suffix -er attached to Hamburg, the name of a city in northern Germany. Speakers reanalysed the word as ham-burger, doubtless interpreting the ham as the word for a type of meat. By analogy with pairs like ham sandwich ~ sandwich, the pair hamburger ~ burger is expected. Other examples are the verbs edit from editor, emote from emotion, and babysit from babysitter.

Meaning extension Our final two processes do not involve changes to the form of a word. The first is meaning extension, which is the process of extending the meaning of an existing word, broadening it to embrace new senses. This is a quite common way of forming new words – new words because the meaning associated with the old form is a new one, and not fully predictable from the old sense. This can be exemplified by the word for ‘policeman’ in many Australian languages: in Walmajarri it is limpa ‘a type of biting fly that swoops down on person’; in Bardi (Nyulnyulan) it is liinyj ‘sour’. On the opposite side of the continent, Dhurga (Pama-Nyungan) has extended the word jungga ‘octopus’ (the eight arms of the law!), while Djabugay (Pama-Nyungan) has extended jun.gi ‘freshwater crayfish type’. Money, in Australian languages (not a traditional artefact), is often designated by the word for ‘rock, stone’, an extension based on the resemblance of coins to stones; these days the term shows, in many languages, a further extension to include paper money as well.

Lexicon

Examples of meaning extensions are not difficult to find in English either. The word holiday, for instance, comes from the compound holy-day, a day on which one did not work. It has extended to cover any day designated as work-free (as in today is a holiday, which happens to have been true at the time I first composed this sentence), and thence to the travel and other enjoyable activities one might perform on a work-free day (as in I will go for a holiday next month). A number of product brand names have extended their meaning to embrace any artefact of their type. An example is hoover; indeed, this word has extended further to the activity one typically performs with the instrument, namely vacuum-cleaning the floor.

Meaning narrowing Meaning narrowing is the reverse of meaning extension: a word’s sense becomes restricted. For example, doctor in everyday spoken English shows narrowing from ‘person holding a doctorate degree’ to ‘person holding a doctorate in medicine’. To express the former meaning, and avoid confusion, one would normally use a more specific expression such as doctor of philosophy, doctor of science or PhD. (Doctor has subsequently extended its meaning to ‘qualified medical practitioner’, relaxing the requirement of having a doctorate in medicine.) The word doctor has perhaps narrowed due to the lack of a suitable term for the professional in the medical field: medical practitioner has a formal air, while terms like physician and surgeon have more specific senses. Not all narrowings are motivated by lack of a suitable term. In seventeenthcentury English meat meant ‘food’, and flesh meant ‘meat’. The meaning of meat has since narrowed to one particular type of food, that derived from animals; flesh has at the same time shifted its meaning to refer to any soft part of an animal’s or person’s body, regardless of whether or not it is to be eaten. And food took the place left by meat.

4.4 Fixed expressions Idioms Recall that idioms are more or less fixed expressions like kick the bucket ‘die’, the meanings of which are not predictable from the component words or grammar (see p. 83), and which need to be listed in the lexicon. Some idioms are virtually unchangeable, like the Australian English Don’t come the raw prawn with me, meaning something like ‘don’t try putting that behaviour over me’ (also idiomatic). You can’t change this to the positive form *Come the raw prawn with me, or into a statement in past time *He didn’t come the raw prawn with me, or with different persons *I won’t come the raw prawn with you. Nor can you replace raw prawn by cooked prawn, raw meat, raw lobster or any other such expression. Most idioms are not so fixed, and allow at least some grammatical modifications. For example, give (someone) a piece of your mind can be modified in various ways, according to time, purpose of

97

98

Linguistics

the utterance and the persons involved: She gave me a piece of her mind, (Don’t) give her a piece of your mind and so on. Although you can modify the structure of this idiom somewhat (even to a limited extent rephrasing it – for example, A piece of my mind is what I intend to give her), you can’t do so as freely as for an ordinary non-idiomatic sentence like She gave the postman the wrongly addressed letter. Whereas you can rephrase the latter sentence as The wrongly addressed letter was given to the postman, you can’t rephrase the idiom as *A piece of your mind was given to her, or *She was given a piece of your mind. (The latter sentences invoke the non-idiomatic readings that something was handed over.) More interestingly, perhaps, idioms can often be exaggerated by either the addition of elaborating material, or replacement of a salient word, injecting life back into tired and worn-out expressions. Examples are take with a generous pinch of salt, take with a ton of salt, I’ll eat my hat . . . and shoes as well! and up shit creek in a barbed-wire canoe without a paddle – as if being up shit creek wasn’t bad enough! Even the invariant Don’t come the raw prawn with me can be modified by lexical replacement to Don’t come the uncooked crustacean with me! (Yes, I have actually heard this variant, which was initially uttered in a deliberate attempt to be humorous, and subsequently caught on for a while among a small in-group.) Similarly, Hold your horses! can appear as Hold your ponies! Like grammars – as Edward Sapir famously observed – idioms leak. Although the meaning of an idiom is not predictable from the words that make it up, in many cases it is possible to attribute some motivation for it. Idioms like don’t look a gift horse in the mouth, throw your weight around, I’ll eat my hat, hold your horses, get it off your chest and get off your high horse are not entirely arbitrary. Thus, you can imagine if you examine too carefully a horse given to you that you might find something wrong with it, and this idea is obviously behind the idiom. To be sure, some remain puzzling, if not downright inexplicable, like don’t come the raw prawn with me and kick the bucket. (Don’t ask me to explain either of these!) Presumably all languages have idioms. One domain where idioms are frequently found is in the expression of emotions. These idiomatic expressions often make reference to parts of the human body, as in the following examples: (4-4)

(4-5)

(4-6)

(4-7)

ngiya-mad-ju barij wi Ngarinyin (Worrorran, Australia) my-kidney-to rise it:is ‘I’m happy.’ (Literally, ‘My kidney is rising.’) tôi đau lòng Vietnamese I sick intestines ‘I’m broken-hearted.’ (Literally, ‘I’m sick in the intestines.’) ti-n miīs Paamese (Austronesian, Vanuatu) intestines-he he:cries ‘He/she feels sorry.’ (Literally, ‘His intestines cry.’) ti-n tīsa Paamese intestines-he he:is:bad ‘He/she is angry.’ (Literally, ‘His intestines are bad.’)

To wind up this section consider (4-8), a common idiomatic expression in Romanian. With a little thought you should be able to understand its motivation.

Lexicon

(4-8)

m-ai lovit în pālārie Romanian me-have:you hit in hat ‘What you are saying (or doing) is so stupid that you fail to hit the target.’ (Literally: ‘You hit me in the hat.’)

Binomials Some more or less fixed expressions have meanings that are fully predictable from the component words. Examples include so-called binomials, pairs of words typically linked by a conjunction, such as salt and pepper, pen and paper, up and down, cup and saucer, dead or alive. The words of these binomials come in a relatively fixed order; thus you normally say Pass the salt and pepper and I looked up and down the street, not Pass the pepper and salt or I looked down and up the street. Of course, you can say Pass the pepper and salt and I looked down and up the street, but that is not the way speakers of English normally phrase things. These binominals are said to be irreversible, in contrast to reversible binomials such as ice and snow, in which both orders are fairly common. A quick glance at the COCA corpus of American English (see p. 212 below) revealed 680 instances of snow and ice and 365 of ice and snow. By contrast, for salt and pepper the frequency was 5,890 against just 86 instances of pepper and salt, while there were 121 instances of cup and saucer but none of saucer and cup. As these figures suggest, the reversibility of a binomial is typically gradable rather than a simple yes/no matter. Irreversible binomials may take on meanings that are not completely predictable, and thus become idiomatic. This is illustrated by the following small selection: more or less ‘approximately’, give or take ‘approximately’, give and take ‘mutual concessions and compromises’, by and large ‘generally, typically’, sooner or later ‘eventually’, the short and curlies ‘pubic hair’, and from rags to riches ‘someone’s rise from a state of extreme poverty to one of great wealth’. (You should think of examples illustrating these meanings.) In all of these cases, however, there is clear motivation for the meaning of the binomial.

In some cases motivation in the meaning of a binomial is absent. This is the case, for instance, in spick and span n ‘neat and clean’, odds and sodss ‘various kinds of things, typically small and unimportant’, the quick and the dead d ‘the living as well as the dead’. In some cases there is a historical explanation: for instance, the quick k used to mean ‘the living’; and kith and kin n ‘kinsfolk’ involves the now non-existent word kith h, meaning ‘friend’ in this context.

4.5 What’s in a word? Word taboos ‘A rose by any other name would smell as sweet’, said Juliet. True this may be, but not all words smell equally sweet, at least to speakers of a language. Words are not neutral; they carry emotional

99

100

Linguistics

overtones. Whether you use the word policeman, cop, bobby, fuzz, the filth, (a) John or pig you may be referring to an officer of the law; whether you say strike breaker, blackleg or scab you may be referring to someone who takes the place of workers who are on strike. But these words have different overtones, and your choice among them conveys your attitude. For the words we’ve just discussed, the overtones are associated with the words in particular uses: scab doesn’t have such negative overtones when used to describe the hard dried blood of a sore. But some words have particularly strong affective values; for example, shit, fuck and cunt are among the most highly charged words in English. They are often called ‘dirty’ words or ‘filthy’ words, although there is nothing intrinsically dirty (or for that matter, clean) about them, and there is nothing at all unpleasant about them phonetically or phonologically. (Fuck surely sounds no better or worse than duck, luck, fun or fuddle.) Nevertheless, these three words are felt to be in some sense ‘bad’ by speakers of English, who tend to avoid their use in ‘polite company’, and, for instance, when speaking to their mother, to a prospective employer or on a televised quiz-show. Indeed, in the USA it was not until 1926 that fuck was first printed openly; its use in printed sources has increased considerably in recent years. (As a foreigner in Denmark I have often been surprised by seeing fuck written in public places where its use would be considered offensive in Australia, Britain or America.) Words like fuck are called taboo words – taboo is a borrowing from Tongan (Austronesian, Tonga) tabu, which refers to actions or things that are prohibited by social or religious convention. In English and various other European languages many words relating to sexual activity, the genitals and some bodily functions and exuviae (liquid and solid exudations) are among the most tabooed words. In English, of course, there are more acceptable words for these activities (e.g. sexual intercourse) and body parts (e.g. penis) and products (e.g. urine, faeces, semen). Interestingly, it is the words inherited from Anglo-Saxon that are taboo; those borrowed from Latin (like faeces and vagina) tend to be accepted as ‘clean’ terms. Certain words with religious connotations are also tabooed in many cultures when used outside of the appropriate religious context. This is the case for words like God, and Christ – as a child I was often told ‘Don’t take the Lord’s name in vain’. There are other quite different types of word taboos. For example, in recent years terms with negative connotations for members of what are perceived as disadvantaged or oppressed groups have become increasingly tabooed. Examples include sexist, racist and religionist terms. Moving away from contemporary Western societies, terms for game animals are often taboo to hunters. In many Indigenous societies in New Guinea and Australia, there is a taboo on uttering the name of a recently dead person. In many Australian languages this taboo extends to a word that sounds like the name of the deceased. Thus, when a man named Djäyila died at Yirrkala (North-East Arnhem Land, Australia) in 1975 the verb djäl- ‘to want’ was tabooed, and replaced by duktuk-; after a few years djäl- started reappearing.

Euphemisms Euphemisms are indirect or evasive expressions used to avoid direct mention of unpleasant or taboo ideas; euphemisms provide ways of avoiding being offensive by being evasive. A few examples are: pass away and go to sleep for ‘die’; bathroom (American English) and loo and, more recently,

Lexicon

washroom (Australian English) for ‘toilet, lavatory’; smalls and unmentionables for ‘underclothing’; and girl, working girl and woman of the street for ‘prostitute’. The word undertaker, which originally meant ‘odd-job man’, was used as a euphemism for someone whose job is to bury the dead. Its meaning narrowed to this sense alone, as often happens with euphemisms. A new euphemism is now replacing it – funeral director. The unpleasantness of touchy events or things is felt to be lessened by use of an indirect term, because it reminds one of something more pleasant. Euphemisms are commonly found in the domains around which taboos are often found, including sexual activity, sex organs, bodily functions and products, death, killing and other violent acts, and stigmatized social groups. But they are not restricted to these domains, and can be found for any sort of unpleasant reality: for example, honorariums instead of bribes, campaign contributions instead of graft, make redundant instead of sack, and tactical withdrawal instead of retreat.

Dysphemisms Dysphemisms are the inverse of euphemisms: a euphemistic or neutral expression is replaced by a particularly direct or harsh term, with offensive overtones, often a taboo term. Examples of dysphemisms include: shithouse and boghouse for toilet (compare the euphemisms bathroom and loo); use of tabooed terms like fuckwit, cunt and shithead and terms for animals such as pig, sow, cow, snake, monkey and bitch in insults. A slightly different illustration of dysphemism is the use in many languages of terms like ‘rubbish’ and ‘worthless’ for the most sacred or important ideas or items – in this circumstance not necessarily with offensive overtones. In Nyulnyul the word riib ‘rubbish, no good’ could be used dysphemistically in reference to the most sacred religious objects. It is as though by using this term the powerful and potentially hazardous is trivialized, thereby losing some of its harmful potential. In this example the effect is more like the effect of a euphemism, while the form is dysphemistic.

Dysphemistic terms are not only used to insult or offend people. A lot depends on the context in which they are uttered. The strongest dysphemisms can, in certain contexts, be used as terms of intimacy and endearment. This is the case in the institution of mateship among Australian males, where the stronger the term of abuse the greater the intimacy expressed. Lovers sometimes employ dysphemistic terms for one another in the most intimate contexts.

Summing up The lexicon of a language is a listing of its unpredictable signs, including all its lexemes, simple and complex (e.g. idioms), as well as the entirety of its grammatical items. This listing ideally provides information about the form and meaning of each. Speakers of a language arguably have mental lexicons in which these and other types of information are stored.

101

102

Linguistics

Words and morphemes are classified into parts-of-speech according to their grammatical behaviour, which varies from language to language. Widely found parts-of-speech include nouns, verbs, pronouns, adjectives, adverbs, prepositions and/or postpositions, conjunctions and interjections. Not all of these categories are found in all languages; indeed, it is not certain that any are universal. The lexicon of a living language is open, and new words are regularly added while old words may be lost. New lexemes can be constructed by inventing novel forms via processes such as clipping, acronyming, blending, borrowing and coinage. They can also be constructed by reusing old forms and processes, including derivation, compounding, reduplication and backformation. New lexical items can also be formed by extension and narrowing of the meanings of existing words. Words are sometimes attitudinally charged. Some are prohibited in particular circumstances; these are taboo words. Other illustrations of the affective values of words come from euphemisms and dysphemisms.

Guide to further reading Parts-of-speech systems are dealt with in most introductions to linguistics, and in most grammars of particular languages. For more detailed discussion, see Evans (2000), Hengeveld et al. (2004), Rijkhoff (2007) and Schachter and Shopen (2007). Numerous books treat word formation in English. Among them Bauer (1983) and Marchand (1969) are recommended. Fuller details on the etymology of OK (including the debunking of a number of myths) can be found in Chapter 1 of Wilton (2009). This book provides a fascinating story of many urban myths concerning the etymologies of English words, and is well worth reading. On the history of English, see Bragg (2003), Pyle and Algeo (1993), Williams (1975), Singh (2005) and Trudgill (2023). Burridge (2004) is an accessible and entertaining book dealing with all of the topics mentioned in this chapter in relation to English. I recommend Wescott (1980) for a fascinating discussion of phonaesthesia (as well as a variety of unusual and often ignored linguistic phenomena), albeit mainly in English. Although sound symbolism is relatively marginal in mainstream linguistics, many distinguished linguists have studied it, including Edward Sapir (1929) and Roman Jakobson (1978; Jakobson and Waugh 1979). Hinton et al. (1994) contains articles on sound symbolism in a variety of languages. A good treatment of idioms is Fernando (1996). The reader is also referred to dictionaries of English idioms such as Kirkpatrick and Schwarz (1995), Speake (2002) and Spears (1990). Allan and Burridge (1991, 2006) are recommended as detailed but accessible accounts of euphemisms and dysphemisms. Bolinger (1980) also deals with euphemisms and dysphemisms, among other things, and is well worth reading. Sheidlower (1995) provides much information on one English taboo word, and words based on it.

Lexicon

Issues for further thought and exercises 1 What word-formation processes are illustrated by the following English words? Classify them according to the schemes of §4.2 and §4.3. Try making an educated guess first, then look up the word in a good dictionary. typo teens porn asap Reagonomics wordsmith galoot peddle doodad karaoke

boatel AC/DC carpeteria gargantuan sandwich brolga Darwinian alcohol la-di-da Frigidaire

2 Find out about the meaning and origin of the word googol. What sort of word formation process does it illustrate? Why do you think this word caught on? How would you account for googolplex? 3 A number of English words for large numbers are constructed with the ending llion. What is the basis of these formations? Find as many such words as you can, and state their meanings. Are there any additional motivations for any of these terms? 4 It was mentioned in §4.3 that the word for ‘policeman’ in Walmajarri is limba, the word for a certain type of fly. In the local dialect of Aboriginal English this same fly is referred to as a bolijman blai (policeman fly). What sort of word-formation processes does this illustrate? 5 List some idioms in a language you know, along with their meanings; determine what modifications (including exaggerations) they allow. Try to account for the meaning of the idiom. To what extent is the idiom motivated. 6 Make a list of as many binomial expressions as you can. Can you see any patterns in the ordering of the words, in which one word typically goes first? Can you find any trinomials: that is, expressions involving three lexical items typically connected by and? 7 Find a newspaper or magazine article reporting on a war. List the expressions referring to events involving the killing of people; classify the expressions as euphemisms, dysphemisms or neutral expressions. Are there any differences in the expressions – or their frequency – that are used for the killing of people on different sides? If there are differences, what do they reveal? What other euphemistic or dysphemistic expressions can you find in the article? 8 Slang is a somewhat imprecise term used for colloquial, informal or non-standard language. What are some examples of slang terms used by people in your generation? See what you can

103

104

Linguistics

find about the slang of your parents’ generation. What similarities and differences do you find; are there any shared terms? How would you classify the expressions you collected in terms of the processes discussed in §4.2 and §4.3 above? 9 What word-initial phonemes or phoneme sequences do you think are phonaesthesic in English, or your mother tongue? What meaning do you intuitively feel is associated with them? Make up a list of words beginning with the sequences that support your intuitions. 10 Here are a few relatively recent scientific lexemes mostly culled from Scientific American. Do you know them? If not, can you guess their meanings? Check the internet to verify your understandings or hypotheses. What processes of word formation do they exemplify? To what extent is their meaning arbitrary or motivated? exoplanet D-GPS COVID-19 puffy planet zeptoliter hot Jupiter geolocation

wiki ADDL Higgs boson zoonotic pre-bang universe SPIs picokelvin

11 Recent years have seen a spate of words ending in -aholic ~ -oholic, as in workaholic. Find as many of these words as you can (look on the internet). What sort of word-formation process is involved? What does -aholic ~ -oholic mean, and where does it come from?

Research project Some languages, including Mandarin Chinese, Cantonese, Japanese, Korean, Lao and Vietnamese, have a set of numeral classifiers, sometimes called noun classifiers or just classifiers. (The latter terms, however, are used more broadly to refer to other types of classifier than numeral classifiers.) What are these words or morphemes, and what are their main functions? What motivates the label? Find a good grammar (and/or some other reference) of one of the above languages and write an essay describing the numeral classifiers in that language. Incorporate answers to the above questions in the introduction. The body of your description should discuss such matters as when and where the classifiers are used, criteria for identifying them, how many there are (and whether they form an open or closed class), and the meanings of the classifiers. Can you make any generalizations about the range of meanings expressed by these morphemes?

5 Structure of Sentences: Syntax

In the previous two chapters we examined the internal make-up of words and their classification into parts-of-speech. We turn now to the ways words can be put together to form sentences, and examine the structures and types of these units. Sentence structure in all human languages is complex, and (like morphology) varies considerably from language to language. However, in all languages two things are recurrent: the existence of units intermediate in size between words and sentences, and the existence of grammatical relations. Differences in these units and relations permit us to distinguish different sentence types within and across languages.

Chapter contents Goals Key terms 5.1 What is syntax? 5.2 Hierarchical structure in sentences 5.3 Syntactic units 5.4 The structure of clauses Summing up Guide to further reading Issues for further thought and exercises Research project

106 106 106 109 112 118 125 125 126 130

105

106

Linguistics

Goals The goals of the chapter are to: ● introduce and explain three fundamental concepts of syntax: openness, grammaticality and hierarchical structure; ● present the fundamental syntactic units, and give criteria for their identification; ● show how the syntactic structures of sentences can be represented in tree diagrams; ● explain the need to identify grammatical relations; ● identify some of the major types of grammatical relation; ● illustrate by example some differences in the structure of sentences of the world’s languages; and • remark on similarities and differences between morphology and syntax.

Key terms Actor

grammatical relation

prepositional phrase

adjectival phrase

hierarchical structure

sentence

clause

interpersonal role

Subject

constituent analysis

noun phrase

textural role

embedding

Object

Theme

experiential role

openness

Undergoer

Event

phrase

verbal phrase

grammaticality

postpositional phrase

5.1 What is syntax? Openness In all human languages words can be put together in sequences to express meanings for which no separate words exist. This is because the range of complexities and nuances of meanings that a speaker might want to express – and distinguish from other possible meanings – is much larger than can be expressed by the lexical and morphological resources of any language. For instance, no human language would have a single word to express a meaning like that expressed by the previous sentence. Words and morphology alone are insufficient to make all the complex meanings and meaning distinctions people regularly need to make in thought and communication.

Structure of Sentences: Syntax

Syntax is concerned with the means available in languages for putting words together in sequences. Sometimes the term grammar is used instead of syntax, though more usually grammar is considered to cover not just syntax but also morphology; indeed, phonology and semantics are often included as well. In this way of using the terms, syntax is grammar above the level of the word. It should be clear from the previous two chapters that morphology and lexicon are not completely closed systems: languages can acquire new lexemes, even new grammatical morphemes (see §16.5). But these resources are somewhat limited, even in languages that are morphologically much more complex than English, such as Yup’ik, in which single words can express meanings that require full sentences in English, as in example (3-1), repeated as (5-1). Not all English sentences, however, can be expressed as single words in Yup’ik: as (5-2) shows, sentences in this language sometimes consist of more than one word (the glosses have been simplified somewhat). (5-1)

(5-2)

kai-pia-llru-llini-ube:hungry- -really-past-apparently-statement‘The two of them were apparently really hungry.’ tauku-t atsa-t tegu-k-ai that-PL fruit-PL take:in:hand-PART-he→them ‘He took those pears.’

-k -they:two

Yup’ik

Yup’ik

Syntax provides additional means of ‘opening up’ the grammatical system for the expression of new meanings, nuances of meanings, precision in meaning and links between ideas; it provides means for speakers to go beyond the limitations of the morphology and lexicon. Syntax enhances the creativity of expression in language. In terms of openness, the difference between syntax and the other domains is one of degree rather than kind. All grammatical systems, phonological, lexical, morphological and syntactic, are to some extent open; openness is most salient in syntax.

The notion of sentence The sentence, as it is usually conceived in linguistics, is the largest linguistic unit showing grammatical structure, the largest unit over which grammatical rules or patterns apply; it is at the opposite end of the scale of grammatical items from the morpheme, the smallest grammatical unit. This understanding of the sentence goes back to the American linguist Leonard Bloomfield (1887–1949) who proposed that a sentence is a string of words not included in any larger linguistic form by virtue of grammatical structure. According to this criterion, (5-3) consists of two sentences, since the two components, (5-4) and (5-5), are grammatically independent of one another. (5-3) (5-4) (5-5)

The fisherman hung the net on the fence. I saw her. The fisherman hung the net on the fence. I saw her.

107

108

Linguistics

To be sure, there are connections between the parts of the separate sentences: her in (5-3) is naturally interpreted as referring to the same person as the fisherman. But this is not by grammatical rule, and it is possible that the person seen was someone else.1 Nor does the fact that the person seen is the same person as the one who hung the net out force the speaker to use the third person pronoun her (or him): I saw the poor woman (still referring to the fisherman, although in this instance it is more likely that the person seen was someone else). Notice the difference from the situation for (5-6), where the grammatical form of the material following the comma is dependent on the preceding string of words: it can be only didn’t (s)he, did (s)he or didn’t they (if you want to be gender neutral). Thus you can’t use a different verb, such as isn’t, saw or hung, and preserve the structure, as you can in the case of separate sentences, as in (5-3). Nor can she in (5-6) refer to anyone other than the fisherman. (5-6)

The fisherman hung the net on the fence, didn’t she?

The openness of syntax referred to in the previous subsection can be understood as the openness of the set of sentences of a language. The syntax of a language provides a ready-made system of principles for the construction (production by a speaker) and interpretation (understanding or interpretation by a hearer) of novel sentences – sentences that have never previously been uttered in the language, or at least have never been uttered or heard by the current speaker or hearer. These include sentences that express new meanings and sentences that express old meanings in new ways. This can be referred to as creativity in sentence formation – a somewhat limited sense of creativity, to be sure, but creativity nonetheless. Of course, not every sentence one utters or hears is novel; but speakers do fairly frequently produce novel sentences, most of which hearers find quite unremarkable. The invention of new words is a much less common phenomenon, and is more likely to strike hearers as unusual, humorous or smart.

Grammaticality Not all possible strings of words in a language form grammatically acceptable sentences. While The fisherman hung the net on the fence does represent a grammatical sentence, the same words in a different order – for example, The the hung fisherman fence net on the – is clearly ungrammatical. It does not follow the grammar of English, and a speaker might well retort that the sentence makes no sense, or perhaps that the words come in the wrong order. Such strings of words are ungrammatical. It is standard practice to put a star before ungrammatical strings of words: *The the hung fisherman fence net on the. The notion of grammaticality should not be confused with meaningfulness or interpretability. Noam Chomsky’s famous Colourless green ideas sleep furiously is a fully grammatical sentence of English, although it makes little sense and is self-contradictory; it could hardly designate any ongoing situation in the real world.2 And Lewis Carroll’s Jabberwocky (§4.2) consists of fully grammatical sentences, though it is ‘nonsense verse’. By contrast, Fisherman hanged net on fence is not a grammatical sentence, although no speaker of English would have the slightest difficulty understanding it.

Structure of Sentences: Syntax

The notion of ungrammatical sentences is useful for revealing things about the syntax of a language. What is grammatical needs to be seen in the context of what is not, if one is to produce a revealing and complete description of the syntax of a language. As we find elsewhere in linguistics (indeed in science generally) paying attention to where things go wrong can reveal insights about situations in which they go right, insights that might not be perceived if attention had been directed exclusively to what is normal.

Caution must be observed in using ungrammatical sentences in syntactic arguments. The borderline between what is grammatical and what is ungrammatical is not always clear-cut, and especially if the linguist is relying exclusively on their own intuitions they are liable to be misled by their own presuppositions, or by failure to properly interpret a string of words. Here is an example from personal experience. Standard accounts of English grammar say that tag questions can be added to statements (as in The fisherman hung the net on the fence, didn’t she? you? she?) ?) and commands (as in Hang the net on the fence, will you?). ?). One infers from this that tags can’t be added to questions (if it is not said, then it is not so); indeed, some linguists have explicitly claimed that such sentences are ungrammatical, and star sentences like Are you going now, are you?? In the early 1990s I began to notice examples of this type in the speech around me, and over the next few years collected some hundreds of instances. Clearly these sentences were grammatical, and not errors. Some grammarians responded ‘not in my dialect’: this was a peculiarity of Australian English, they said, that did not occur in British English. Nevertheless, BBC television programmes such as The Billl revealed examples in British English that showed the same properties as their Australian counterparts.

5.2 Hierarchical structure in sentences Grouping We now have three types of grammatical unit at our disposal for describing the syntactic structure of a language: sentences, words and morphemes. Are these sufficient? Can we provide a complete account of the syntax of sentences as strings of words and/or morphemes coming one after the other? Evidence suggests not, that we need to recognize other units intermediate in size. Consider the English sentence (5-7). (5-7)

The train chugged along the line through the mountains.

Some morphemes and/or words seem to belong together: for instance, the first the is naturally interpreted as belonging with train rather than with chugged or along. At minimum, it seems reasonable to identify three groups of words or morphemes in (5-7): the train, chugged and along

109

110

Linguistics

the line through the mountains. Within the third group another group can be recognized, the line through the mountains, within which in turn through the mountains forms yet another word group. Such descriptions in everyday English quickly become cumbersome and difficult to understand, and it is useful to represent groupings of morphemes/words in diagrammatic form. (5-8) shows one form of representation. In such figures the groups are indicated by sets of vertical lines that are connected by a horizontal line. (5-8)

Other styles of figure are also used. (5-9) illustrates the most frequently used type, that represents the groups by connecting them together with slanted lines, thus ∧ instead of ⊓. In this representation, just two branches meet at any node; this means that more word groups are recognized than in (58), and the structure is more hierarchical. You can of course always redraw (5-8) to represent the same hierarchical structure as (5-9), and vice-versa – try it! (5-9)

Representations like (5-8) and (5-9) are called tree diagrams, or simply trees. Tree diagrams enjoy a prominent place in syntax. Figures like (5-9), using mostly two-way branches, were popular in American linguistics during the 1930s–50s, and are associated with an approach called Immediate Constituent Analysis (or IC Analysis). Modified versions still enjoy considerable popularity, particularly within formal syntax (see §1.4). Trees like (5-8), with more rake-like (and less hierarchical) structures, tend to be used in functional grammars; they are sometimes said to represent string constituent analysis. As we will see later in the chapter, there are reasons to prefer the string constituent type analysis.

Structure of Sentences: Syntax

Evidence for groupings of words Grammarians are not satisfied with grouping words together merely on intuitive grounds. They demand evidence from the language. Three main considerations – if you like, three main ‘tests’ – provide evidence for groupings: movability, contractibility and structural ambiguity. We deal with these in turn.

Movability The idea behind movability is that if certain words always move about together in a sentence they constitute a single group: since they hang together and can’t be split apart, they presumably belong together. That is, if we compare a sentence with similar-meaning sentences involving the same lexical items in different orders, and find that certain words always cluster together in the same way, this suggests that they form a word-group. Returning to (5-4), repeated as (5-10), compare sentences (5-11)–(5-15): (5-10) (5-11) (5-12) (5-13) (5-14) (5-15)

The fisherman hung the net on the fence. On the fence the fisherman hung the net. It was on the fence that the fisherman hung the net. The net was hung on the fence by the fisherman. It was the fisherman who hung the net on the fence. It was the net that was hung on the fence by the fisherman.

Examples (5-11) and (5-12) show that on the fence behaves as a single unit; (5-13)–(515) show that the fisherman and the net each separately forms a single grouping of words. By contrast, the words the and fisherman always go together; they can’t be shifted around independently of one another, and separated by other words from the same sentence. Thus the ungrammaticality of *The hung fisherman the net on the fence. This criterion is a good, though imperfect guide to word groupings; grammatical criteria (or tests) are rarely perfect. Sometimes word-groups can be split up, as illustrated by (5-16), which shows that on can be separated from the fence. (5-16) It was the fence that the fisherman hung the net on. What you do not find, however, is that words that do not form a group together always move around in concert. For instance, the three words the net on does not behave in this way, as revealed by the unacceptability of the following: (5-17) *The net on was hung the fence by the fisherman (5-18) *It was the net on that was hung the fence by the fisherman

Contractibility Contractibility is the potential for a string of words to be replaced by a single word. In (5-10) we can replace the fisherman by she, the net by it, and on the fence by out or up: She hung it out.

111

112

Linguistics

The idea behind this is that if the string can be replaced by a single word it behaves as a single word, which we know is a grammatical element. Thus the string behaves like a single grammatical item, and so the component words together form a single syntactic group. Again this criterion is imperfect: in (5-7) it is not clear that a single word could replace through the mountains, or indeed the line through the mountains (it perhaps works marginally – The train chugged along it). Nevertheless, replacement of non-groupings of words is not possible. You can’t replace chugged along the by a single word.

Meaning differences A single stretch of speech or writing sometimes has two or more distinct meanings – like bank ‘side of a watercourse’ and bank ‘a financial institution’. This is called ambiguity. In some cases a string of words admits two or more interpretations that can be explained by different groupings of the morphemes or words. For example, we could explain the different interpretations of The policeman shot the man with a rifle in this way. In one interpretation the man with a rifle forms a single word-group, specifying a man carrying a rifle. In another interpretation the rifle was used to shoot the man, in which case the man with a rifle is not a single word-group, but two. (5-19) and (5-20) show tree diagrams for the two different analyses, respectively. (5-19)

(5-20)

5.3 Syntactic units Syntactic units are grammatical items showing unified behaviour – that is, items that behave as indivisible wholes. Words, morphemes and sentences are syntactic units. So are the intermediate word-groups discussed in the previous section. In this section we say a bit more about these intermediate units, distinguishing types according to their size. Units of two intermediate sizes

Structure of Sentences: Syntax

exist between words and sentences: phrases and clauses. This gives us a hierarchy of units according to increasing size: morpheme, word, phrase, clause and sentence.

Clauses Sentences come in a variety of types ranging from the utmost simplicity of single morphemes (for instance, interjections such as hey! and yuck! – see p. 85) to complex syntactic configurations. Sentences like (5-7) and (5-10) above are what could be called simple sentences; they contain just one verb, and specify a single event. Simple sentences can be joined together to form complex sentences like (5-21) and (5-22), which refer to combinations of events; often, as in these examples, words like when and and are used to connect the two parts. (5-21) The car skidded when it hit the oil slick. (5-22) The fisherman hung the net on the fence, and the farmer pulled the plough into the shed. Sometimes the simple sentences that are put together to form a complex sentence need to be modified in some way. We can, for instance, combine (5-10) with (5-23) to get (5-24), but adjustments are necessary to (5-23): the fisherman must be replaced by she, he or they; the fence should be omitted; the word that can be used to connect the sentences, although it is optional. (5-23) The fisherman made the fence last year. (5-24) The fisherman hung the net on the fence that she made last year. A string of words that is either a simple sentence or a modified form of a simple sentence is called a clause. In many languages, clauses come in two main types. First, we have a simple singlemorpheme type, consisting of a word used as an interjection. Such clauses, sometimes called minor clauses, have the simplest structure – effectively none. The second type has (when complete, and nothing is omitted) a verb and accompanying nouns, and refers to an event in the real world, or some imaginary world. These latter, which can be called major clauses, are either independent (i.e. they can stand alone as independent sentences), or dependent (can’t stand alone as independent sentences, but correspond to clauses that can). In some ways the clause is the most fundamental unit of grammar; moreover, it displays many intriguing syntactic properties; we examine its structure in §5.4 below.

Phrases Nature of phrases In §5.2 we argued for what can now be seen as units intermediate in size between words and clauses. Such intermediate units are called phrases. Phrases are groupings of words that do not normally constitute complete clauses, just parts of clauses. In (5-25) the phrase-sized units we identified in example (5-7) are labelled by Ps, the clause by C.

113

114

Linguistics

(5-25)

Consider now the following example: (5-26) The trains chugged slowly along the line through the mountains. The tree structure for this example is shown in (5-27). Comparing this with (5-25), it seems reasonable to suggest that the single word chugged in the latter is actually a reduced phrase, in the same way train is both a morpheme and a word. Putting things the other way around, we can say that single words can be recognized as phrases provided that they can be expanded into larger units comprising more than one word. This observation permits some useful syntactic generalizations that we could not otherwise make. In particular, clauses are made up of phrases, that are in turn made up of one or more words. (5-27)

Types of phrase Phrases can be grouped together into different types according to their internal structure. In the next two subsections we deal with two important phrase types that are found in many languages, noun phrases and verb phrases. Then we briefly mention a few other phrase types that are less widespread across languages. Given that nouns and verbs are not separate parts-of-speech in all languages it is possible that noun phrases and verb phrases might not be distinct in all languages. We do not address this issue, on which there are differences of opinion; instead we content ourselves with those languages that do draw the distinction.

Noun phrases Noun phrases – henceforth NPs – are phrases like the train, the line through the mountains, the farmer and so on. These are made up of a noun, which is usually the most important word in the phrase, possibly together with one or more other words or morphemes. An NP refers to some

Structure of Sentences: Syntax

entity, concrete (like a person, animal, tree) or abstract (perhaps an emotion or idea), in a real or imaginary world. NP structure in English, Swedish, Mandarin Chinese or any language is usually far from simple, and it would be impossible in an introductory book such as this to provide a comprehensive description of the complexities in all languages. Instead, let us look at a few simple examples in Māori (or te reo ‘the language’, Austronesian, New Zealand), and see how they can be described as sequences of words of particular parts-of-speech. Here are the examples: (5-28) wahine pai nei te tuuru roa tooku tuuru pai na tooku wahine nei te wahine pai

‘this good woman’ ‘the tall chair’ ‘that good chair of mine’ ‘this woman of mine’ ‘the good woman’

Māori

Based on these few examples (the reality is not so simple!), it appears that an NP in Māori can consist of up to four words, including a noun (wahine ‘woman’ and tuuru ‘chair’). The noun can be preceded by a determiner (te ‘the’) or possessive pronoun (tooku ‘my’), and can be followed by either an adjective (pai ‘good’ or roa ‘tall’), or a demonstrative (nei ‘this’ and na ‘that’). A full NP would look like one or the other of the following structures: (5-29)

Verb phrases Verb phrases (VPs) are groups of words and morphemes like chugged, was chugging, might chug, might have been chugging slowly and so on. VPs contain a lexical verb, which conveys the most important lexical information, usually along with other morphemes, grammatical and/or lexical, bound and/or free. Whereas NPs refer to entities, VPs refer to the events these entities are involved in; these events are specified by the central item, the lexical verb. Again we illustrate how VP syntax can be described by examining a small fragment of Northern Sotho. Below are our examples: (5-30) o-rêk-ilê o-tlô-rêk-a o-bê a-rêk-a o-rêk-a o-tlô-ba a-rêk-a

‘he/she bought it (e.g. meat)’ ‘he/she will buy it’ ‘he/she was buying it’ ‘he/she buys it’ ‘he/she will be buying it’

Northern Sotho

115

116

Linguistics

o-bê a-rêk-ilê o-tlô-ba a-rêk-ilê

‘he/she had bought it’ ‘he/she will have bought it’

It is obvious that the lexical verb ‘buy’ is rêk. Assuming it is a typical verb, we can hypothesize that VPs in plain tenses are simple inflected verbs involving one or two prefixes and a suffix. By comparing the forms we can identify o- and a- as prefixes, though it is not possible determine whether they are allomorphs or distinct morphemes; nor can we assign a meaning to them. In fact, these prefixes provide information about the identity of the subject. There are also affixes indicating tense – a prefix and suffix for the future, and a suffix for present and past. The two morphological structures can be specified as SUB-V-PRS/PST and SUB-FUT-V-PRS, where SUB stands for the prefix indexing the subject. The verbs expressing the more complex relative tenses – where the time of the event is specified in relation to a reference point of time other than the time of speaking, either in the past or future – involve another word that inflects rather like the verb ‘buy’: it takes a subject prefix and the future prefix, but has apparently irregular root forms in the past and future. It is presumably an auxiliary verb. Assuming that o- and a- are allomorphs selected by the main verb rêk ‘buy’ and the auxiliary, respectively, an approximate description of the VP would be: (5-31) (SUB-(FUT)-AUX) SUB-(FUT)-V-PRS/PST This formula is imprecise: impossible combinations such as of the FUT prefix and PST suffix are not excluded. Nevertheless, it shows a syntactic pattern that all acceptable VPs in our data follow. (As an exercise, try to devise a formula that does exclude the non-occurring forms.)

The reader should be aware that in many theories of syntax, especially formal theories (see §1.4), VPs include not just the verb and its closely associated auxiliaries and the like, but also many of the accompanying NPs – in fact, often everything bar the subject. According to this type of analysis, hung the net on the fence e would be a VP in (5-10), rather than just hung g. This analytical difference is also reflected in the different types of tree diagrams employed, whether string types like (5-8), or IC type trees like (5-9). The arguments are too complex to deal with in an introductory text. Suffice it to say that in favour of the string constituent analysis adopted here is the observation that a meaningful grammatical role can be associated with the VP in examples such as (5-7) and (5-10) – it specifies an event (see further p. 122). No such meaningful role is associated with the larger VP containing everything other than the subject. (A possible contender for a role for this VP would be the role Predicate; however, it is not easy to understand this as anything but a purely formal role: what might it mean?)

Other phrase types In example (5-7) we have one VP (chugged) and three NPs (what are they?). There remain two phrases that are neither NPs nor VPs, along the line through the mountains and the included through

Structure of Sentences: Syntax

the mountains. Both are clearly made up of a preposition and an NP. Phrases like this are called prepositional phrases (abbreviated PPs). Not all languages have PPs, for the simple reason that not all languages have prepositions. Some languages (e.g. Hungarian, Japanese and Ngarinyin) have postpositions instead, in which case we can speak of postpositional phrases. Some languages have both prepositional and postpositional phrases, while others have neither part-of-speech, so neither phrase type. Two other types of phrase found in some languages are adjectival phrases (AdjPs) and adverbial phrases (AdvPs). AdjPs in English have an adjective and a modifier indicating degree or intensity as in very tall, quite rich and somewhat stupid. AdvPs have an adverb and a modifier again indicating degree, as in very badly and excessively well.

Complications NPs and VPs sometimes have more complex structures than is accounted for in the preceding discussion. For example, our PP along the line through the mountains involves a PP within a larger PP. This is called embedding: the PP through the mountains is embedded in the larger PP. Embedding of phrases within other phrases is quite common in English and many other languages. In English, if a PP is embedded in an NP or PP it usually comes at the end of the phrase, as in the house on the hill, the man on the moon, the end of the universe and so on. An NP indicating a possessor can also be embedded in another NP, as in: the old woman’s three cats, the new film’s boring ending, the new president’s flight to the Arctic and so on. Example (5-32) shows the structure of the last example. (Note that here PP indicates both postpositional phrase – recall that the possessive -’s of English is a phrasal enclitic, thus effectively a bound postposition – and prepositional phrase.) (5-32)

A second complication is that phrases can be conjoined by conjunctions (see p. 85) such as and and or to form more complex structures, as in a word and a number, an instruction booklet and the necessary cables and from the cities and from the towns. Within phrases words can also be conjoined, as in boys and girls, salt and pepper, big and little people, swam and played and might have been tarring and feathering. Example (5-33) shows two possible structures of old men and women, according to the two possible interpretations, depending on whether it is NPs that are conjoined (as in (a), where women forms a full NP that could be filled out with, for example, young) or it is words that are conjoined (as in (b)). In the former case old applies only to men; in the latter, old applies to both nouns men and women. (5-33)

117

118

Linguistics

5.4 The structure of clauses Fundamentals Description of the clause in terms of phrases The grammatical notions developed in the previous section permits us to describe clauses as sequences of phrases of various types, in a similar way to our descriptions of NPs and VPs as sequences of words. In this way we can capture similarities among a range of different clauses. Thus the tree diagram in (5-34) captures the structure of both (5-7) and (5-26). (5-34)

It also characterizes innumerable other sentences, depending on the choice of NP, VP and PP. For example: (5-35) (5-36) (5-37) (5-38) (5-39)

The dog was running. The little child squealed with joy. The train goes in the morning. The door squeaked on its hinges. The little child listened carefully to the story.

Not every English clause, of course, satisfies (5-34). Here are just a few additional patterns, with a single example of each (it is left to you to draw the tree diagrams): VP NP PP (5-40) Is the locomotive in the shed?



PP VP NP (5-41) On the corner stands a statue.



NP VP NP (5-42) Marlowe embraced his assailant.



VP NP NP (5-43) Is the president Bill Clinton?



NP VP NP NP (5-44) The teacher will give his wife a gift of considerable value.



INTER VP NP (where INTER stands for interrogative or WH-word, e.g. what, who) (5-45) What is that thing? ●

Structure of Sentences: Syntax

INTER VP NP PP (5-46) When was the locomotive on the line through the mountains?



This is a very small selection from the range of patterns available in English – for instance, one or more additional PPs can be added to many of the above, and we have not yet brought AdjPs or AdvPs into the picture. Think of some additional patterns yourself, give examples and draw tree diagrams.

Problems It is not difficult to find clauses that don’t lend themselves so readily to descriptions of this type. How, for instance, would you give a general description of the syntactic patterns shown by the following very simple clauses? (5-47) (5-48) (5-49) (5-50)

Should we go tomorrow? Does the teacher like strong chilli? Will the teacher give her husband/wife a valuable gift? When did the train travel on the line through the mountains?

Evidently you need to look into the structure of some of the phrases. It is impossible to account for clause patterns in English entirely by reference to phrase-sized units, ignoring their internal composition. Specifically, the auxiliary verb in each is separated from the main verb by an NP. The additional clause patterns can be easily listed: AUX NP V PP, AUX NP V NP, AUX NP V NP NP and INTER AUX NP V PP. This fails, of course, to show that the AUX and V belong together as part of the same VP. One way of dealing with this would be to use labelled brackets, as in: [AUX]VP NP [V]VP PP. It will be obvious by now that this approach will result in a very long list of different structures. It is also obvious that important generalizations will be missed if the types are merely listed. Thus, the examples so far show that when a clause begins with an INTER, the first NP always follows the first word of the VP, which will be either the main verb (if it is be) or an auxiliary (otherwise). Recognition of this as a grammatical rule would lead us to predict that some patterns – for example, INTER NP VP – are impossible. We can then search for examples to test whether or not this is so, giving us a more powerful method of investigation than searching randomly for new patterns. (Can you find grammatical clauses satisfying this pattern predicted to be ungrammatical?)

A better way of thinking about clause structure than the diagrams we used above is to imagine the trees as three-dimensional objects like real trees, rather than two-dimensional ones. To remain closer to figures like (5-34), they can be thought of as mobiles such as are sometimes found above an infant’s cot. Units like NPs and VPs can be imagined as rods of a mobile, together with strings and attached objects. The grammatical rules can then be regarded as ways of projecting the three-dimensional mobile onto a two-dimensional representation – as rules taking the abstract units and putting the elements (the words) in the correct sequence. (Pursuing our analogy, this can be likened to shining a light from a

119

120

Linguistics

particular position to get a shadow on the wall; shining it from different positions will give different shadows.) This allows us to operate with a more general, though slightly weakened, criterion of movability than the one given in §5.2: the bits must move around in concert, although they need not necessarily stick together in the projection onto word sequences.

Grammatical relations An even more serious problem with the type of description outlined in the previous subsection is that, although it captures generalizations about the possible forms of clauses, it fails to reveal anything about their meanings. It leaves completely out of account the systematic similarities and differences in meaning among the clauses. The syntactic patterns in that mode of description are no more than specifications of possible formal shapes, related only by virtue of the fact that they involve similar component units. By recognizing grammatical roles or relations associated with the formal elements it is possible not just to account for differences of meaning expressed by formally related sentences, but also to describe clausal syntax in a way that goes beyond a mere listing of alternatives. In what follows we identify three different types of grammatical relation, which express fundamentally different types of meaning.

Experiential roles Consider clauses (5-51)–(5-53). These exhibit three different syntactic patterns in terms of units and their combinations: each has the same three types of unit (an NP, a VP and a PP), but in different orders. (5-51) The train is leaving from platform two. (5-52) Is the train leaving from platform two? (5-53) On platform two the train is leaving. Each clause describes an ongoing situation, and the NP the train specifies the thing that is engaged in it, the thing that is moving or about to move. By contrast, in (5-54) and (5-55) – which are identical with (5-51) and (5-52) in terms of their patterns of phrases – the NP refers to something that is acted on, rather than something that does or performs an activity. (5-54) The train was shunted from platform two. (5-55) Was the train shunted from platform two? We can account for these similarities and differences with the notion that the same NP the train serves in two different grammatical roles or relations – also called functions – in the two sets of sentences. In (5-51)–(5-53) the NPs are Actors: their function in the clause is to indicate the doer of the event. In (5-54) and (5-55) they are Undergoers: they designate the patient or sufferer of the event, something the event happened to or impinged on.

Structure of Sentences: Syntax

The terms Actor and Undergoer are not just intuitively meaningful labels; they are labels for grammatical roles – elements of the grammatical structure of English clauses. The terms are given with initial capitals for this reason, to make it clear that we are dealing with grammatical phenomena, not merely with intuitively identified meanings. You can’t just call an NP an Actor or Undergoer by inspection of isolated examples. For instance, in the famous linguist died intuition suggests that the linguist (the person, not the phrase!) was more of an undergoer than an actor. But in the grammar of English the famous linguist (now the phrase) serves in the same role as it does in the famous linguist climbed the mountain. The actual meanings of the grammatical roles Actor and Undergoer are not to be confused with the meanings of the corresponding lexical items actor and undergoer. They are to be found by studying clauses with the roles, not dictionary meanings of the terms. Perhaps this all appears very abstract, even fanciful: the grammatical role is not something you can tell by inspection of the linguistic form, and its meaning is difficult to pin down precisely. But even if we don’t have direct indication of the roles, there is indirect evidence for them. We have identified them using an argument reminiscent of the ambiguity of meaning test for units (§5.2), where the idea was that a single ambiguous string of words might allow different divisions into units – different structures. Here the same style of argument has been reused, but at the level of phrase patterns. The same pattern of phrases has different meanings: examples (5-51)–(5-53) and (5-54)–(5-55) all show the pattern NP VP PP, but they differ systematically in meaning – and therefore show different structures. The structural difference cannot be in terms of division into units (they are precisely the same); it has to be something else: the functions of those units within the clause. In some languages the situation is more obvious, and the roles are overt rather than covert. In Acehnese (Austronesian, Sumatra) the roles of Actor and Undergoer are distinguished morphologically, and their meanings are closer to the senses suggested by the labels. Actors are distinguished by an agreeing prefix to the verb, as shown by (5-56); Undergoers optionally have an agreeing suffix, as shown by (5-56) and (5-57). (5-56) gopnyan geu-mat lôn (s)he (s)he-hold me ‘(S)he holds me.’ (5-57) gopnyan geu-mat-lôn (s)he (s)he-hold-me ‘(S)he holds me.’

Acehnese

Acehnese

A clause describing controlled movement has an Actor, as in (5-58); if the movement is uncontrolled, it has an Undergoer, as in (5-59). (5-58) geu-jak (s)he-go ‘(S)he goes.’ (5-59) lôn me ‘I fall.’

gopnyan (s)he

Acehnese

rhët(-lön) fall(-me)

Acehnese

121

122

Linguistics

The two grammatical roles Actor and Undergoer are fundamental in many languages, perhaps even universal. In most languages the majority of clauses have at least one of them: that is, at least one is obligatory.3 Also obligatory is a VP. In clauses like most of those discussed above, this refers to an event; associated with the VP is the grammatical role Event. These three roles then give us a handle on the clause in terms of the way our world of experience is interpreted and construed. The clause is structured so as to express this general type of meaning, called experiential or representational meaning. Roles like Actor, Undergoer and Event are accordingly experiential roles.

Subject and object It is unlikely that just these three grammatical relations – Actor, Undergoer and Event – are sufficient to describe the syntax of any language, let alone all languages. At least in some languages – for instance, many languages of Europe – Subject, and perhaps also Object, are also required. Comparing (5-60) and (5-61) we see that the tourist is Undergoer in each clause. But the tourist in (5-61) also shares some grammatical behaviour with the Actor NP the sniper in (5-60). First, they occur in initial place in the clause, immediately preceding the verb. Second, the verb in each sentence agrees, to a limited extent, with this NP. Third, both NPs could be replaced by nominative pronouns he or she, not him or her. And finally, if a tag is added, its pronoun picks out these NPs – we could add didn’t she? to (5-60) and wasn’t she? to (5-61). These commonalities in behaviour motivate identifying Subject as a grammatical relation in English, distinct from Actor. (5-60) The sniper shot the tourist. (5-61) The tourist was shot by the sniper. There has been much debate in linguistics about the need for, and nature of, Subject as a grammatical relation. Some deny its universality, while accepting its existence in certain languages; others deny it for all languages. Many grammarians consider Subject as a purely formal grammatical role associated with an NP in a particular structural position in the clause. Others maintain that, like Actor and Undergoer, Subject is also a meaningful grammatical relation. A number of related notions have been suggested in recent decades that begin to make sense of Subject as a meaningful grammatical relation. Michael Halliday suggests (1985: 76) that the Subject represents the thing in reference to which the truth of the proposition can be affirmed or denied. Thus one would argue about or evaluate (5-60) in relation to the sniper, (5-61) in relation to the tourist. A rather similar suggestion was put forward by Simon Dik, who proposed (1989: 212ff.) that it provides the perspective from which the clause is presented, the vantage point from which it is viewed. Thus example (5-60) presents things from the perspective of the sniper, while (5-61) presents them from the perspective of the tourist. And Ronald Langacker proposes (1991: 304–29) that Subject relates to cognitive prominence; he has refined this idea in more recent work (1999) to the notion of event profiling: the event is profiled from the perspective of the Subject. Similar suggestions have been made by others.

Structure of Sentences: Syntax

It is more difficult to interpret Object – the role of the tourist in (5-60) – as a meaningful grammatical relation. Nevertheless, both Dik (1989) and Langacker (1990: 225) suggest that the Object represents a secondary vantage point from which the clause is perspectivized. Thus the difference between The teacher will give the pupil a gift and The teacher will give a gift to the pupil concerns, they suggest, whether the pupil or the gift is taken as the secondary vantage point. According to these views, Subject and Object have nothing to do with the construal of the world of experience; it is not concerned with experiential meaning. They are concerned with the selection of positions for perspectivizing the situation: with the angle from which the speaker chooses to view it and present it to the hearer. This sets the stage for the hearer to adopt the same angle, the same viewpoint. Meaning of this type is interpersonal – the term comes from Halliday, who was the first to suggest Subject expresses this type of meaning: it is concerned with the interactive dimension of language, with the establishment of a shared perspective.

Theme In many languages the initial NP or PP of a clause serves an important role. Consider the following German examples: (5-62) Der Priester traf den Bischof the:MAS:NOM priest meet:PST the:MAS:ACC bishop in Hamburg am nächsten Tag. in Hamburg on:the:MAS:DAT next day ‘The priest met the bishop in Hamburg the following day.’ (5-63) Den Bischof traf der Priester in Hamburg am nächsten Tag. ‘The bishop the priest met (him) in Hamburg the following day.’ (5-64) Am nächsten Tag traf der Priester den Bischof in Hamburg. ‘The following day the priest met the bishop in Hamburg.’ (5-65) In Hamburg traf der Priester den Bischof am nächsten Tag. ‘In Hamburg the priest met the bishop the following day.’ These clauses all describe the same situation, with traf ‘met’ as Event, der Priester ‘the priest’ as Actor, and den Bischof ‘the bishop’ as Undergoer. They also present it from the same perspectives (as per the previous section), der Priester is Subject and the one from whose perspective the clause is presented; and den Bischof is perhaps Object and provides a secondary vantage point for viewing the clause. Thus the four clauses express the same experiential and interpersonal meanings, and are made up of the same NPs serving in the same experiential and interpersonal roles. Nevertheless, the clauses differ subtly in meaning. Example (5-62) ostensibly presents a message about the priest, saying what he did; (5-63) by contrast seems to be about the bishop, presenting information about him. The first NP specifies what the clause is about; it serves in the grammatical role Theme, sometimes called Topic. If the first NP is the Theme, the Themes of (5-64) and (5-65) should be am nächsten Tag ‘on the following day’ and in Hamburg ‘in Hamburg’, respectively. But it seems somewhat implausible to say that these clauses are saying something about a time (the following day) and a place (Hamburg),

123

124

Linguistics

respectively. In these cases the initial PP instead apparently serves to establish a setting (temporal or spatial) within which the event occurred. (Note that this accounts for only a part of the meaning difference between the four examples; we cannot go into other differences here.) So a Theme can either be what the clause is about, or establish a setting for it. There is something common to both: the Theme anchors the message down, providing a fixed point from which the message can be expanded. The type of meaning conveyed by the Theme is textural: it serves to give texture to the clause, distinguishing it from an arbitrary string of words.

Constructing a clause is a bit like putting an Ikea bookshelf together. You start with a particular piece, and build up from it. The first piece is like the Theme: the other pieces are anchored to it. Although the instruction kit gives a sequence of putting the bits together, it is not necessarily the only way – though it might be in some sense the best, or most natural. Likewise in syntax, one choice of Theme is often the most natural: in the case of our German examples, it is the choice in (5-62). Other choices are less natural, and less common in language use.

Morphology and syntax Both morphology and syntax deal with arrangements of grammatical items. But there are differences that underline the need to distinguish them as different levels of grammar. To begin with, only some of morphology is conveniently viewed in arrangement terms. Some aspects (especially of inflectional morphology in highly inflecting languages) are better viewed in word-paradigm rather than item-arrangement terms (see p. 75) – that is, in terms of paradigmatic contrasts among words, rather than as morphemes in sequence. In syntax item-arrangement description always works, even though (as we have seen) it may demand recognition of other things (in particular, grammatical relations) in addition. Another difference is that in morphology the arrangements of the items are usually more or less fixed. Little variation in order is permitted. A single structural formula specifying the ordering of the morpheme types can normally be given that accounts for the morphological shape of nouns and verbs. The complex situation we encountered for English clauses in §5.4 does not arise. Lastly, while units serve grammatical relations in syntax, in morphology they do not. We can describe morphology without bringing roles of morphemes into consideration; description in terms of form is adequate. As we have seen, the same NP the farmer occurs in the two clauses the farmer kissed the duckling and the duckling kissed the farmer, though it serves a different grammatical role. This situation does not arise in morphology. Although the same phonological form /z/ is found in the verb form /bɪd-z/ and the noun form /bɛd-z/, we do not have one morpheme /z/ serving in different roles; rather we recognize two distinct morphemes with different meanings (respectively, present time and third person singular doer, and plural) that happen to share the same phonological shape.

Structure of Sentences: Syntax

Summing up The lexical and morphological resources of a language are insufficient to permit expression of the full range of meanings people need to make; to get around this limitation, words are combined together into larger units. These units are structured according to patterns that differ from language to language, and define the syntax of a language; this is the most open grammatical system of any language. Fundamental to syntax is the sentence, the largest unit in a language that shows grammatical patterning. A sentence made up of a string of words that observe the syntactic patterns of a language is grammatical; otherwise it is an ungrammatical string. Study of ungrammatical strings, and comparison with grammatical sentences, can yield insights into the syntax of a language. The structure of sentences is hierarchical. Words in a sentence go together to form groups of intermediate sizes – clauses and phrases – identified by criteria of movability, contractability and ambiguity. Clauses are effectively simple sentences, that can be combined together to form complex sentences. Clauses are constituted by phrases, which fall into different types, corresponding to the main parts-of-speech of a language. The hierarchical structure of sentences into clauses, phrases, words and morphemes can be represented in tree diagrams, the nodes of which are labelled according to the type of unit. Sentences cannot be adequately described as strings of units of various sizes and types. It is necessary to also recognize the grammatical relations or roles borne by the component units. These are characterized in terms of both form (e.g. word order, verbal agreement, case-marking affixes or adpositions) and meaning. Grammatical roles fall into three general types according to the type of meaning they express. Experiential roles express meanings concerning the construal the world of experience, and include Actor, Undergoer and Event. Interpersonal roles are concerned with meanings relating to the interactive dimension of language, including perspective taking. Subject, and perhaps also Object, is, according to some linguists, an interpersonal role. Textural roles are concerned with giving texture to syntactic units, with providing the glue that binds sentences together. Theme is a textural role.

Guide to further reading Almost all introductory textbooks adopt a more formal approach to syntax than adopted in this chapter. An exception is Finch (2003), which devotes one section of the syntax chapter (Chapter 4) to each of formal and functional approaches. Van Valin (2017) gives a brief, but fair overview of functional theories. Lockwood (2002) is a good textbook on functional syntax, and includes numerous examples and exercises from diverse languages. Van Valin (2001) is one of the best nonpartisan introductory textbooks on syntax; although functionally oriented, its final chapter discusses mainstream formal theories. Chapter 3 of Pavey (2010) is a good introduction to basic syntactic analysis using examples from a number of languages.

125

126

Linguistics

The approach adopted in this chapter to the structure of the clause is largely inspired by Michael Halliday’s thought, a comprehensive account of which can be found in his 1985 book, An Introduction to Functional Grammar; the latest edition is Halliday and Matthiessen (2014). For a textbook introduction, see Eggins (1994). The leading figure in mainstream formal syntax since the late 1950s is Noam Chomsky, whose ideas have had an enormous impact on almost every branch of linguistics. In syntax his ideas have spawned not only an array of formal theories – which generally go under the umbrella term generative grammar – but also several functional theories. Those interested in finding out more about generative grammar could begin with Baker (2017) and Wasow (2017). Anyone serious about syntax should read not just about syntactic theories, but about the syntax of particular languages. Among the myriad grammars of English, Huddleston (1984) is recommended for its careful argumentation and insights. More comprehensive is Huddleston and Pullum (2002); Huddleston, Pullum and Reynolds (2022) is a shortened student’s introduction. More or less detailed treatments of the syntax of other languages can be found in reference grammars in series such as the Mouton Grammar Library, Lingua Descriptive Studies/Croom Helm Descriptive Grammars/Routledge Descriptive Grammars, Pacific Linguistics, and Cambridge Grammatical Descriptions.

Issues for further thought and exercises 1 Below is a tree analysis of (5-10) showing structure down to the level of the word, ignoring the division of words into morphemes; the nodes are labelled according to the category of unit. (DET stands for determiner.)

Draw similar tree diagrams for the English clauses below. In some cases the clauses are ambiguous; give separate diagrams appropriate to the different interpretations. Can you justify each of your groupings? Comment on any cases where you have difficulty deciding on the appropriate analysis. a. b. c. d. e. f. g.

The farmer will kiss the duckling in the woodshed. Who is the man in the shed? They followed his dripping blood until nightfall. The old men and women are on holidays in the Alps. The hungry cat ate the tiny raw mouse. The hungry mountaineer ate the tiny mouse raw. The hungry mountaineer didn’t eat the tiny mouse raw.

Structure of Sentences: Syntax

h. i. j. k.

The slithy toves did gyre and gimble in the wabe. What are slithy toves? What gyred and gimbled in the wabe? Mary gave John the recipe for Thai curry.

2 Draw tree diagrams for the two complex sentence examples (5-21) and (5-22) in §5.3. Suggest a tree diagram for The fisherman who hung the net on the fence saw the farmer. 3 Examples (5-54), (5-55) and (5-61) illustrate the passive voice in English, and correspond to active voice forms in which the Undergoer serves as Object, and the by PP (if there is one) corresponds to the Actor (also Subject) of the active – compare (5-61) with (5-60). What are the passive voice forms of the following? a. b. c. d. e.

The farmer kissed the duckling. The hungry mountaineer ate the tiny mouse. They will follow his dripping blood until nightfall. The fisherman may have been hanging the net on the fence. Marlowe could have embraced the assassin.

Answer the following two questions. (i) Does inclusion of a by PP seem equally good in all examples, or is it awkward in some cases? If some examples seem awkward, can you specify in which conditions? (To answer this you should construct further examples of passive constructions yourself.) (ii) How would you describe the structure of the passive in terms of syntactic units and their arrangement? What formal features indicate the passive voice? 4 Below are some examples of acceptable and unacceptable English NPs. (Check that you agree with my intuitions!) List the acceptable and unacceptable NP structures that these examples reveal. What do you conclude from the distribution of units of different types? (Three hints: (a) it may be useful to think of other examples in answering this question; (b) review §5.4; and (c) what conclusions can we draw from complementary distribution?) a. b. c. d. e. f. g. h.

the hairy fisherman the fisherman who is hairy the bird on the fence the bird hanging on the tree the tove with no ears the earless tove the distant star the star in the distance

*the fisherman hairy *the who is hairy fisherman *the on the fence bird *the hanging on the tree bird *the with no ears tove *the tove earless *the star distant *the in the distance star

5 Below are some NPs in Saliba with word and morpheme divisions indicated. List each morpheme, and give it an English gloss, and tentative part-of-speech classification; for the grammatical morphemes also explain their function. Comment on any uncertainties. Describe the structure of NPs as sequences of morphemes of various types.

127

128

Linguistics

a. b. c. d. e. f. g. h. i.

tenem nogi-ne hauhau-na-ne tobwa leiyaha tenem tobwa-ne hauhau-na-ne mwauyope buina-na numa gagili tenem numa-ne mwaedo gagili-na mwaedo gagili-di mwauyope yo baela buina-di

‘that new grass skirt’ ‘pandanus leaf basket’ ‘that new basket’ ‘a ripe pawpaw’ ‘a toilet’ ‘that house’ ‘a small eel’ ‘small eels’ ‘ripe pawpaws and bananas’

6 The examples below illustrate some simple NPs in Indonesian. List the morphemes and give them glosses. How would you describe the words orang, buah, ékor, seorang, sebuah and seékor – when do you use them, and how do you choose between them? Give a description of the structure of NPs according to this data. a. guru ini b. tujuh orang guru c. lima orang guru ini d. bayi itu e. tiga orang bayi f. enam orang bayi ini g. buku h. dua buah buku i. sebuah buku j. prahoto ini k. sebuah prahoto l. tiga buah prahoto m. lima ékor kucing n. seékor kucing o. kera ini p. tiga ékor kera ini

‘this teacher’ ‘seven teachers’ ‘these five teachers’ ‘that baby’ ‘three babies’ ‘these six babies’ ‘a book’ ‘two books’ ‘one book’ ‘this truck’ ‘one truck’ ‘three trucks’ ‘eight cats’ ‘one cat’ ‘this monkey’ ‘these three monkeys’

7 The following sentences allow different interpretations, though not all are ambiguous. What are the different interpretations each allows? Which are ambiguous, and what type of ambiguity do they involve (i.e. lexical or structural – see p. 112)? Comment on any cases where you think that the different interpretations would or could be resolved in speech by different prosodies. a. b. c. d. e. f.

Be careful of my glasses. Criminal lawyers can be dangerous. They’ll hang the prisoner in the yard. She hates her husband. The pen has fallen down. The kangaroo is ready to eat.

Structure of Sentences: Syntax

g. h. i. j.

Don’t lie around here. You can see the man in the park with binoculars. Smoking pipes will not be tolerated in this office. His photograph appears on page two.

8 Below are some simple Malagasy (Austronesian, Madagascar) clauses with free translations into English. Identify each lexical word with its English gloss, and identify as many morphemes as you can. Describe the sentences first in item-arrangement terms, and then in terms of experiential roles (Actor, Undergoer and Event). a. b. c. d. e. f. g.

Namaky boky zaza Nahita boky amboa Nisasa zaza vehivavy Nankany anjaridaina amboa Nankany antrano vehivavy Nahita trano zaza Natory amboa

‘A child read a book’ ‘A dog saw a book’ ‘A woman washed a child’ ‘A dog went to the park’ ‘A woman went to a house’ ‘A child saw a house’ ‘A dog slept’

9 Below are some simple clauses in Warao (language isolate, Suriname) with English translations. List the words and give each an English gloss; identify any grammatical morphemes you can. Describe the structure of the clauses in item-arrangement terms, and in terms of experiential roles. a. b. c. d. e. f. g. h. i. j.

Noboto nakae Tira wabae Tira hube abuae Hube anibak ahikomo tate Noboto wabakomo tate Ma noboto ahiae Anibak nakaera Sina nakaera Kasikaha noboto abuaera Sina ma ahiaera

‘The child fell’ ‘The woman died’ ‘A snake bit the woman’ ‘The young girl might hit a snake’ ‘The child might die’ ‘The child hit me’ ‘Did the young girl fall?’ ‘Who fell?’ ‘What bit the child?’ ‘Who hit me?’

10 Below are some sentences in Archi (North Caucasian, Daghestan) with English translations. Identify as many morphemes as you can, and give each a suitable gloss and explanation of its use in the case of grammatical morphemes. Comment on any for which you are uncertain, and explain why. Give descriptions of the syntax of Archi in terms of items and their arrangements and grammatical roles. a. b. c. d. e.

diya verkurshi vi hoɪn h’oti irkkurshi bi boshor baba dirkkurshi vi shusha erkurshi i hoɪn borcirshi bi

‘The father is falling down’ ‘The cow is seeking the grass’ ‘The man is seeking the aunt’ ‘The bottle is falling down’ ‘The cow is standing’

129

130

Linguistics

f. g. h. i.

diyamu buva dark’arshi di buvamu dogi birkkurshi bi dadamu h’oti irkkurshi i lo orcirshi i

‘The mother is left by the father’ ‘The donkey is sought by the mother’ ‘The grass is sought by the uncle’ ‘The child is standing’

Research project The approach to syntax adopted in this chapter recognizes structure between the levels of word and clause. Not all theories of syntax agree. Dependency theories generally do not. One fairly wellknown dependency theory is Richard Hudson’s Word Grammar (also called Daughter Dependency Grammar). Find out about this approach to grammar, identifying its main characteristics, and Hudson’s main arguments in favour of it. What sorts of syntactic units and relations are identified, and how are the latter diagrammed? Compare the dependency approach to grammar with that adopted in this chapter. Draw dependency diagrams for clauses such as The train chugged along the line through the mountains and The policeman shot the man with the rifle, and compare them with the diagrams given above. (You will find information on this theory on the internet; a brief account can be found in Hudson and Van Langendonck (1991).)

6 Meaning

Running throughout the previous chapters, suffusing our discussions at every turn, is meaning. Yet we have said virtually nothing about it. It is high time we remedied this situation, and explicitly discussed the notion of meaning. In this chapter we set up basic frameworks for investigating meaning. First, we deal with meanings encoded by words and sentences, meanings that belong to the language system. Second, we discuss meanings that speakers intend their utterances to express in particular instances of speech, and/or that hearers infer from them.

Chapter contents Goals Key terms 6.1 What is meaning? 6.2 Semantics 6.3 Pragmatics: the meaning of utterances Summing up Guide to further reading Issues for further thought and exercises Research project

131 132 132 136 145 151 152 152 155

Goals The goals of the chapter are to: ● distinguish among different types of meaning, including between literal and nonliteral (figurative) meanings; ● explain the difference between sentence meaning and utterance meaning;

131

132

Linguistics





● ●



introduce the study of lexical semantics through discussion of the main semantic relations between words; convey some feeling for the considerable differences in lexical semantics among different languages; demonstrate one way of specifying lexical semantics; introduce four key concepts in pragmatics: speech acts, reference, presuppositions and the cooperative principle; and reveal the role of context in utterance meaning.

Key terms collocate

figurative meaning

pragmatics

componential analysis

Gricean maxims

presupposition

compositionality

homophony

reference

connotation

illocutionary force

semantics

contextual meaning

intension

sense

cooperative principle

literal meaning

speech act

deictic expressions

metaphor

synonymy

explicit performative

non-literal meaning

vagueness

felicity conditions

polysemy

6.1 What is meaning? The notion of meaning in linguistics concerns that which is expressed by sentences, utterances and their components. Meaning is the content conveyed in communication by language, the message or thought in the mind of a speaker that is encoded in language and sent to a hearer who decodes it (recall the speech chain model, §2.1). This is admittedly an imprecise and simplistic characterization. But rather than attempt to give a precise definition of meaning – which would be impossible – it seems preferable to proceed indirectly, and draw some distinctions that will hopefully clarify the concept.

Reference and sense In saying My computer crashed I am talking about something that happened to an object in the real world, something that sits on my desk. The NP my computer refers to this material artefact, and the

Meaning

relationship between the NP and this object is called reference. Reference is more general than this, however, and covers the relationship between an NP and imaginary and intangible ‘things’ existing in possible worlds of human imagination. Thus we speak of reference in relation to my dream, Archimedes and Sherlock Holmes. Reference is a different thing to the ‘meaning’ or ‘concept’ component of the Saussurean sign (see Figure 1.1). On the one hand, words like hello, eh, in and and can’t be used to refer to anything at all, although they are certainly not meaningless. Signs always have some component of meaning, though some are never used in reference. On the other hand, the Morning Star and the Evening Star both refer to the same material object, Venus (observed in different circumstances), though the NPs surely have different meanings. The term sense is sometimes used for this type of meaning. The sense of a linguistic sign derives in part from its relations to other signs in the language. The sense of the lexeme hand is defined in part by the existence of the lexeme arm. But Indonesian and Savosavo (Papuan, Solomon Islands) have a single term corresponding to both of the English words hand and arm. On the other hand, Jahai (Austro-Asiatic, Malaysian peninsular) has three terms, bling ‘upper arm’, prbér ‘lower arm’ and cjas ‘hand’. The sense of each of the terms in Indonesian, Savosavo and Jahai is different to that of the English terms. The same point can be made for grammatical categories. As Saussure observed, whereas French has a singular vs. plural contrast for nouns, Sanskrit had a three-way contrast between singular, dual and plural. The sense of the plural is different in French and Sanskrit. This aspect of sense – the part derived from the contrasts with other members of the language system – is what Saussure called value. In one approach the sense of a linguistic sign can be characterized in terms of defining properties that must be satisfied. The lexeme sheep might in this approach be understood to include properties such as ‘animal’, ‘mammal’, ‘feeds by grazing’, ‘ruminant’, ‘has hooves’, ‘quadruped’ and so on. These properties that define words are the intension of the sign.

Not everyone agrees that intensional definitions are necessary, useful, or even possible for most lexemes. According to the prototype approach, meanings are identified by characteristic instances of the categories of objects, events or whatever, denoted by a word. Thus we usually think of carrots and potatoes as having more of the major characteristics of vegetables than say aubergines, Brussel sprouts and cabbages. A carrot or potato would be a prototype, or a prototypical instance, of a vegetable; cabbages would be non-prototypical, but not as peripheral as aubergines or okras. According to this theory, the meaning of vegetable e will be specified (at least to some extent) in terms of its prototypes: carrots and potatoes, and other things that share some of their characteristics, that are more or less like them.

Sense and connotations Words often have connotations, more or less unstable meaning associations such as emotional overtones (see §4.5). Unlike the sense of a word, which is an essential part of the sign, connotations are not always present. Connotations can differ according to a person’s attitudes. For example, the

133

134

Linguistics

word mathematical might have quite different connotations depending on a speaker’s experience with the subject at school; that’s a very mathematical way of looking at it could express either a positive or a negative evaluation. Connotations also differ according to the linguistic or speech context. For example, if I used the term mathematical of someone’s approach to life or social relations a negative evaluation would probably be attached; but it could express a positive rating in a description of a piece of baroque music or of Esher’s art. Connotations can be important in language learning and change; over time a connation can become so firmly attached to a sign that it becomes a part of its sense, in the process perhaps replacing aspects of the earlier sense. For instance, for many speakers of English the word dork has just the sense ‘stupid or contemptible person’, with an implicit negative appraisal. The word first appeared as a slang term for ‘penis’; the attitudinal component was a connotation that came to stick, ousting the original meaning.

Literal and figurative meaning We do not always use an expression in its literal sense, the meaning actually encoded by its component lexical and grammatical signs. Clear illustration is provided by idioms (§4.4) such as He kicked the bucket which can mean either ‘he hit the bucket with his foot’ or ‘he died’. The first interpretation is the literal meaning, the second, a non-literal or figurative meaning. The figurative meaning can be considered to be an extension of the literal meaning (see §4.3). Traditional rhetoric distinguishes a number of different processes of meaning extension; three kinds most relevant to language are: Metaphor – in which the sense of an expression is extended to another concept on the basis of a resemblance. For instance, in Belgian drivers are cowboys the noun cowboy is not used in its literal sense ‘person who tends cattle’, but rather invokes the notion ‘person who behaves like a cowboy’; it is left up to the hearer to figure out the basis on which the comparison is made. Metonymy – here the sense is extended to another concept via a typical or habitual association. The literal sense of university is ‘educational institution’; in I’ll go to the university tomorrow the word is used in the sense of ‘building in which the educational institution is housed’. In He’s fond of the bottle, the bottle is used metonymically to refer to the alcoholic beverage typically contained in bottles. Governments are not infrequently referred to by their location, as in London, Washington, Paris, the Kremlin. Synecdoche – where the sense is extended via a part–whole relation. For instance, the term wheels is sometimes used to refer to one’s car. And in the speech of hospital staff, patients might be referred to by their problematic body part. Thus the kidney acquires the sense ‘person suffering from some kidney complaint’. It can be difficult to draw a line between literal and figurative senses, and some linguists reject the distinction. Cognitive Linguistics, associated with George Lakoff, Ronald Langacker, Eve Sweetser and others, takes this view. According to this approach, metaphor plays a central role in language and thought, and is pervasive in ordinary language. Metaphor is not seen as figurative use

Meaning

of language, but rather as a cognitive strategy allowing people to understand one experiential domain (the ‘target domain’) in terms of another (the ‘source domain’). For example, many domains of experience are understood in terms of space, and are expressed linguistically via spatial relations. Time is an example in English and many other languages. In Summer came early this year the early occurrence of summer is expressed in terms of motion in space, with the verb come; in I arrived well ahead of time the spatial relation ahead is used to specify the temporal relation before. (Note that the word before has its origins in a spatial term, and is still sometimes used in this sense.) In some languages, including Russian, the target domain of possession is understood in terms of the source domain of space; ‘I have a cat’ is expressed as ‘at me (is) (a) cat’.

Sentence and utterance meaning Consider the simple sentence The car broke down yesterday. This describes a situation, the failure of a car. You can easily picture the event and invoke a conceptualization of it in your mind. How do we get this meaning? According to the (admittedly fragmentary) grammar developed in the previous chapters, this sentence is a clause made up of signs, including morphemes, words, phrases and grammatical relations. These signs all have meanings, concepts associated with their forms. Supposing we know all of these meanings, we could expect that putting them together will give a good indication of the meaning of the whole sentence. We get a good way towards this goal by putting the meaning of the car together with the meaning of the grammatical role Actor (see §5.4), the meaning of break down with that of Event, and of yesterday with the meaning of the grammatical role it serves, let’s say Temporal Location. We also need to bring into the picture the meaning of the inflected past tense form broke of the lexical root break. This gives the meaning of the sentence in the abstract – that is, as an expression in the English language. Our sentence can be uttered in many different circumstances. Let’s consider just two. (6-1) (6-2)

Carol: Barry: Carol: Barry:

What’s been happening while I’ve been away? The car broke down yesterday. Do you feel like going out tonight? The car broke down yesterday.

The literal meaning of the sentence remains constant in the two contexts of occurrence: the same conceptual event is construed. But, depending on context, different meanings are conveyed by uttering the sentence; the meanings of the utterances differ. Example (6-1) could be from a conversation between friends who have not seen one another for some time due to Carol’s absence abroad. Barry is making a plain statement of fact, giving a direct answer to Carol’s question. Example (6-2) might also occur in a conversation between friends, but here what Carol says could be an invitation to Barry to go out with her. Barry’s response could constitute a polite refusal. It might alternatively express willingness, simultaneously requesting that Carol pick him up. The sentence meaning is invariant and remains regardless of the context in which the sentence is used; however, the utterance meaning is different in the two circumstances.

135

136

Linguistics

The investigation of sentence meaning – and the meanings of the various signs making up sentences – is called semantics. Semantics deals with the meaning of expressions taken in isolation, with the meaning they have within the system of the language. The study of utterance meaning is called pragmatics. Pragmatics deals with the specific meaning of actual instances of language use – that is, with the meaning conveyed by a linguistic expression in a particular context of speech. It is concerned with the uses made of signs belonging to the language system in interactions among human beings. There is a system to these uses – they are not arbitrary, but follow regular patterns, though patterns that do not belong to grammar or lexicon as such. Pragmatics is about meaning in relation to speakers and hearers in context, and thus belongs to the system of speech (to be interpreted generally to include writing and signing) rather than of language.

The distinction between sentence and utterance can be understood in terms of the logical notions of type and token, where a type is a general category, an abstraction, and a token is a specific instance of the category. Thus in boys will be boyss there are four word tokens, but just three types: one type, boyss, occurs in two tokens. A sentence is a linguistic type; an utterance is a token. Semantics is concerned with the meaning of linguistic types, pragmatics with the meaning of tokens.

Overview of types of meaning in language Figure 6.1 puts the distinctions made in the previous sections together in a single diagram to show the sorts of meaning that are linguistically relevant. This all may seem quite cut and dried. But, as usual in linguistics, things turn out to be somewhat fuzzy in reality. It is not always obvious where the line between pragmatics and semantics falls, and linguists disagree about where the border falls. Some linguists, such as Charles Fillmore, Michael Halliday, Ronald Langacker and Peter Matthews, are dubious about, or even reject, the division of labour into semantics and pragmatics. However, aside from the fact that it seems conceptually useful to make the distinction, there are clear-cut cases as shown by (6-1) and (6-2). The line we take in this book is that the two types of meaning are in principle (though not necessarily easily in practice) distinguishable. Nor are they unrelated; indeed, semantics and pragmatics go hand in hand, to the extent that neither can be investigated in the absence of the other. They also go together in language change and learning.

6.2 Semantics The bulk of this section discusses the semantics of lexical items, which you will recall from §4.1 are those things that need to be listed separately in the lexicon of a language. These are of course signs,

Meaning

Figure 6.1 Aspects of linguistic meaning. In the semantic system, value and intension are the main components defining the sense of the sign. The sign the tree also has a non-literal, metaphoric, meaning ‘a branching diagram representing the structure of a syntagm’ (which is not invoked in this particular instance of use). Sentence meaning is very roughly represented as the sense of a complex combination of lexical and syntactic signs (the structure of which is not shown). This complex signifying construction points to the referent, the dead tree. This utterance conveys a pragmatic meaning: the hearer is being reprimanded for relaxing when there is an important job to be done. © 2009 William B. McGregor and his licensors. All rights reserved.

137

138

Linguistics

and our focus is on their senses. Three key issues in lexical semantics concern: (a) pinning down and identifying the meanings of lexical items; (b) the relationships among the meanings of lexical items in a language; and (c) the specification of the meanings of items. These concerns are clearly interrelated. Before you represent the sense of an item you have to identify it; you also need to know how it relates to other items in the language, as the value of a sign is determined by the contrasts with other items in the language system.

Homophony, polysemy and vagueness Two different lexemes sometimes accidentally share the same phonological form; this is called homophony or homonymy, and the words are said to be homophones. Some homophones in English are: boy (as in he is only a boy) and buoy (as in they marked the place with a buoy); port (as in I don’t usually drink port), port (i.e. ‘suitcase’, as in I put the luggage in my port) and port (as in Aarhus has a port); and bank (as in I have no money in the bank) and bank (as in the fisherman is asleep on the bank of the river). Word forms such as /bɔɪ/ and /pɔːt/ are ambiguous since they can be interpreted as the signifiers of more than one lexeme. Sometimes lexemes are partial homophones in the sense that some, though not all, of their forms share the same phonological shapes. For example, the verb bear (as in she agreed to bear the costs) and the noun bear (as in the bear attacked the tourist) share the same phonological shapes in some inflected forms (e.g. both have inflectional forms /bɛː/ and /bɛːz/), but only the verb has /bɔː/ (a phonological form that is incidentally also shared with the verb bore and noun bore). Homophony is sometimes exploited for humorous effect: (6-3)

‘How is bread made?’ ‘I know that!’ Alice cried eagerly. ‘You take some flour–’ ‘Where do you pick the flower?’ the White Queen asked: ‘In a garden or in the hedges?’ ‘Well, it isn’t picked at all,’ Alice explained: ‘it’s ground–’ ‘How many acres of ground?’ said the White Queen. ‘You mustn’t leave out so many things.’ (Carroll 1899: 184–5)

Polysemy is where identical forms have related meanings. For example, the meanings associated with ear in the following sentences seem related: (6-4)

a. b. c. d.

I put cottonwool in my ear. He listened to their difficulties with an impatient ear. That phonetician has a good ear for tone. I tried to get her ear.

These examples reveal the following clearly related senses (there are others): (a) ‘organ of hearing of humans and animals’, (b) ‘attention to what is being said or to sounds’, (c) ‘ability at discriminating sounds’, and (d) ‘favourable attention directed to a person’. Most dictionaries recognize the distinction between homophony and polysemy by giving separate entries for the former and including the latter under the same entry. But the distinction is

Meaning

not always easy to draw, because of the fuzziness of the distinction between different and related meanings. It is easy to see that the above senses of ear are related. Most dictionaries consider the word ear as illustrated by The ear withered on the corn plant to be a homophone of the lexeme ear of (6-4). Nevertheless, many speakers do see a connection, and imagine the ear of corn to resemble in some way an ear. In fact, lexicographers often do not take just meaning into account in their decisions, but also the history of words. In this case, the words come from two different sources: ear (as in the body part) comes from Old English ēare, whereas ear (as in the plant part) comes from Old English ēar. Few speakers of English see any semantic relation between the two senses of bank mentioned in the first paragraph of this section, and dictionary makers tend to agree, typically putting them under different headwords. But both, in fact, can be traced back ultimately to proto-Germanic *bangk- ‘ridge, mound, bordering slope’. You can appreciate the connection through the following plausible chains of meaning extensions: (a) ridge > bench > moneylender’s counter > moneylender’s shop > financial institution; and (b) ridge > slope > side of watercourse. Speakers do not perceive the connection between the two extreme concepts because the other senses barely survive in association with bank. Speakers perceive, quite reasonably, a closer semantic connection between the body-part and plant-part senses of ear (also supported by many similar connections, as in, for example, head of cabbage) than the geographical and institution senses of bank. Homophony and polysemy must also be distinguished from vagueness or generality – that is, lack of specificity of meaning. Earlier we identified four quite general specifications of the senses of ear that are involved in (6-4). Sense (a) ‘organ of hearing of humans and animals’ covers not just (6-4a), but also use of ear in the two sentences in (6-5). (6-5)

a. The teacher pulled the boy along by the ear. b. The dog scratched its ear.

But notice that the ‘meanings’ in the three cases – the mental concepts invoked in the mind of the speaker and hearer – are quite different: in (6-4a) we think of an orifice at the side of the human head; in (6-5a), of an appendage at the side of the human head; and in (6-5b), of an appendage at the side of a dog’s head (which does not look very much like the one on the side of the human head). We don’t usually think of these three meanings as polysemies of ear because the meanings are so closely related that they fall under a single general specification, something like ‘(part of the) organ of hearing of humans and animals’. Similarly for the meanings associated with wrong in It is wrong to speak with your mouth full, It was wrong to take Indigenous children from their mothers, and It is wrong to attribute that quote to Saussure. The first invokes the sense ‘improper’, the second ‘immoral’ or ‘reprehensible’, while the third just ‘incorrect’; it is not difficult to see that a single general sense covers each. The sentential context, our knowledge of the world and our knowledge of the speaker’s beliefs, can be brought into account to narrow down to the specific meaning invoked. The meanings that a word acquires from its contexts of use are called contextual meanings. As distinct from the sense of a lexeme, which remains invariant, contextual meanings are not fixed. Thus, It was wrong to take Indigenous children from their mothers does not necessarily invoke a

139

140

Linguistics

moral comment. For instance, an officer involved in removing Aboriginal children might consider his actions fully moral, and ‘wrong’ only in the sense ‘mistaken’: the intended results were not achieved. Like the other distinctions we have discussed, the line between vagueness and polysemy can be difficult to draw. Some linguists, the present author included, believe that lexemes have much vaguer senses than generally thought, and that polysemy is comparatively rare.

Lexical semantic relations The lexemes of a language relate to one another semantically in various ways, and form a highly structured system, the lexicon. As mentioned in §4.1, this is better thought of as a huge network of interrelated items rather than a mere listing, such as is provided by a printed dictionary. In what follows we discuss four types of semantic relation that give structure to the lexicon: synonymy, antonymy, hyponymy and meronymy.

Synonymy Synonymy is the relation of sameness or close similarity of meaning; lexemes related in this way are synonyms. Some examples of synonyms are: hide and conceal, small and little, rich and wealthy, mother and mum, car and automobile, truck and lorry, and dear and expensive. You will notice that the members of these pairs are not exact synonyms; indeed, exact identity of meaning is quite rare. Synonyms often belong to different registers or styles (see §7.3) of language such as formal, literary or colloquial. Bond concealed the automobile under a tarpaulin is more formal than Bond hid the car under the tarp. Synonyms sometimes belong to different dialects: togs, swimmers, swimming costume, cossies, bathing suit, swimsuit and trunks are words in different dialects of English for the item of clothing worn when swimming. (What do you call them in your dialect?) Synonyms may also differ in the lexical company they keep, in the words that are likely to be found nearby; to put things more technically, they differ in terms of the words they collocate with (see further §9.4). Strong and powerful are partial synonyms, and share some contexts: he has strong arms and he has powerful arms. But we speak of the strong arm of the law not *the powerful arm of the law, and a strong head for alcohol not *a powerful head for alcohol. Investigations of large corpora reveal that these two words are indeed very different in terms of the other words that habitually occur in their environments, and how frequently they do. (See pp. 226–7 for brief discussion.)

As the British linguist J. R. Firth famously observed, the collocations of a word form a part of our knowledge of a word: ‘You shall know a word by the company it keeps!’ (Firth 1962: 11). But Firth went further than this, and proposed that a component of the meaning of a word derives from the collocations it habitually enters into. Synonyms such as strong g and powerfull and happyy and joyfull may be similar in terms of their intensions, but differ markedly in their collocational meanings.

Meaning

Antonymy Antonymy is the relation of opposite in meaning, and examples of antonyms include big and little or small, long and short, up and down, dead and alive and so on. Several different types of antonymy are commonly identified. Gradable antonyms allow intermediate degrees between the two opposite extremes, like big and little/small, fast and slow, and rich and poor. Gradable antonyms can thus be used in comparative and superlative constructions, like richer than and the poorest. They also readily admit modification, as in very big, quite small. And for gradable antonyms, the negative of one does not necessarily imply the positive of the other: not fast does not necessarily mean slow. Non-gradable antonyms are polar opposites, and allow no intermediate degrees. Examples are dead and alive, pass and fail, male and female, and true and false. For these, the negative of one does imply the positive of the other: not true implies false, not dead implies alive. Non-gradable antonyms do not normally enter into the comparative construction. However, the reality is very different to this idealization, and the distinction between gradable and non-gradable antonyms is one of degree rather than kind: it is gradable. So-called nongradables commonly do occur in comparative and superlative expressions (e.g. utterances such as I feel more dead than alive, Hobro is the deadest village in the universe are commonplace), and the binary opposition between male and female is of course highly contested in the human domain – and is dubious in the animal domain generally. Pairs like push and pull, come and go, and rise and fall, which contrast in direction of movement, can also be interpreted as being opposite in meaning. These are called reverses, as also are pairs like tie and untie, pack and unpack, and inflate and deflate where there is a reversal of the action sequence. Converses describe the same relation from contrasting viewpoints, as in own and belong to (he owns it, it belongs to him), like and please (I like it, it pleases me), give and receive (I gave money to the beggar, the beggar received money from me), and above and below (the red block is above the blue block, the blue block is below the red block).

Hyponymy In hyponymy the meaning of one lexeme includes the meaning of another. A hyponym includes the meaning of a more general word. Hammer, saw, chisel, screwdriver all include the meaning of tool – they all denote types of tool – and are hyponyms of tool; the four specific terms are cohyponyms. The general term is called the superordinate (sometimes the terms hypernym or hyperonym are used instead). Dog and cat are co-hyponyms of animal; slap and punch are cohyponyms of hit; and carrot is a hyponym of vegetable and a co-hyponym of potato, onion, cabbage and so on. Hyponymy is a ‘kind of ’ relation: hyponyms are ‘kinds of ’ the superordinate category, which in turn indicates the general type of the hyponym. Thus, relations of hyponymy associate meanings on taxonomic hierarchies. Certain semantic domains lend themselves well to this sort of analysis, including colour terms, kinship terms and terms for animals and plants. Figure 6.2 shows a very partial network for plant terms in English.

141

142

Linguistics

Figure 6.2 A small portion of a taxonomic hierarchy for plant in English.

Figure 6.3 Partial meronymic hierarchy for Gooniyandi body-part terms. Notice that three terms – moowooloo, birdi and thinga – are vague, and refer to both a body part and a part of that body part.

Meronymy Meronymy is the part–whole relation. Door and window are meronyms of room; wheel, handlebar and pedal are meronyms of bicycle; and hand and face are meronyms of clock. Meronymic relations in the lexicon can be represented in hierarchies similar to taxonomies, as shown in Figure 6.3 for body-part terms in Gooniyandi. There is an important difference between the relations of hyponymy and meronymy. Alsatian is a hyponym of dog, which is a hyponym of animal; Alsatian is also a hyponym of animal. This property often does not apply in meronymy. For example, nostril is a meronym of nose, but not of face: we do not say that one’s nostril is a part of one’s face! More technically, hyponymy is said to be a transitive relation, whereas meronymy is not transitive. It must be stressed that networks of both hyponymy and meronymy are lexical networks, not networks of relations among real-world entities. There are many conceptually different ways the animal kingdom can be taxonomized (the Linnean system of classification into species is just one of many possibilities), and many different ways that the human body can be divided into parts. It seems reasonable to believe that the lexical relations of hyponymy and meronymy reflect speakers’ conceptual categorizations of the world. Such ‘folk’ conceptualizations – and thus hyponymic and meronymic relations among lexemes – can be at variance with scientific conceptualizations. For

Meaning

instance, whale can’t be presumed to be a hyponym of mammal or penguin of bird in English (or all varieties of English) simply because whales and penguins are mammals and birds, respectively, in the Linnean taxonomy. In a similar vein, the majority of speakers of English would consider mushroom to be a hyponym of vegetable and plant even though mushrooms do not belong to the biological kingdom of plants.

Specifying lexical meanings How would you specify the meaning of mother? Perhaps the first thing you think of is a biological explanation, and you may think of using semantically related words such as woman, female, father, child, parent and so on. If you have taken on board the discussion of this section, you will try to think of other senses, and look for sentences using the word, such as The earth is mother of us all, She is my mother by adoption, and The Stamp Act is the mother of all mischiefs. You will need to decide whether the different meanings belong to different lexical items sharing the same form, whether they are polysemies of a single lexeme, or are separate contextual meanings of a single lexeme. These considerations are important in pinning down the sense of the word, and essential to giving an adequate description of its sense. There is no consensus among semanticists as to how descriptions of the meaning of lexical items are best formulated, and there are many different approaches in the literature. One approach, popular in the mid-twentieth century, is componential analysis. In this approach the semantic meaning of a lexeme is decomposed into small components, or atoms of meaning, each of which is recurrent in a range of lexemes. The standard componential approach identifies semantic features that differentiate words from one another. Consider the following small set of nouns: bull, cow, calf, woman, boy, girl, chair, man. Except for chair these words all have in common the concept ‘animate’. We could identify [animate] as a semantic feature with a value of either + for animate nouns, or − for inanimate nouns. (It is conventional to put semantic features in square brackets.) Continuing the comparison of the terms, we could also identify features [human], [male] and [adult]. Our eight words could be specified as follows: bull + animate − human + adult + male

cow + animate − human + adult − male

calf + animate − human − adult ± male

woman + animate + human + adult − male

boy + animate + human − adult + male

girl + animate + human − adult − male

chair − animate − human − adult − male

man + animate + human + adult + male

A feature value is given as ± if the word is not specific on that feature: calf is [±male] for this reason. Inanimates are given the value –, not ±, for the features [adult] and [male] because they can’t be adult or male.

143

144

Linguistics

There are also dependencies among the features. If a word is specified as [−animate], it must simultaneously be [−human], [−adult] and [−male]; if a word is [+human], it is also [+animate]. If a word is specified as [+adult], [+male] or [±male], it must also be [+animate]. (Notice that this conclusion does not follow from [−male], though it does from [±male].) There is no need to specify the predictable feature values, which can be simply left out from the matrix specification. Thus we could economize in the above specifications, representing the meanings as follows: bull − human + adult + male

cow − human + adult − male

boy + human − adult + male

girl + human − adult − male

calf − human − adult ± male

woman + human + adult − male

chair − animate

man + human + adult + male

It is important to realize that a dependency among a pair of features is not the same thing as a ± value for a feature. A ± value means either + or – is possible: the feature is not specified for. But leaving out the specification [+animate] for, for example, boy does not mean that either value is possible! Rather, it means that the + value is predictable. (In other cases, a – value is predictable.) The four features are sufficient to distinguish the eight words, and give at least a partial specification of their senses. Adding more features would allow them to be distinguished from other nouns (e.g. dog, table, river, whale, etc.), and permit more precise specifications of their meanings. For instance, we could add in [bovine], [canine], [feline] and so on. This approach has been criticized on many grounds. For instance, it adopts an intensional view of semantics, and is criticized on this basis by prototype semantics (see box on p. 133), which rejects intensional definitions. A perhaps more telling criticism is that the component features used to characterize the meanings of the terms above are more technical than the terms they describe: it appears that the simple is being defined in terms of the complex. Nevertheless componential analysis has been applied to a range of semantic domains in a number of languages. It seems most useful for the description of words belonging to relatively closed lexical sets such as terminologies for kinship, plants, animals and so on; it is also useful for the description of grammatical morphemes and words (e.g. pronouns and adpositions), which constitute the most closed classes in a language.

Sentence semantics The meaning of a sentence depends on the meanings of the component words and how they are syntactically combined; sentence semantics is largely compositional. The meaning of the sentence The fisherman hung the net on the fence is determined by the meanings of the component words, the meanings of their groupings into phrases, and the meanings of the grammatical relations such as Actor, Undergoer, Subject and Theme. Of course, a more comprehensive syntactic description than that developed in Chapter 5 is needed to provide a full description of the sentence semantics.

Meaning

Another site for sentence meaning is the grammatical construction itself (for instance, a grammatical pattern such as the English passive). According to Construction Grammar, grammatical constructions are linguistic signs, and thus carry meanings; I would agree. These meanings also contribute to the compositional meaning of a sentence.

6.3 Pragmatics: the meaning of utterances In the previous section we dealt with the sense of lexical items and larger linguistic units – that is, the meaning that is actually encoded by linguistic forms. As has already been mentioned, this accounts for only part of the meaning of a stretch of speech. It is as though speakers specify just the bare outlines of the meaning they intend to convey, leaving it to the hearer to reconstruct the details in their full richness. This is something we human beings are good at doing. When you look at the figure below it is difficult not to see a white triangle laid over three black circles. But all that is actually depicted are three circle-segments arranged in a particular configuration; your mind fills in the lines that are not perceived by the sense organs.

In this section we deal first with two types of meaning that speakers and hearers fill in: (a) what the speaker intends to do with the utterance – why they spoke in the first place – and how the hearer infers these intentions; and (b) reference or referential meaning (see p. 133), in particular, of NPs in utterances. Lastly, we discuss a general principle that guides the inferences we draw.

Speech acts Speech is fundamentally about purposefully doing things with words; it is a social act of doing. Even now as I sit alone in my room typing these words – not an especially social environment – I am engaged in purposeful acts of using language. I want to inform you, the reader, about linguistics, to sway you to my way of thinking about the subject, and to convince you that linguistics is a fascinating thing to do. Speech acts are the actions speakers perform in uttering sentences, including informing, promising, requesting, questioning, commanding, warning, preaching, congratulating, laying bets, swearing and exclaiming. The type of action performed by the speaker in making an utterance is known as its illocutionary force.

145

146

Linguistics

Explicit performatives English has (presumably in common with all languages) a number of speech act verbs, verbs like inform, assert, promise, request, baptize and so on, that label types of speech act. Most can be used in sentences like the following, where they make explicit the speech act the speaker intends to perform: (6-6) (6-7) (6-8) (6-9) (6-10) (6-11)

I promise you I’ll chop down the tree. I resign. I apologize. I dare you to go any closer. I pronounce you man and wife. I order you to leave the premises.

Sentences like the above that make explicit their illocutionary force by a speech act verb are called explicit performatives or explicit performative sentences.

Direct and indirect speech acts Most utterances, however, do not wear their illocutionary force on their sleeve. To the contrary, as examples (6-1) and (6-2) show, a sentence like The car broke down yesterday can be used with a range of different illocutionary forces: in the former context it has the force of a statement; in the latter it may be either a refusal or request. As a speaker of English you will doubtless feel that there are ‘natural’ associations between certain syntactic forms of sentences and particular illocutionary forces. Table 6.1 shows some of these typical associations. The second column gives the technical label for grammatical form of the sentence shown in the first column. Although we did not deal with this aspect of syntax in Chapter 5, it should be clear that the four sentences are syntactically different types, i.e. different constructions. Before reading further, you should attempt to describe each example in item-arrangement terms (as per pp. 118–9); this will give you an idea how the four syntactic forms are defined. The third column of the table indicates the illocutionary force typically associated with sentences of each syntactic type. If I were to say Can you pass the salt? to my neighbour at the table, I would not normally be asking them a question about their ability to pass me the salt, and a purely linguistic response like

Table 6.1 Syntactic forms and their typical illocutionary forces in English Syntactic form

Illocutionary force

You are energetic this evening.

Declarative

Statement

Are you energetic this evening?

Interrogative

Question

Be energetic this evening!

Imperative

Command

How energetic you are this evening!

Exclamative

Exclamation

Meaning

Yes would be judged inappropriate and inadequate. The interrogative form is being used here with the illocutionary form of a command or request. Examples like this, where a syntactic form is used with an illocutionary force other than the one typically associated with it, are called indirect speech acts; when the association is the typical or natural one, we speak of direct speech acts. Explicit performatives also count as direct speech acts, the difference being that the speech act type is specified lexically rather than grammatically. (Are they direct or indirect grammatically?) We often use indirect speech acts to be polite. The difference in politeness between Can you pass the salt? and direct speech acts such as Pass the salt!, Give me the salt! or I am ordering you to pass the salt is obvious. Speakers often phrase questions and commands in the declarative for similar reasons. Let us suppose we were sitting in the lunchroom at work on a warm day and you open the window for some cool air. Perhaps this has the undesirable consequence that street noise becomes very loud, so at some point I want you to shut the window again. A polite way of issuing the request would be with a declarative – for example, It’s very noisy in here. To say Shut the window! would be impolite. It also risks the possibility of being ignored, or worse, flatly refused – Shut it yourself. Even to say Please shut the window would sound somewhat insistent, and suggests that I am presuming authority over you.

Felicity conditions For a speech act to achieve its intended purpose, its illocutionary force, certain conditions must be satisfied; these are called felicity conditions. For instance, an explicit performative such as I pronounce you man and wife will only succeed in marrying a couple if the speaker is an authorized marriage celebrant, and only if it is uttered at a particular point in the context of a marriage ceremony. Failing these conditions, the speech act cannot achieve its intended ends, and it is infelicitous. Similarly for other speech acts: more than just an appropriate grammatical form is a requirement for the successful achievement of their purposes. Thus, a question such as Where are my glasses? will normally have as felicity conditions that the speaker doesn’t know where their glasses are, that they want to know this information, and that they believe the hearer may know this information. A request such as Please give me my glasses would have as its felicity conditions that the speaker does not have their glasses, but believes that the addressee does, that they are capable of handing the glasses over to the speaker, and that the speaker wants them.

Reference As already indicated, reference is different from sense in that it is not what is inherently associated with linguistic forms such as morphemes and words. Words as such do not refer; rather speakers use them to refer. The claim on pp. 132–3 that NPs refer is to be interpreted in this way: that it is the specific instance of use of the NP by the speaker – the NP token (see box on p. 136) – that refers. How are these acts of reference achieved? All languages have words or morphemes that are used to help pin down the reference of a stretch of speech (including writing and signing), that facilitate the hearer’s identification of the intended referent. For instance, we can use proper nouns (e.g. for

147

148

Linguistics

animals and people Nim Chimpsky, Ferdinand de Saussure, Charles Darwin; and places Sydney, Uluru), and, in languages like English and many other languages of Europe, definite and indefinite articles (the man on the moon, a puppy, the government). In most cases these expressions do not identify unique individuals, except when used in specific contexts. There are, however, cases where an expression normally identifies a unique entity, and refers to something else only in restricted contexts. For example, the moon normally refers to the unique moon of the earth, although it might be used in a lecture or article on astronomy in reference to one of the moons of Jupiter. There is a particular class of words or morphemes that are used to assist identifying referents by linking them specifically to the context of the speech act; these are known as deictic expressions. Deictic expressions identify things by relating them to the social, linguistic, spatial or temporal context of an utterance, and include pronouns, demonstratives and adverbs of space and time. The reference of these items varies with each context in which they are used. Personal pronouns such as I, me, you, we, our are deictic expressions since their interpretation is always dependent on the speech context: their interpretation depends on who the speaker is and who the hearer is. As soon as the speaker changes, the interpretation of I and you changes. Third person pronouns are generally also deictic: they effectively point to someone or something other than the speaker or hearer. (There are exceptions, including use of it in It is clear that you are not listening to me and you in one interpretation of You should clean your teeth every morning.) Demonstratives such as this and that are also deictics, effectively specifying referents by indicating whether they are close to the speaker, or distant from them. Thus you might say this book to refer to the book you hold in your hands; changing speaker roles, I might then refer to the same book as that book. Languages differ in the number of demonstratives they have; for instance, in some languages there are three (occasionally more) rather than two. In Tongan, for instance, there are three demonstratives, eni ‘close to the speaker’, ena ‘close to the hearer’, and ito ‘distant from both speaker and hearer’. Demonstratives employ spatial deixis. Other spatial deictic elements are the adverbs here and there. Expressions of temporal deixis include words such as today, tomorrow, now, then, last week and so on, as well as morphemes such as tense markers, which situate the time with respect to the time of speaking, and change their interpretation with changes in the speech context. It is important to note that the deictic expressions discussed in this section also have senses. For instance, pronouns have senses relating to person, number, gender and case. Their full meaning, however, is only unleashed when they are used in context.

The cooperative principle Speakers and hearers generally communicate successfully: the utterance meaning intended by the speaker on any particular occasion usually corresponds well with the utterance meaning inferred by the hearer. Of course mismatches do occur; a hearer may take offence when none was intended, or fail to take offence when it was intended. But things normally work fairly smoothly. For this to happen, the speaker and hearer must share some procedures of interpretation, they must share ways of drawing the appropriate inferences from what is actually encoded.

Meaning

The philosopher Paul Grice proposed that such an interpretative procedure was the cooperative principle. This he explained in the following way: ‘Make your contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged’ (Grice 1989: 26). According to Grice, the cooperative principle is constituted by four component maxims: ●



● ●

Maxim of Quantity: Make your contribution as informative as required, but no more (or less) informative than required. Maxim of Quality: Try to make your contribution true; do not say that which you believe false or for which you lack adequate evidence. Maxim of Relevance: Be relevant. Maxim of Manner: Be perspicuous – avoid ambiguity, prolixity, disorderliness and obscurity.

These maxims are principles governing the inferences conversational partners draw; they are not rules that you have to follow to produce interactively or socially acceptable or correct utterances. Thus people often tell lies, often say things based on the flimsiest of evidence, and are not above formulating their utterances in obscure ways, intentionally or unintentionally. However, speakers flout the maxims for reasons, such as to achieve particular effects. For example, apart from pathological liars, people usually lie for a reason, to achieve some end. In this respect the maxims are unlike grammatical rules such as ‘an adjective must agree in gender and number with the noun it modifies’; if a language has this grammatical rule, speakers will consistently observe it (excluding speech errors). Speakers don’t decide to disobey a grammatical rule in order to achieve some effect. (There are exceptions, for instance, when a speaker produces an ungrammatical form to, say, mimic – and perhaps insult – an infant or a non-native speaker.) To illustrate how the Gricean maxims can be used to understand the pragmatic meanings of an utterance, consider the following (invented) conversational fragment involving Carol and Barry again: (6-12) Carol: Did you see the new Spielberg movie on TV last night? Barry: Is the Pope a Catholic? Carol has asked Barry a yes/no question, using an interrogative clause; but Barry does not reply with either Yes or No – or anything in between, like Maybe or Some of it. Nevertheless Carol (and you) will immediately interpret Barry’s response as a resounding ‘yes’, even though it is in the form of an interrogative, which normally has the speech function of a question. The Gricean maxims can be used to explain how this meaning is inferred. By the Maxim of Relevance, Barry’s response, whatever it might be, is interpreted as being relevant to Carol’s question. How could the religious affiliation of the Pope be relevant to the question of whether Barry saw the movie? Well, everyone (including Barry) knows that the Pope is a Catholic, so by the Maxim of Manner, Barry cannot be seriously asking for information. Moreover, to be orderly and relevant, Barry must be interpreted as answering Carol’s question – his ‘question’ must really be an answer, and the only way this can be so is for the blatantly obvious answer to Barry’s question to be that answer. Hence the inference that Barry did see the movie. But it can also be inferred that Barry means more than this – otherwise he would have just said Yes, the briefest and clearest expression

149

150

Linguistics

of affirmation. The particular roundabout response that he chose implies that things couldn’t have been otherwise: ‘yes, of course I saw it’. In applying the Gricean Maxims to (6-12) we had to appeal to background knowledge shared by the conversational participants, in this case information known generally to members of the speech community. In some cases the shared information is specific to the conversation. We also had to appeal to syntactic structure, specifically to the status of Barry’s response as an interrogative.

Presuppositions A presupposition is something that must be assumed true in order for a sentence to be appropriately uttered. In each of the following examples, the a. sentence presupposes the b. sentence: (6-13) a. The bus driver managed to stop in time. b. The bus driver tried to stop. (6-14) a. The baby has stopped crying. b. The baby was crying previously. (6-15) a. I regretted giving them the donation. b. I gave them the donation. (6-16) a. He realized that he had been tricked. b. He was tricked. If the driver didn’t try to stop, it would not be appropriate to utter (6-13a), that they managed to stop; if the baby had not been crying previously, it would not be appropriate to say that she had stopped crying, (6-14a); if the speaker had not given the donation, it would be inappropriate to say that they regretted doing so, (6-15a); and if he had not been tricked, he could not realize this, (6-16a). Thus in each case the b. sentence is presumed true in order for the a. sentence to be sensibly uttered. A good test for presuppositions is that they remain constant under negation: each of the b. sentences above remain true if the a. sentence is negated: (6-13) (6-14) (6-15) (6-16)

c. c. c. c.

The bus driver didn’t manage to stop in time. The baby hasn’t stopped crying. I didn’t regret giving them the donation. He didn’t realize that he had been tricked.

Words like another, again, more and the like also invoke presuppositions. The humour of the following passage from Alice’s Adventures in Wonderland is based on the Hatter’s claim that more does not presuppose some previous quantity. (6-17) ‘Take some more tea,’ the March Hare said to Alice, very earnestly. ‘I’ve had nothing yet,’ Alice replied in an offended tone: ‘so I can’t take more.’ ‘You mean you can’t take less,’ said the Hatter: ‘it’s very easy to take more than nothing.’ (Carroll 1927/1866: 101–2)

Meaning

The negative test reveals that Alice is right: Take some more tea presupposes that the addressee has already had some – it remains true for Don’t take any more tea. In a sense presuppositions allow us to produce efficient discourse, as can be seen from (6-17), where use of presupposition-invoking more reduces considerably what needs to be said. In a similar way, (6-18) presupposes that France has a king, otherwise it seems a strange thing to say (and some philosophers have argued that without this presumption the sentence can’t be said to be either true or false). (6-18) The present king of France is bald. The examples we have just discussed would seem to suggest that presuppositions concern semantics rather than pragmatics, and there is disagreement among scholars as to which domain they belong to. I have discussed the phenomenon under pragmatics because presuppositions can be cancelled under certain conditions. For instance, the presupposition (6-13b) does not always hold for (6-13c) – as shown by (6-13d) – and nor does the corresponding presupposition hold for (6-19). (6-20) shows that the same cancellation is possible for more. (6-13) d. The bus driver didn’t manage to stop in time, in fact he didn’t even try to. (6-19) How can five students have managed to fail such an easy test? (6-20) Alice didn’t have more tea, if indeed she had any.

Summing up Meaning is that which is expressed by linguistic units and conveyed by the use of linguistic units in speech, writing and signing. It is a multifaceted phenomenon, embracing two primary domains, semantics and pragmatics. Semantics is concerned with the meanings expressed or encoded by linguistic forms – that is, with the meaning aspect of the linguistic sign. Pragmatics is concerned with meanings that are not encoded, but are inferred. Semantics is thus concerned with sentence meaning, pragmatics with utterance meaning. Sentence meaning is largely compositional, whereas utterance meaning is not. The major concern of semantics is with sense, which involves value and (according to some) intension. A linguistic item can be used either literally or figuratively, though the difference is not clear-cut. Metaphor, metonymy and synecdoche are examples of figurative meanings. We dealt with three issues in semantics. First was the relations between the senses of a lexical item: polysemy, vagueness and homophony. Second was the identification of the range of semantic relations among lexical items: synonymy, antonymy, hyponymy and meronymy. Third was how to specify the semantics of a linguistic unit; we outlined one approach, componential analysis, which factors the semantic meanings of lexical items into atomic components or features. We also dealt with four issues in pragmatics: speech acts, reference, the cooperative principle and presuppositions. Speech acts are what speakers do when they utter a sentence; speech acts have an illocutionary force. Some speech acts overtly specify their illocutionary force; these are explicit performatives. When the illocutionary force is directly indicated by linguistic form we

151

152

Linguistics

speak of direct speech acts; otherwise it is an indirect speech act. Direct and indirect speech acts differ largely in terms of politeness. Reference is concerned with the link between utterances and people, things, places and times that are being referred to. Deictic elements play an important role in establishing reference. The cooperative principle is a principle of interpretation and inferencing shared by speakers and hearers, permitting the utterance meaning intended by a speaker to be reliably inferred by the hearer. It comprises four maxims: Quantity, Quality, Relevance and Manner. Presuppositions are implicit assumptions invoked by certain sentences as required truths in order for utterance of the sentence to be appropriate or reasonable.

Guide to further reading Good basic texts on semantics are Hurford, Heasley and Smith (2007) and Elbourne (2011). Riemer (2010) is a more detailed treatment of the subject; aside from lexical semantics (the focus of this chapter), it deals with sentence semantics and discusses cognitive approaches to semantics. Goddard (1998) is the only theoretically coherent introduction I know of; unlike most other semantics textbooks, it contains numerous illustrations drawn from languages other than English. Ruhl (1989) shows in a number of case studies how many (Ruhl would doubtless say ‘all’) apparent polysemies of lexical items can be accounted for as different contextualizations of a single abstract sense. Despite its now derogatory title, Malinowski (1936/1923) is well worth reading as a serious attempt by a brilliant anthropologist to understand the meanings of utterances in an ‘exotic’ language. Malinowski adopts a ‘meaning is use’ semantics, effectively rejecting the division between semantics and pragmatics. For more recent arguments against the division, see Matthews (1995). Kempson (2001) is a good place to begin reading on pragmatics. Good introductory textbooks are Levinson (1992), Blakemore (1992), Thomas (1995), Mey (2001) and Yule (1996). The cooperative principle and conversational maxims are dealt with in detail in Grice (1975, 1989). Stephen Levinson has developed Grice’s ideas in significant ways; the most accessible outline of his proposals is Levinson (1995); Levinson (1999) applies the framework to positional verbs (‘stand’, ‘sit’, ‘hang’) in the Papuan language Yélî Dnye.

Issues for further thought and exercises 1 What semantic relations are represented in the following pairs of lexemes? a. b. c. d. e.

maximum left east mad borrow

minimum right west crazy loan

Meaning

f. g. h. i. j. k. l. m. n.

brotherly parent single open converse learned appear mobile sane

fraternal child married shut chat erudite disappear cell phone insane

2 Find synonyms (try and find at least two for each) for the following English words: faithful, believe, stretch, break, ground, before, injustice and habit. Are your synonyms exact or approximate? In the case of approximate synonyms, explain the meaning differences, and comment on any differences in their syntactic behaviour. 3 English has a number of verbs relating to cooking, among them the following morphologically simple ones: cook, fry, boil, steam, bake, sear, grill, barbeque and toast. Suggest a set of semantic features that distinguish these verbs from one another, and provide a full feature description of each verb. 4 Suggest semantic features that will distinguish the following verbs of motion: walk, fly, go, jump, swim, hop, run, crawl, drive, roll and move. Give a full feature description for each verb. 5 List as many hyponyms as you can of furniture. Draw a hierarchical diagram showing the hyponymic relations among these words. 6 Make up a list of meronyms of car, and show the meronymic relationships among them on a hierarchical diagram. Are there any instances of transitivity in these terms? (As a test for meronymic relations, check whether the two terms X and Y can occur in the frames X has Y and the Y of X. If so, then Y is a meronym of X. Thus seat is a meronym of car because we can say a car has a seat, and the seat of the car.) 7 How would you explain the meaning of mouse? Make an attempt at writing an explicit definition. Now do the following: a. Think of actual uses of the word in sentences – or check in a corpus if you have one readily available. b. What other senses do you need to identify to account for your examples; attempt to give sharp definitions of each of the senses you find. c. Which senses would you identify to be polysemies of a single lexical item, and which would you suggest belong to another lexeme. Do you think any of your polysemies might be better treated as instances of vagueness? Why or why not? Compare your treatment of mouse with the treatment in your dictionary.

8 Look up some word (for example, try, finish, game, etc.) in your dictionary. Find the lexical items in the definition (focus on the first sense if more than one is given) that are most closely

153

154

Linguistics

related semantically to the word, and look up their definitions in the dictionary. Continue this process, and see how long it takes you to get back to your original word. Draw a diagram to show how the headwords are linked – for example, if try has attempt in its definition, draw a line connecting them. 9 Think of contexts in which the following sentences can be used with the illocutionary force indicated: a. b. c. d. e. f.

It’s cold in here Do you know the way? The refrigerator is full The cat hasn’t been fed Do you know what time it is? Can you pass the salt?

command/request for action rejection of advice refusal of offer denial of permission complaint yes/no question

10 A good way of testing for an explicit performative is to see whether hereby can be inserted and the resulting sentence makes sense. I resign is one by this criterion, since I hereby resign makes perfect sense. But I understand is not, since you would not say I hereby understand. Using this test, decide which of the following are explicit performatives. a. b. c. d. e. f. g. h. i.

I swear that I have never been out with her. You are requested not to feed the animals. I swore that I had never been out with her. I welcome you all tonight. You are nominated as head of the commission. I promise to work harder in future. We nominated him as head of the commission. I dismiss the story as malicious gossip. You know that I have never been out with her.

11 Using the Gricean maxims, explain the following: (a) how the tree in Figure 6.1 can have the utterance meaning indicated; and (b) why I said for animals and people on p. 148, when I might more economically have said for animals (and thus satisfying the Maxim of Quantity). 12 Consider the following conversational fragment: Carol: Barry:

Did you see the new Spielberg movie on TV last night? I’ve got an important exam today.

What is the pragmatic meaning of Barry’s reply? Can you explain this meaning by inferences governed by the Gricean maxims? Think of other utterances Barry could use to mean either ‘yes’ or ‘no’, and explain how that meaning can be inferred. 13 Identify at least two presuppositions of each of the following sentences: a. Harry was surprised that the postman arrived so early. b. Harry’s younger brother wanted more ice cream.

Meaning

c. d. e. f. g. h. i. j. k.

When Harry arrived he began to argue with his brother. Those dogs are barking again. When will Harry ever grow up? The postman doesn’t like dogs either. What’s happened to my glasses? I hope we have another warm day before September. Only Harry knows the combination to the safe. He still regrets being married. Harry has gone back to Stockholm, because I was speaking to him on the phone yesterday.

Research project It was mentioned in §6.1 that time is often understood metaphorically in terms of the domain of space – sometimes encapsulated in the conceptual metaphor TIME IS SPACE. What are some examples of this metaphor in human languages? Write an account of the ways in which relations of time before and time after are expressed in English and other languages. What spatial axes are employed, and how do they represent time? You might consider extending your investigation to gestures in your language and in sign languages. You will find plenty of relevant material on the internet, as well as in many grammars. I draw your attention to one particularly relevant article, Boroditsky and Gaby (2010).

155

156

Part II Language in Use

157

158

7 Sociolinguistics: Language in Its Social Context

Up to now we have been viewing language primarily in terms of its internal organization, as a structured system of signs. In the three chapters of this part we further develop a theme introduced in the last section of the previous chapter, language in use. We begin in this chapter by considering language from the perspective of the uses speakers put it to in their social lives. We will be concerned, that is, with the social aspects of the meaning-making potential of the language system in its context of use. This area of investigation is called sociolinguistics.

Chapter contents Goals Key terms 7.1 Language as a social phenomenon 7.2 Social varieties and variation 7.3 Varieties and variation according to use 7.4 Language use in bilingual communities 7.5 Language shift and endangerment Summing up Guide to further reading Issues for further thought and exercises Research project

160 160 160 162 169 172 174 178 179 180 182

159

160

Linguistics

Goals The goals of the chapter are to: ● describe how languages vary systematically according to social factors, and identify the main types of variation; ● show how speakers vary their ways of speaking – including the language they choose to speak – to construct personal identities and social roles for themselves in speech interactions; ● identify some of the factors relevant to language choice in bilingual communities; ● discuss how and why habits of language use can change over time, and possible consequences of these changes to the vitality of a language; and ● examine the increase in rate of language endangerment and extinction in recent centuries, and attempts by speakers and linguists to arrest these processes.

Key terms accents

isogloss

registerial variation

accommodation

language choice

respect varieties

bilingualism

language endangerment/ obsolescence/death

secret varieties

code-switching dialects dialectal variation

language maintenance/ revival

gender variation

language shift

identity

register

social varieties speech community standard dialect style

7.1 Language as a social phenomenon Social domains of language use All speech occurs in an interactive context in which participants – speakers and hearers – make choices from the language system. These include lexical and grammatical choices that express appropriate experiential meaning – that is, meaning concerned with the construal of the world of experience (see §5.4 if you have forgotten this term). This is only part of the story. As discussed in

Sociolinguistics: Language in Its Social Context

§4.5, words are not always neutral signs, but often express attitudinal values, as for instance when one says pass away instead of die. This is not the only way that words can be charged with nonexperiential meaning. Words can also convey social information about the speaker. For instance, if an Australian is thanked for doing someone a favour, they would be likely to respond with No worries, while an American is likely to say You’re welcome. On one level these expressions mean the same thing, but choice of one rather than another is consistent with the norms – the typical speech patterns – of Australian English versus American English. A person’s membership in a social group – for example, the British community, a rural farming community, or an immigrant community in an urban area – will correlate with the use of certain linguistic forms and patterns of behaviour in preference to others. Some linguistic forms and behaviours represent part of the relatively stable aspect of a person’s social identity; they indicate something about who the speaker is. Here variation in a language is according to the speaker. But not all choices are like this. A speaker of Australian English might say Please take a seat or Grab a chair when offering the addressee a place to sit. These forms do not mark the speaker as being an Australian so much as correlate with the specific aspects of the immediate context of speech, and the temporary roles the speaker adopts. Imagine a university lecturer and student in a formal interview concerning the student’s progress in a course. After a greeting, the lecturer might invite the student to sit down with Please take a seat – which could well sound ominous to the student, and hint that something unpleasant was to follow. Later, the two may happen to meet in a bar; the lecturer might invite the student to join her with Grab a chair. Here the choice of different expressions has to do with the speech context, and the respective roles the interactants take on in it; it does not concern the speaker’s social identity in the sense of their group membership. This is variation according to use. These two social features and their linguistic correlates are summarized in the first two columns of Table 7.1. In the third column are indicated the most general social functions or macro-functions associated with the linguistic devices within their domains of use. The languages and social varieties one controls, as well as the varieties associated with uses, go together to construct a participant’s identity as a person: they concern who the person is, the dimension of ‘being’. This contrasts with the ‘doing’ dimension where the concern is with how the language system is used to accomplish things in speech. In this chapter we focus on the former dimension, ‘being things with words’, ignoring the latter, ‘doing things with words’, which is dealt with elsewhere, under pragmatics (§6.3) and discourse (Chapter 8). These rather terse observations will be elaborated more fully in the remainder of this chapter, beginning with the ‘being; the construction of personal identity’ macro-function. Table 7.1, to be sure, gives an oversimplified picture: the distinction between social varieties and varieties according to use is not as clear-cut as a simple contrast between temporary social role Table 7.1 A model of the major phenomena relevant to language use Social phenomenon

Linguistic manifestation

Social macro-function

Community

Languages and social varieties

Being; construction of personal identity

Interactive context

Varieties according to use

Being; construction of social role

Interactive event

Discourse

Doing; using language as a tool for action

161

162

Linguistics

and permanent personal identity (if there is such a thing!). Nor is the distinction between being and doing a sharp one: indeed, being is very much doing, actively constructing an identity. Nevertheless, the table provides a useful initial perspective on the complex phenomena of language variation and use. Before embarking on this enterprise, however, it is important to say a few words about the notion of speech community, since this plays a crucial role in the story.

The speech community A speech community is a coherent group of people who share the same language or languages and more or less the same norms of language use. The members of a speech community form a network of interacting individuals who communicate linguistically with one another frequently, and more intensively than they engage with outsiders. The term ‘speech community’ is somewhat elastic, and may be used of groups of radically different sizes depending on one’s focus. From the broadest perspective, the speakers of English might be considered to form a single speech community, with overall more frequent in-group interactions than out-group interactions; they also share what is in some sense the same language, and use it in at least some common ways, even if there are some differences in how they use it in specific circumstances. So also might the speakers of British English, American English or Estuary English (the variety of English spoken along the Thames River and its estuary, including in London) be regarded as forming speech communities. What is required for a group of speakers to represent a speech community is a degree of unity and cohesiveness both on the level of the language system(s) and on the level of interpersonal interactions. A random selection of a million speakers of English drawn from the UK, the USA, New Zealand and India would fail to meet this condition, and does not form a speech community. Nor do the speakers of English and Cantonese together form a single speech community.

7.2 Social varieties and variation Regional variation No language with a reasonably large number of speakers spread over a relatively wide territory will be completely homogenous, and differences in pronunciation, lexicon and/or grammar are likely to be associated with different regions. Such variation is called dialectal variation; varieties of a language with their own peculiarities of grammar, phonology, phonetics or lexicon that are associated with particular geographical regions are dialects. The term accent is used in reference to varieties that differ only or primarily phonetically or phonologically; the term ‘dialect’ is used more generally when there are differences in lexicon and grammar, and often in phonetics as well. The Austronesian language Taba, spoken by some 30,000–40,000 people living mainly on Makian Island, near the island of Halmahera in Indonesia, shows minor dialectal differences in each village. These include a small number of lexical differences; a phonological difference (in the speech of some villages /o/ is found where others have /a/), and a grammatical difference (in some dialects the singular/ plural contrast is made only on human nouns, while in others it is made for all animate nouns).

Sociolinguistics: Language in Its Social Context

The differences between neighbouring dialects of a language are insufficient to make speech in one dialect unintelligible to speakers of another; dialects are variant forms of a single language, not distinct languages (see §17.1 for further discussion). However, if a language is spread over a very large region, speakers from opposite extremes of the region may not be able to understand one another, or may experience difficulties in understanding one another, and misunderstandings might be frequent. Nevertheless, neighbouring varieties will likely be mutually intelligible, and the language can be seen as a chain of mutually intelligible dialects. Such situations are called dialect continuums. An example is the so-called Western Desert language (Pama-Nyungan) spoken over the vast desert region of Australia shown in Map 7.1. The named varieties in this map differ from one another in both lexicon and grammar. Geographically close varieties are similar enough to be mutually intelligible; distant ones such as Yulbarija in the far north-west and Kukata in the extreme south-east are more divergent, and not everything said in one would be understood by a speaker of the other. There are significant grammatical and lexical differences between them.

Map 7.1 Varieties of the Western Desert language.

163

164

Linguistics

Mutually unintelligible forms of speech like Mandarin Chinese and Cantonese are thus separate languages; they are not dialects in the linguistic sense, contrary to popular usage, and terminology in common use in Chinese linguistics in China.

Standard dialects Sometimes one dialect of a language will be recognized as the most important or standard dialect of the language. This is usually the most prestigious dialect, which is regarded as the most ‘correct’ form. For languages with longish traditions of writing such as English and French, the standard is the variety that is promoted in schools, and that children are usually taught to write in; it is also the variety most likely to be heard on national broadcasting networks. The standard is usually the variety that is codified in grammars, dictionaries and style guides. In the case of English, somewhat different standards have emerged in different countries, so we have Standard American English, Standard British English, Standard Australian English, Standard New Zealand English and so on. To the extent that a general Standard English can be identified, it would be something of an abstraction, characterized by features common to the national standards. Not all languages have standard dialects. The traditional languages of Australia did not have standard varieties; it is only in post-contact times that a few of them have acquired standard varieties. These are often the varieties that have, by a quirk of history, been the ones that missionaries have worked on, and perhaps produced Bible translations in, or that educators have happened to choose as the standard for literacy materials.

Notice that the linguistic usage of the term dialectt differs from popular usage, where a dialect is understood to be a non-standard or substandard variety of a language and the standard variety is not regarded as a dialect. In linguistics, both standard and non-standard varieties are dialects, and neither is privileged over the other.

Isoglosses In dialectology, the study of dialects, it is standard practice to use isoglosses, lines drawn on a map to mark the boundaries of regions in which a particular feature is found, whether it is a particular lexical item, a characteristic feature of pronunciation, a grammatical feature or whatever. These are a bit like isobars on a weather map, which bound regions of the same barometric pressure. Map 7.2 shows the isogloss for the Danish stød,1 which runs in an east–west direction: dialects to the north of the isogloss have the stød, those south of it do not. (Note that this isogloss reflects the situation for speakers born before 1948; it is not certain what the situation in modern varieties is.) It also shows isoglosses for genders,2 which run in a north–south direction. Dialects to the east of the two-gender region distinguish three genders (masculine, feminine, neuter); those to the west of it make no gender distinction. As the map indicates, isoglosses do not always coincide. Generally, however, boundaries of major dialects are marked by bunching of isoglosses.

Sociolinguistics: Language in Its Social Context

Map 7.2 Two isoglosses in Danish. (Based on Haberland 1994: 314, 315 and Goldshtein 2023: 25.)

Variation according to social group Many societies in today’s world are stratified according to socio-economic status. In industrialized Western societies stratification depends primarily on income, education and occupation. Sociolinguists commonly identify two classes according to these variables: working class (generally with lower levels of education and in manual or semi-skilled employment) and middle class (generally with higher levels of education, and working in non-manual professional jobs). Both of these can be further divided into upper, middle and lower. Sometimes lower and upper classes are also distinguished. These classes (in Western societies) form a scale of variation rather than a set of rigidly distinct and precisely delimited classes. One investigation, undertaken by the American sociolinguist William Labov in the late 1960s, studied social stratification in the speech of New York City residents according to a number of linguistic variables (Labov 1972). One was the phonetic realization of /θ/, which in New York City has three

165

166

Linguistics

͡ ] and [t̪ ]. Across various styles of speech, Labov found a consistent correlation between variants: [θ], [tθ social class and the phonetic variable. For a given level of formality, the higher the speaker’s socioeconomic status the greater was the tendency to use the fricative allophone [θ], and the lower the speaker’s status, the more affricate and stop allophones they used. Moreover, there was a fairly large gap between lower- and working-class speakers on the one hand, and middle-class speakers on the other. Use of these linguistic variables is a matter of frequency; it is not an all-or-nothing affair. No social class in New York City is totally consistent in use of any of the allophones. Furthermore, for each class, use of the prestigious variant [θ] increases with the degree of formality of speech. The variation thus concerns the notion of style (see the box on p. 169).

Variation according to gender Men and women probably speak differently in all human societies. Some differences have a biological foundation: males tend to have larger vocal tracts and vocal folds than females, and thus the fundamental frequency (the frequency of vibration of the vocal folds) tends to be lower in the speech of males than females. However, biology does not fix even this, and the differences can be exaggerated, as is the case in Japanese where the fundamental frequency differences between the genders are more marked than in English, due to female Japanese speakers tending to use higher frequencies than English-speaking females. This has been confirmed experimentally by Y. Ohara (1997). Ohara recorded conversations and sentences read in Japanese and English by the same speakers, and found that the women used higher fundamental frequencies when speaking Japanese than English, while men used the same fundamental frequencies when speaking both languages. Differences in speech between the genders are often a matter of degree rather than kind, although in some languages there are features that are unique to one gender or the other. In English the situation is of the former type: that is, gender differences are a matter of degree rather than kind. A number of linguistic features tend to pattern differently for men and women. It is documented, for instance, that women tend to have, and habitually use, larger vocabularies of colour terms than men, including less frequent terms such as mauve, lavender, crimson, violet, beige and so on. Differences also exist in usage of non-standard grammatical forms such as so-called double negatives (as in I never did nothing), use of the /ən/ ~ /n/ allomorphs of the -ing verb suffix (as in the utterance of finishing as /ˈfɪnəʃn/), and non-standard past tense forms such as seen instead of saw (as in I seen it the other day). Numerous studies have shown these non-standard features to be more common in the speech of males than females. Similarly in Japanese there are differences in frequency of usage of morphemes such as sentence-final politeness particles in the speech of men and women. However, these differences in frequency of use do not associate exclusively with gender differences. They also depend on the social context as well. For instance, the frequency of usage of allomorphs of the ing suffix in English, and the politeness markers in Japanese, depend on both gender and social circumstances. In some languages categorical differences are found in the speech of males and females, certain forms being peculiar to one gender. In Gros Ventre (Algonquian, USA) alveolar and palatal affricates in men’s speech correspond with velar stops in women’s speech. Sidamo (Afroasiatic,

Sociolinguistics: Language in Its Social Context

Table 7.2 Bound third person pronouns in women’s and men’s varieties of Yanyuwa Women

Men

Male

Masculine

Male-Masculine

Nominative

ilu-

inju-

ilu-

Accusative

anya-

i-

ø-

Ethiopia) has some lexical items peculiar to men’s and some to women’s speech. For example, the word for ‘four’ is rore in women’s speech, and ʃoole in men’s speech. In the Australian language Yanyuwa (Pama-Nyungan) there is an even more fundamental grammatical difference between male and female speech. In the variety spoken by females seven noun classes (see note 2) are distinguished, while in the variety of male speakers just six are distinguished. A contrast is made between male and masculine classes in women’s speech that is not made in men’s speech. (The nature of the difference between the male class and the masculine class in the variety spoken by women need not concern us.) This difference shows up in a number of places in the grammar of the two varieties, including in the bound pronouns (see Table 7.2). It also shows up in the gender prefixes to nouns and their modifiers: corresponding with the prefix ki- in the speech of males, is either nya- (male class) or ji- (masculine class) in the speech of females.

You will have doubtless noticed that the above discussion assumes a binary division of the humans into males and females. This is of course a gross oversimplification both in terms of biological sex and socially constructed gender. Biologically, humans show traits – including genital and chromosomes – which makes it impossible to divide them simply into two mutually exclusive sexes. In the social domain gender is not something that is inherent in a person, but rather is something that a person does; and gender identity, the felt identity of a person, does not always match that assigned at birth. The relevance of language here is that the different linguistic choices a person makes contribute to the construction of their gender identity. Their linguistic choices are a component of the doing of gender, and permit the construction of a range of femininities and masculinities in different socio-cultural contexts.

Other dimensions of variation Other social dimensions of variation include age, ethnicity and religion. Let us look briefly at each of these. Different generations of speakers often show differences in speech – for instance, in use of ‘slang’ terms such as buck ‘dollar’, wicked ‘good’, cool ‘good, up to date’, and dude ‘guy, man’. Some slang terms (e.g. buck) have long lives, and may end up as more or less standard lexemes; dwindle is an example: it was a slang term in Shakespeare’s time. Many do not survive long, and their use can be characteristic of a particular generation group, the youth of a certain time. Terms such as dag ‘a somewhat entertaining character’

167

168

Linguistics

and stoush ‘a fight’ in Australian English, popular in the early twentieth century, now sound quite dated. (I don’t think I have used either of these, which I associate with my (grand)parental generation, for decades.) Popular in the 1960s and 70s was groovy, which seems to have suffered a similar fate. Different ethnic groups in countries such as the USA, Britain and Australia often speak slightly different varieties of English, showing divergences in phonetics/phonology, lexical items and/or grammar. One of the best-studied ethnic varieties is the English of African Americans, generally now called African American Vernacular English (AAVE). This variety shows characteristics distinguishing it from Standard American English, the ethnic variety associated with those of European descent. (a) The auxiliary be is usually absent where standard English has an unstressed be, as for instance in He fast in everything he do. (b) The verb be is used to indicate habitual activity, as in He be late, which means ‘he is always late’; He late by contrast refers to a single instance. (c) Word-final consonant clusters of Standard English are often absent in AAVE, the cluster being typically replaced by its first consonant, as in foun (found) and lef (left). Although this happens in casual speech in other varieties of English, it is more pervasive in AAVE. Sometimes religious differences are associated with differences in language varieties. Hindi (spoken in India) and Urdu (spoken in Pakistan) are mutually intelligible varieties of a single language, often referred to as Hindi/Urdu. They differ somewhat in lexicon, and employ different writing systems, Devanagari for Hindi and a variety of the Arabic abjad for Urdu. But the contrast is based ultimately on religion: Hindi is associated with the Hindu religion, Urdu with Islam.

Accommodation Speakers often change the way they speak according to the person they are speaking with, adopting features of one another’s speech – or what they believe to be characteristics of one another’s speech. Thus they adjust the variety they use so as to be more like the variety of their addressee. This is called speech accommodation, and is a way of reducing the social distance between the interlocutors. Speakers of any dialect of English who reside for long periods of time in a region where a different dialect is spoken normally accommodate to the dialect of their region of residence; on return to their home region, they reaccommodate to their native dialect. Their speech tends to converge to the dialect spoken around them. I notice this in my own speech when returning to Australia every few years, and then on my subsequent returns to Denmark. When the sociolinguist Peter Trudgill examined his own speech in interviews with Norwich informants, he found that his use of some accent features closely resembled those of the accent of his informants (Trudgill 1986). A speaker can also choose to emphasize their social distance from an interlocutor by refusing to accommodate, by diverging from the patterns of the other’s variety. A person who speaks both a standard and a non-standard dialect of English might shift from speaking the standard to speaking the non-standard in order to signal social distance from their interlocutor – for instance, to underline a refusal to comply with a request.

Sociolinguistics: Language in Its Social Context

7.3 Varieties and variation according to use Where variation in language depends on the more immediate context of the utterance rather than characteristics of the speaker, we speak of different registers or registerial variation. Registers thus do not construct the speaker’s personal identity, but rather their and their addressee’s role in that speech interaction. They are linguistic varieties according to use. According to Michael Halliday (e.g. 1978), three factors are relevant to the specification of registers: ● ●



Field, the subject matter of the discourse. For instance, the field of this book is linguistics. Tenor, the relations among the interactants in the discourse. This includes, for example, the degree of distance or formality they adopt. Mode, the medium or channel employed. This can include the choice between speech and writing; it can also include the manner of speaking – for instance, speaking over the telephone rather than in person, and the role of other systems such as gesture.

Different values for these factors (according to Halliday) give rise to different registers or registerial variants.

Think, for instance, of the values of these factors for the present book and those for your lectures. Can you identify any corresponding differences in the language employed? Would the text of this book sound appropriate if spoken in your lectures, or the speech of your lecturer look suitable if written down?

Examples of different registers in English include legal, bureaucratic, scientific, religious and medical ‘Englishes’, which are characterized by numerous lexical peculiarities. Differences in the frequencies of use of grammatical constructions or categories may also exist: scientific English shows heavy use of nominal modes of expression and nominalizations (nominal stems derived from roots of other parts-of-speech – for example, variation from vary). The other two factors are also relevant: there will be differences according to the relation between the interactants and whether speech or writing is used. For instance, the register of a popular science book, a written piece, differs from the registers the author would use when writing for an audience of fellow scientists, and when lecturing to a lay audience. Other registers found in some languages include secret varieties, respect varieties, baby-talk and animal talk (speech directed to animals). In what follows we discuss the first two of these.

The notion of style overlaps with the notion of register. A style is a variety associated with a particular social context of use, and differs from other styles in degree of formality. Thus styles in a language range from the most informal and colloquial to the most formal.

169

170

Linguistics

Secret varieties Professional and occupational registers like those mentioned in the previous section serve gate-keeping functions: non-members of the group are excluded from full understanding of the message due to the technical terminology and possibly arcane modes of expression. In some cases this function comes to the fore, and a register’s motivation is principally to exclude outsiders and render the meaning obscure to them. Registers of this type are called ‘secret languages’ or ‘anti-languages’. An example is the secret register called kpélémέíyé used by young Kisi men in Liberia. Based on Kisi (Niger-Congo, Sierra Leone), only males of a certain age use it, and no female speakers or nonKisi speakers understand it. The words of this secret register are formed from ordinary Kisi words by a variety of somewhat obscure processes of modification, the most obvious of which is transposition of syllables. Examples illustrating the latter process include the secret variety lexeme ndòtúŋ ‘dog’ deriving from the ordinary term tùŋndó, and yòɲáá ‘cat’ coming from ɲààyó. There are also semantic and grammatical differences, including replacement of some items by their opposites, and reordering of words in clauses. Other examples of secret registers include Pig Latins, sometimes used by school children in Western societies; initiands’ and other ritual varieties of some Australian Aboriginal groups; and secret varieties used by criminals – for example, in West Bengal. A common characteristic of these registers is the replacement of a lexeme by a lexeme opposite or nearly opposite in meaning; this is also quite commonly employed in slangs, as in the use of wicked and sick for ‘good’. Also common is the reversal of the order of syllables.

Respect varieties Many, perhaps all, languages have means of showing respect, deference, distance and politeness by lexical and/or grammatical choices. For instance, it is common in the languages of Europe (and elsewhere) for a speaker to address a single hearer with the second person plural pronoun to indicate respect; in French, for instance, the plural vous is used in addressing a single person to show respect, distance or politeness. Japanese and Korean (isolate, Korea) have systems of honorifics, lexical and grammatical choices that mark respect. For instance, in Korean the ordinary word for ‘meal’ is pap; the corresponding honorific is cinci. Ordinary verbs in Korean can be made honorific by adding the infix si, as in o-si-ta ‘to come’, corresponding to ordinary ota ‘to come’. Traditional Australian Aboriginal societies were egalitarian, and respect was shown to an individual not because of social rank, but rather according to the kin relationship between that person and the speaker. Usually this applied to individuals related as mother-in-law to son-in-law (sometimes brothers-in-law): such in-laws should not engage in familiar or intimate interactions, and should be circumspect in their interactions with one another. In many cases special speech varieties are used among interlocutors so related, and sometimes also when speaking about the inlaw. These varieties are used as a sign of social distance and respect, and are thus called respect varieties. (They are also called avoidance styles and mother-in-law languages.)

Sociolinguistics: Language in Its Social Context

Respect varieties generally have the phonology and grammar of the everyday language – though there can be minor divergences – and differ mainly in lexicon. Often the vocabulary of the respect variety is quite small, sometimes covering only a limited range of meanings; the lexemes are typically vague in meaning compared with everyday words. For example, Bunuba and Gooniyandi respect varieties have just over a hundred words. Some respect words have a more general sense than their everyday counterparts, so that one avoidance term corresponds to a few different everyday terms. In the Bunuba variety jayirriminyi covers the meanings of the ordinary words thangani ‘mouth, language, speech, story’ and yingi ‘name’, while jalimanggurru covers three distinct boomerang types, referred to in ordinary speech as baljarrangi ‘returning boomerang type’, gali ‘returning boomerang type’, and mandi ‘non-returning boomerang type used for hunting’. However, only a fraction of the everyday lexicon has corresponding respect terms: notable absences are terms for genitals and sexual activity, topics inconsistent with respect and distance! Generally, an utterance in the respect variety contains just a single respect lexeme, as illustrated by Gooniyandi example (7-1), which employs the respect verb malab- ‘make’ instead of the ordinary verb wirrij- ‘dig’. The other word goorgoo ‘hole’ is an ordinary everyday nominal. (7-1)

malab-mi make-he:effected:it ‘He dug a hole.’

goorrgoo hole

Some respect varieties have somewhat larger lexicons than the Bunuba and Gooniyandi ones, some fewer. At one extreme is the Dyirbal (Pama-Nyungan) respect variety, which apparently had lexemes covering the entire range of semantic domains, though with less precision than the everyday lexemes. At the other extreme are respect varieties with just a single characteristic lexeme, as in the case of Jaru, where it is luwarn-, identical in form with the ordinary verb meaning ‘shoot’. This verb replaces every verb of everyday speech, and is completely general in meaning. Respectful utterances are formed in Jaru by replacing the verb by luwarn-, as illustrated by (7-2), which may be compared with the near minimal pair in everyday Jaru, (7-3). (7-2)

maliyi ngalu luwarnan mother:in:law they:are be:doing ‘Mother-in-law is sitting here.’

(7-3)

ngawiyi nga nyinan father he:is be:sitting ‘Father is sitting here.’

murla-ngka here-at

murla-ngka here-at

Jaru respect variety

Everyday Jaru

Respect varieties often show differences in manner of delivery, being spoken more slowly or softly than normal, and without eye contact. Use of pronouns is often different: the ‘you-plural’ form is normally used for a singular addressee, the ‘they’ form in reference to a single avoidance relative. Furthermore, respect speech is typically vaguer than ordinary speech; it is rare for speakers to elaborate on vague avoidance utterances to make the meaning more precise.

171

172

Linguistics

7.4 Language use in bilingual communities A speech community is not always made up solely of speakers of a single language. Many speech communities are constituted of individuals who share two or more languages. Here I use the term bilingualism to refer to such situations, allowing that more than two languages are involved; sometimes the term multilingualism is used instead as the cover term. Many speech communities in Indigenous Australia were traditionally, and still are, bilingual. Almost everyone in the Gooniyandi speech community traditionally spoke, in addition to Gooniyandi, at least one of the following: Bunuba, Kija (Jarrakan), Nyikina (Nyulnyulan) and Walmajarri; some gifted individuals spoke other languages as well. In more recent times, Kriol (a creole – see §17.4) has been added to the typical inventory. The Danish speech community is also a bilingual one, with English and to a lesser extent German among the languages shared by many Danes. Speakers in bilingual speech communities must choose between two or more languages on any occasion of speaking. The choice of language is probably never entirely arbitrary, and like lexical and grammatical choices, typically conveys meaning. We deal first with the most general level of language choice, the level of the speech interaction. Then we look at choices made at the level of utterances, and the ways in which, and reasons why, speakers adopt now one language, now another at different points in the speech interaction. The fundamental idea underlying the discussion is that languages express aspects of speaker’s social identity (the ‘being’ macro-function in Table 7.1). In some cases a speech community uses two distinct forms of one language, one learnt via education, the other learnt in informal situations as the first language. The variety learnt at school, the ‘high’ (H) variety, is usually used in more formal contexts such as in church, on the radio, in serious literature and so on. The other variety, the ‘low’ (L) variety, is associated with less formal contexts, such as family conversations. This situation is known as diglossia. The German-speaking community in Switzerland is diglossic. Standard German is the H variety, learnt at school; Swiss German is the L variety, learnt in the home. Comparable situations in which different languages are involved, as in the case of Spanish (H) and the Tupian language Guaraní (L) in Paraguay, are also referred to as diglossic.

Language choice In bilingual communities, speakers tend to speak each language in particular interactive contexts, depending on who they are talking to, the topic of conversation and so on. The clusters of contextual factors that influence the habitual choice of language are called domains. Examples of domains are the domestic domain, the educational domain, the administrative domain and so on. The association between a language and a domain is a tendency not a rule: certain choices of language correlate statistically with certain domains. Bilingual speakers can and often do vary their language within a single discourse, or across discourses of the same type (see next subsection). It has been proposed that broad patterns of language choice in many African countries correlate with social domains (Myers-Scotton 1993). In urban regions in Kenya many people are trilingual in their own mother tongue, Swahili and English. Mostly they use their mother tongue in the home,

Sociolinguistics: Language in Its Social Context

and with members of their own ethnic group. At work again, speakers may use their mother tongue with others from their own ethnic group, and otherwise Swahili or English (especially in whitecollar occupations). Outside of the workplace, English and Swahili are also used with people from other ethnic groups, with English associated with more formal and public interactions. Another trilingual speech community is Sauris, a small community in the Carnian Alps in north-eastern Italy. Here a dialect of German is used in the home; Italian (Romance) is the language of education and organized religion; and Friulian (a Rhaetian Romance language) is used by men in the local bars.

Code-switching Code-switching is the phenomenon, common in bilingual speech communities, in which speakers switch from one language to another within the same conversation. Indeed, code-switching often occurs within the same utterance, as in (7-4) – quite unremarkable in casual conversation – from a bilingual speaker of Malay (Austronesian, Malaysian peninsular and many nearby islands) and English. (Malay words are bolded.) (7-4)

This morning I hantar my baby tu dekat babysitter tu lah ‘This morning I took my baby to the babysitter.’

In many bilingual situations the languages in a speaker’s repertoire include one or more local or minority languages associated with local ethnic groups, and a majority language that has no local associations, such as a national language or international language like Swahili and English in Kenya. Broadly speaking, in such bilingual situations, choice of the local language underlines solidarity between the conversational partners, while choice of the national language serves a distancing function, emphasizing social distance. By making choices among the available languages within a conversation, speakers strategically manipulate solidarity and distance to more effectively serve their goals at that point in the interaction. Susan Gal (1979) found that bilingual speakers of Hungarian and German in the Austrian village of Oberwart might switch to German in an argument conducted largely in Hungarian to add extra force to a particular point. It is not that German is always chosen to help win an argument. Rather, at certain points in an interaction it can be used in a bid to achieve this communicative purpose; elsewhere it might be used to achieve different ends. Code-switching is common in Australian Aboriginal communities, though only a few careful investigations have been undertaken. One notable early example is Patrick McConvell’s (1985) classic study of code-switching in an interactive event in which a small group of men from Daguragu, a small community in the Northern Territory, are butchering a bullock. The men spoke ‘standard’ Gurindji (Pama-Nyungan), as well as a local regional variety such as Wanyjirra (PamaNyungan), and Kriol (see p. 439). Within this interaction the men constantly switch between their local variety, standard Gurindji, and Kriol. They do not do this at random, however. McConvell shows that the choice depends to a large extent on which social group(s) the speaker wishes to stress membership of at different points

173

174

Linguistics

in the interaction. Choice of the local variety Wanyjirra highlights the interlocutors’ membership of a small local group: using this variety a speaker can declare their social proximity to the addressees, that they are co-members of a small speech community. This might pave the way for a request. By contrast, choice of Kriol would serve to downplay the alliances among the interactants, indicating no more than that they are all members of the large Kriol speech community. Choice of Kriol could reinforce denial of a favour, or stress wider community needs over the needs of an individual. The speaker as it were smooths the way for such problematic speech acts as denials and refusals by distancing themselves from the addressee. This is illustrated by (7-5), a short excerpt of three speech turns by two of the butchers. (Here the vertical line | indicates switch of language; capitals indicate Kriol words; bolding indicates words specific to Standard Eastern Gurindji; small capitals mark specifically Wanyjirra forms; and plain italics indicates forms common to Gurindji and Wanyjirra.) (7-5)

G: MINE

|

pampirla | THERE AGAIN, OLD MAN | pampirla, shoulder shoulder waku nyarra? | kankurla-pala-nginyi ngu-yi-n | kuma-wu which way above-across-from will-me-you cut-will J: | laja | -ma ngartji ma-ni W-rlu shoulder-topic choose get-did W-by G: | nganinga | -ma my -topic G: ‘MINE | the shoulder | THERE AGAIN, OLD MAN | the shoulder, or what ? | From across the top you have to for me | to cut it.’ J: | ‘the shoulder | W- picked it out.’ G: | ‘mine | (it is).’

McConvell comments on the code-switching in this interaction as follows: G begins in Kriol, but switches to Wanyjirra to emphasise the close local bond between himself and J, in relation to J’s giving him the shoulder, and the cutting action which will provide G with the shoulder. J however responds by shifting back to the wider community arena by using SEG [Standard Eastern Gurindji], and emphasising the rights of a non-Wanyjirra community member. G reasserts his claim within the narrower arena by using the W [Wanyjirra] term for ‘mine’. (McConvell 1985: 111)

7.5 Language shift and endangerment Languages do not remain constant for long: indeed they change rapidly. In later chapters (especially Chapters 16 and 17) we deal with changes that happen over time to the lexicons and grammars of languages. Sociolinguistic patterns are not immune to change either, as societies and technologies change and languages are put to new uses. New styles of speech or writing emerge for use in new social interactions and purposes. The wide availability of email, instant messaging, SMS and the World Wide Web has resulted in new patterns of use of many languages (see §14.5).

Sociolinguistics: Language in Its Social Context

Nor are things static in the domain of linguistic varieties and their social-identity values. New dialects emerge when populations move into new regions and countries, as happened to English in America, Australia and New Zealand; in some circumstances new languages eventually emerge (see §17.4). Moreover, over time people are likely to change their habits of choosing between the languages and varieties at their disposal in the speech community, and thus the social values associated with these varieties change. When changes in habits of language use become particularly pronounced, and one language or language variety comes to be used in a significantly smaller or wider range of circumstances in a speech community we speak of language shift. In extreme cases, what was once the major language of a community – the language used as the primary vehicle of communication and the mother tongue of most community members – may be replaced by another language. When this process affects the entire speech community of a language, we speak of language endangerment or obsolescence; when it reaches the point where no speakers remain, we refer to language death.

Rate of language shift, endangerment and death The rate of language shift or death varies considerably from case to case. In cases of gradual shift the domains in which one language is used contract gradually, and it may take many generations before it is replaced by another language (if indeed the replacement is ever complete). The replacement of Scots Gaelic or Gàidhlig (Indo-European, Scotland) by English has been ongoing for hundreds of years, and remains incomplete. At the opposite extreme, a language can completely disappear within a generation or less. Such cases of sudden death are rare, and are often associated with the death of all speakers within a short period of time. In 1226 the Xixia or Tangut population of Western China, speakers of a Tibeto-Burman language, were annihilated by the Mongolian emperor Genghis Khan. But perhaps the clearest example of sudden death is that of Tambora (Papuan), spoken on the Indonesian island of Sumbawa. All speakers of this language were wiped out in a volcanic eruption in 1815, the largest in recorded human history. Sometimes political circumstances can give rise to sudden death of a language without the death of the entire speech community. Following a massacre of thousands of Indians in El Salvador in 1932, the survivors abandoned their traditional languages to avoid identification as Indians.

Causes of language shift Language shift and death can happen for many reasons. Usually it is not possible to isolate a single cause for a particular case of language shift; rather, a number of factors typically conspire, including the wider social circumstances. Across diverse cases certain factors tend to recur, including the following. Disruption of the speech community – physical or social separation of speakers so that there are fewer opportunities for interaction among them – is a common factor in language shift. This can come about in many different ways: decimation of the speech community; enforced resettlement together with others who do not share the language; widespread dispersal of the community for

175

176

Linguistics

employment and other reasons; influx of significant numbers of immigrants; and separation of children from adults (e.g. by segregation in dormitories). The Nyulnyul speech community was disrupted in almost all of these ways during the first sixty or seventy years of contact with Europeans. First, it was significantly reduced in the late nineteenth and early twentieth centuries through killings by unscrupulous Europeans and the diseases they brought with them. With the establishment of the Beagle Bay Mission in Nyulnyul territory in 1890 began influxes of Aborigines from outside, few of who spoke the language. When dormitories were established on the mission in the early twentieth century, Nyulnyul children were separated from their parents who they saw only at weekends; use of Nyulnyul in the dormitories was forbidden. From the first decades of the twentieth century, many mission-educated Aborigines of Nyulnyul descent were sent to employment outside of the mission. Numbers of speakers and their patterns interaction, including marriage, are other relevant factors. The larger the speech community of a language, the better chance it will have of survival, other things being equal. But other things are not always equal, and some languages have survived for a long time without large speech communities, while others appear vulnerable even with many thousands of speakers. If marriages tend to be outside of a smallish community of speakers, fragmentation of the community may well result. This consideration was also relevant in the case of Nyulnyul: in the early decades of the twentieth century missionaries strongly encouraged marriage between local Nyulnyul men and women from outside, the majority of who had been forcibly taken to the mission as young children. Attitudes to the languages can also be decisive. Speakers might shift their speech habits in favour of a language enjoying higher status in the community or in the national domain, especially if it is politically or economically advantageous to do so. Attitudes can be relevant in other ways as well. In some Australian Aboriginal communities the traditional languages have come to be regarded by speakers as too difficult for children, and suitable only for adults. And in some cases last speakers have withheld their language from younger generations because they fear it will not be adequately valued. The symbolic value of a language can also have a bearing on language shift. In some instances the language of the colonizers is associated with the modern world and desirable commodities, while the traditional language might be associated with old ways of life no longer practised. An association with traditional culture can, on the other hand, sometimes be supportive, giving the language at least one domain in which its survival is enhanced. The Nyulnyul situation is interesting in this regard: as a result of missionary translations of religious materials, it seems that the association between Nyulnyul and traditional cultural practices was weakened, so that no longer was the language identified with traditional practices. As a consequence, by the mid-twentieth century (if not earlier), Nyulnyul was left with no positive symbolic value.

Structural changes accompanying language shift and endangerment In language endangerment situations, especially when shift is gradual, simplifications of grammar and lexicon such as regularizations and losses often occur. For instance, in the late twentieth century

Sociolinguistics: Language in Its Social Context

Table 7.3 Some allomorphs of two case suffixes in Gurindji (after Dalton et al. 1995: 90) Cases

Children’s Gurindji

Traditional Gurindji

ergative

-ngku after a vowel

-ngku after a vowel in words of 2 syllables -lu after a vowel in words of more than 2 syllables

-tu after a consonant

-tu after an alveolar consonant -ju after a palatal consonant and other allomorphs

-ngka after a vowel

-ngka after a vowel in words of 2 syllables -la after a vowel in words of more than 2 syllables

-ta after a consonant

-ta after an alveolar consonant -ja after a palatal consonant and other allomorphs

locative

the Gurindji of 5–8-year-old children in the Daguragu and Kalkaringi communities in the Northern Territory showed evidence of simplification in various grammatical features, and loss of infrequent words.3 Bound pronouns were lost entirely, and the allomorphy of some case suffixes was reduced, as can be seen from the two case inflections presented in Table 7.3. (For explanation of the term ergative, see §15.3.) As mentioned in §3.3, Nyulnyul has some fifty bound nouns indicating parts of the body that require a prefix indicating the owner of the part. By the last decades of the twentieth century, only one speaker still used this system. The others (most of who did not speak the language fluently) used the third person singular form of the noun as the root form; the system of prefixes had been lost entirely, and possession was indicated by a free possessive pronoun. Thus whereas in traditional Nyulnyul one would say nga-marl ‘my hand’, in modern speech ‘my hand’ is expressed as jan nimarl, literally ‘my his/her/its:hand’. Intriguingly, this system of pronoun prefixes to nouns was not entirely absent from latetwentieth-century Nyulnyul. Some speakers retained it on the one or two exceptional prefixing nouns that do not denote body parts. Thus it was retained in the speech of some on -mungk ‘belief, knowledge’, as in nyi-mungk ‘your belief/knowledge’ and nga-mungk ‘my belief/knowledge’. One guesses that preservation of the feature for this lexical item may have been supported by the fact that -mungk expresses a meaning closer to that of a verb than a noun; however, it was not actually re-analysed as a verb, and given verbal inflections. With decreasing use of a language in specialized social domains and disappearance of social domains such as ritual, registers can be lost, and along with them lexical items peculiar to them. For instance, in the late twentieth century speakers of Nyulnyul appear to have known few terms for secret-sacred law and ritual objects. These words almost certainly disappeared with the generation who were adolescents in the 1890s: this was the last generation to undergo initiation, a prerequisite to acquisition of sacred religious knowledge.

177

178

Linguistics

Language maintenance and revival Language endangerment and death have always occurred. Recall that Sumerian became extinct in antiquity, around 2000 BCE (see p. 16 above). The rate at which languages are becoming endangered and dying has, however, been steadily accelerating over the past few centuries. Many languages of Africa, Australia and the Americas have become seriously endangered in post-colonial times. In Australia, for example, no more than twenty traditional languages are presently being learnt as a mother tongue of children, or have a thousand or more speakers. This represents less than a tenth of the number of languages that were spoken by viable populations of speakers on the continent at first colonization in 1788, although a number even then perhaps had fewer than a thousand speakers. Some linguists have predicted that if present trends continue unabated as many as 90 per cent of the world’s presently spoken languages will either become extinct, or at least endangered, within the next century. Opinions differ, however, and the reality is that linguists’ prognoses have often been wide off the mark (Vakhtin 2002). Many speakers of endangered languages and many linguists are concerned about this situation, and efforts have been proposed or adopted to arrest the processes of shift in communities around the globe. These efforts are referred to by a range of terms, including language maintenance and revival (other terms are also used; sometimes the terms are used to refer to different things, sometimes as synonyms). For instance, in Australia a number of Aboriginal-controlled language centres have emerged since the mid-1980s, that are concerned with determining community attitudes to the traditional languages, and how best to serve them. Many communities have expressed determination that their traditional languages survive, or that a previously spoken traditional language be reintroduced. Slightly earlier, in New Zealand, ‘language nests’ or kohunga reo, were established by the Māori community in an attempt to promote the learning of Māori by children. In these language nests older Māori-speaking adults, typically from the generation grandparental to the children, worked as voluntary caretakers speaking Māori to the children. (This strategy has subsequently been tried elsewhere, with mixed degrees of success.) Unfortunately, it is difficult to determine which strategies are likely to succeed either in general or in particular cases, and few attempts have enjoyed much success. Widely regarded as the most successful is the revival of Hebrew – which had not been used as a medium of everyday communication for over a thousand years – in the late nineteenth and early twentieth centuries. (See, however, Zuckermann [2006] for a different view.)

Summing up Any language with a viable speech community is heterogeneous, showing varieties and variation in phonetics, phonology, lexicon and/or grammar associated with differences among speakers along geographical and/or social dimensions. Languages are often divided into different dialects and accents according to region. They also show dialectal variation across regions, which variation sometimes cuts across dialect boundaries.

Sociolinguistics: Language in Its Social Context

Dialectal variation is represented by isoglosses on a map. Social dimensions that language variation and varieties may be associated with include social class, age, gender, ethnicity and religion. The language variety spoken by a person serves as a badge of group membership. Speakers tend to accommodate to the variety of their interlocutor, reinforcing social ties with them. Languages also vary according to the use speakers put them to, different forms of speech being associated with different functions of language in interaction. This gives us registers and registerial variation, which include legalese, secret languages, respect varieties and the like. Styles are similar to registers, but the term is usually used for varieties differing in terms of formality. Many speech communities are bilingual. In such communities the choice of language can express a speaker’s social identity. In many bilingual communities language choice is at least partly motivated by domain; but domains do not usually determine the language spoken. In most bilingual communities code-switching occurs, often to strategically manipulate solidarity and distance. Speech communities change over time, sometimes radically: their language repertoire may change with the introduction of a new language, as may the habits of using them. Language shift happens when a language comes to be spoken in fewer domains, in a more restricted range of social circumstances. In extreme cases, a language can become endangered or obsolescent; ultimately we may have language death or extinction. These processes happen at vastly different rates. Language endangerment is often (though not always) accompanied by changes, usually simplifications, in the grammar and lexicon of the language. There is currently considerable concern among linguists and others, including speakers of endangered languages, about the loss of the world’s linguistic diversity; this has led to the development of language maintenance efforts in various countries.

Guide to further reading Two of the best textbooks on sociolinguistics are Mesthrie et al. (2009) and Coulmas (2013a). Also worth reading are Holmes and Wilson (2022), Chambers (2017) and Edwards (2013); for a rather different approach, see Halliday (1978). One type of sociolinguistic investigation we did not mention, the ethnography of communication, is concerned with how language is used in different cultures; Saville-Troike (2002) provides an excellent textbook introduction. For information on dialects of English in Britain, see Hughes et al. (2012), and on dialects and varieties in American English, Wolfram and Schilling-Estes (2006). Brief information on registers in Australian languages can be found in Dixon (2002: 91–5). Chapter 5 of Mithun (1999) deals with various speech registers in Indigenous languages of North America (though not under the term register); for fuller treatment, Silver and Miller (1997) is recommended. Finlayson (1995) discusses ‘women’s language of respect’ in Xhosa (Niger-Congo, South Africa); Bradley (1988) and Kirton (1988) deal with grammatical and lexical differences between men’s and women’s varieties in Yanyuwa.

179

180

Linguistics

There is a large literature on language, gender and sexuality (a topic we did not discuss in this chapter) in English and other languages, especially the major languages of Europe, North America and Asia. Kiesling (2019) is a very readable and instructive introductory textbook, while Ehrlich, Meyerhoff and Holmes (2014) is an excellent collection of articles. An extensive collection of articles on the topic is currently in preparation, Hall and Barrett (in preparation); many of the articles are currently available on the internet if your library has an appropriate subscription (https://doi.org/10.1093/oxfordhb/9780190212926.001.0001). Books dealing with social aspects of bilingualism and multilingualism include Myers-Scotton (1993) and Romaine (1995); see also Romaine (2017). A short overview of language shift and endangerment can be found in Chapter 8 of Mesthrie et al. (2009). For fuller treatments, see Grenoble and Whaley (1998), Tsunoda (2005) and especially Evans (2022). Abley (2003) presents a non-technical and entertaining travelogue of his journeys searching for endangered languages. However, be warned that Abley adopts an extreme Whorfian stance (see §11.1), and is linguistically naive. McGregor (2003) provides fuller details on the language situation of Nyulnyul. Grenoble and Whaley (2006) deals with language maintenance and revitalization. Walsh (2014) overviews language maintenance and revitalization in Australia.

Issues for further thought and exercises 1 Below are some words characteristic of different major dialects of English, including British, American, Australian and New Zealand. Identify which dialect(s) each belongs to. (Columns do not align with dialects.) a. b. c. d. e. f. g.

faucet dyke truck g’day gas drugstore diaper

tap toilet lorry hi petrol chemist nappy

bathroom

bog

hello

2 Which dialects do you think the following pronunciations represent? a. b. c. d. e.

[fɨʃ] [mɔɹnɪŋ] [səi] [ʧɨps] [næɒ]

‘fish’ ‘morning’ ‘see’ ‘chips’ ‘now’

3 List as many gender differences as you can in English or another language you speak. Classify the differences according to whether they are phonetic, phonological, intonational,

Sociolinguistics: Language in Its Social Context

lexical, grammatical, pragmatic or interactive (i.e. differences in the organization of speech interaction). 4 In one of his investigations Labov was interested in post-vocalic r as a sociolinguistic variable: in New York English it is a prestige feature. He visited three department stores in New York and asked the attendant a question that would elicit the answer fourth floor; for example, he might have asked Excuse me, where are women’s shoes? Both words fourth and floor could of course be pronounced with or without the rhotic following the vowel. The three department stores varied from lower to higher prices, which he expected would correlate with the socioeconomic status of the clientele. Labov pretended he did not hear the answer, and asked for a repetition. He found that there were more instances of post-vocalic r in floor than fourth. Why would this be? He also found more instances of post-vocalic r in the speech of attendants in the more expensive stores, and a higher frequency of this variable on the repetition. Labov interprets this as indicative of differences in the frequency of post-vocalic r across the social varieties of New York speech. Given that the attendants in all of the stores would presumably be working class, how would you account for his conclusions? 5 Compile a list of lexical items characteristic of some professional register (such as education, law, music, medicine). Give an explanation of each term in informal style. Do you think that use of informal style rather than the professional register would be helpful in making professional writing in these domains more accessible to the layman? Do you think that the professional register could be entirely replaced by an informal style: or to put things another way, is the only function of professional registers to exclude non-members of the profession? Explain your reasons. 6 What linguistic features – such as modes of delivery (i.e. phonetic properties of delivery of the message), lexicon and grammar – do you think would characterize the difference between the registers of spoken science and sports commentating? Listen to an example of each on television, and test your expectations. Be alert also for other differences than those you expected. 7 Below are examples of words in a Pig Latin variety of French called Verlan. Explain how Verlan words are formed from the corresponding French ones. a. b. c. d. e. f. g.

French blouson /bluzõ/ bloquer /blɔke/ père /pɛːʀ/ zonard /zonaːʀ/ jeter /ʒ(ə)te/ cresson /kʀɛsɔ/ ̃ démon /dɛmɔ/ ̃

Verlan zomblou québlo reupé narzo téjé soncré mondé

English ‘jacket’ ‘to block’ ‘father’ ‘person who lives in a suburb of Paris’ ‘to throw’ ‘watercress’ ‘demon’

8 One finds differences of opinion among linguists on the issue of language endangerment. Write a comparison of the views expressed in Hale et al. (1992: 40) and Ladefoged (1992: 810–11). Include your own critical comments and overall evaluation of each piece.

181

182

Linguistics

Research project Find an example of an endangered language (you could begin with the references mentioned above; see also Glottolog (https://glottolog.org/), which provides a comprehensive listing of languages and their state of health) and write a description of its social and political situation. Your description should contain basic information about the language and its speakers, as well as discussion of the historical circumstances leading to the present language situation. If possible, also discuss speakers’ attitudes and any language maintenance programme in operation or planned.

8 Text and Discourse

This chapter continues the theme of language in use begun in the previous chapter, but addresses it from a different perspective, the larger structures in language use. We do not merely construct grammatical sentences in our own minds; rather, we normally create them within the context of social interactions with others, and use them to achieve interactive purposes. Sentences are thus not normally produced in isolation, but in interpersonal settings. In this chapter we are concerned with the ways sentences fit into these contexts. Our focus is on the linguistic context, on the ways sentences go together with other sentences. It is, however, impossible to ignore non-linguistic features; these are treated in passing, not because they are unimportant, but because of considerations of length.

Chapter contents Goals Key terms 8.1 Preliminaries 8.2 Text organization 8.3 Discourse: language in interactive use Summing up Guide to further reading Issues for further thought and exercises Research project

184 184 184 187 195 202 203 203 205

183

184

Linguistics

Goals The goals of the chapter are to: ● show that structure exists beyond the level of the sentence (and utterance), and that this structure is distinct from grammatical structure; ● draw a distinction between texts and discourses in terms of the broad uses of language that are involved in each; ● identify two major text genres, narrative and exposition, and outline their global structures; ● distinguish between the coherence and cohesion of texts; ● identify the main linguistic devices used to create cohesion in texts; ● demonstrate that discourses are highly structured linguistically, and identify some of the dimensions of this structure; ● comment on the relation between structure of discourse and the ways it is used to further participants’ goals and purposes; and ● identify some strategies conversational partners use to manage the progress of interactive events, such as taking turns as speaker.

Key terms adjacency pair

ellipsis

speech interaction

coherence

exchange

substitution

cohesion

exposition

text

cohesive tie

genre

transaction

conjunction

lexical cohesion

continuer

move

transition relevance place (TRP)

Conversation Analysis

narrative

turn-taking

discourse

pre-sequence

Discourse Analysis

reference

8.1 Preliminaries Structure beyond the sentence We have seen (p. 107) that the sentence is the largest linguistic unit with grammatical structure. This does not mean that patterning and structure in language ceases at the level of the sentence. Nor does it mean that grammar is irrelevant beyond sentences.

Text and Discourse

As to the first point, it is clear that larger linguistic phenomena are structured: this book – indeed, any book – is not a random collection of sentences. Sentences are put together in particular ways; other ways of putting them together would not make sense, make less sense, or convey different meanings. Putting the sentences of Agatha Christie’s The ABC Murders (Christie 1967/1936) in random order would result (in most instances) in an incomprehensible or at best ridiculous story. As to the second point, the relevance of grammar to organization beyond the sentence is clear from the way in which sentences are grammatically structured when they occur in text. A perfectly acceptable alternative grammatical organization for the second sentence of the initial paragraph of this section is That patterning and structure in language ceases at the level of the sentence is not what is meant by this. But this alternative does not read very well in that paragraph, and makes for a less coherent development of ideas. What is meant by the observation that grammatical structure stops at the sentence is that the patterning and structure at the ‘higher’ levels – such as book or story – are inherently different from patterning at the sentence level (and below): it is not grammatical in nature. This chapter identifies some of the ways in which these larger phenomena are structured, and their effects on grammatical and lexical choices.

Text and discourse In the previous subsection we spoke of linguistic items ‘larger’ than sentences, and made up of a number of sentences. What are these items? We mentioned books and stories; others include lectures and jokes. To constitute entities in their own right these items must be in some sense complete. It is intuitively clear that books, stories, lectures and jokes do indeed represent complete units that belong to some level above the sentence. I say ‘above’ because they are made up of sentences, and because they are complete in ways single sentences usually are not. The sense in which these larger items are complete is, broadly speaking, in terms of usage. They are unified instances of language in use – to be more precise, unified sequences of utterances or sentence tokens (see box on p. 136). (And this is of course how sentence should be interpreted in the previous subsection.) These sentence tokens cohere in terms of purposeful language use. As a speaker of a language and a member of its speech community (see p. 162) you have an understanding of these wider purposes, and how and when they are achieved. With this knowledge you are able to identify these units and anticipate their boundaries. You generally know when a lecture or story is complete or incomplete, not just by the time or your place in a book – although you may sometimes be wrong. It is useful to distinguish two main types among these larger units of usage, texts and discourses. Texts are units that are primarily concerned with structuring and conveying information, typically where this information is fairly sizeable in quantity and complex. This is the case for jokes and narratives: they usually convey too much, and too complex information to be structured as single sentences. Nevertheless, they carve out segments of the real world – or an imaginary world – that members of a culture perceive as forming a coherent set of circumstances and events. This is illustrated by the following short piece, my own telling of a famous piece of mathematical folklore:

185

186

Linguistics

(8-1)

Carl Friedrich Gauss was perhaps the greatest mathematician of all time. Even as a child he showed great aptitude for mathematics. One day, when he was just a young boy in primary school the schoolmaster gave the class the task of adding up the first 100 integers, thinking that this would be a good way to keep the class occupied for some time. But the problem had barely been given before Gauss, the youngest in the class, produced the answer: 5050. The other pupils laboured on for an hour or so, adding up the numbers. Gauss was right, while many of his classmates got the answer wrong. He realized that the first hundred integers can be put into 50 pairs whose sum is 101 (1+100, 2+99, . . .), giving a total of 5050.

(8-1) clearly presents a coherent chunk of reality, a coherent sequence of events, and expresses them by means of a structured sequence of sentences. It would be rather difficult to express this meaning as a single sentence (except if you resort to a trick like replacing the full stops by semicolons – try instead to express it as a single spoken sentence), and the result would be hardly comprehensible. Discourses, in contrast to texts, are units primarily concerned with doing things with words, with language as a form of action (recall §7.1). A discourse is the language component of a complete interactive event such as the purchase of meat at the butchers’, or of a car at a second-hand car lot, or a dinner-time conversation. A meat-purchase discourse, for instance, is a complex social act, the goal of which is to buy/sell meat. It is made up of a structured sequence of stages such as greetings, request of information about meat for sale, payment and so on. Each stage is oriented to a certain sub-goal, and the stages come in a particular order – for example, it would make no sense to pay before greeting the butcher, or before selecting the meat. The discourse is clearly much more than a mere collection of grammatically acceptable utterances.

The terms textt and discourse e are used in many different ways in linguistics. Sometimes they are used interchangeably, in reference to the same type of item. Perhaps more often, the term textt is used to refer to written instances of language use, while discourse e is used for spoken utterances. Related to this is the use of textt in reference to the language component of discourse e, which is construed as the entirety of a social interaction. The particular distinction drawn in this section, according to whether the item in question constructs a chunk of knowledge or attempts to achieve an interactive goal, is not usually made. Nevertheless, this distinction is important not just because of the differences in the uses of language, but also because the two types are associated with very different structures.

Given this understanding of texts and discourses as chunks of language-in-use, it can easily be seen that they need not necessarily be made up of more than one sentence token. In some circumstances a single sentence or even a smaller unit constitutes a complete text or discourse. Examples are notices such as No smoking, No loitering, Open, labels on packaged food such as Dansk honning ‘Danish honey’1 and so on. We ignore such minimal texts in what follows.

Text and Discourse

8.2 Text organization Text types and structures Narratives The text given in (8-1) presents a short story, a version of a piece of mathematical lore that has been told (and written) in many different ways (although a number of themes are recurrent, as discussed in Hayes 2006). Texts like this, texts that present a story that unfolds over time, are called narratives. Narratives are tightly structured texts, that do much more than present a sequence of events, even events belonging to the same ‘world’. This can be seen by comparing (8-1) with the following, which begins with the same three sentences, but refers to different events that might have occurred (and most of which are mentioned in at least one of the alternative versions): (8-2)

Carl Friedrich Gauss was perhaps the greatest mathematician of all time. Even as a child he showed great aptitude for mathematics. One day, when he was just a young boy in primary school the schoolmaster gave the class the task of adding up the first 100 integers, thinking that this would be a good way to keep the class occupied for some time. Gauss wrote his answer on his slate, and placed it on the teacher’s desk. The other students kept writing on their slates. After an hour everyone was told to stop work. The mathematics class, which the schoolmaster did not like teaching, was finally over.

It is clear that (8-2) lacks something important that (8-1) possesses: a plot. Various models have been proposed for describing narrative plots, for the overall structure that narratives follow. These identify elements according to their overall function in the narrative, and state the order in which these elements normally appear. We will refer to these elements as stages, because they generally come in a fixed order. As a simple model of narrative structure we identify the stages shown in Table 8.1, where as usual brackets enclose optional elements. Table 8.1 Stages in narrative structure Stage

Features

(Orientation)

Preliminary information indicating the topic of the narrative, and/or that a narrative is about to be told

Setting

Description of the time and place of the events

Events

Actions and happenings in the world of the narrative, which include (among other things):

(Coda)

Complication

The main happening, an event that raises a problem in the narrative world that is pivotal in the unfolding of the drama

(Turning point)

An event that brings the chain of events following the complication to a head

Resolution

The final outcome of the drama in which the complication is resolved

Wrapping up the story, possibly drawing out a moral

187

188

Linguistics

This approach to narrative organization is reminiscent of the approach to sentence structure we took in Chapter 5, where sentences were analysed into units that serve grammatical roles. For this reason, some investigators speak of ‘story grammar’; one must be aware, however, that here the term grammarr is being used in an extended sense, as texts are not structured grammatically, as discussed above.

To illustrate this structural scheme, let’s look at another narrative, (8-3). This is my version of a fairly well-known urban legend, which has appeared in many different forms.2 (8-3)

When I was a first year student, we had a professor who was notoriously tough on grading term papers; he rarely give anything higher than a ‘D’. Then at last in one class he rewarded one student with a ‘B–’. Well this student hung onto her paper, and sold it to the highest bidder at the end of the semester. The buyer submitted it to the same professor in the next semester, getting a ‘B’. The following year, this student again sold the prized paper to the highest bidder, who submitted it to the same teacher. He received a ‘B+’. Finally, yet another student submits the paper for a fourth time and is awarded an ‘A’. The paper is returned to the student with a written comment from the professor: ‘I’ve read this paper four times now, and I like it better each time.’

(8-3) launches straight into the narrative, giving a setting for the events to follow. This is immediately followed by a complication, that the professor normally gave the lowest marks for term papers. Eventually he gives a higher mark for one student’s paper. There follow a number of events describing how the same essay is marked better and better in subsequent semesters. Finally there is a resolution, in that the professor gives out the highest mark for the paper – this is a resolution because it potentially ends the narrative. It is a somewhat unusual resolution, however, in as much as this is the same paper that he first gave a lower mark to. We are forced to conclude that the professor is rather foolish, and at this point construct the narrative as a classic story of the absent-minded and unworldly professor. This resolution is then challenged by the final event, in which the professor reveals that he is not so absent-minded after all. Table 8.2 puts the above remarks more explicitly into the stages identified in Table 8.1. Subtypes of narrative include narratives of personal experience, myths, urban legends, historical narratives and so on. Each subtype shows its own peculiarities, including what counts as a particular stage of the narrative. For instance, a popular genre of narrative is the crime story, which comes in variants including the crime novel, the crime short story, and crime videos or movies. These are regularly and conventionally structured, regardless of whether written or acted out. Almost all have a complication that occurs early on, the committing of a crime, and a resolution that occurs towards the end, the uncovering of the criminal. Little else could count as either complication or resolution. Between the complication and resolution is a series of events that typically leads up to the discovery of the criminal. Readers or viewers do not expect or even want to know the identity of the criminal before the crime is committed, or before the due processes of investigation have been carried out.

Text and Discourse

Table 8.2 Structural analysis of the urban legend of the tough professor Stage

Realization in (8-3)

Orientation

Absent (recall that this is an optional stage)

Setting

When I was a first year student,

Events

Complication

we had a professor who was notoriously tough on grading term papers; he rarely give anything higher than a ‘D’

Turning point

Then at last in one class he rewarded one student with a ‘B–’.

Well this student hung onto her paper, and sold it to the highest bidder at the end of the semester. The buyer submitted it to the same professor in the next semester, getting a ‘B’. The following year, this student again sold the prized paper to the highest bidder, who submitted it to the same teacher. He received a ‘B+’. Resolution Coda

Finally, yet another student submits the paper for a fourth time and is awarded an ‘A’.

The paper is returned to the student with a written comment from the professor: ‘I’ve read this paper four times now, and I like it better each time.’

Expository texts Narratives are just one of a range of text types or genres. Many of the texts you encounter as a student are not narratives – they do not relate stories – but rather are expositions. They explain or describe something. Most scientific writing is constituted mainly by expository texts; narratives play a less prominent role and are often dismissed as mere anecdotes. This chapter can be seen as a set of related expositions treating a number of topics, including the nature and structure of text and discourse. You will most likely be expected to write expository essays in some of the courses you take at university. If I asked you to explain some linguistic concept in a test, I would expect your answer to be structured as a short exposition. More concretely, the model answer to the phonological problem given in Chapter 2 of the website for this book is an example of an exposition. Like narratives, expositions have internal structure. This structure is, however, quite different for the two genres. The model answer to the phonological problem just referred to does not begin with a setting; nor are any events referred to, and there is no complication-resolution organization. Rather, it begins with the statement of a claim; this is followed by an argument for the claim. Finally, the claim is restated in slightly different words. These components can again be seen as stages in the exposition, which can be formalized as in Table 8.3. This text might be referred to as an argumentative exposition. Other types of exposition exist, and they show different structures. For instance, descriptive expositions explain ideas or things by mentioning details and listing features; an example would be the description of a language you are

189

190

Linguistics

Table 8.3 Structure of the model answer exposition Stage

Realization in the model answer

Introduction

Statement of thesis (the segments are phonemically distinct)

Argument

Evidence for thesis Claim 1 (the segments are suspicious pairs) Reason Claim 2 (the segments contrast) Exemplification

Conclusion

Restatement of thesis

asked to write in the Research project, Chapter 17, p. 439. Another type of exposition focuses on comparison of ideas or things, relating them to one another via similarities and differences, such as a text discussing the differences between formal and functional theories of syntax.

Other genres Narratives and expositions are not the only types of text. Other genres include procedural texts (which specify procedures for doing certain things – for instance, how to connect your computer to the internet, how to cook beef rendang), recounts (which recount sequences of events, but lack the complication-resolution components of narratives), biographies (which relate life experiences, but are not organized as narratives), lectures, sermons and so on. All of these types are characterized by different structures, correlating with differences in the type of knowledge they construe. They also show linguistic differences. For instance, different genres differ in terms of the patterns and frequency of lexical choice and use of grammatical categories and constructions. See Chapter 9 for some discussion.

Coherence and cohesion Coherence In §8.1 we mentioned the property of texts that they represent more or less coherent chunks of knowledge of real or imaginary ‘worlds’ – chunks that hold together in the eyes of members of a culture. To the extent that it does this, we can attribute the property of coherence to a text. A coherent text is one for which we can establish a mapping from the sentences to a ‘world’ that makes sense, and is constituted by things and events that belong together. A coherent text will have a theme, a main idea that it is ‘about’, that encapsulates the ‘world’ it describes. The property of coherence does not just depend on linguistic features of a text. Consider (8-4). Is it a coherent text?

Text and Discourse

(8-4)

The procedure is actually quite simple. First you arrange things into different groups depending on their makeup. Of course, one pile may be sufficient depending on how much there is to do. If you have to go somewhere else due to lack of facilities, that is the next step; otherwise you are pretty well set. It is better to do too few things at once than too many. Remember, mistakes can be expensive. At first the whole procedure will seem quite complicated. Soon, however, it will become just another fact of life. (From Bransford and Johnson 1972, cited in Whitney 1998: 236)

Quite likely you find this incomprehensible, and the experimental study by Bransford and Johnson (1972) confirmed that it is very difficult for subjects to remember it. However, if you re-read it knowing that it is about washing clothes, it immediately becomes comprehensible (and easy to remember). Knowledge of the theme permits you to construct a coherent ‘world’ for the text. The coherence of the text therefore cannot lie just in the language, since it is unchanged; it must also depend on knowledge of what the text is about. This possible interpretation permits you to bring other knowledge to bear on the problem of interpreting the text – your knowledge of how to wash clothes in a washing machine. Notice that I did not claim that the language of the text is irrelevant to its coherence. There are linguistic features that facilitate textual coherence. We now turn to these.

Cohesion Let’s begin by comparing (8-4) with (8-5). (8-5)

The farmer kisses the duckling. Remember, mistakes can be expensive. They followed his dripping blood until nightfall. The other pupils laboured on for an hour or so, adding up the numbers. When I was a first year student, we had a professor who was notoriously tough on grading term papers; he rarely give anything higher than a ‘D’.

This collection of sentences is incoherent not just because you can’t figure out what the theme is. (As far as I can see, there is no theme whatever.) It is also incoherent because the sentences have no obvious links to one another. (8-5) appears to be a collection of independent and unrelated sentences. By contrast, you will notice that there are a number of links among the sentences of (8-4) that contribute to its hanging together. For instance, in the third sentence one pile clearly links to different groups in the second sentence, selecting as it were one of the groups constructed by the latter noun phrase. By contrast, in (8-5) there is nothing in the first three sentences that the noun phrase the other pupils in the fourth sentence can be linked to: no group of pupils has been set up in the previous text. And although they in the third sentence might refer to the mistakes mentioned in the second sentence, it is clear that this interpretation would make no sense. Linguistic ‘devices’ that help to establish links among the sentences of a text are called cohesive devices; the types of link that these devices construct are called cohesive links or ties. Following the pioneering work of Michael Halliday and Ruqaiya Hasan (1976), five types of cohesive devices and links are usually identified: reference, conjunction, substitution, ellipsis and lexical cohesion. We discuss each of these briefly in the following sections.

191

192

Linguistics

Reference Reference devices include items like one in one pile in (8-4), which is interpreted via pile and group; one selects one of the piles that make up the groups. The interpretation of one will be different in different contexts. For instance, the fourth sentence of (8-1) – But the problem had barely been given before Gauss, the youngest in the class, produced the answer: 5050 – might be replaced by But the problem had barely been given before one boy, the youngest in the class, produced the answer: 5050. In this case one must be interpreted in relation to boy and class. Items like one don’t have full lexical meanings of their own, at least not when they are used as cohesive devices. The words that are perhaps most commonly used as reference devices are personal pronouns and demonstratives. In (8-1) the third person pronouns he and his link back to Carl Friedrich Gauss. In the same text the demonstrative this is used to link to the event constructed in the previous clause, namely the schoolmaster’s giving the addition problem to the class. (8-4) also uses personal pronouns and demonstratives to construct a cohesive text. For instance, in the final sentence it links back to the noun phrase the whole procedure in the previous sentence. Aside from one, pronouns and demonstratives, reference devices include words like some, other, same, and different. (8-1) illustrates cohesive use of other in the noun phrase the other pupils, which refers to the entire class with the exception of Gauss. In the above examples the reference item points back to a referent that has already been established. This is called anaphoric reference. Sometimes reference items point forward to a referent that is established in a later sentence. (8-1) might have begun with He rather than Carl Friedrich Gauss, not identifying the person by name until say the third sentence, as shown by (8-6). This is a perfectly possible though marked alternative to (8-1), and might be used to create tension – for instance, to make you wonder ‘who is this person?’. (You have doubtless encountered this device in literature.) This sort of reference is called cataphoric. (8-6)

He was perhaps the greatest mathematician of all time. Even as a child he showed great aptitude for mathematics. One day, when he was just a young boy in primary school the schoolmaster gave the class the task of adding up the first 100 integers, thinking that this would be a good way to keep the class occupied for some time. But the problem had barely been given before he, Carl Friedrich Gauss, the youngest in the class, produced the answer: 5050. The other pupils laboured on for an hour or so, adding up the numbers. Gauss was right, while many of his classmates got the answer wrong. He realized that the first hundred integers can be put into 50 pairs whose sum is 101 (1+100, 2+99, . . .), giving a total of 5050.

Personal pronouns and demonstratives serve functions other than forging cohesive ties within a text. Thus in example (8-4) the second person pronoun you u can be interpreted as the reader or as any arbitrary person (in which case it could be replaced by one e). Either way, it establishes a link from the text to the wider context of the world out there; at the same

Text and Discourse

time, the second and subsequent instances of you u link back to the first instance. The type of reference where the link goes directly to the referent is sometimes called exophoric, in contrast to endophoric, which is achieved via ties within the text itself; the above discussion focuses on endophoric reference.

Conjunction Adjacent sentences in a coherent text will normally be related to one another in some way, corresponding to the relations among the events they refer to. For instance, in a narrative the events described by a sentence usually follow the events described by the previous sentence. The events referred to in the fourth sentence of (8-3) follow those referred to in the third sentence; this is made explicit by the phrase in the next semester in the fourth sentence. Another way in which the relation among sentences can be explicitly indicated is by means of a connective, a linguistic item that links a sentence to a previous one by indicating the nature of the relation between the events or situations described. For instance, then serves as a connective in They followed his dripping blood until nightfall. Then they made camp. The type of cohesion achieved by connectives is referred to as conjunction. Various types of linguistic item are used in conjunction, including: Conjunctions (see p. 85 and p. 117) including and, or and but: (8-7) At first the whole procedure will seem quite complicated. But with a little experience it will become just another fact of life.



Words of various types such as then, nevertheless, furthermore, alternatively and however: (8-8) At first the whole procedure will seem quite complicated. However, it will soon become just another fact of life.



Prepositional phrases such as in spite of, by the way and to sum up: (8-9) At first the whole procedure will seem quite complicated. In spite of this, it will soon become just another fact of life.



The basic types of conjunctive relation include: addition (e.g. expressed by and); alternation (e.g. indicated by or); contrast (e.g. indicated by but and yet); temporal (e.g. indicated by then); and causal (e.g. marked by because and therefore).

Substitution Substitution is a cohesive tie created by the use of a general word as a type of counter, replacing words that have already been used in the text. Words like one, do and so can be used in this way in English. In (8-10), for instance, the word one in the second sentence stands for the word pile. (8-10) First you make one pile with the coloureds. Then you should make a new one with the whites.

193

194

Linguistics

(8-11) illustrates the use of do – actually an inflected form of this verb – as a substitute. Notice that does serves as a replacement for solves it, mentioned (with the verb in a different inflectional form) in the previous sentence. (8-11) Although he worked all night on the problem, he was still unable to solve it. If he ever does, I will be surprised. So frequently serves as a substitute for entire clauses, as in the following pair of sentences. (8-12) Could the other pupils have solved the problem in the way Gauss did? I don’t think so. The second clause in (8-12) I don’t think so might alternatively be expressed as in I think not. In this case the negative particle not is being used as a substitute. In (8-12), as in (8-11), it is not the precise grammatical form of the phrase or clause that is substituted for, but a variant suitable to the new sentential environment: a finite form of the verb phrase, or the corresponding declarative clause. The lexical component remains unchanged.

Ellipsis Perhaps counterintuitively, omission of something that is required by the grammar can serve as a cohesive device. Consider the rather laborious rephrasing of the second last sentence of (8-1): (8-13) Gauss was right. But some of his classmates got the answer wrong. A few _ got it right. In the position indicated by the underline _ the words of his classmates or of them are left out. The gap, comprising material that is missing where you expect it, effectively forces you to look back in the text for something to fill in what is missing. In this way missing material can be cohesive. This type of cohesion is called ellipsis. Some languages use ellipsis as a cohesive device much more extensively and frequently than English. This is the case, for instance, in many Australian Aboriginal languages, where it is common to omit explicit mention of a character in a narrative after it has been introduced. Ellipsis is not restricted to noun phrases. In the following example there is missing material in two places in the second sentence: in the subject noun phrase of his classmates (or of them) is missing, while in the verb phrase get the answer right (or get it right) is clearly missing. (8-14) Some of his classmates got the answer right. Most _ didn’t _. Ellipsis can be thought of as substitution by zero: what is missing effectively serves as a counter standing for the words that have already been mentioned.

Lexical cohesion Texts concern coherent portions of real or imaginary worlds, and hence normally involve a number of sentences that concern the same or similar things, circumstances, props and so forth. For this reason it is only to be expected that the common elements will be referred to again and again using identical or related lexical items. (8-1), for instance, is about Gauss, and he is mentioned by name on two occasions subsequent to his introduction by full name in the first sentence. The use of such related lexical items contributes to the cohesiveness of a text. This phenomenon is referred to as

Text and Discourse

lexical cohesion. Lexical cohesion is saliently absent from (8-5), consistent with the fact that there is no apparent coherent interpretation for the sentences. The clearest instances of lexical cohesion involve the repetition of a lexical item, as in the justmentioned case of repetition of the proper noun Gauss. (8-3) also involves two repetitions of the common noun professor, as well as five repetitions of student. In the repetitions of Gauss and professor the same individual is being referred to. This is not so in the repetitions of student. But the repeated instances of this word do denote individuals of the same category, which plays a crucial role in this urban legend. Repetition of words of other parts-of-speech also contributes to the cohesiveness of a text – for instance, the repetition of add up in (8-1). Instead of repeating the lexical item, a synonymous lexeme might be used. Thus in (8-1) we find the roughly synonymous pupil and classmate, and integer and number. In addition, lexical cohesion can be achieved by lexical items related by any of the other semantic relations identified in §6.2. Antonymy is illustrated by right and wrong in (8-1); these are (so-called) non-gradable antonyms. In (8-15) and (8-16) we see another type of antonymy in break (down) and be repaired: these are of course reverses. Hyponymy is exemplified by boy and child in (8-1) and car and vehicle (8-15), and meronymy by front axle and car in (8-16). (8-15) We did a tour around Denmark in an old car. At one point the vehicle broke down, and it took a week before it could be repaired. (8-16) We did a tour around Denmark in an old car. At one point the front axle broke, and it took a week before it could be repaired. Lexical cohesion can involve other types of semantic relation. For instance, (8-1) shows a range of them, including between: mathematician and mathematics; primary school, schoolmaster and pupil; add up, number, integer and sum; and task and answer.

8.3 Discourse: language in interactive use In Chapter 7 (see Table 7.1) we deployed the notion ‘being things with words’ to account for the existence of varieties and variation in languages. Among other things, people use lexical and grammatical choices as well as choices of varieties to construct social identities for themselves, and to achieve things by positioning themselves in social space. We also identified another socially relevant dimension to language use, ‘doing things with words’. We now adopt this perspective on language in context, and examine how language is used to do things, to achieve interactive goals. This invokes a somewhat different slant on ‘doing things with words’ to pragmatics (see §6.3). And our earlier attention (in Chapter 7) to choices in the linguistic system and their social meanings is replaced by a focus on speech interactions and their organization.

Hierarchical organization of interactions In §8.1 we introduced the term discourse in reference to the spoken or written component of an interaction, the largest unit of social and interpersonal action. A discourse might be the language

195

196

Linguistics

component of a buying and selling event, which is marked off by the arrival of a buyer at a particular location and their subsequent departure from that location.

As usual, there are difficulties. It is not always this easy to determine precisely where a discourse begins or ends, and arrival at or departure from a location need not signal a discourse boundary – for instance, one might need to wait one’s turn in a busy butchery. Nor is it necessarily warranted to treat discourse as being made up exclusively of linguistic phenomena. For example, eye-gaze and gesture are also integral parts of spoken discourse, as is the exchange of material objects in a buying and selling event.

Discourses are made up of utterances, the acts of producing and using sentences to do things – speech acts. We normally think of an utterance as being produced by a single speaker. But this is not always so. It is not unusual for an utterance to be made up of contributions from two speakers acting in concert, jointly constructing it. This is illustrated in the following example, where E completes the utterance begun by B. (8-17)

B: An’ there – there wz at least ten mi:les of traffic bumper tuh bumper E: – because a’that (from Jefferson 1973)

It has been suggested by Jennifer Coates – based on an investigation of a large body of informal talk between British women friends – that utterances in women’s talk are frequently jointly constructed. This is illustrated in the following brief extract in which D, C and A together construct a single utterance over the first three and fifth lines (from Coates 1994: 181). (Here = on successive lines indicates that there is no perceptible pause between the end of one speaker’s contribution and the beginning of the next speaker’s.) (8-18)

D: it’s sort of pleasure C: a perverse pleasure= A: =in their C: =yeah A: downfall=

This highly cooperative sort of talk is, according to Coates, more characteristic of females than males, who tend not to jointly construct utterances so often (Coates 1997). There is, however, more to the structure of discourse than the sequence of utterances; other units of intermediate size can be recognized. We distinguish three additional types of unit, forming the hierarchy: discourse, transaction, exchange, move and utterance.

Moves In discourse a speaker might utter a single sentence (possibly abbreviated) or a sequence of sentences that cohere together in terms of their speech act value, representing the speaker’s contribution to the discourse at that point. Such coherent sequences of utterances – including the

Text and Discourse

minimal case of single utterances – are called moves. Moves correspond fairly well to speakers’ turns in conversation. The term move is used here instead of turn because the correspondence is imperfect: sometimes a speaker’s turn is made up of more than one move, as for example when the other participants in the discourse fail to take their own turn when available, or when a speaker is telling a joke or story. For example, consider (8-19), from an argument between spouses Molly and Ben about who should be making popcorn and who should be minding the child (cited in Tannen 2003: 195). Each speaker’s turn is made up of two or more moves. For instance, Molly’s turn is made up of a refusal followed by a justification (her moves are each single utterances in this case). How would you analyse Ben’s turn? (8-19)

Ben:

Molly:

Molly! Mol! Let’s switch. You take care of her. I’ll do whatever you’re doing. I’m making popcorn. You always burn it.

This is reminiscent of the situation in chess where the rule of en passant permits a player to make two moves in a single turn. (Compare castling, which is a single turn made up of one complex move.) Another place where moves and turns do not coincide is in continuers, those small words like mhm, yeah, right and the like that interactants use to signal that they are attending to what is being said. The use of such items represents a minimal response, and constitutes a supporting move by one participant, reinforcing the speaking participants’ turn. But arguably speaker and hearer roles are not exchanged, and a turn has not been taken by the person who produces the continuer. (For instance, they could hardly be accused of interrupting.)

Exchanges Exchanges are sequences of moves by different speakers that go together as complementary in speech act value. These include such pairs as questions and answers, offers and acceptances, commands and compliances, and so forth. The term exchange captures the idea that the roles of speaker and hearer are exchanged in these sequences of move; the interactants are engaged in turntaking. These sequences are also called adjacency pairs, since they are often made up of pairs of moves, as in (8-20). However, some exchanges consist of three essential component moves. This is typical of teacher-student interaction, in which the pupil’s response to a teacher’s question is almost invariably followed by a feedback move by the teacher as illustrated by (8-21). This need not necessarily be verbal: a nod might suffice. When the third move is absent – that is, the teacher gives no response – this is usually interpreted as indicating that the answer is wrong. (8-20) (8-21)

P: It’s a really clear lake isn’t it? L: It’s wonderful (Hutchby and Wooffitt 2008: 47) T: Those letters have special names. Do you know what it is? What is one name that we give to these letters?

197

198

Linguistics

P: Vowels. T: They’re vowels, aren’t they?

(Coulthard 1985: 125)

An exchange can be enclosed within another exchange, as in the following example, where a question-answer exchange (B’s first move together with A’s response) is embedded within a requestrefusal exchange (A’s first move and B’s second move): (8-22)

A: B: A: B:

Can I have a bottle of Mich? Are you over twenty-one? No. No.

(Levinson 1992/1983: 304)

Transactions A discourse is typically oriented to the achievement of some interactive goal, such as the purchase of goods, conveying of information (e.g. in a class), or oiling the wheels of interaction (which the anthropologist Bronislaw Malinowski called ‘phatic communion’). Often, especially if they are nontrivial, these goals are achieved in stages, rather than all at once. These stages are called transactions. Transactions are thus sequences of exchanges that go together to form coherent phases or stages of a discourse, component parts that are oriented to the same intermediate ends. For example, in a classic study of buying and selling interactions in Cyrenaica (a region in North Africa now part of Libya), T. F. Mitchell (1975/1957) distinguished, in certain types of economic encounter, five transaction types: salutation; enquiry as to object of sale; investigation of object on sale; bargaining; and conclusion. Boundaries of transactions are often marked by framing words such as OK, well, right, now and the like, often called discourse particles. In classroom interactions, teachers often use these words, followed by a short pause, to mark the beginning of a topic-focused transaction. In casual conversation, they are often used to mark the end of a transaction, to close it down. Transactions tend to be consistent in register (see §7.3), more so than entire discourses. For example, my lectures normally begin with a greeting transaction and end with a farewell transaction; these will typically be in an informal non-academic register. In between will be a number of informing transactions characterized by an academic linguistic register. During the breaks there may be other transaction types, which may or may not be in the academic linguistic register. (As an exercise, you should observe the types of transactions that occur during the breaks in your lectures, and take note of characteristics of the language employed in them.)

Summary of discourse components Table 8.4 provides a summary of the major features of the five types of discourse unit we have distinguished. Box 8.1 shows the major outlines of a discourse involving two Chinese graduate students (P1 and P2) temporarily residing in the USA. It is an informal dinner invitation, conducted in the Beijing dialect of Mandarin Chinese. The invitation is divided into three transactions, and the

Text and Discourse

Table 8.4 Hierarchy of discourse units Discourse unit

Alternative terms

Characteristics

Discourse

Conversation, Presentation

Stretch of interaction characterized by a common ultimate goal (macro-goal), and usually same participants, environment, etc. Structured as a staged sequence of components oriented to various subtasks and topics.

Transaction

Stage, Topic sequence

Stretch of talk within a discourse made up of sequences of exchanges and coordinated to the achievement of shared sub-goals. Consistent register choice; boundaries may be marked by discourse particles.

Exchange

Adjacency pair

Tightly linked sequences of acts by different speakers with complementary speech act functions.

Move

Turn (roughly)

A coherent contribution to the discourse representing a single step that is usually produced by a single interactant; may correspond to a turn of speaking.

Utterance

Sentence, Locution, Smallest component pieces of speech action by a participant; Speech act realized by a sentence, and thus showing lexical and grammatical structure.

content and speech act value of each speaker’s turn is summarized briefly in English. The actual spoken utterances included many interjections; in addition there were numerous head movements, and facial expressions, only a few of the most relevant of which are indicated.

Box 8.1: An informal dinner invitation between Chinese graduate students s (adapted from Saville-Troike 2002: 139–40) Transaction 1: Opening P1: Greeting P2: Acceptance of greeting Offer of seat Return of greeting Transaction 2: Invitation P1: Hints that he will ask P2 to do something [Pauses to look for P2’s reaction, observing facial expression] Offers invitation to dinner at his home P2: Refuses the invitation [surprised expression, then frown] P1: Insists on acceptance P2: Accepts indirectly [facial expression indicates he has no other alternative] P1: Reassures P2 of sincerity of invitation; sets definite time P2: Agrees on time; expresses thanks P1: Reassures P2 it will be informal

199

200

Linguistics

Transaction 3: Closing P1: Confirms the time Makes an excuse for leave-taking P2: Thanks P1 again Closing salutation P1: Closing salutation

Managing interactions We conclude our discussion of the organization of speech interactions with a brief glance at just two of the many strategies interactants use to manage the progress of discourse. We first discuss ways turn-taking is coordinated, then we look at how speakers prepare the ground so to speak for the accomplishment of their goals. Conversation Analysis is a discipline that focuses on such concerns. The related field of Discourse Analysis has a somewhat broader scope, and is concerned with all aspects of the structure of discourse.

Turn-taking A fundamental feature of most types of discourse is that interactants alternate in taking on the roles of speaker and listener. (Even in the rather rarefied a-social ‘discourse’ environment in which I am writing this book I alternate between the roles of writer and reader. As a solitary reader you are, hopefully, engaged in dialogue with me (as a constructed author), perhaps uttering ‘yes’ or ‘no’ in reaction to some of the words I’ve written, underlining or highlighting passages, or inserting marginal comments.) In certain ceremonial contexts turns are laid down by convention: everyone knows when and what contribution they should make. But in spontaneous casual conversation there is no preordained order for speakers to take turns, or fixed duration of turn size. This raises the question of how speakers negotiate or manage the switches in speaker and hearer roles. How do interactants coordinate their contributions so that things flow smoothly? Analyses of various forms of conversational interaction suggest that there is a tendency or ideal for precisely one person to speak at a time, and for there to be little gap or overlap between the utterances of two speakers. For this to be possible, there must be some mechanisms governing turn-taking, and participants must be continually monitoring what the others are saying, and projecting what they will soon be saying. The turn-taking model is based on the notion that any turn of speech has transition relevance places (TRPs), points where an utterance is potentially complete (Sacks, Schegloff et al. 1974). TRPs include boundaries of grammatical units, as well as of intonation units; in face-to-face encounters non-verbal cues such as eye-gaze and gestures can also mark these points. Exchanges of speaker roles tend to occur at TRPs. Thus, one study of telephone conversations revealed that fully a third of turns were initiated less than 200 milliseconds (i.e. one fifth of a second) from the end of an intonation unit (Beattie and Barnard 1979). Overlaps in speakers’ turns do, of course, occur. It has been shown that overlaps usually occur at potential TRPs – that is, at places where a TRP has been inferred. When this happens, one speaker

Text and Discourse

usually rapidly relinquishes their turn. This is illustrated in the following excerpt, from Hutchby and Wooffitt (2008: 58). (The figures in brackets indicate pauses of the specified fraction of a second, (.) indicates a pause of less than 0.2 seconds, and bolding indicates phonetic prominence.) (8-23)

M: We:ll? She doesn’t kno:w. .uhhh: huhh =huh-huhh-huh-huh-heh-heh= L: = O h h m h y G h o: d, = M: hhhhh Well it =was anL: =Are you watching Daktari:? (0.2) M: nNo:, (.) L: Oh my go:sh Officer Henry is (.) ul-locked in the ca:ge wi- (0.3) with a lion

Notice that in the third line M has interpreted L’s first utterance as a response to her own she doesn’t know, when in fact it is in response to something happening on the television programme Daktari that L had been watching when M phoned. But M gives up her turn very soon after L’s overlap, and allows L to take on the role of questioner, to which she (M) answers in the fifth line. The above turn-taking patterns were initially observed in telephone conversations in the USA. Studies of other social and cultural contexts have revealed somewhat different patterns. Thus, long segments of overlapping speech are common in certain socio-cultural contexts. For instance, it has been reported that public talk among villagers in Antigua is characterized by much simultaneous speech (Reisman 1974). Coates also argues that overlapping is more typical of women’s speech than men’s speech in British English; men’s speech follows the norm of ‘one speaker at a time’ more closely than does women’s speech (Coates 1994, 1997). Nor is the norm of filling virtually all available time with speech always adhered to. Some cultures allow much more silence in conversational interaction than do Westerners. Even in the West there are significant differences according to context. Face-to-face discourses among people who know each other well can show long periods of silence, much longer than what occurs in typical telephone conversations.

Pre-sequences Pre-sequences are techniques speakers use to prepare the listener for what is to come, techniques to prepare the ground for the joint pursuance of a new discourse goal. They are as it were preparatory exchanges involving a proposal by one interactant for the ensuing discourse goal, to which another participant can concede or not. Pre-sequences are steps in negotiation of discourse orientation; they are typically motivated by the avoidance of loss of face, if for instance the other participant were to reject the new goal outright. Someone who wants to tell a joke or story that is likely to involve them as the speaker for some length of time might prepare the ground by beginning with a move like Have you heard the one about the Irish electrician? The response to the pre-question sets up an agenda that the two parties agree to follow, namely to tell or not to tell the joke or story. Pre-story sequences can be more indirect than this, as in the following, cited in Hutchby and Wooffitt (2008: 126):

201

202

Linguistics

(8-24)

(A telephones B, an employee at ‘Bullocks’ department store) A: Well I thought I’d jus’ re- better report to you what’s happened at Bullocks today B: What in the world’s happened? A: Did you have the day off? (.) B: Yah? A: Well I:- (.) got outta my car at fi:ve thirty . . . (story follows)

Pre-sequences are used in many other circumstances – for instance, to set up the grounds for asking a request, as in (8-25); for offering an invitation as in (8-26) and the beginning of the second transaction in the discourse of Box 8.1; for asking a question (e.g. Um, there’s one thing I wanted to ask you – yes mhm); for closing a conversation or transaction (e.g. well okay – okay); and so forth. (8-25)

A: B: A: B:

Hi. Do you have uh size C flashlight batteries? Yes sir I’ll have four please (turns to get them) (cited in Levinson 1992/1983: 346)

(8-26)

A: Whatcha doin’? B: Nothin’ A: Wanna drink? (cited in Levinson 1992/1983: 357)

Summing up Linguistic patterning exists above the level of the individual sentence or utterance, although this is very different from the grammatical patterning found within sentences. This patterning is in terms of two main dimensions: text, which is concerned with the construal of complex chunks of knowledge; and discourse, which is concerned with the achievement of interpersonal goals. Texts fall into different genres or types according to the type of knowledge they convey and how they construe the relations among the component pieces. Two primary text genres are narrative, which is concerned with the construction of coherent sequences of events; and exposition, which is concerned with the presentation of relationships among ideas. Texts of these two genres also show different structural organizations in terms of their component elements, stages. This macro-structure contributes to the coherence of a text. Another aspect of text coherence is found at the micro-level, and concerns linkages forged by the language of the text. These cohesive links are of five main types: reference, conjunction, substitution, ellipsis and lexical cohesion. Although both the macro- and micro-structure of a text contributes to its coherence, neither guarantees coherence. Discourse or speech interaction, the spoken component of interpersonal interactions, is hierarchically structured. At the top of the hierarchy is the largest unit, the discourse, which corresponds to a complete interactive event. It is made up of a structured sequence of transactions, stages or phases in which interactants orient to sub-goals, e.g. greetings, farewells. Transactions are

Text and Discourse

in turn made up of exchanges consisting of groups of complementary moves by different speakers, such as a question-answer sequence. Discourse analysis is the field that studies the structure of discourse. A sub-discipline is Conversation Analysis, which focuses on the management of the progress of discourse. One feature of this is the management of turn-taking, which is highly principled. In most types of discourse just one speaker holds the floor at a particular time; overlapping of speakers is normally resolved by one yielding the floor to the other. Turns tend to occur at transition relevance places, points where a speaker’s utterance is potentially complete. Other phenomena of concern to conversation analysis are: use of continuers to signify to the hearer that the addressee is attending to what they are saying; and pre-sequences, exchanges that prepare the ground for joint pursuance of a new discourse goal. For instance, someone who wants to tell a story or joke might prepare the ground with the move Have you heard the one about . . .

Guide to further reading There is an enormous literature dealing with narratives from a bewildering array of perspectives. The approach taken in this chapter is a structuralist one. Classic structuralist treatments of narrative include Propp (1968), Labov and Waletzky (1967) and Prince (1982). De Fina and Johnstone (2015) provides a comprehensible overview of the fundamentals of structuralist approaches to narrative organization. Exposition is less well studied, and I am aware of few references suitable for beginners. Chapters 1–3 of Martin (1985) and Martin and Peters (1985) could be consulted. The classic work on cohesion is Halliday and Hasan (1976); for simpler treatment, see Chapter 9 of Halliday (1985; the first edition of this book provides the most accessible treatment). Salkie (1995), a workbook on text and discourse analysis, is largely concerned with cohesion; it provides numerous exercises and examples. Good textbooks on Conversation Analysis are Hutchby and Wooffitt (2008) and Garcia (2023); Chapter 6 of Levinson (1992/1983) gives a more technical treatment. Coulthard (1985) and Stubbs (1983) are good introductory textbooks on discourse analysis. Schiffrin, Tannen and Hamilton (2001) is a rich resource of articles on a range of aspects of discourse and text analysis; few of these are suitable for beginners.

Issues for further thought and exercises 1 What is the structure of the narrative given in (8-1)? Identify the stages and their linguistic realizations. 2 What genre of text do you think (8-4) would be? What type of knowledge does it construct as a whole, and how would you say it is structured – that is, what stages do you consider should be identified?

203

204

Linguistics

3 In the following short text-segments identify the type of cohesive relation, if any, that each underlined word serves. What does it tie to? (Note that in many examples the ties are within single sentences. Do not exclude them for that reason.) a. The same letters refer to the same muscles in all three figures; but the names are given of only the more important ones to which I shall have to allude. (Darwin 1898: 22–3) b. During hunting the spears were usually hurled with a wommera or spear thrower, but some heavy ones made from hard wood were thrown directly from the hand by balancing them in the middle. (Thomas 2007: 62) c. There is a great resemblance between the Victorian and Tasmanian legends of the origin of fire and the apotheosis of heroes. Thus, according to the Yarra blacks, Karakarook, a female, was the only one who could produce fire, and she is now the seven stars (the Pleiades presumably). (Mathew 1899: 20) d. This naturally leads to the conclusion that one-dimensional scales have to be discarded in favour of multidimensional ones, which lend themselves to analysis by computational techniques designed for capturing similarities, such as multidimensional scaling. (Richards and Malchukov 2008: ix) e. His teacher Master Büttner was amazed that Gauss could add all the whole numbers 1 to 100 in his head. Master Büttner didn’t believe Gauss could do it, so he made him show the class how he did it. (Cited in Hayes 2006: 203)

4 In the following passage identify as many cohesive ties as you can, and classify them according to the types identified in §8.2. (The best way of proceeding is to make a few copies of the text and indicate on each copy cohesive relations of just one type. You might, for instance, circle words related by ties of a particular type, and draw a line between them.) It was a perfectly ordinary night at Christ’s high table, except that Hardy was dining as a guest. He had just returned to Cambridge as Sadlerian professor, and I had heard something of him from young Cambridge mathematicians. They were delighted to have him back: he was a real mathematician, they said, not like those Diracs and Bohrs the physicists were always talking about: he was also unorthodox, eccentric, radical, ready to talk about anything. This was 1931, and the phrase was not yet in English use, but in later days they would have said that in some indefinable way he had star quality. (C. P. Snow’s Foreword to Hardy 2006/1940: 9)

5 Find an example of a short expository text in a popular scientific magazine such as Scientific American. Identify its structural stages, and its overall type (is it argumentative, descriptive, or what?). To what extent does the structure of this exposition resemble that of the model answer exposition shown in Table 8.3? 6 In Chapter 5 (pp. 123–4) we introduced the notion of Theme, characterizing it as a clauselevel grammatical role defined by initial position (there are complications, but it would take us too far from our present concerns to deal with these). It was observed that the Theme can either establish what the clause is about, or establish a setting for the event described. Granted this, we would expect Theme to be relevant to the coherence of a text. Identify the Themes of

Text and Discourse

each of the clauses in the narratives of (8-1) and/or (8-3). How do they relate to one another, and do they contribute to the coherence of the texts? If so, how? 7 Michael Stubbs reports the following utterance from his recordings of secondary school interactions (Stubbs 1983: 40). It occurred at the beginning of an English class. The teacher had been talking to some pupils at the front of the classroom, then turned around and said to the class: Right! Fags out please! No one in the class was smoking. Stubbs interpreted this as a strategy of gaining the students’ attention, signifying that the class was to begin. Explain how this could be so. 8 Find out how one type of buying and selling encounter is conducted in your city by observing an example in a post office, supermarket, restaurant or some other place of your choice. (One way of doing this would be to get a friend to do the interaction, while you observe from nearby; another way is to do it yourself, and observe from the perspective of a participant.) Based on the observed encounter, how was the discourse structured in terms of transactions? 9 Shown in Box 8.1 is the structure of a discourse into transactions and speaker turns. Give a full analysis of the structure of this discourse in terms of exchanges and moves. Comment on any aspects of this invitation that seem atypical of the ways such an invitation would most likely be constructed in your own culture. Try observing a comparable invitation (or make one yourself with a co-student). How closely did it resemble your expectations? 10 Record with an audio or video recorder a short segment of casual conversation involving friends or family. (Make sure you obtain permission from the interactants to make the recording before you begin.) Transcribe a short segment of a few minutes in duration, indicating features such as overlap of turns, continuers, hesitations (e.g. um, aa and the like). Discuss turn-taking in this segment of the conversation, and the extent to which the norm of one speaker at a time is adhered to. 11 It was mentioned on p. 197 that in teacher-student interaction the absence of a confirming move by the teacher following a student’s answer is typically taken to mean that the answer is incorrect. Why might this be? Why might the teacher not provide an explicit negative response?

Research project In §8.2 we mentioned procedural texts as a genre, but did not discuss their structure. Find at least a dozen examples of procedural texts, and compare them in terms of their overall structure. (You might like to restrict attention to one particular subtype – for example, recipes.) Can you suggest a structural description for this type of text in terms of stages and their sequencing? How does the structure of procedural texts differ from the structure of narratives as discussed in §8.2? Discuss any difficulties you encountered in deciding whether or not a text was procedural, and why you decided one way or another.

205

206

9 Investigating Language in Use: Corpus Linguistics

This chapter provides a third perspective on language in use: the issue of how to investigate language as it is actually used. Usage is important in functional theories of linguistics (recall §1.4), which take the view that language is shaped by the uses to which speakers put it. Usage can thus reveal things about the system of a language. To study usage we need a body of usage data, instances of language in use in various circumstances. Corpus linguistics is the branch of linguistics that is concerned with the gathering and analysis of bodies of such data, called corpora.

Chapter contents Goals Key terms 9.1 What is a corpus and what is corpus linguistics? 9.2 Types of corpora 9.3 Building a corpus of your own 9.4 Analysing a corpus Summing up Guide to further reading Issues for further thought and exercises Research project

208 208 208 211 215 217 228 229 230 231

207

208

Linguistics

Goals The goals of the chapter are to: ● explain what a corpus is; ● provide an overview of the range of types of corpora that are available; ● describe how a corpus can be designed; ● illustrate some of the types of question that can be investigated via a corpus study, and how they can be addressed; ● raise some ethical issues; and ● mention some of the things that a corpus investigation can tell us about language and its usage.

Key terms annotation

collocation

learner corpus

AntConc

concordance

lemma

balanced corpus

corpus (plural: corpora, corpuses)

markup

British National Corpus (BNC) Child Language Data Exchange System (CHILDES) cluster Collins Birmingham University International Language Database (COBUILD) Corpus of Contemporary American English (COCA)

corpus linguistics ethics frequency lists general corpus historical corpus International Corpus of English (ICE) keyness

multilingual corpus parsed corpus range regular expression representative corpus Sketch Engine specialized corpus Wordsmith Tools

keywords

9.1 What is a corpus and what is corpus linguistics? The notion of a corpus In its most general sense in the field, a corpus – plural corpora or corpuses – is a compilation of material that has been prepared for the purpose of linguistic research. Linguists writing grammars

Investigating Language in Use: Corpus Linguistics

of particular languages typically compile corpora of data including words, sentences and texts in a language, possibly along with translations into English (or the mother tongue of the linguist), which forms the primary database for their analysis. Typologists (see Chapter 15) may select a number of languages as their corpus, along with relevant data in these languages, for their investigations of a targeted grammatical phenomenon. Discourse analysts will typically assemble a corpus of discourses (see §8.1) that they base their analyses on. Historians of languages with long written traditions often compile corpora of texts representing different historical periods. Lexicographers have for a long time used corpora of sample texts in a language to understand the meaning of words as they are actually used. These days, linguists generally think of a corpus as a digitized compilation of data in a language, perhaps available on the internet. However, the use of corpora in linguistic investigations goes back much further than the 1980s, when use of personal computers became widespread. For instance, from its conception in the 1850s the Oxford English Dictionary was to be based on actual usage of the language.1 The first edition was compiled from a corpus of examples of English usage from literature, newspapers and other written sources spanning some centuries. The examples were gathered by a large team of volunteers who sent them to the editor, James Murray, who amassed more than three million paper quotation slips, filing them in wooden pigeonholes. A similar method was used by descriptive linguists up until the mid-1980s for writing grammars. Words and sentences in the language, together with their English translations, were written on index cards, which were laboriously sorted and searched through. The first corpus of Gooniyandi that I compiled was of this type. It took me a few days to sort through the entire corpus, which was contained in shoeboxes covering most of the floor of my study.2 Similarly, the 1972 edition of Quirk et al.’s grammar of English was based on information recorded on ‘slips’ (see Figures 9.1 and 9.2) that were sorted and stored in filing cabinets. This corpus, amounting to around one million words of writing and speech, was subsequently transferred to electronic format and is known as the London-Lund Corpus (LLC).

Corpus linguistics and corpora Personal computers, and later the internet, have facilitated the compilation of corpora and of course searching them for phenomena of interest. The term corpus linguistics has been used since the early 1980s to refer to the empirical investigation of language based on large computerized corpora of ‘real’ instances of usage. Correspondingly, within corpus linguistics, a corpus is understood to be a typically large electronic collection of natural instances of usage in the shape of written and/or spoken texts or discourses. An early corpus (in this sense of the term) was the Brown Corpus. Compiled in the 1960s, it comprised about a million words of written English from works published in the USA in 1961. Since then corpora have increased steadily in size. Dating from the 1990s, the British National Corpus (BNC) consisted of 100 million words in written and spoken British English; the Collins Birmingham University International Language Database (COBUILD corpus), begun in the 1980s, now consists of 4.5 billion words; and the Google Books corpus has 34 billion words.

209

210

Linguistics

Figure 9.1 A ‘slip’ (A6 size) from the corpus compiled by Randolph Quirk with an example of a noun phrase. Written text W12, subtext 4, slip 52. Each and every noun phrase in the corpus was manually underlined on a separate slip (as were other word classes, phrases, clauses, etc.), and filed away in a filing cabinet. The two starred lines show an overlap with the previous and following text fragments. © Survey of English Usage, UCL.

The primary concern of corpus linguistics is how people actually speak and write, how they use their language and its various components, including the lexicon, morphology, syntax and so on. It primarily addresses questions of usage. One of the most obvious questions is how often something is used – a question that cannot be answered convincingly through the use of one’s intuitions as a speaker of the language. Some years ago I carried out an investigation of constructions like there’s family and (there’s) family (see further pp. 224–5 below). My intuitions were that these were fairly uncommon in usage, but it required a corpus study to reveal just how infrequent they really are. One might also be interested in relative frequencies of linguistic phenomena. We have already mentioned (p. 61) one very limited corpus investigation that aimed to identify the most frequently used words in English. The ultimate goal of corpus linguistics, as a number of commentators have observed, is not just counting and statistics, but to understand language. The really important questions concern

Investigating Language in Use: Corpus Linguistics

Figure 9.2 The Survey of English Usage research room, early 1970s. In the centre left you can see one of the drawers in which the slips were filed; these drawers were put into filing cabinets. © Survey of English Usage, UCL.

why: one wants an explanation and interpretation of any statistical patterns observed. As Biber, Conrad and Reppen (1998: 9) put it: . . . a crucial part of the corpus-based approach is going beyond the quantitative patterns to propose functional interpretations explaining why the patterns exist. As a result, a large amount of effort in corpus-based studies is devoted to explaining and exemplifying quantitative patterns.

There are differences of opinion as to the nature of corpus linguistics and its place in the field of linguistics. Some consider corpus linguistics to be ‘a methodological basis for doing linguistic research’ (Leech 1992: 105). Others see it as more than just a methodology, and regard it as a separate discipline within linguistics, with its own goals, research questions, methods and approaches – as well as its own conferences and journals.

9.2 Types of corpora Corpora come in many different types, designed for many different purposes. What follows is an overview of some of the main types, along with mention of a small selection of examples. You will find numerous other corpora of the various types, and in a range of languages, by searching the

211

212

Linguistics

internet. There are also extensive lists of corpora at https://martinweisser.org/corpora_site/ CBLLinks.html and the Corpus Resource Database. Many corpora on the internet are available for use, though most require registration; a number require a subscription to access their full potential. Your university may have an academic licence for some internet corpora.

General corpora A general corpus is one that aims to provide a snapshot of a particular language or variety of a language as it is spoken at some point in time. The main purpose of a general corpus is to give a good coverage of the range of usages of the targeted language in a society in roughly the proportions in their uses. It aims to be representative, to cover the range of genres, discourse types and registers; it will also represent the modalities of the language (e.g. written and spoken), as well as varieties of speakers according to social variables. It also aims to be balanced, providing comparable amounts of material in each and avoiding overrepresentation or underrepresentation of any one of them. These days general corpora tend to be quite large, containing many millions or even billions of words. Other than the four corpora mentioned above – the LLC, the Brown Corpus, the BNC and COBUILD – there are the Corpus of Contemporary American English (COCA) comprising over a billion words, and the International Corpus of English (ICE), which consists of comparable subcorpora of around one million words in over twenty national dialects of English. There are general corpora for a number of other large European and Asian languages, including Arabic, Spanish, Portuguese, French, German, Danish, Swedish, Russian, Mandarin Chinese and Japanese, among others.

Specialized corpora A specialized corpus does not aim to represent an entire language, but rather consists of instances of usage in some restricted domain. For instance, a specialized corpus might be restricted to a single genre (such as newspaper reports, e.g. the English Language Newspapers Corpus (SiBol) and The Norwegian Newspaper Corpus), a single register (e.g. the Michigan Corpus of Academic Spoken English, the City University Corpus of Academic Spoken English in Hong Kong and the Wolverhampton Business English Corpus) or a single modality (e.g. spoken, such as the Spoken BNC2014, comprising 11.5 million words of speech). There are also specialized corpora comprising the works of particular individuals, such as the complete works of Shakespeare and Proust, as well as the works of an individual author satisfying some criterion, such as the complete Sherlock Holmes texts by Arthur Conan Doyle. Specialized corpora tend to be considerably smaller than general corpora. Spoken corpora tend to be small because of the time-consuming nature of transcription. About an hour is required for a reasonable transcription of a minute of speech by an expert transcriber; the more detailed the transcription, the longer it will take. The aims of representativeness and balance remain; sometimes completeness is possible.

Investigating Language in Use: Corpus Linguistics

Historical corpora Historical corpora present snapshots of a language at different points in time, different historical periods. There are, for example, historical corpora of English such as the Helsinki Corpus that covers the thousand-year period 750–1700 from Old English through Middle English to Early Modern English. There are corpora that provide snapshots of a language at specified intervals such as fifty years or a century throughout a particular time window: for example, ARCHER is a multigenre corpus that covers English from 1650 to 1990, divided into blocks of fifty years. The Corpus of Historical American English (COHA) is a corpus of almost half a billion words of American English covering the period 1820s–2010s balanced by genre and decade. Some historical corpora focus on one particular historical period, such as Middle English, or a specific genre of text falling within a specific time range. Because of the recentness of recording technology, historical corpora are overwhelmingly representative of written genres. Some, however, give at least some indication of the spoken language. An example is the Old Bailey Corpus, which provides transcripts of criminal trials (originally recorded in shorthand, later on stenotype machines) held in the Old Bailey over a period of nearly two centuries. Similarly, A Corpus of English Dialogues 1560–1760 (CED) comprises written records of speech in legal contexts as well as constructed dialogue in drama and fiction. Corpora of computer-mediated genres such as X (formerly Twitter), chatrooms and instant messaging can also represent snapshots of use at various points of time. However, the time windows are very small since the technologies and platforms are recent and typically have very short lives.

Parsed corpora For projects concerned with grammatical structures, searching for words and strings of words may not be a practical way of finding examples of usage. A parsed corpus permits you to search for particular structures and grammatical categories such as types of phrase and clause. The Lancaster Parsed Corpus is a corpus of over 100,000 words from the Lancaster Oslo-Bergen Corpus (LOB) of written British English, and the British component of the ICE is fully parsed and represents both spoken and written English. A number of historical corpora of English have also been parsed. Parsed corpora, as might be expected, are normally rather small: even if the parsing is automated, at least a subset of the parsings needs to be manually checked for accuracy. Furthermore, a parsed corpus will be useful to you only to the extent that the phenomenon you are looking for is included in the categories represented, and to the extent to which the categories of the corpus correspond with yours. (Recall, from Chapter 5, the very different senses in which verbal phrase (VP) and even noun phrase (NP) are employed.)

Learner corpora The term learner corpora refers to corpora that provide samples of the language of those who are not yet entirely proficient users. These include corpora of the language used by children at various

213

214

Linguistics

developmental stages of learning their first language (see §12.1). A notable corpus is the Child Language Data Exchange System (CHILDES). Established in 1984 by Brian MacWhinney and Catherine Snow, CHILDES now has corpora in more than twenty languages, mostly consisting of spontaneous interactions between the child and caregiver. Others are the Polytechnic of Wales Corpus (POW), a parsed corpus of English as spoken by children 6–12 years of age, and The Growth in Grammar Corpus, a corpus of written texts by school children in England. There are also a range of corpora of second language learners’ speech and writing (see §12.3). For instance, the International Corpus of Learner English (ICLE) comprises written English essays by mainly undergraduate students from many different first language backgrounds. The Louvain International Database of Spoken English Interlanguage (LINDSEI) contains transcripts of speech in informal interviews and prompted by pictures, by advanced learners of English from various first language backgrounds. Both corpora were developed at the Centre for English Corpus Linguistics in the University of Louvain.

Multilingual corpora A multilingual corpus contains texts, ideally in equal amounts and in the same genres, from a number of different languages. An example of a multilingual corpus is the Aarhus corpus of Danish, French and English contract law, which consists of corpora of texts relating to contract law in each of the three languages, and amounting to around three million words in total. Such a corpus, in which the subcorpora are matched by genre and other such features, is referred to as a comparable corpus. Another type of multilingual corpus is the parallel corpus. In a parallel corpus the component texts of one of the corpora are translations of those of the other language. Such corpora are often sentence-aligned, with tags indicating which sentences in each language correspond. Examples of parallel corpora include the English-Norwegian Parallel corpus, the English-Swedish Parallel Corpus and InterCorp, a parallel corpus comprising over forty different languages. There are also parallel corpora of English and Mandarin Chinese. Somewhat different in conceptualization are the Multilingual Corpus of Annotated Spoken Texts (Multi-CAST), a corpus of spoken narratives in a typologically diverse selection of seventeen languages; and The Social Cognition Parallax Interview Corpus (SCOPIC), a corpus of texts in more than twenty languages, that were elicited from picture description tasks.

Other types of corpora There are corpora that don’t fit well into the broad types identified above, that are designed for a variety of purposes and according to a variety of criteria. The following are worthy of brief mention. First, there are multimedia corpora in which transcriptions are aligned and synchronized with audio and/or video recordings; the Santa Barbara Corpus of Spoken American English is an example. Such multimedia corpora have been constructed for a number of minority languages as part of efforts to document the linguistic diversity of the world; the DOBES language archive holds corpora in over one hundred endangered languages from around the world.

Investigating Language in Use: Corpus Linguistics

Second, there are a few corpora of intercultural interactions among speakers of different first languages who use English as a common language; the Vienna-Oxford International Corpus of English (VOICE) is such a corpus of non-scripted face-to-face interactions. Third, the web can be considered to be a corpus, and harvested intelligently for texts of a desired type. TenTen is a family of corpora of texts gathered from the internet in more than forty different languages.

9.3 Building a corpus of your own Fundamental considerations As we have seen, a wide range of corpora are available on the web in many languages, and their number is increasing. You might, however, find that there is none that is suitable for the questions you want to address, or that none is available to you (not all corpora are freely accessible), or that there is nothing in the language or variety you want to investigate. For example, you might be interested in the language of advertisements on billboards or on product labels, the use of a particular word or construction by a specific author, or the use of English swear words in Danish spoken by young people in Denmark. In such circumstances you will probably need to construct your own corpus. This can be a fairly easy task, or a quite difficult and time-consuming one: it depends on your questions and the target language. Thus the initial step is to clearly formulate your questions, and to ensure that a suitable corpus is not already available. The following is an overview of some additional considerations that you will need to take into account. To begin with, you will need to think about what kind of data you need – for example, what kinds of text you should collect, and how to collect them. In the above project on swearing you will need to collect a body of speech representative of young Danes. There will be no point in collecting speech of older Danes or young speakers of English (except possibly for comparative purposes – but that would be a different project). How big should the corpus be? There is no simple answer to this question. The size of the corpus will depend on many factors, including representativity and practicality. Your corpus must be large enough to provide an adequate representation of the features you are interested in. If you are interested in a particular author’s use of a construction, you might opt to include the entirety of their output. On the other hand, to study use of English swear words in spoken Danish completeness will be impossible; you will need to give careful thought to how many instances you should aim for. This brings in the issue of practicality: no one has unlimited time or money to carry out their project, and you need to ensure that the collection of data can be completed within the time available.

To underline the point that it is impossible to give figures for the size of a corpus, it has been said that a million words should suffice for a grammatical analysis. A corpus of this size will certainly be adequate to reveal the major grammatical patterns in a language. However,

215

216

Linguistics

some grammatical phenomena turn out to quite infrequent, and may not occur in a one-million-word corpus. My own investigations of constructions like there’s family and (there’s) family y (see p. 210 above) revealed only about one token per 16 million words (McGregor 2013: 146). Even more seemingly ordinary complex sentence constructions of the type I mistakenly thought you were my friend d may not occur at all in a million-word written corpus.

Once you have decided what and how much material you need to collect, it is necessary to think about how to collect it. In some instances it will be necessary to get permission to collect the texts from the parties involved. For texts in the public domain this is not an issue, and permission is not generally required; but for copyrighted material even electronic collection for a private corpus may require permission. Before beginning construction of your corpus you also need to give thought to how the texts will be stored (what file format, including the character coding – the software you plan to use may limit your choices). You will also need to think about the additional information that should be included in the text files in addition to the texts themselves, so-called markup which refers to codes inserted into the files indicating about the format (e.g. font-style, paragraphing, punctuation) and other relevant features of a text. It makes sense to establish meaningful file-naming conventions that provide information relevant to your research: if you are interested in gendered language it might be sensible to include information about the author’s gender. Often a text file begins with a header that provides information about the file, such as demographic information about author or speaker, or information about the circumstances in which the text appeared. Markup of these types, of course, must be separated somehow from the text itself – for instance, by enclosing it in angle brackets (< >). If your texts are spoken, a range of other issues arise relating to recording them (audio and/or video, and the permissions from speakers to make the recording), transcribing them (how narrow should your transcription be, and how will you do it – e.g. do you need to represent prosody, will English orthography suffice, and if not, what are the possibilities?), and linking the transcription with the audio or video file. Programs such as Praat and Elan permit linking of transcriptions with audio and/or video files. It may be useful to annotate your corpus – for instance, by providing part-of-speech tags for each of the words (or perhaps a selection of words, such as the verbs), semantic information, or other information relevant to your study, such as the absence of a feature of interest. Part-of-speech annotations may permit more focused searches, e.g. separating May (the month name) from may (a type of shrub) and may (the modal auxiliary). Software tools are available for part-of-speech identification, but many other types of annotation must be done by hand.

Ethics It is important to be aware of ethical considerations in corpus construction, including copyright and necessary permissions to record and/or use materials. Speakers should give their informed

Investigating Language in Use: Corpus Linguistics

consent to be recorded, and have their recordings used in your corpus. Informed consent is also required from writers of private documents (e.g. letters, diaries, student essays). Another important ethical issue is privacy, the right of the individuals who produced the texts in your corpus to remain anonymous – for instance, things said in the texts could be embarrassing or even incriminating. The EU has privacy laws (GDPR) that lay down rules for the protection of personal data and its storage and manipulation. This means that (for private texts) your corpus should not contain information that identifies the author or speaker(s), or persons mentioned in the texts. They should be identified just by a dummy name or code, and no information may be kept linking these to a person. Many countries have similar laws. It is important to be aware of these laws and to respect them.

9.4 Analysing a corpus To be of any use, a corpus has to be analysed to extract linguistically significant information, and various software tools have been developed for this purpose. General corpora such as COCA, BNC and Spoken BNC2014 have their own built-in software interfaces. There are also software tools that that can be used with a variety of corpora, including AntConc (freeware), Sketch Engine and WordSmith Tools. What follows is an outline of some of the most basic types of information one might want to extract from a corpus, and how to get it. Examples given in this section are mainly from the COCA corpus, which has a very user-friendly interface and permits searches of a wide range of types of information. My advice is to replicate the analyses below in the corpus of your choice. If you use COCA you should be aware that it is still evolving, and when you query it there may be different possibilities than those outlined below; furthermore you are likely to find different frequencies of items as the corpus is added to. (It is currently (2023) over twice its size when I first used it a bit over a decade ago.) It is also worth exploring the other possibilities offered by the software and corpora of your choice.

Frequency Showing ranked frequency lists Perhaps the most basic information one might want to extract from a corpus is frequency information. A list of word types can be generated for a particular corpus, along with the number of tokens of each, and the percentage of the entire set of word tokens that these represent. (See box on p. 136 on the notions of type and token.) Instead of percentages, the number of tokens per million words is sometimes indicated. The list can then be displayed in terms of the rank order of words, from most to least frequent. Table 9.1 shows the top ten most frequent word types in three corpora. Table 9.1 shows that there is a good deal of similarity among the three corpora in terms of the most frequent words, despite their very different sizes. The definite determiner the is the most frequent word in all of the corpora, and shows similar frequencies (5–6 per cent). Also appearing in each list are a, and, of, in and to; these also occur in comparable rank orders in each of the corpora.

217

218

Linguistics

Table 9.1 Frequency list in three corpora COCA

Freq

%

AmE06a

Freq

%

BE06b

Freq

%

1

the

50,033,612

5.00

the

60,056

5.90

the

59,163

5.87

2

be

32,394,756

3.23

of

30,331

3.00

of

30,733

3.05

3

and

24,778,098

2.47

and

28,973

2.85

and

28,069

2.79

4

a

24,225,478

2.42

to

26,036

2.55

to

26,319

2.61

5

of

23,159,162

2.31

a

23,926

2.35

a

23,102

2.29

6

to

16,770,155

1.67

in

19,923

1.96

in

19,423

1.93

7

in

15,670,692

1.56

that

12,279

1.21

that

10,572

1.05

8

I

14,217,601

1.41

-’s

10,047

0.99

it

9,446

0.94

9

you

12,079,413

1.21

for

8,910

0.88

for

9,275

0.92

it

11,042,044

1.10

I

8,663

0.85

was

9,241

0.92

Rank

10

a. American English 2006, a corpus of a million words of written American English from 2006. This corpus was constructed in the same way as the Brown Corpus (it comprises 500 text samples, each of about 2,000 words). See Potts and Baker 2012. b. British English 2006 corpus of one million words of written British English, also from 2006 and constructed in the same manner as the Brown Corpus.

It is important to be aware of what precisely the frequencies are frequencies of, whether they are of sequences of letters or of lexical items. In most cases it is letter sequences that are counted. Clearly in the AmE06 and BE06 corpora different inflected forms of be e are counted separately; this is not the case in COCA, however, where inflected forms as well as cliticized forms are also included in the count for BE E. As this indicates, software will sometimes return the frequency of lemmas – the citation form for a set of allomorphs or inflected forms of a lexeme. This is not done consistently in COCA: for example, I is treated as a separate lemma from me me. e.

Other modes of presentation are possible. For instance, the list might be displayed in alphabetical order instead of frequency order; WordSmith Tools permits display in alphabetical order backwards from the end of the word, thus facilitating identification of frequencies of suffixes. Aside from frequency of the items, one might be interested in whether the word is widespread across the texts in the corpus, or is restricted to a few texts. Thus along with frequency the software may indicate the range of the word, the number of texts in which it occurs.

Why is frequency information interesting and relevant? There are a number of reasons why frequency can be interesting linguistically. For instance, looking at the top ten items in Table 9.1 reveals that they account for slightly over 20 per cent of the word tokens in each corpus. They are all monosyllabic (with the exception of the inflected form being), and all are grammatical items.

Investigating Language in Use: Corpus Linguistics

Also interesting are low frequency words. The COCA interface permits you to examine other intervals of frequency ranking for the 60,000 most frequent word types (e.g. in the 50th–100th interval), as well as other criteria such as the number of syllables. Thus you can generate a listing of words of nine syllables, for example. There are just eleven of them, of which the most frequent, antiintellectualism, is way down the frequency list, in the 36,201st position. The COCA software does not allow displays of items below the 60,000 cut-off point (these have to be sought separately). However, AntConc will generate a full ranked-list of all items in a corpus. An examination of AmE06 reveals that the last ten rank positions – items ranging from ten through to just one token (so called hapax legomena) – are occupied by fully 81 per cent of the word types, but account for just 3.55 per cent of the tokens. As can be seen from Table 9.2, there is a rapid increase in the number of word types as the number of tokens decreases from ten to one. Frequency data can be significant for other reasons than statistical patterns like those identified above. For instance, it can be important to know how relatively frequent synonyms are (see below pp. 226–7 on strong and powerful): this may help provide insights into meaning differences. The frequency of a word may be very different in different genres, registers or other varieties of a language, and this may also be suggestive of the meaning of the word. Different senses of a word might occur at different frequencies. For example, can and may overlap in their modal senses; it would be relevant to know how relatively frequent the overlapping senses are. In a study of the words see and watch, Chrispin and Fontaine (2023) find that the former verb is ten times as frequent as the latter. Moreover, see has a far wider range of senses than watch, one of which predominates by a large margin; by contrast, for watch there are two senses that are roughly equally frequent. An increase or decrease in frequency of use of a word or grammatical construction over time in a historical corpus might be indicative of the form coming into or going out of fashion (see further §16.5).

Table 9.2 The ten last frequency positions in AmE06 Rank position

Number of word tokens

Number of word types

8,268

10

610

8,878

9

733

9,611

8

865

10,476

7

1,020

11,496

6

1,367

12,863

5

1,836

14,699

4

2,616

17,315

3

3,651

20,966

2

6,579

27,545

1

16,889

219

220

Linguistics

Other frequency information Many corpus interfaces and software packages permit more sophisticated frequency searches than mere searches of word-like strings of letters. You may be able to search by lemmas rather than strings; this is possible in COCA, for example, albeit with some qualifications as outlined above. Typing a word in all capitals will search the number of instances of the lemma, while in lower case it will find the string. Thus searching for BE will find the frequency of all inflected forms of the verb be, while searching for be will find just instances of that graphemic shape. If the software does not allow searches by lemma, there may be work-arounds using so-called regular expressions. The star (*) is used as a wildcard for a sequence of any length (including zero), and the question mark (?) for a single letter wildcard. However, searches using these variables will probably find many other items than the affixed form you are looking for. For instance, searching for work* with AntConc in AmE06 gives not only works, worked and working, but also derived forms such as worker and workable and compounds such as workshop and workload. Of course this will not work for inflectional variants like be ~ was ~ is ~ am, etc. These can be found if one lists the various forms separated by the ‘or’ operator |. One can search for these inflectional forms in AntConc with the string (be|was|is|am) – where of course all of the alternate forms must be listed. (In AntConc the brackets are essential: otherwise the search returns also strings with the alternatives as substrings, e.g. history and Christmas will be included.) Many programs permit one to look for sequences longer than words, though they might need to be accessed in different ways to single words. One might, for instance, be interested in the frequency of cats and dogs, or of an idiom such as look a gift horse in the mouth. The wildcard * can usually be used to represent any word, so one could use cats * dogs to search for three-word sequences involving the two lexemes. I suspected that look a gift horse in the mouth was a relatively fixed idiom (allowing of course inflection of the verb), but when I ran the search LOOK a gift * in the mouth I found also other choices than horse, including whore, vampire, muscle-maker and pig. However, each of these was instanced just once, and far less frequently than the standard form (sixty-six instances). Another search revealed that horse can be pluralized (again instanced just once). If your corpus is tagged for part of speech, you should be able to search for instances of the string belonging to a particular part of speech. For instance, in COCA you can look for all instances of dog as a verb. (Though many of these turn out to be misclassifications.) This corpus also allows you to search for all instances of the given form followed by a word of a specified category, such as the verb dog followed by a pronoun. There are differences among speakers of a language in the how often they use particular words, some of which may be idiosyncratic (and indicative of their idiolect), others of social groups to which they belong, or the genre of the text. The software with COCA permits you to see the relative frequencies of a word in different genres and in six 5-year periods from 1990 to 2019. A search of OK revealed this form to be instanced 136,788 times, and the 708th most frequent word. Searching for frequency by section allows you to see the different frequencies according to genre and time. Unsurprisingly, there are significant differences in the frequency of OK according to genre: it is most frequent in the spoken part of the corpus, and in TV programmes, least frequent in academic texts.

Investigating Language in Use: Corpus Linguistics

Again, it must be cautioned that, even though the search specified the part of speech as adverb (interjections are treated as adverbs in COCA), what is counted is instances of the string of letters , not just the interjection. In fact, the system treats , and as well as the capitalized versions as allographs. Examination of the actual instances reveals that some are personal names, some are abbreviations for the state of Oklahoma, some are words of other languages (in quotations), and some are instances of verbal use. A good number of the instances of OK K in the academic component of the corpus are in fact of the first three of these word types. This may be a specific problem with the part-ofspeech tagging in COCA; however, it underlines the need for caution in interpreting the results of frequency searches.

Keywords Keywords and their identification Keywords are words that occur statistically more frequently (positive keywords) or less frequently (negative keywords) in a target text or corpus of texts than they do in a reference corpus. Keyword software identifies words showing statistically significantly different relative frequencies in the two corpora. In COCA it is possible to input a text of your own choice and identify keywords showing significantly higher frequencies than in the corpus itself. However, the permissible size of the target text is quite small. The Keywords tool of AntConc is more versatile, and can be used with larger target texts and corpora. I used it with BE06 (target corpus) and AmE06 (reference corpus). With the standard settings, around 500 words were identified as positive keywords. Some of these reflect different spelling conventions (e.g. colour occurs 105 times in the target corpus, not at all in the reference corpus). Other keywords are revealing of the different geographical political contexts of the dialects: the top ten keywords include UK, British, Britain and London. Examination of negative keywords reveals similar patterns. There are negative keywords reflecting different spelling conventions (e.g. color occurs ninety-four times in AmE06 against twice in BE06). And in the top ten keywords we find U (all from the acronym U. S.), American, states, federal and Bush (almost all instances of which refer to the then president). However, there are surprises, and it is not obvious why yesterday, child and police are identified as positive keywords. Keyword software provides information on the relative frequency of the keyword in the target and reference corpora, and possibly also information on the range of the keyword. It will also show a statistical measure of the keyness, or relative importance of the differences. There are different statistical measures for keyness (AntConc allows you to choose among several). The essential point for us is that the value indicates the relative strength of the word as a keyword: a larger value indicating that the word has a greater significance as a keyword. The AntConc software displays keywords in descending order of keyness, but allows other orders (e.g. ascending order of keyness, alphabetically, etc.). Keyword software will normally also give

221

222

Linguistics

a statistical measure of the level of confidence we can have that the identified difference in frequency of the keyword is genuine. It is beyond the scope of this chapter to discuss these statistical measures.

Significance of keywords There are a variety of different reasons why a word might be identified as a keyword. Keywords may reflect what the texts are about, their subject matter. Thus a number of keywords identified in the comparison of BE06 and AmE06 writing reflect different concerns of the texts relating to the different countries, different political and geographical concerns of the texts. In other cases the keywords had nothing to do with the subject matter of the texts. The frequency of occurrence of the word colour (disregarding the spellings) is approximately the same in both corpora, and what is significant is different spelling conventions. Keywords might also throw light on characteristic lexical choices of a register or genre if a target corpus in that variety is compared with a large general corpus as the reference corpus.

One would presume that keywords will normally be lexical items, and that grammatical items would be more or less equally frequent across all text types. But there are exceptions. AntConc identifies itt and be e as keywords in the above-mentioned comparison of British and American English. On the other hand, if your target was a corpus of newspaper headlines or telegrams (see note 1 to Chapter 12, p. 472 below) you might expect grammatical words to show up as negative keywords.

Different speakers and authors use different words with different frequencies, and keywords have been used in the identification of speakers/authors, and determining whether different texts or collections of texts are likely to have been produced by the same person. This approach has been used in literary studies in determining whether a particular text is likely to have been produced by a particular author. More recently, in forensic linguistics keywords have been used – along with various other measures – to determine whether a particular text is likely to have been produced by a particular person. The target text is compared with a corpus of similar texts known to be produced by the person. Also in the forensic domain keywords have been used to determine whether a transcript is likely to be a verbatim record of testimony. Coulthard (2000), for instance, observes that then is far more frequent in the written statements of police officers than of witnesses, and significantly more frequent than it is in the spoken component of the COBUILD corpus. This led him to conclude that a transcript that was allegedly a verbatim record of a spoken statement was very unlikely to have been that: its usage of then was more characteristic of written police statements than witness statements or spoken English generally.

Investigating Language in Use: Corpus Linguistics

Concordances Basics of concordancing: generating a concordance A concordance is a list of all instances of a word or phrase in a corpus together with a specified number of preceding and/or following words from their textual environment. A concordance permits us to explore how the targeted word is used, and ultimately what it means. Concordancing software, often referred to as KWIC (key word in context), is available for major general corpora such as COCA and COBUILD. (For some corpora there are limitations on the number of concordance lines displayed if you don’t have a licence.) It is usually integrated with frequency (and other) software permitting one to move seamlessly from a frequency listing to the instances that are counted. This is useful since one wants to know what has actually been counted. For instance, when the keyword component of AntConc identified U as a negative keyword in the British– American comparison mentioned in the previous section I obviously wanted to know what word this was, since it is not a standard spelling of any word – was it a non-standard spelling of you? Concordance software typically displays a list of occurrences of the target word in a sequence of lines centred on that word, which is usually highlighted – e.g. in bold – along with a given number of words on either side of it. Of course, you will need to check whether the concordance is finding instances of a string of letters, or a lemma. COCA allows concordances of both by use of lowercase or all capitals. It is possible to sort the lists – for instance, alphabetically – on the preceding or following word or words. If there are a large number of instances of the target word, it will be possible to make a random selection of e.g. 100 or 200. The display in COCA permits a number of choices, including colour-coding of the target and nearby words according to part of speech. One can also restrict the search according to part of speech of the target word (useful for words such as book and dog which can be either a noun or verb). Beyond generating a concordance for a particular word, it is usually possible to create a concordance for combinations of words, such as for the combination mistakenly think. The conventions mentioned in the section on Frequency above can also be applied. For instance, using mistakenly THINK as the search string in COCA returns the instances of the string with the range of inflected forms of think. Wildcards can also be used in the searches. For instance, Figure 9.3 is a screenshot of a KWIC search of COCA for prevaricate followed by any preposition; the search words are highlighted in green. Notice that as well as providing the linguistic context (eleven words on each side) the year, genre and label for the text in which the example occurred is given.

Figure 9.3 Concordance lines for prevaricate followed by a preposition in COCA.

223

224

Linguistics

With these more advanced search tools one can search for certain grammatical structures and patterns, in particular those that involve a specific pattern of words. For instance, there is a construction in English called the pseudo-cleft exemplified by what we want is Watney’s. We could begin by searching for what * * BE. Of course, this search will return many other sequences such as what the hell is. And of course there will be genuine instances that are not included, such as what the hell we want is Watney’s. In AntConc one could make similar searches using regular expressions. (What would an appropriate regular expression be for the above pseudo-cleft?)

Uses of concordances Examination of concordances can tell us a good deal about the contexts of usage of a word or string of words, as well as its senses. This is one of the most likely ways you will want to use a corpus in your linguistic courses. A simple example is the word prevaricate, mentioned above. Corpus investigation shows that in addition to its use in the sense of lying or misleading – the only sense listed in many online dictionaries – it is not infrequently used in the sense of stalling or dithering (with no component of misleading).3 Of course, to appreciate this you have to use your intuitions as a speaker to understand the segment of text – the corpus itself won’t give you the information. Example (9-1) from the COCA corpus provides clear illustration of the second sense. (9-1) The truth is that unlike killing ObL Captain Zero didn’t have six months to agonize (and get his arm twisted) over the “courageous” decision to do anything requiring balls. * # So he dithered, prevaricated, and went catatonic in the face of actually having to make the simplest decision. From the contextual information given in Figure 9.3 it is clear that line 3 uses the word prevaricate in the sense of lying or misleading. It is not so clear in the other lines whether this sense is invoked, or the sense of stalling or dithering. By clicking on the number in the leftmost column one is provided with more context. This reveals that it is likely that all of examples invoke the former sense, suggesting that followed by a preposition prevaricate always means ‘mislead’. Extending the search to other inflected forms of prevaricate further strengthens this hypothesis. (These cautious wordings are motivated by the observation that even with context it is not always clear what meaning was actually intended.) I have already mentioned an investigation I made a decade ago of expressions like there’s family and (there’s) family (McGregor 2013). My main interest in this construction was not its (in-) frequency, but rather its meaning. In fact, very few investigations have been made of the construction, and the sources that do discuss it usually say that it expresses the meaning that different kinds of the noun (in this instance family) are identifiable; most also draw attention to good and bad kinds. Examination of examples in context revealed a very different situation. Examples were not simply drawing attention to the existence of different types of thing. In fact, what turned out to be common across all of the examples was that they asserted the non-uniformity of the category in the face of a presupposition of uniformity. In the majority of instances, the category is mentioned in the immediately previous text. As an illustration, consider the following example, from the BNC.

Investigating Language in Use: Corpus Linguistics

(9-2) Most of us need to use a moisturiser each day, although oily/combination skins can get away with the lighter types, applied to neck, cheeks and eye area, and avoiding the very greasiest parts of the face, There are moisturisers . . . and moisturisers. Some last on the skin longer than others. You need to test different products to find one that really suits. New formulae claim anything from 15 to 24 hours ‘efficacy’, so check the small print on the pack! The instance of our target construction, bolded in (9-2), continues the theme of the text, moisturisers, which have to this point been treated as though a unitary category. There are moisturisers . . . and moisturisers asserts to the contrary, that it is a differentiated category, and goes on to identify some specific differences among moisturizers, and discusses the need to test different products. Many corpus studies have examined the usage of discourse particles, such as ok, right, now, well, oh and the like (see p. 198 above), addressing the question of where they are likely to be used. For instance, one might want to know whether they are typically used at boundaries of transactions or other unit types, and if so, what effect they have. Examination of the use of such items with concordance software can be very revealing. In this case one will generally need more context than that usually provided in concordance lines, and you will need to have an understanding of the overall structure of the text. (Ideally this information would be included in the markup of your corpus.)

Collocations Collocations and their identification A collocation is a statistically significant co-occurrence of words (recall the term collocate from §6.2). The software packages AntConc and Wordsmith Tools as well as interfaces for general corpora such as COCA permit searches for collocates of selected words. By default a span between the target word and its collocates on both sides is set, five words to the left and right in AntConc, and four in the COCA interface; this can be increased or decreased. For each collocation identified its frequency in the corpus is indicated, and a figure given for its strength. There are different statistical measures of strength, and programs such as AntConc allow choices, which may identify different collocations or assign them different relative strengths; again it is beyond the scope of this chapter to discuss the various measures. The range may also be indicated, as might also be a measure of the statistical significance of the collocation. There are various possibilities for the presentation of the collocate lists. In AntConc by default they are presented in order of decreasing strength. COCA provides an intuitively appealing display: the collocates are organized according to their part-of-speech, and then in each part-of-speech according to strength. The collocations are colour-coded for significance. For instance, a search of the collocates of mistakenly revealed a range of collocations including with nouns (most strongly with police), adjectives (most strongly with wrong), verbs (most strongly with think and believe) and adverbs (most significantly with often). The actual instances can be seen in a KWIC-style display with a key-click on the context button next to each collocate in COCA. In AntConc this can be done with a click on the collocate.

225

226

Linguistics

Collocation software also permits searches for collocates of strings of words. More sophisticated searches may also be possible. The COCA software permits searches for the collocates of a lemma or of a particular word form; it is also possible to restrict the searches according to the part-of-speech of the word or the collocate.

Linguistic significance of collocations In §6.3 it was remarked that part of one’s knowledge of a word is the words it habitually collocates with, and that according to J. R. Firth a component of the meaning of a word is specified by its collocations. Investigation of collocations can be revealing of the meaning of a word, and provide information additional to that provided by a concordance. For instance, we have seen that mistakenly collocates quite strongly with the adjective wrong, which conveys a related meaning. This is not surprising: one expects that if some act is performed mistakenly a wrong choice will have been made somewhere, and many of the examples are of this type, as shown by (9-3). (9-3) . . . they were children and she was mistakenly placed with the wrong family. But collocates can be revealing of other things than meaning. For instance, we saw that police is the strongest collocating noun. As it turns out, three quarters of the examples have police before mistakenly, and in the majority of these the police are the Actors of some mistakenly performed action. Another example is provided by a search for collocates of mistakenly think and mistakenly believe (these being the most strongly collocating verbs with mistakenly) – restricted to the left, in order to get an idea of the usual Actor – revealed interesting similarities and differences. The most frequent collocations for mistakenly think were (in order) people, many, who, because and some; for mistakenly believe they were many, people, some, because and Americans. This suggests that mistaken thoughts and beliefs are normally generic ones, attributed to non-specific groups of people. This would of course need to be checked further by looking at the actual instances, to ensure that the collocates were actually in the Actor NPs. A quick examination of the collocates of many with mistakenly believe confirmed this suspicion. The collocations of both strings with because is also suggestive, and consistent with the hypothesis that mistaken beliefs are likely to be construed as reasons for unexpected behaviour. Examination of collocations of semantically related words such as synonyms can be revealing. For instance, consider strong and powerful in the COCA corpus. These synonyms differ significantly in frequency, strong being the more frequent, occurring in position 497 in the COCA frequency listing, whereas powerful is in position 1256. As Table 9.3 reveals, strong also shows far more collocates than powerful. An examination of the collocates shows the words are used somewhat differently. It seems that the nouns powerful collocate with tend to be concrete ones, and include words for artefacts and tools, whereas strong collocates with a higher proportion of more abstract nouns (such as evidence, support, relationship and the like) in addition to concrete nouns – though the list includes no artefacts or tools. Collocates belonging to other parts-of-speech are also revealing. Powerful collocates strongly with the adjectives rich, wealthy and influential, describing

Investigating Language in Use: Corpus Linguistics

Table 9.3 Comparison of the strongest collocates of strong and powerful in the COCA corpus, listed for each part of speech in order of strength collocate

strong

powerful

noun

force, tool, interest, weapon, nation, message, support, evidence, sense, relationship, influence, leader, voice, drug, computer, wind, economy, feeling, force, position, engine, figure, storm, argument, committee voice, arm, argument, growth, opinion, performance, candidate, tie, connection, supporter, correlation, presence, leadership, message, bond, defense, belief, influence, desire, suit, predictor, opposition, commitment, faith, signal, reaction, association, storm, advocate, emotion, foundation, safety, link, incentive, demand, tradition, coffee, showing, possibility, emphasis, muscle, personality

adjective

weak, healthy, powerful, smart, independent, tall, emotional, brave, confident, magnetic

rich, strong, wealthy, influential

verb

grow, build, remain, maintain

become

adverb

very, enough

most, more, very

non-inherent attributes, whereas the adjectives collocating with strong tend to describe personal features. As expected, both words collocate rather significantly with one another. It will be noted that only powerful collocates with the quantifiers more and most. The reason for this is doubtless a morphological one: strong has derived comparative and superlative forms, whereas powerful does not, and must express these notions periphrastically.

Clusters Closely related to the notion of collocation is the notion of a cluster, which refers to a sequence of words that is repeated a number of times in a corpus (see also under Binomials in §4.4). In COCA you can search for clusters of two to four words involving a given word. The software returns clusters with the word in initial and final position, along with frequency information. For example, a search for clusters with salt revealed 5,884 instances of salt and pepper, as against just 86 of pepper and salt – see p. 99 above, where slightly different frequencies were obtained in searches for these binomials. This tool can also be used alongside the collocation tool to examine differences in usage of synonyms such as strong and powerful. The most common two member clusters for both words in COCA involve enough. However, for strong we find clusters with sense, support, evidence and case, whereas for powerful the most frequent are with tool, man, people and force. This adds further to our findings with the collocation tool.

227

228

Linguistics

A similar thing can be done in AntConc with the tool called N-gram. This tool also permits one to find repeated clusters of various sizes, thus facilitating the identification of common expressions in the corpus.

Limitations As we have seen in this section, corpus linguistics can be used to address a wide range of linguistic questions. It is not the only methodology one might use, nor is it necessarily the best for all questions. For instance, examination of a corpus is unlikely to provide insight into speakers’ use of, or attitudes towards, linguistic variables such as the use of post-vocalic /r/ in English (see Question 4, p. 181). For many research questions it is appropriate to use a corpus approach in conjunction with other approaches. For instance, my investigation of English expressions like there’s family and (there’s) family employed corpora as well as other approaches, including my own observations of language as used around me and in a non-digital corpus, the novels of Agatha Christie, which I read in their entirety. Looking for something in a corpus is a comparatively easy task, though you may be overwhelmed by hits, and need to restrict your search in some way, or examine a random sample of hits. A more difficult task is to locate something absent that might have been present. For instance, one might be interested in what motivates usage vs. non-usage of that in constructions like . . . say/request/think/ believe that . . . Finding instances of the verb followed by that will be easy – though it is likely to produce irrelevant material that will need to be sifted through by hand. But finding cases where the verb could have been followed by that but is not will be more difficult. Corpus research is also time-consuming, especially if you need to build your own corpus. And it is easy to be overwhelmed by the number of examples of a particular phenomenon instanced in large general corpora. One has to be vigilant and check the results output by the analytical software to ensure that they really are instances of the target phenomenon; this can be a time-consuming (not to say boring) task. On the other hand, some phenomena are very rare, and even mediumsized corpora may not provide many instances. This includes some grammatical constructions, as remarked on above.

Summing up Corpus linguistics is a relatively new discipline within linguistics, dating from the 1980s. It is concerned with the empirical investigation of language in use, based on computerized corpora of genuine texts and discourses in a language or language variety. The ultimate goal of corpus linguistics is to understand language, and to explain rather than merely observe patterns in usage. A wide range of different types of corpora have been constructed for a number of major languages, some of which are available for use on the internet, though registration is usually necessary. These include (among other types): general corpora, which aim to provide a snapshot of an entire language or variety of a language; specialized corpora, which aim to present a sample of a particular domain of use of a language such as a written or spoken language; historical corpora,

Investigating Language in Use: Corpus Linguistics

which present comparable and representative data from different points of time; learner corpora, which present language as used by people who are in the process of learning the language; and multilingual corpora, which present comparable corpora in two or more languages. To facilitate searches for grammatical structures and categories some corpora are parsed. The internet is increasingly used as a corpus. Sometimes no suitable corpus is available for addressing your question, or in the language or variety you are interested in studying. In that case you may need to construct your own corpus. There are a number of considerations that need to be addressed prior to actually putting a corpus together, including the nature and quantity of the material to be collected, how it should be represented, what types of markup and annotation are to be provided, and issues of copyright and ethics. Software is essential for analysing corpora, and for addressing linguistically relevant research questions. Programs such as AntConc, Sketch Engine and Wordsmith Tools can be used on corpora on your own computer (provided that the file format is right). Many online corpora have their own built-in software tools. These include software for extracting information on frequency of words and displaying ranked frequency lists; there is also software that permits the identification of keywords, words showing markedly different frequencies in a target set of texts as compared with a reference corpus. Concordance software permits the presentation of a given word in its textual environment, usually in the KWIC format. Collocation software searches for words that occur significantly more frequently than expected in the environment of a given word. Cluster software permits the identification of continuous sequences of words that recur throughout a corpus. More complex searches than just for single words (sequences of letters) may also be possible with software of all of these types. It may be possible to search instead for lemmas, for the word as a certain part-of-speech, or for sequences of words; it may be possible to use regular expressions in your searches.

Guide to further reading Good textbook introductions to corpus linguistics are Teubert and Cermáková (2007), Weisser (2016) and McEnery and Brezina (2022). I particularly recommend Barth and Schnell (2022), which is one of the few textbooks that provides extensive discussion of corpus linguistics in language documentation and typology. These textbooks also provide more detailed information on statistical measures and methodologies. Baker (2018) is a very readable article-length treatment of corpus methods in linguistics. For more on the ways corpus linguistics can be used in grammatical investigations, see Jones and Waller (2015). The Routledge Handbook of Corpus Linguistics (O’Keeffe and McCarthy 2022) comprises fortyseven articles dealing with a range of topics in corpus linguistics, including building and designing a corpus, how to explore a corpus for linguistic purposes, uses of corpora in pedagogy and language learning and other applications. It is worth dipping into if you are interested in undertaking a project in corpus linguistics. The first dictionary of English based on a general corpus was Sinclair (1987), which employed the COBUILD corpus; it is now in its tenth edition (2022). These days most dictionaries of English use

229

230

Linguistics

large general corpora. Also based on the COBUILD corpus was a grammar of English, Sinclair (1990), now in its fourth edition (2017). Other grammars of English based on general corpora are Quirk et al. (1972), based on the non-electronic corpus referred to on p. 209 above (later editions employed the electronic version), Greenbaum (1996), and Biber et al. (1999), republished as Biber et al. (2021). More advanced methods than those discussed in §9.4 are necessary if one wants to use corpus linguistics in the identification of registers; see, for example, Biber (1995) and Biber and Conrad (2001). Winchester (2003) provides a very readable account of the history of the Oxford English Dictionary, with a focus on the compilation of the first edition.

Issues for further thought and exercises 1 Use a general corpus of English other than COCA, BE06 or AmE06 and find the ten most frequent words. To what extent does the list agree/disagree with the lists given in Table 9.1? Were strings of letters or lemmas (or a mixture) counted? 2 A distinction is sometimes drawn between corpus-based and corpus-driven research. Find out how these two approaches to corpus linguistics differ, and write a paragraph or two describing each, giving examples to illustrate the difference. 3 Find a large general corpus in a language other than English and generate a listing of the ten most frequent words. How do these top ten words compare with the most frequent words in English as per Table 9.1? Are there any lexical words among them, or any words of more than one syllable? (Again, find out what the software is counting.) If the language is your first language or a language you are fluent in, do your findings agree with your intuitions? 4 Choose a large general corpus of English such as COCA, BNC or COBUILD, and use it to find out about the use of some word – for example, like, bloody, watch, right or fine. How frequent is the word? What words collocate with it? Examine a concordance and see whether you can identify different senses of the word. What are their relative frequencies? 5 Use the same corpus as in question 4 to examine the usage of be wondering (with all inflected forms of the verb be). How frequent is the string? What are the most likely pronouns to occur immediately to the left of be wondering? And what connective words (such as if, whether, about, how and the like) are most likely to immediately follow it? What are the most likely combinations of pronoun and connective word? 6 Take a pair of synonyms such as rich and wealthy, or couch and sofa. How do the synonyms compare in terms of frequency? How similar and different are their collocational patterns? Do the differences in their collocations reveal anything about the differences in their meanings? 7 Using a large general corpus, find out what grammatical patterns or constructions the verb hear appears in. Which are the most common constructions? Are there differences in frequency of these patterns according to genre?

Investigating Language in Use: Corpus Linguistics

8 The expression falsely believe/think is roughly synonymous with mistakenly believe/think. Which are the more frequent expressions? Are there differences in the collocations of the expressions with the two different adverbials? If so, do they reveal any differences in the meanings of the expressions? Examine the instances in a concordance, and test the hypotheses mentioned on p. 226 above for mistakenly believe/think. (You will need to use a large general corpus like COCA to answer this question since the expressions are quite infrequent.) 9 Build a small corpus of about a dozen texts of a single type from newspapers such as editorials, features, sports commentary or obituaries. Choose a suitable reference corpus and use keyword software in a program such as AntConc to identify positive and negative keywords of this corpus. 10 Examine a historical corpus of English to find out how the usage of the word gay has changed since 1900. Does the frequency of its use change over time? What about the words it collocates with? Does the corpus analysis support your expectations based on your knowledge of the history of the word? 11 Language documentation is another field of linguistics that gathers, compiles and analyses electronic corpora. Find out about this field, and the ways in which it is similar to and differs from corpus linguistics. (You could begin by looking at Chapter 10 of Barth and Schnell 2022.)

Research project Select a large general corpus of English, a dialect of English, or your own first language, and write up answers to the following questions: a. What is the composition of the corpus (things to consider are its size, the type of texts it includes, where and when they were produced, and so on)? How representative and balanced does the corpus seem to be? Can you identify any aspects in which it fails to be representative of its target variety or varieties, or aspects in which it is not balanced? b. Identify some linguistic item of interest to you, such as for example an interjection (e.g. huh, OK), a swear word, or a fixed expression (e.g. and then X goes like). Formulate a question about this item that you can address in the corpus. For instance, you might want to know in what circumstances the item tends to be used, whether it is more or less commonly used by males or females or younger or older people, or what genres of text it is more or less associated with. (Here you need to know enough about the corpus to be sure that your question is answerable – for example, if gender information of the text producer is not available, you won’t be able to answer a question about differences in use according to gender.) Use the software tools for corpus analysis to investigate your question. c. How well do the results of your corpus investigation agree with your intuitions about the phenomenon? Did the corpus investigation reveal anything unexpected?

231

232

Part III Language: A Human Phenomenon

233

234

10 Language in Its Biological Context

In this chapter we begin our examination of language as a human phenomenon by adopting the widest perspective, and consider it in its biological context. The main issue concerns the status of language in relation to communicative systems employed in the non-human animal world: is language unique to human beings? If so, is it a system without precedents in the biological world – possibly the result of a fortuitous genetic mutation – or was language an evolutionary development from some simpler system of communication?

Chapter contents Goals Key terms 10.1 Natural communication systems of other animals 10.2 Teaching human language to animals 10.3 Origins and evolution of human language Summing up Guide to further reading Issues for further thought and exercises Research project

236 236 236 242 247 253 253 254 256

235

236

Linguistics

Goals The goals of the chapter are to: ● describe some systems of communication used in the animal world; ● evaluate the extent to which natural animal communication systems satisfy the design features of human language; ● discuss the ability of members of other species to learn human language; and ● introduce and discuss some theories of the origins of human language.

Key terms alarm calls

feral children

joint attention

ape gestures

FOXP2 gene

language evolution

ape vocalizations

genetic encoding of language

language origins

bee dances

primates

gestural origins of language

signing

gorillas

Specific Language Impairment (SLI)

bonobos

grooming and gossip hypothesis

vervet monkeys

chimpanzees

indexical signs

bird calls bird songs bodily signs

10.1 Natural communication systems of other animals In this chapter we explore the question of the uniqueness of human language from two different angles. First, in this section, we examine human language in relation to the natural communication systems of other animals:1 what properties do they share, and to what extent are they different? Second, in §10.2, we discuss the ability of animals to understand and use human language: to what extent are they capable of learning human languages? These two questions are of interest in relation to the origins of human language, which we deal with in §10.3. If we could show that non-human animal communication systems exist that share key features of human languages, and that our closest biological relatives have systems that most resemble human language, this would count as evidence in favour of the evolution of language from animal communication systems, and that language differs in degree rather than kind from

Language in Its Biological Context

these other systems. Even if we could find evidence that other species can learn human language to a significant degree, this might count in favour of the evolutionary development of human language from systems of animal communication. Not finding such evidence does not, however, argue against an evolutionary story: it may be that there are no living species sufficiently close to us biologically to reveal the continuity. Our lineage diverged from that of our closest biological relatives the chimpanzees some five to six million years ago; the only remains of the intermediate species that emerged and lived during these millions of years are fossils; the species themselves are extinct. Our two questions are of interest for other reasons as well. If animals have the capacity to learn and use human languages this would count as evidence that language is not a peculiarity of human beings, or a genetic endowment of our species. It would argue in favour of the idea that language is not encoded in a module in the brain entirely separate from general intelligence. Again, if it is found that animals do not have this capability, it does not follow that language is encoded in our genes, or that it is stored and processed separately in the brain. It could only be concluded that animals lack the necessary genetic or neurological hardware. Many stories have appeared over recent decades, and continue to appear, in the popular media about apes and other animals with amazing talents for human language. Popular magazines, papers and television series talk of animals with large vocabularies, with grammar, with the ability to create novel utterances that they have not previously heard, with the ability to communicate their thoughts and feelings to their human trainers and so on. Such claims are often highly exaggerated and emotional; critical scientific evaluation is required before they can be accepted.

Commonalities of signs in communication systems of humans and animals Certain bodily signs indicating emotions are shared among humans and animals. For example, Charles Darwin describes in his The Expression of the Emotions in Man and Animals the involuntary erection of feathers in birds and hair in mammals when angry or fearful (1898: 94–101). These involuntary behavioural events can be interpreted (not necessarily consciously) by other animals, and not only members of the same species, as indicators of the animal’s emotional state; the other animal might as a consequence adopt an appropriate mode of behaviour – for instance, flight from a gorilla displaying signals of anger.

Involuntary signs like the erection of hair or feathers are indexical signs or indexes, according to the classificatory scheme developed by the American philosopher Charles S. Pierce (1955). An indexical sign is characterized by an association between a form and a meaning that arises through habitual co-presence; the form as it were points to the meaning. Other examples are smoke, which is an index of fire, and the first person pronoun I, which is an index pointing to the speaker. Note that as these examples show, indexical signs can be either voluntary or involuntary.

237

238

Linguistics

Many animals signify submission by lowering their bodily position below that of a more dominant animal – for example, by cowering or curling up. Conversely, a dominant animal may raise its position – or the position of its head – above that of a subordinate. Similar bodily signs are employed by human beings. Thus, a person might bow their head as a sign of submission, or stand erect over another in an attempt to intimidate them. Lower-pitched vocalizations are associated with aggression and dominance across animal families. This is based on an association between lower pitch and a larger vocal tract and bigger vocal folds, which in turn correlate with larger body size. A number of animal species – including red and fallow deer, koalas, lions and tigers – have a descended larynx, or at least a larynx that can be lowered by the animal during vocalization. This results in lowering the fundamental frequency of the sound (the rate of vibration of the vocal folds), thus exaggerating the perceived size of the vocalizing animal; this might be employed to intimidate rivals or perhaps to attract females.2 Bodily signs carry important messages, often unconsciously and involuntarily; this is perhaps part of their usefulness, since they cannot be used deceptively. On the other hand, they can form the basis for deliberately used signs expressing similar meanings. The intentional use of these systems is well developed among human beings, who can smile or laugh deceptively, or deliberately modify their height with items of clothing (shoes, or hats) to signal dominance.

Natural communication systems of some animal species Aside from the systems of bodily signs that indicate size or the internal state of the individual, many animal species have communication systems that are used to transmit information about the external world. In the following subsections we describe four such systems.

Bees Some bee species have a system of communication that is used by foraging bees on their return to the hive to convey information about the location of nectar sources. The system of European honeybees, discovered by Karl von Frisch (for which he was awarded a Nobel Prize), involves two different types of dances. If the nectar source is close to the hive, a round dance is performed; other bees perceive the scent of the nectar on the dancing bee, and set off in all directions looking for it. If it is at some distance from the hive (over about 50 metres or so) the bee performs a tail-wagging dance that follows a figure 8 shape. The angle of the diagonal of this shape indicates direction of the nectar source. Usually it is performed in relation to the vertical; the angle of the diagonal of the figure to the vertical indicates the direction of the nectar source in relation to the sun. Thus if the diagonal is at 40° to the left of the vertical, as shown in Figure 10.1, the feeding place will be in a direction at an angle of 40° to the left of the current direction of the sun. The speed of the tail-wagging dance indicates the distance of the nectar source from the hive: the more distant the source, the slower the rhythm of the dance. The bee also brings along attached to its body minute particles of the pollen from the flower; only those bees that gather from that type

Language in Its Biological Context

Figure 10.1 Indication of direction of nectar source by honeybee’s tail wagging dance. The figure is a part of the Nobel Prize lecture of Karl von Frisch Copyright © The Nobel Foundation 1973 Figure 3 in this link https://www.nobelprize.org/uploads/2018/06/frisch-lecture.pdf

of flower will take any notice of the dance. Similar dances are performed by scouting bees to indicate a new site for the hive. The bee’s dance shows displacement (see §1.3), since the source of nectar may be some distance removed in both time and space. However, displacement is limited in the sense that only an immediately relevant source is indicated, not a source that the bee visited on a previous foraging trip. There is also a minor degree of productivity in the system, in that the bee is not restricted to conveying information about known sources, and can modify the dance parameters so as to convey fairly precise information about a new source. But the message remains constrained to indicating horizontal direction, distance and type of nectar; the system does not permit indication of vertical direction. Thus when von Frisch showed some honeybees a supply of sugary water at the top of a radio beacon, they duly performed the round dance in the hive at its foot. Other bees searched around for honey in the vicinity of the hive, looking everywhere except upwards. Eventually they gave up the search. Bees improve their interpretation of the signs with increased age, perhaps suggesting some degree of cultural transmission. However, transmission is primarily genetic. Thus cross-breed offspring of different varieties of bees tend to know the dance of just one variety. And bees raised in isolation can interpret and perform the dance correctly when introduced to the hive.

Birds Most birds have systems of communication employing vocalizations; many birds also communicate by non-vocalized sounds such as beak clapping, and by visual displays of objects (e.g. bower-birds), or dances (e.g. brolgas). Vocalizations fall into two types. First, there are calls, brief bursts of sound such as whistles, screeches and chirps. These are generally of just a few syllables’ duration, and include alarm calls, food calls (indicating the location of food), signals between parents and offspring, and flocking calls.

239

240

Linguistics

A second type of vocalization are songs, which are more complex sequences of sounds that together form units that are typically separated from other songs by a relatively long pause. Experimental and observational evidence shows that bird songs are used mainly (though not exclusively) by males, and often for one of two purposes: to attract a mate, or to mark out territory. Songs may be produced either individually, or by groups of two or more birds, as in the case of kookaburras. Some species have large inventories of songs, perhaps in the order of 200 for nightingales. The songs of some songbird species appear to be innate. In other species there is evidence of learning, although a genetic template for the song may also be present. In such species young birds learn by copying adult birds; if prevented from hearing the songs of adult birds, they still sing, but their songs are typically slower, simpler and more variable than the normal songs of their species. This suggests some measure of cultural transmission, which is supported by two additional types of evidence. First, there is a critical period for acquisition of song in male zebra finches and various other species, a developmental period during which a bird must be exposed to adult songs in order for it to master the song. Second, some species, including the white-crowned sparrow and rufoussided towhee (both of the USA), show dialect variation in their songs. Dialect variation permits birds to recognize members of different groups. In addition, this identifying function can be one of the motivations for bird song. Cues to sex and age may also be provided by voice-quality, which can permit recognition of individual birds. For example, a seabird returning to a large colony might be recognizable to its family members by its distinctive voice quality. In some species productivity is also apparent: some birds improvise on heard songs, perhaps to attract a mate. Female birds of various species (e.g. red-headed parrot finches and zebra finches) show strong preferences for males with elaborate courtship songs.

Vervet monkeys Many species of monkeys use vocalizations as well as facial expressions and posture to communicate with one another. Vervet monkeys, which live in a variety of habitats in southern, eastern and western Africa, use bodily signs including head-bobbing (in threat displays), rapid glancing towards and away from another individual (communicating subordination), penile displays (demarcating territory) and tail-signals (indicating the degree of assuredness by degree of erection). Vervet monkeys also have a system of vocalizations including at least twenty different sounds. Among these are alarm calls warning of the presence of predators. A high-pitched chutter warns of the presence of a snake; a chirp (short but loud barking call) gives warning of leopards and lions; a rraup or short cough-like call warns of an eagle; an uh warns of a minor predator such as a hyena; and a nyow indicates the sudden appearance of a minor predator. These warning signs elicit immediate defensive action: on hearing the ‘eagle’ warning call, vervets look skywards, and run for cover; on hearing the ‘snake’ call, they raise themselves on their hind-legs, and search the ground for a snake. In one experiment alarm calls were played through a concealed loudspeaker, resulting in the appropriate evasive action. Vervet monkeys also have vocalizations that communicate information about interactive relations between individuals. A low-pitched chutter expresses an aggressive threat, or solicits

Language in Its Biological Context

support from other group members; a woof-whoof by subordinate males indicates submission; a series of nasal grunts are emitted by group members when the group starts to move. The vocalizations of vervet monkeys appear to be arbitrary. They are also to some extent culturally transmitted: the vocalizations of young vervets increase in accuracy with age, and infants check adult responses before responding themselves. Adult vervets also react differently to alarm calls of adult and young individuals: on hearing an alarm signal by a young monkey they first check for themselves whether a predator is present before initiating evasive action. Comprehension also precedes production. Alarm calls are made when danger is present. The system shows no displacement; for example, calls are not used to refer to predators present on different occasions. However, vocalizations are occasionally used deceptively: one monkey might give a sign of submission, then bite a rival. Vervets are not creative in combining signs to form new and more complex signs, or inventing new signs to make new meanings. Also unlike human language is the fact that all vocal signs are concerned with regulating the immediate behaviour of other vervets. For example, there are no vocalizations for informing of the presence of harmless species that pose no threat, and call for no evasive action. Nor is there a vocalization for perhaps the most significant predator today, people. (Why might this be?)

Apes Apes also have systems of communication that include vocalized and non-vocalized signs, including bodily gestures. Chimpanzees have at least a score of vocalized calls that have been described by Jane Goodall (1986). These constitute indivisible whole utterances, and cannot be combined into sequences or decomposed; they do not show duality of patterning (§1.3). Moreover, chimpanzee vocalizations, like the vocalizations of other apes, appear to be largely involuntary, and beyond conscious control. Goodall describes an instance in which a chimpanzee found a cache of bananas. He wished to keep it to himself, but was unable to suppress the pant-hoot vocalization; nevertheless he deliberately muffled it by placing a hand over his mouth. Conversely, ‘production of sound in the absence of the appropriate emotional state seems to be an almost impossible task for a chimpanzee’ (Goodall 1986: 125). This is presumably adaptively advantageous: since most ape vocalizations are warning signals, or signals relating to territory and mating, being involuntary makes them difficult to fake. Gestural communication is better developed and more flexible in apes than vocalizations. Some gestures are involuntary bodily displays expressing emotional states (see p. 237). But there are others. Studies of gorillas and chimpanzees in relatively free-ranging naturalistic zoos have revealed the use of systems of gestural communication containing over thirty signs. In contrast with vocalizations, manual gestures are voluntary, and can be controlled. Moreover, unlike vocalizations, they are intentionally directed to specific recipients, and are generally produced when the intended recipient is watching. According to Michael Tomasello (1999, 2008), chimpanzees employ two main types of intentional gesture in natural communication with other chimpanzees. One type is an attention attractor, a gesture aimed at getting another to look at the gesturer. The second type is a ritualized

241

242

Linguistics

gesture signifying an incipient or desired action. For example, many young chimpanzees use a stylized gesture to their mother, such as a brief touch on the top of the rear end, to request her to lower her back so they can climb on. These gestures are apparently idiosyncratic, and used exclusively with the individual’s mother; many attention attractor gestures are also idiosyncratic rather than conventionalized at the social group level. Chimpanzees employ gestures for social-regulative purposes such as attracting attention to the self or requesting action. They do not use gestures to direct attention to something for the purpose of sharing interest in it; thus they do not draw other chimpanzees’ attention to an object by pointing at it.

10.2 Teaching human language to animals The animal communication systems discussed in §10.1 all fail to satisfy one or more of Hockett’s design features of human language (see §1.3). In all of the examples we discussed, displacement was either absent, or minimal – if the animal could communicate about something that was not physically present, it had to be something of current relevance. Moreover, animal communication systems seem to be limited in terms of productivity: they are used to convey a rather limited range of meanings, typically from a predetermined set. Perhaps it is a mere accident that animals did not develop communicative systems as elaborate as human language; maybe some animals actually do have the capability of learning human language. Numerous instances have been reported over the past century of animals learning a human language, as well as of performing a range of other complex mental operations, such as arithmetic, that one normally thinks of as uniquely human. In this section we focus on attempts to teach a human language – or a simplified version of a human language – to apes. But before we embark on this, we look briefly at the linguistic ability of one non-primate species, a species with a long history of domestication.

Dogs’ understanding of human language Dog owners often speak to their pets, which they believe to be capable of understanding much (if not everything) that is said to them. For example, an owner says heel, and the dog returns to its owner; or fetch, and it fetches a thrown ball. An investigation by Juliane Kaminski and associates at the Max Planck Institute for Evolutionary Anthropology in Leipzig examined a border collie, Rico, reported by its owners to know words for over 200 different items, which it would fetch when instructed (Kaminski et al. 2004). The dog had been trained from ten months of age to fetch items placed in different locations around the owners’ apartment, and had been rewarded for fetching the correct object. In order to circumvent the ‘Clever Hans’ effect,3 the experimenters had the owner request Rico to fetch from an adjacent room two items randomly selected from the 200 that the dog was allegedly

Language in Its Biological Context

familiar with. The owner and experimenter were both out of sight of the dog when it selected the items. Rico performed well on the task, and the experimenters concluded that he did know the labels for the items. Even more strikingly, the dog could rapidly learn names for unfamiliar toys. The owner would first ask Rico to bring a familiar object. Then Rico was asked to bring an unfamiliar item with a name that he had not previously heard. He was able to fetch the novel object from a group of eight items consisting of seven familiar objects, performing this accurately in seven out of ten attempts. It appears that Rico was operating on the principle that a new word would belong to an unfamiliar and hitherto unnamed object (see ‘Novel name–new category’ principle in §12.2). Furthermore, after a period of a month during which he had no access to the new object, he was able to remember many of the new labels, fetching the correct thing from a group of novel and known objects on half of the trials. Does this study show that ‘dogs understand language’, as a CNN headline on Thursday 10 June 2004 put it? Certainly not, if by language is meant human language. Two hundred words for material objects is a far cry from the rich lexicons of human languages, which consist of many thousands of words, including words for events as well as things. Rico’s understanding of words for objects is based on fetching. One wonders whether he could perform an instruction to do something other than fetch the object, not to fetch something, or learn a word for something not fetchable, such as a sofa. Could he, for instance, understand and perform Bite the cushion and lie on the sofa?

Apes Not surprisingly, the most serious attempts to teach a human language to animals have been made with our closest biological relatives, the apes. Early attempts were made in the first half of the twentieth century, and were resounding failures. In the 1920s, Robert Yerkes attempted unsuccessfully to teach chimps to speak, and proposed that sign language be taught instead. In the 1930s, Winthrop and Luella Kellogg acquired a seven-month-old chimpanzee Gua, which they brought up like a human child, alongside their own son. Although Gua was able to understand over seventy words, she never spoke. In the late 1940s, Keith and Cathy Hayes acquired Viki, who they attempted to teach English. Despite intensive coaching over a period of three years, she learnt to say just four words – mama, papa, cup and up – though she was able to recognize around 100 words. These attempts failed partly for physiological reasons. The human vocal tract is adapted for speech, with a short jaw, rounded tongue and lowered larynx with a right-angled bend between the pharynx and oral cavity; chimpanzees lack these physiological adaptations (compare Figure 10.2 with Figure 2.2), and are incapable of articulating the range of sounds of human languages. In particular, the range of distinct vowel qualities they are able to produce is reduced compared with the range humans can produce. Moreover, as has already been mentioned, vocalizations in apes are largely involuntary. For these reasons, recent experiments have attempted to teach apes signs of American Sign Language (ASL) or invented systems employing plastic tokens or keys on a computer keyboard.

243

244

Linguistics

Figure 10.2 Vocal tract of the chimpanzee.

Teaching ASL signs to chimpanzees The first attempt to teach ASL signs to a chimpanzee was made in the mid-1960s by Beatrix and Allen Gardner. They and their research assistants raised a female chimpanzee named Washoe, acquired in 1966 at about a year of age, in a domestic environment. Washoe was not subjected to rigorous training schedules, but left to acquire ASL in a relatively ‘natural’ way. The humans around her communicated with one another and with Washoe in ASL. By 1975 Washoe had learned to use around 150 signs. Washoe was also able to combine signs to express more complex meanings. She made a number of two- and three-sign sequences of her own invention, such as listen eat for ‘listen to the dinner gong’, open food drink for ‘open the refrigerator’, and gimme tickle for ‘tickle me’. She is also said to have made novel words, such as combining the sign for ‘water’ with the sign for ‘bird’ on seeing a swan. Similar abilities were reported a short time later for a gorilla called Koko. Another famous attempt to teach a chimpanzee ASL was made by Herbert S. Terrace and colleagues. Beginning in 1973, Nim Chimpsky was taught ASL under controlled conditions from the age of four months; detailed records were kept of his progress, including video recordings. Like other chimpanzees, Nim acquired an active vocabulary of around 125 signs, and comprehension of some 200. He put the signs together into sequences, as shown in Figure 10.3, which illustrates a three-sign sequence. These sequences showed some regularities of ordering. For example, in

Language in Its Biological Context

Figure 10.3 Nim Chimpsky signing me hug cat to trainer. (From Terrace et al. 1979: 892. Reprinted with permission from AAAS.)

two-sign sequences more occurred in initial position about 80 per cent of the time, and the verb preceded the object with about the same frequency. Closer examination revealed that Nim’s signing showed preferences for ordering particular words – for example, more was preferred at the beginning of a sequence, Nim at the end. Many other words showed random distribution. In Nim’s two-sign sequences involving eat, the order eat Nim occurred 302 times as against 209 instances of Nim eat and 237 of me eat; in all instances Nim was the eater. It cannot be concluded from Nim’s utterances that he was using a consistent ordering of signs to distinguish who is acting on who. Nim’s average utterance length remained at about 1.5 signs. Furthermore, his multi-sign utterances were characterized by a high degree of repetition. About 20 per cent of his three-sign utterances involve repetition of a sign; another 28 per cent involve both the sign for Nim and the sign for me. Nim’s longest utterance, consisting of sixteen sign tokens, give orange me give eat orange me eat orange give me eat orange give me you is highly repetitive, and contains just five different sign types. Terrace also found that a high proportion of Nim’s utterances were full or partial imitations of signs recently given by his trainers. Nim rarely initiated a conversational exchange. Almost

245

246

Linguistics

90 per cent of his utterances were given in response to his teachers. Re-examining films of Washoe’s and other apes’ use of signs, Terrace concluded that the same held true of them: all were producing a high proportion of prompted repetitions of the signs made by their trainers. It seemed that the chimpanzees were producing signs in order to receive rewards.

Teaching chimpanzees to use tokens or keys Other investigators have employed, instead of ASL, systems of arbitrary signs made up of plastic tokens or keys on keyboards labelled with symbols. In 1966, David and Ann Premack began to train their chimp Sarah to manipulate plastic tokens as signs. Many of these tokens were quite arbitrary: for instance, the sign for ‘banana’ was in a square shape, for ‘chocolate’, an X shape with a vertical line through the middle. Sarah understood over 100 signs, and is reported to have understood conditional sentences such as if apple, then chocolate – given the choice between taking an apple and a banana she would choose an apple in order to get the (greatly desired) chocolate reward. More recently plastic tokens have been replaced by light-up keys on keyboards connected to computers. Sue Savage-Rumbaugh and colleagues trained bonobos (or pygmy chimpanzees) to use such symbols on a portable keypad. Their most notable success was with Kanzi, who acquired proficiency in the system not through direct training, but as an observer in his mother’s less than successful training sessions. He eventually acquired a vocabulary of some 250 signs, and is said to use key order to express meaning – Kanzi tickle Sue to mean that he would do the tickling vs. Sue tickle Kanzi to mean that he wanted Sue to tickle him. Another bonobo in Savage-Rumbaugh’s training programme, Panbanisha, is reported to have been strolling along with a group of scientists when she suddenly pulled one of them aside and repeatedly pressed the keys fight, mad and Austin in various combinations on her keyboard. It was understood that she meant by this that there had been a fight in Austin’s building (Austin was another chimp). On investigation this proved to have been the case. According to Savage-Rumbaugh, Panbanisha had never before put these three symbols together; moreover, the message was manifestly not motivated by a desire for a food reward, and was initiated by the chimpanzee. Notice, however, that Panbanisha was employing the same strategy as Nim Chimpsky in her longer utterances: repetition of a few signs, with no attempt to elaborate or reformulate the message.

Evaluation of apes’ language abilities Investigators disagree on whether use of signs by apes is comparable with human language. Nevertheless, differences are manifest, and it cannot reasonably be claimed that the systems that apes have learnt show all of the features of human languages. No ape has been demonstrated to actively use anything like the many thousands of signs that the average speaker of any human language controls, including ASL; and the utterances produced by apes, as we have seen, tend to be short and, if longer than a few symbols, highly repetitious. Whether the combinations of symbols follow a grammatical system is uncertain. The communication systems taught to apes fail to satisfy two of Hockett’s six design features discussed in §1.3. No study has convincingly demonstrated that the signs show either duality of

Language in Its Biological Context

patterning or reflexivity. However, the other four design features are satisfied to some degree at least: the signs are arbitrary; at least some degree of displacement is evident (e.g. in the case of Panbanisha’s communication about Austin); the systems are culturally transmitted and learned; and there is indication (not uncontested) of some degree of productivity. It also seems that chimpanzees are more prone to interrupt utterances by their teachers than usually occurs in human conversational exchanges. Moreover, chimpanzees rarely initiate communicative acts, though they are not incapable of doing so. In an experiment described in Menzel (1999), a female chimpanzee called Panzee observed, over a number of trials, an experimenter hide an item (e.g. food) in the trees outside her enclosure. After delays of up to 16 hours she could interact with a human who did not know about the hidden object. From the beginning of the experiment Panzee did whatever was necessary to gain the person’s attention, used her keyboards to indicate the type of object hidden, and manual pointing to indicate the location of the object. It cannot be concluded from either studies of natural animal communication systems or attempts to teach apes and other animals to use human language that the evidence favours the evolution of human language from animal communication systems. Nor does it favour the evolution of our language production and comprehension abilities from the general cognitive abilities of our ancestors. Nevertheless, as observed at the beginning of §10.1, this lack of evidence does not argue either for a non-evolutionary scenario, or for a separate language module in the mind. Indeed, overall it seems that the language abilities of human beings differ from the abilities of animals mostly in degree rather than kind. The evidence suggests that some of the cognitive mechanisms involved in speech comprehension and production may have been in place prior to the emergence of human language. Perhaps the apparent qualitative differences between human language and animal communication systems are the result of the piling-up of quantitative differences. This brings us naturally to our next topic, the origin and evolutionary development of human language.

10.3 Origins and evolution of human language Our unique ability to speak has inspired wonder and explanations have been put forward from time immemorial. Many religions have myths accounting for language origins and/or diversification. Often a divine source is invoked. According to the Judeao-Christian tradition, God gave Adam the power to name things; the Tower of Babel story accounts for the subsequent diversity of languages. Babylonians attributed language to the god Nabu, Egyptians to the god Thoth, and Hindus to Sarasvati, wife of Brahma. According to some Australian Aboriginal societies, languages were implanted in particular tracts of country by mythical beings during the Dreamtime, a formative stage in which the world came to be as it is. The origin and evolution of human language is also of interest to science. We cannot, of course, observe the evolution of human language, and no records remain of the communicative systems used by our ancestors until the advent of writing (see Chapter 14). This means that we are restricted to the interpretation of other, indirect observational evidence. This does not make the study of

247

248

Linguistics

origins and evolution of language unscientific: many fields of scientific endeavour are restricted to the interpretation of indirect evidence. One type of indirect evidence that can be brought to bear on the topic comes from feral children, children who grow up in virtual social isolation, in circumstances in which they have been exposed to little or no language. One of the most recent cases is Genie, discovered in California in 1970 at the age of 14. Genie had been confined to a small room and had experienced only minimal human contact from the age of 18 months. Like other feral children, Genie spoke no language when discovered, nor did she subsequently learn English fully. Genie did, however, learn a relatively large vocabulary, though her syntax remained quite simple. She apparently went through many of the same early stages of language learning that children normally go through (see §12.1).

Introductory textbooks often refer to a report by the Greek historian Herodotus that around 600 BC the Egyptian pharaoh Psammetichus segregated two newborn infants in an isolated mountain hut with a shepherd who was instructed to allow no one to speak in their presence. The pharaoh’s idea was allegedly that in isolation from linguistic input they would speak the original human language. The first word they produced was reported to have been bekos bekos. s. This was discovered to be the word for ‘bread’ in Phrygian, a now extinct Indo-European language spoken in the north-west of modern Turkey. Thomas (2007) shows that this is a mythical interpretation of what Herodotus wrote that has no foundation in the text itself, which does not suggest that Psamemetichus’ goal was to determine the original human language by experiment. Accounts of others (including King James IV of Scotland) doing similar experiments to determine the original human language also appear to be tall tales.

Nineteenth-century theories of language origins The nineteenth-century linguist Max Müller suggested a famous classification of theories of language origins, distinguishing the ‘la-la’, the ‘bow-wow’, the ‘ding-dong’, the ‘pooh-pooh’ and the ‘yo-heave-ho’ theories. The ‘la-la’ (or ‘sing-song’) theory sees the origins of human language in a communication system resembling bird song (see §10.1). The Danish linguist Otto Jespersen favoured this theory, presenting an idyllic Rousseauan view of humankind’s origins: The genesis of language is not to be sought in the prosaic, but in the poetic side of life; the source of speech is not gloomy seriousness, but merry play and youthful hilarity . . . In primitive speech I hear the laughing cries of exultation when lads and lassies vied with one another to attract the attention of the other sex, when everybody sang his merriest and danced his bravest to lure a pair of eyes to throw admiring glances in his direction. Language was born in the courting days of mankind. (Jespersen 1922: 433–4)

The ‘bow-wow’ theory proposes that human language began with mimicry of natural sounds of the environment. A bird’s or animal’s call might be imitated, and this imitation then used to refer to the creature; or a noise might be imitated and used as a verb to denote an event associated with the

Language in Its Biological Context

noise – for example, words like splash, bang and crash might have begun as ideophones representing sounds (see box on pp. 87–8), then later come to be used as verbs referring to events making those noises. According to this view, language origins lie in iconic rather than arbitrary signs. The ‘ding-dong’ theory also holds that language originated in natural connections between meanings and sounds. These could be iconic connections, as in the imitation of physical sounds. Alternatively they might be indexical connections, as in the case of mama for ‘mother’, supposedly deriving from the sound made by a baby as its lips approach its mother’s breast. According to the ‘pooh-pooh’ theory language originated in natural cries of emotion such as anger or pain, as when someone utters yow or ouch in pain, or yuck as an expression of distaste. Darwin championed this theory. The ‘yo-heave-ho’ theory proposes that the sounds uttered by persons when engaged in strenuous physical exertion provide the source of earliest language. The grunts and groans that are naturally emitted in circumstances of exertion might then have taken on other meanings or senses in social contexts, perhaps being interpreted by hearers as requests for assistance. While the ‘bow-wow’, ‘ding-dong’, ‘pooh-pooh’ and ‘yo-heave-ho’ theories may account for some words in human languages, especially interjections and onomatopoeic lexemes, it is difficult to see how they can explain much more. It is not clear why or how morphology or syntax arose at all. As for the ‘sing-song’ theory, why would we have anything but unanalysable songs (holophrases) used in a delimited range circumstances? Why didn’t these songs remain the domain of one of the sexes, as in the case of most bird species? What drove the emergence of utterances analysable into components on both levels, form and meaning – i.e. duality of patterning?

More recent theories of language origins In 1866, the Linguistic Society of Paris imposed a ban on papers on the origins of human language, a restriction it reaffirmed in 1911. Until fairly recently linguists have by and large supported the sentiments of the ban on the grounds that investigations of origins can only be speculative and, in the absence of technology for time travel, unverifiable. Anthropologists, archaeologists, geneticists, psychologists, neurobiologists and others have been more daring, and have not shied away from speculation informed by findings in their disciplines. It is only in the last couple of decades that linguists themselves have turned attention again to the question of origins, and the domain has become accepted as a field of investigation. Even today, however, many linguists consider it to be too speculative to permit investigation by scientific means. While it is true that one can only speculate, speculation does play an important role in science. And serious speculations will perforce be constrained and informed by the fields within which the investigator works. Moreover, if there is convergence in the speculations and findings across different fields one may feel more confident about a speculation. The area provides a good domain for interdisciplinary research, provided one enters it with an open mind, and does not adopt the rhetorical stance of some linguists who stipulate that only linguists have the warrant to make statements about language origins – for surely this is an area where we cannot rely on a single discipline.

249

250

Linguistics

In the following subsections I outline with a very broad brush a few of what seem to me to be the more interesting recent proposals about language origins. No attempt is made to be comprehensive: there are far too many theories to mention in an introductory survey; some are too complex to summarize in a few paragraphs, and have been left out for that reason. Nor do I attempt to be critical – all the proposals are based on circumstantial evidence, and can be fairly easily critiqued on the grounds that they leave unexplained a rack of known facts about the structure and/or functions of human language. In other words, at best they might account for the emergence of a communicative system of complexity less than that of human language; all take recourse to much hand-waving.

Gestural origins One popular notion, with a long history, is that human language has a gestural origin, that it originated in bodily gestures that were later transferred to the vocal medium. Our ancestors such as the australopithecines may have communicated with bodily signs before their vocal tracts were capable of speech. One attraction of this idea is that apes have intentional control of manual gestures but not of vocalizations (see §10.2), and the same was presumably true of our common ancestor, and likely also of some of the descendant hominid species. Following Max Müller’s lead, we might refer to this theory as ‘noddy’. One problem with ‘noddy’ is how to account for the switch from manual gestures to vocal gestures. In fact, however, as Michael Arbib has observed, it is not really that a switch occurred in the development of human language from the manual to the vocal medium; rather, the relative load of the latter increased. Facial and manual gestures arguably form with speech a single multimodal communication system, as also argued by gesture theorists such as David McNeill (see §13.1). Perhaps over a long period of time manual gestures became increasingly accompanied by vocalizations, that may have begun with grunts; this process may have continued until the point was reached where the balance shifted from primacy of the visual to primacy of the vocal. This shift could have been sustained and enhanced by practical advantages such as the possibility of using vocalizations in the dark, and the freeing up of the hands for other tasks, thus allowing one to carry out manual activities at the same time as speaking. But it would have necessitated biological changes to both the vocal organs and the brain. The advent of bipedalism in our lineage some five or six million years ago was, according to some (e.g. Corballis 2003, 2012; Lieberman 2003), a crucial first step in these biological changes. Alternatively, a genetic mutation that occurred some time in the last 100,000 years may have been responsible (Corballis 2003). Arbib (2003, 2011, 2012) also suggests that biological evolution led to a language-ready brain, a key development being the evolution of the system of mirror neurons that link production and perception of motor acts of grasping. Intriguingly, these are found in a region of the cortex of a monkey’s brain that is considered to be the analogue of Broca’s area in humans.

The grooming hypothesis Robin Dunbar (1996, 2010, 2012) proposes the grooming hypothesis – the ‘yackety-yack’ theory – which assigns primacy to the interpersonal and social dimensions in the emergence of language.

Language in Its Biological Context

He observes that grooming is the favoured mechanism among primates for bonding social groups. However, human groups tend to be too large – the order of 150 members – for grooming to be viable. Individuals would need to spend about 40 per cent of their waking day grooming; given that this is time during which they could do little else, it is far too much to be practicable. (The highest proportion of time observed among any living primate is half of this, among Gelada baboons.) Speech provides a means of grooming at a distance; it can also be done while engaged in other activities, and is not restricted to pairs – multiple partners can be groomed simultaneously. As Dunbar observes, much of our everyday use of language is in gossip, which can be seen as an investment in the verbal servicing of social relationships. The social character of language is further supported by the preferred topics of natural conversation: social topics, he avers, make up around two thirds of conversation time, whereas instrumental topics (work, tool manufacture and use, etc.) make up only 10–20 per cent.

Language as a genetic predisposition Everyone agrees that we are genetically adapted for speech: although both the baby and the rattle are exposed to the same linguistic input, only the baby acquires language. The human brain and/or mind cannot be a tabula rasa. What investigators disagree on is the extent of this genetic endowment, whether the minimal view that our genes give us a language-ready brain or the maximal view that we have a genetic blueprint for language – that language is genetically encoded. Two divergent opinions are held by those who maintain the maximal position, which we’ll call the ‘just genes’ theory. On the one hand, there are those who, like Noam Chomsky, suggest that language arose in one unique and isolated biological event, as the result of a single genetic mutation and not by the normal evolutionary process of natural selection (Chomsky 1986) – the ‘Oops!’ theory. On the other hand, there are investigators like Stephen Pinker who argue that language is a biological adaptation that evolved in the human species via the normal evolutionary process of natural selection. Language is, according to Pinker’s story – the ‘chatting-up’ theory4 – an innate specialization that evolved for the encoding of propositional information in a form that permits it to be conveyed from one individual to another. If the ‘Oops!’ theory is correct, a single gene might be responsible for language. A possible candidate for this is the FOXP2 gene, the first gene to be shown to be relevant to language. A mutation in this gene was shown by geneticists in 2001 to be associated with a type of language disorder – called Specific Language Impairment (SLI) – characterized by articulation difficulties and grammatical impairments. However, it seems increasingly unlikely that a single gene or genetic mutation could be responsible for language.5 Language is not completely wiped out in those individuals showing the mutation in FOXP2, and other genes have been shown to be associated with SLI. Furthermore, an investigation by a team of geneticists into the distributions of the FOXP2 gene across a range of animal and human populations revealed that the most likely scenario is that the gene has been the target of selection during recent human evolution (Enard et al. 2002).

251

252

Linguistics

Language and social cognition Many investigators now consider that the last and perhaps most significant steps in the evolution of language – in particular the development of complex syntax – were cultural rather than biological. These investigators focus attention on the social and cultural environment in which language arose. According to this approach , there was no specific biological adaptation for human language in the shape we find it today. Rather, we reached a stage of having a brain that was ready for language, before we had language. Language subsequently evolved in a cultural, not biological, setting. The emergence of language in human beings is thus in some ways comparable with the emergence of writing, which is known to have arisen in certain cultural settings and clearly cannot be coded in the human genome. Michael Tomasello has proposed in a range of publications (e.g. 1999, 2003b, 2008) that a crucial aspect of this was the emergence of a type of social cognition that enabled the development of human culture, and human symbolic communication within it. Crucial in this was the evolution of the ability to recognize other individuals as intentional agents who one can share attention with. According to this ‘looky-look’ theory, the capacity for joint attention is crucial to the development of sharing of experience – and thus information – as well as collaborative action. Fully modern human languages developed, according to ‘looky-look’, via processes of grammaticalization (roughly, the emergence of grammatical elements from lexical elements – see §16.5) operating over periods of millennia on the grammatically less complex communicative systems that arose in the biological evolution of our species.

Concluding remarks on language evolution It seems fair to say that the majority viewpoint currently tends towards the notion that our genetic make-up permits us to learn language, rather than that language is encoded in our genome. To use a computer analogy, our genes gave us the necessary biological hardware, but not the software, which emerged more recently in the human cultural context, after the biological machinery was already in place. Many investigators now situate the final steps in the evolution of language in human culture, in the interpersonal context. It is also widely accepted that unanalysable and independent symbols emerged first; syntax came much later. Although investigations of the communicative systems and abilities of animals does not unassailably support the gradual evolution of language from other communication systems, it cannot be doubted that the majority of fundamental biological components and processes involved in vocal production and perception are shared with animals. They are modifications of existing features rather than entirely novel. Given this, it is difficult to disagree with those who hold that comparative investigation of non-human communicative systems is likely to provide a fruitful perspective on the evolution of human language. Another point of widespread (though not universal) consensus is that investigation of the evolution of language is not the prerogative of linguistics, but is best approached from many different disciplinary perspectives. We have mentioned anthropology, archaeology, psychology, genetics and neurobiology. Computer and mathematical modelling have recently come to prominence as means of testing theories, especially where multiple factors are involved.

Language in Its Biological Context

Summing up Many animal species have natural systems of communication employing bodily gestures and vocalizations to express emotional states, to warn conspecifics of dangers, to demarcate territorial boundaries, and for attracting mates. These systems do not satisfy all of the design features of human languages. This does not mean that other animals are incapable of producing or comprehending human language, and many attempts have been made to teach human language to other species. The most successful have focused on apes. The systems apes have learnt show some of the design features of human languages, at least to some degree; duality of patterning and reflexivity are conspicuously absent, however. Studies of natural communication systems of animals and attempts to teach animals human language do not argue strongly either for or against the evolution of language from animal communication systems. However, some cognitive mechanisms essential to language appear to have been in place prior to the evolution the modern humans. Speculations on the origins and diversification of human language can be traced back to mythology. The nineteenth century saw the emergence of many theories, including ‘bow-wow’, ‘ding-dong’, ‘pooh-pooh’, ‘yo-heave-ho’ and ‘la-la’. Recent years have seen the emergence of more sophisticated theories. According to ‘noddy’, language has its origins in gestures; ‘yackety-yack’ suggests that language emerged to facilitate gossip and is a replacement for manual grooming. ‘Just genes’ proposes that language is genetically encoded. According to one variant, ‘chattingup’, language evolved by the normal evolutionary processes of natural selection; an alternative variant, ‘oops’, maintains that language emerged as an accidental genetic mutation. Specific Language Impairment (SLI) is associated with a mutation of the FOXP2 gene, which was for a time heralded as the ‘language gene’. However, recent evidence indicates that neither SLI nor the FOXP2 gene are specific to language. A clutch of recent theories consider the final steps in the emergence of human language to have been cultural: biological evolution gave us a language-ready brain, but language arose in a sociocultural setting. One such theory is ‘looky-look’, which argues that joint attention was a critical development.

Guide to further reading Animal communication systems are surveyed in Bright (1984), Morton and Page (1992) and Rogers and Kaplan (2000). The communicative dances of honeybees are nicely described in Karl von Frisch’s 1973 Nobel lecture (Frisch 1992/1973, available online at https://www.nobelprize.org/ prizes/medicine/1973/frisch/lecture/). Chapter 5 of Chittka (2022) provides some updates and an evolutionary perspective. Vocal communication of birds is described in Kroodsma et al. (1982). Cheney and Seyfarth (1990) deals with communication systems of monkeys; examples of vervet monkey vocalizations can be found on YouTube. Goodall (1986) treats chimpanzee communication.

253

254

Linguistics

Radick (2007) is a comprehensive discussion of investigations of the communication systems of primates and their scientific relevance. Attempts to teach language to chimpanzees are described in Hayes (1951), Gardner and Gardner (1971), Premack and Premack (1993), Savage-Rumbaugh and Lewin (1994) and Terrace (1979). For a critical overview, see Seboek and Umiker-Seboek (1980). On the Clever-Hans effect, see Seboek and Rosenthal (1980). Genie’s learning of English is described in Curtiss (1977), and Curtiss et al. (1974) (reprinted in Lust and Foley 2004); see also Rymer (1994). For a brief survey of some of the main approaches to the evolution of human language, see Carstairs-McCarthy (2017). Older but still good overviews of the field from the perspective of linguistics are Aitchison (1996), and Chapter 2 of Foley (1997). The classic work dealing with biological aspects of language evolution is Lenneberg (1967); this book is now dated, and many of the ideas it presents have been challenged. Fitch (2000) is a more recent overview; also interesting is Lieberman (2000). Tomasello (2008) provides a readable account of his ideas on the social origins of human language. Tallerman and Gibson (2012) presents a wide range of approaches in sixty-five short articles by specialists; I recommend it to anyone who seriously wants to find out about current ideas on language evolution. Other recent edited collections are Botha and Knight (2009) and Bannan (2012). Johansson (2005) identifies a number of facts that must be taken into account in any viable theory of the evolution of language.

Issues for further thought and exercises 1 It is sometimes suggested that linguists’ attempts to show that other animals’ communication systems are not human languages reflect neurotic desires to prove that human beings are superior to other animals. Do you think this is a valid criticism? Why or why not? (Why don’t linguists concern themselves with the proposition that barking is restricted to dogs, or meowing to cats, for instance?) 2 The involuntary erection of hair and feathers was classified as an indexical sign in §10.1. Can you explain why in more detail? It is also possible to regard it as an iconic sign. Explain how. Use your explanation to suggest an evolutionary account of the development of this involuntary action. 3 To what extent do the systems of bodily signs discussed in the section ‘Commonalities of signs in communication systems of humans and animals’ in §10.1 satisfy the properties of human language? Evaluate them in relation to Hockett’s design features. 4 To what extent are the signs of the bee’s dance arbitrary or otherwise? What aspects are arbitrary, iconic and indexical? 5 Evaluate the animal communication systems described in the section ‘Natural communication systems of some animal species’ in §10.1, in terms of the full set of Hockett’s design features.

Language in Its Biological Context

Tabulate your findings and discuss whether the differences from human language are a matter of degree or kind. 6 Review the notion of duality of patterning, and explain in a few sentences what it means. Do you think that duality of patterning is a useful design feature for all communicative systems? If not, when – under what conditions – do you think that this becomes a relevant consideration? Explain your reasons. 7 The table below shows Nim Chimpsky’s twenty-five most frequent two- and three-sign combinations, from Terrace et al. (1979: 894). (Reprinted with permission from AAAS.) What are the relative frequency of two- and three-sign combinations? Two types of repetition are illustrated in these combinations. What are they, and what is their frequency? Calculate the frequency both in terms of the combination types and their tokens, and in relation to the length of the combination. Is repetition more frequent in three-sign combinations than in two-sign combinations? What other generalizations can you make about the utterances listed? What other information would you like to know about these combinations if you were going to write a description of the grammar of Nim’s utterances? Two-sign combinations

Frequency

Three-sign combinations

Frequency

play me

375

play me Nim

81

me Nim

328

eat me Nim

48

tickle me

316

eat Nim eat

46

eat Nim

302

tickle me Nim

44

more eat

287

grape eat Nim

37

me eat

237

banana Nim eat

33

Nim eat

209

Nim me eat

27

finish hug

187

banana eat Nim

26

drink Nim

143

eat me eat

22

more tickle

136

me Nim eat

21

sorry hug

123

hug me Nim

20

tickle Nim

107

yoghurt Nim eat

20

hug Nim

106

me more eat

19

more drink

99

more eat Nim

19

eat drink

98

finish hug Nim

18

banana me

97

banana me eat

17

Nim me

89

Nim eat Nim

17

sweet Nim

85

tickle me tickle

17

me play

81

apple me eat

15

gun eat

79

eat Nim me

15

255

256

Linguistics

Two-sign combinations

Frequency

Three-sign combinations

Frequency

tea drink

77

give me eat

15

grape eat

74

nut Nim nut

15

hug me

74

drink me Nim

14

banana Nim

73

hug me hug

14

in pants

70

sweet Nim sweet

14

8 Find out about attempts to teach a system of signs to dolphins. (Some references are Evans and Bastian 1969; Herman 1980:178–80; Herman et al. 1984; and Richards et al. 1984. There are also many descriptions on the internet.) Write a brief description of one attempt, and discuss the extent to which the animal appears to have acquired a system comparable to human language. Compare the dolphin’s ability to use the signs system with that of chimpanzees. 9 What are some possible motivations for replacing signs of American Sign Language by signs made up of plastic tokens and computer keyboards? To what extent do such systems resemble human languages, whether sign languages, speech, or writing? Do the differences render comparisons with human language spurious or difficult to interpret? (For instance, does the fact that the entire system of symbols is simultaneously visible facilitate production; does it limit creativity?) 10 Which notion do you favour, the idea that language is genetically encoded, or that our genetic make-up permits language, but does not determine it? Identify and discuss evidence for and against your preference. 11 We mentioned in various places in the text bonobos (sometimes called pygmy chimpanzees). What are they? Why do many researchers of primate communication use bonobos in preference to chimpanzees?

Research project A number of theories of the evolution of human language adopt the view that modern human languages developed from a somewhat simpler proto-language (not to be confused with the reconstructed precursors of modern languages discussed in Chapter 17), which evolved over time to show the complexities of modern languages. What are some theories that adopt this view? What are some theories that do not assume it? Discuss the evidence for and against the idea of a protolanguage. Discuss also the processes that have been proposed for the development of fully modern languages from proto-language.

11 Psycholinguistics: Language, the Mind and the Brain

In this chapter we develop a second take on language as a human phenomenon, and consider it from the perspective of the individual user. Our focus is on human beings as speakers and hearers, and the psychological and cognitive attributes that on the one hand are requirements for our possession of language, and on the other hand permit us to engage effortlessly and rapidly in the production and comprehension of speech. Another important set of concerns are the relations between language and other cognitive phenomena, the question of where language fits within our cognitive system.

Chapter contents Goals Key terms 11.1 Language and cognition 11.2 Language processing 11.3 Language and the brain Summing up Guide to further reading Issues for further thought and exercises Research project

258 258 258 262 267 277 278 279 281

257

258

Linguistics

Goals The goals of the chapter are to: ● explore the relation between language and thought, and discuss the Sapir-Whorf hypothesis, the hypothesis that the structure of the language we speak influences how we think about the world; ● discuss some modern revisions to the Sapir-Whorf hypothesis; ● present some fundamental facts about speech production and perception, and comment on what they indicate about the mental organization of language; ● introduce some important experimental methods used in studying speech processing; ● overview the basic physiology of the brain; ● introduce the main questions in the study of neurolinguistics; and ● outline some of the main methods of investigating neurolinguistics.

Key terms anomic aphasia

exchange errors

neuron

aphasia

functional magnetic resonance imaging (fMRI)

positron emission tomography (PET)

arcuate fasciculus Broca’s area Broca’s aphasia categorical perception cerebral cortex conduction aphasia contralateral control dichotic listening test electroencephalograms (EEGs)

garden path sentences global aphasia lateralization lexical lookup localization magnetoencephalograms (MEGs) neurolinguistics

psycholinguistics Sapir-Whorf hypothesis slips of the tongue split-brain patients spoonerisms subtraction paradigm Wada test Wernicke’s area Wernicke’s aphasia

11.1 Language and cognition We begin with the relation between language and other forms of cognition. On this hotly debated issue there is as yet no consensus among linguists or psycholinguists. At one extreme is the

Psycholinguistics: Language, the Mind and the Brain

notion that language forms a distinct module separate from other cognitive processes; this view tends to be associated with linguists and psychologists working within formal theories of language, such as Noam Chomsky, Jerry Fodor and Stephen Pinker. At the other extreme is the idea that there is no distinction between the cognitive processes employed in language and those employed in other domains of thought; this view is associated with investigators working within many functionally oriented paradigms, including Ronald Langacker, George Lakoff and Talmy Givón. We will not enter this debate, but merely comment that the balance of evidence seems to favour an intermediate position: a degree of separateness, along with commonalities with other cognitive phenomena.

Language and thought: the Sapir-Whorf hypothesis We discuss instead an even more vexed question, that has engaged scholars at least since the ancient Greeks: is there a relationship between the language one speaks and the way one thinks about and conceptualizes the world? One highly influential idea holds that the answer is in the affirmative: the structure of the language we speak does correlate with the way we think. Proponents include Wilhelm von Humboldt (1767–1835), Franz Boas (1858–1942), Edward Sapir (1884–1939) and Benjamin Lee Whorf (1897–1941). It is now generally referred to as the Sapir-Whorf hypothesis, often just the Whorfian hypothesis. The Sapir-Whorf hypothesis can be separated into two components. The first is the principle of linguistic relativism, according to which lexical and/or grammatical differences between languages correlate with non-linguistic cognitive differences. For instance, the existence of a number of terms in a language for objects from a conceptual domain – say ‘mound’, ‘ridge’, ‘hill’, ‘mesa’, ‘plateau’, ‘cape’ and ‘mountain’ – would correlate with habitually thinking about these geographical projections as different. If a single term is used the range of objects will tend to be regarded as members of a single category. The principle of relativism holds that language and habitual modes of thought are correlated; it does not presume a causal relation between them. The second aspect of the Sapir-Whorf hypothesis is the stronger principle of linguistic determinism, the notion that differences in the lexical and grammatical systems of languages cause differences in cognitive styles of their speakers, in their habitual ways of thinking. According to this view the presence of different lexemes ‘mound’, ‘ridge’, ‘hill’, ‘mesa’, ‘plateau’, ‘cape’ and ‘mountain’ in a language would imply a different conceptualization of the world to that found among speakers of a language where the lexical distinctions were not drawn. Whorf is usually understood to have advocated a strong version of linguistic determinism, though his stance was often equivocal. His thinking was more sophisticated than simple examples like the above might suggest. He considered that it was not only lexical features that are relevant, but, more importantly, grammatical structures. Thus he contrasted the linear notion of time shared by speakers of English and other “Standard Average European” (SAE) languages, in which time progresses ever onwards into the future, with a cyclic view of time he attributed to speakers of Hopi (Uto-Aztecan, USA). An aspect of this difference, Whorf averred, was related to the presence of tenses in English, which is consistent with a timeline extending indefinitely into the future, and the absence of tenses in Hopi, which is consistent with a cyclical view of time. (His analysis of Hopi as

259

260

Linguistics

a tenseless language has been criticized – albeit not very convincingly: tense is not a universal grammatical category – by later investigators, notably Malotki 1983.) This single difference between Hopi and English is not telling and Whorf sought not just single isolated lexical or grammatical features, but sets of interlocking linguistic phenomena. In the case of the Hopi notion of time, he linked the absence of tenses with other facts about the language, including expressions used for quantifying time. Rather than measuring by numbers of units such as days, Hopi speakers (he claims) use expressions like ‘the fourth day’. The difficulty here is that it is not obvious how either linguistic feature – absence of tenses and use of ordinal expressions in the quantification of time – would imply or induce a notion of cyclical time. And putting them together does nothing to strengthen the case. On the other hand, why doesn’t the presence of lexical cycles of hours of the day, days of the week and months of the year conflict with the alleged SAE linear notion of time?

Revisions to the Sapir-Whorf hypothesis The Sapir-Whorf hypothesis was subjected to intensive empirical testing by linguists, anthropologists and psychologists in 1950s and 60s. One domain that was tested early on was colour, since languages were known to differ in the range of lexical distinctions they make. A classic study by Eleanor Heider investigated colour perception among the speakers of Dani (Papuan, Papua), which distinguishes just two colour terms (Heider 1972). The investigation revealed that Dani speakers could distinguish colours not distinguished lexically, and recognized focal colours – the shades considered to be the most typical of the colour in a language that has a lexical item for it – better than non-focal colours.1 This argues against an extreme determinism, that the structure of one’s language determines perception. It does not, however, refute a weaker version that language may affect the ease of distinguishing and remembering colours. And indeed, the weaker version has been supported by some empirical findings. One experimental study (Kay and Kempson 1984) involved speakers of English and Tarahumara (Uto-Aztecan, Mexico), a language that does not have separate terms for ‘blue’ and ‘green’. Three colour chips were presented to each participant on each trial, from which the participant was to pick the odd one out. In some trials two chips were quite close in actual physical colour (i.e. wavelength of reflected light), but would be classified as blue and green by speakers of English; a third chip was a focal green, but more distant in physical colour from the other green than that green was from the blue. Whereas the Tarahumara speakers chose as the odd one the chip that was most different in physical colour, speakers of English selected the one that would be labelled blue. Following something of a lull from the early 1970s, the Sapir-Whorf hypothesis has recently made a comeback, and reappeared in new guises, stimulating some intriguing new research. One revision that has yielded interesting results is encapsulated in Dan Slobin’s aphorism thinking for speaking: the nature of the language we speak influences the way we think for speaking (Slobin 1996a). The focus is on the dynamic processes of thinking rather than on ‘thought’ in the abstract. Slobin examined stories told by children describing drawings in a wordless picture book, and argued that speakers of different languages attend to different aspects of the depicted situations in constructing their stories. They are forced to do so by lexical and grammatical features of their

Psycholinguistics: Language, the Mind and the Brain

language. Thus, speakers of English attend to whether an event is in progress, and pay a good deal of attention to manner of motion; speakers of Spanish, by contrast, pay more attention to the paths of motion and whether an event is completed or not. This is a consequence of differences between English and Spanish in the grammatical categories distinguished in verbs and in the semantic structure of motion verbs: in English verbs of motion tend to incorporate components of manner (as in e.g. crawl, waddle, shuffle), whereas in Spanish they tend to incorporate components of path (as in entrar ‘enter’, salir ‘leave’, bajar ‘go down’). If the thinking for speaking version of the Sapir-Whorf hypothesis can be sustained, one could then look further, and examine whether there are extensions to other cognitive domains. Research by Stephen Levinson and John Haviland over the past two decades has advanced this style of argument for spatial terms in some languages: that the spatial distinctions and categories speakers need to attend to for speaking are carried over to other domains of cognition. The Pama-Nyungan language Guugu Yimithirr uses the cardinal terms ‘north’, ‘south’, ‘east’ and ‘west’ to specify relative locations of objects and directions of travel rather than body-centred terms ‘left’ and ‘right’. The system of cardinals is used even for identifying parts of the body: if I was facing north, I would refer to an itch in my west ear rather than my left ear. To speak Guugu Yimithirr as a native speaker requires that you pay constant attention to the cardinal directions. Levinson (1997) argues that this extends beyond thinking for speaking, and that Guugu Yimithirr speakers carry this type of thinking over to spatial behaviour generally. For example, suppose you are facing a table on which three objects have been laid out, as shown in Figure 11.1. You are asked to turn around 180° to face the other table, and place the three objects in the same arrangement on this table. How do you place them? Most speakers of English and Dutch orient the objects on the second table according to a selfcentred system, so that the doughnut goes in the centre, with the star to the right, and the pencil to the left of it. But speakers of Guugu Yimithirr usually place the objects so they maintain their absolute cardinal orientations. So if you were facing west to begin with, the star will be placed to the north on the second table, the doughnut in the middle, and the pencil to the south. Speakers of

Figure 11.1 A version of Levinson’s spatial arrangement experiment.

261

262

Linguistics

Guugu Yimithirr and English (or Dutch) tend to place the star on opposite sides of the doughnut. It seems that the cognitive system employed for speaking about space also influences thinking about space for other types of behaviour.

11.2 Language processing Imagine a communication system with a fixed set of symbols and a fixed set of meanings that they can convey, such as a system of semaphore flags signalling messages about a limited range of movements of a vehicle. Production and understanding of signs within such a system by a human operator might be a simple process of linking existing meanings with pre-determined flags or flag positions. Such a psychological process could not work for speech production or comprehension, where the range of possible meanings is not laid out in advance: speakers make new meanings that have never been made before, occasionally using new forms not previously used or not yet conventionalized. Speakers must have mental models that permit them to construct and interpret novel forms, as well as to assign appropriate meanings to them. For language processing, therefore, human beings must have not just an internalized lexicon, but also an internalized grammar of their language, that they access in production and comprehension. A major concern of psycholinguistics is to develop models of these mental grammars and lexicons, and the psychological processes by which they can be accessed in speech production and comprehension. In this section we outline some of the basic features of speech processing that must be accounted for in any model.

Comprehension Perception of speech sounds Four difficulties in perception of speech sounds A crucial component of the comprehension of speech is processing the sounds that reach the hearer’s ears. This is no trivial task. Recall from §2.1 and Figure 2.1 that speech sounds form a continuous stream, rather than a sequence of discrete sounds; the boundaries between the sounds are indistinct, as are the boundaries between words. True, there are sometimes indications of boundaries in the speech signal. Words are occasionally separated from neighbouring words by pauses. And an allophone of a phoneme can sometimes indicate the position of the phoneme within a word. For example, great ape and grey tape can be distinguished by an aspirated [tʰ] in grey tape, that does not normally occur in great ape. On the other hand, realization of the /t/ as [d] or [ɾ] would be most likely in great ape. (This is not to say that these minimal pairs are always distinct in pronunciation.) On the other hand, allophony also contributes to processing difficulties since it means that quite different stretches of sound – for example, [nɒtʔætʔɔːɫ], [nɒtætɔːɫ], [nɒdədɔːɫ] and [nɒɾədɔːɫ] – must be recognized as representing the same sequence of words, not at all. The third form, moreover, admits another interpretation. (What is it?)

Psycholinguistics: Language, the Mind and the Brain

Another source of difficulty in processing comes from the enormous variations within the sound waves between speakers. The sound waves of The farmer kissed the duckling spoken by a child, a female speaker of British English and a male speaker of New Zealand English would be quite different to the sound wave of my production. The hearer has to factor out the differences in the acoustic signal that reaches their ears, and recognize that the distinct sound waves represent the same sentence – while at the same time identifying the social and other meanings these differences convey, and recognizing that other equally minor differences in the sound wave indicate different sentences. The fact that speech often occurs in a noisy environment gives rise to a further difficulty in processing. We must disregard large components of the sound that reaches our ears (the TV blaring, children screaming, traffic on a nearby highway, etc.) in processing speech, though as a matter of survival, these other noises cannot be entirely ignored.

An interesting variant on the observation that speakers ignore components of the sounds that reach their ears as irrelevant to language is that when you first hear speech in a click language such as Xhosa (listen to Miriam Makeba’s The Click Song g – available on YouTube) or Ju Juǀ’ǀ’hoan (Kx’a, Namibia) (watch the movie The Gods Must Be Crazy Crazy), y), the clicks will probably sound like non-linguistic noises going on at the same time as the spoken word. Speakers of non-click languages have strategies for differentiating speech sounds from non-speech sounds that do not always give the correct results when applied to a click language.

Categorical nature of speech perception An important feature of the perception of speech sounds is that it is categorical. When you hear a stretch of speech you categorize the phones as phonemes, ignoring the physical differences between them. You do not perceive a bilabial stop as now having a long VOT (see §2.3), now an average length VOT, now a very short VOT: a speaker of English perceives it as either a /b/ or a /p/. Given the same phonetic input, a speaker of Shua or Thai would identify one of three different stops, /b/, /p/ or /pʰ/, while a speaker of Warrwa would, by contrast, categorize it as the single phoneme /b/, regardless of the length of VOT. Experiments have been done in which aspects of the sound signal of speech (or artificial machine-generated speech) are modified by small degrees, and presented to experimental subjects. For instance, the VOT of the initial stop of [ba] might be varied in increments of 10 ms from –10 to +80 ms. When you listen to the sequence of syllables, you do not hear a gradual increase in VOT; rather, at a certain point you hear a definite switch from /ba/ to /pa/. By contrast, if you listen to a musical tone that varies from 200 Hertz (cycles per second) to 1,000 Hertz, you don’t hear it as suddenly jumping at some point from low to high pitch.

Role of vision and other senses In a face-to-face conversation you not only hear your interlocutor, you also see them. Indeed, conversational partners typically spend a good deal of time looking at their interlocutors’ faces

263

264

Linguistics

while speaking to them. The visual channel provides additional information to the hearer that can assist in the perception of the spoken word, especially in a noisy environment. Try turning your television sound down to a point where you can only just hear what is being said by a newsreader, and then turn away. What do you notice? A particularly good illustration of the relevance of visual cues comes from the so-called McGurk effect, named after Harry McGurk, the psychologist who first observed it. Despite the term lip-reading, it is notable that it is not just visual information regarding the lips that is taken into account by hearers. Other more subtle information is also integrated. For instance, movements of the speaker’s jaw provide information about velar consonants and back vowels, for which, of course, direct visual evidence of the dorsum of the tongue is not usually available. Small differences in air pressure changes in the mouth associated with voiced vs. voiceless stop consonants are manifested in small but perceptible movements of the cheeks. Interestingly, although information concerning the spoken word is distributed across the face of the speaker, it is the eyes of the speaker that hearers tend to look at most. Direction of gaze provides a rich source of information about what the person is thinking about, which can be used in top-down processing (on which see next subsection). A striking example is how Sherlock Holmes uses Watson’s eye-gaze in The Resident Patient (Adventure VIII in Doyle 1894) to interpret his unspoken thoughts. Generally peripheral vision is sufficient to garner the speech-relevant information from the lips, jaw and other areas. However, when background noise increases eye-gaze tends to shift downwards to the lower part of the face, where the best visible source of speech information is found. Recent research reveals that information from other senses is also integrated with the auditory signal in speech perception. The interpretation of heard syllables was shown in one experimental investigation to be affected by touch, when the subject placed their fingers on the lips, cheek and neck of a speaker.

Further support for the integration in the human brain of information from different sensory modalities comes from brain scanning (see pp. 274–5). In one experiment subjects who had had no formal experience with lip-reading saw the face of a person saying the numbers one to nine. The primary auditory regions of their brains showed similar activity as when they heard the numbers spoken. Another experiment revealed that hearing the voice of a familiar person induces activity in the fusiform gyrus, an area of the brain that is involved in recognizing faces.

Identification and recognition of words Recognition of words involves more than just bottom-up processing, more than processing of the incoming sound waves on a phoneme-by-phoneme basis. Processing involves top-down aspects as well. Hearers also use clues from the wider context, including the sentential environment. Experimental subjects make fewer errors in identifying words in sentences than when the same words are presented in isolation. Further evidence is provided by experiments in which segments

Psycholinguistics: Language, the Mind and the Brain

of speech are removed without affecting comprehension. In one study the [s] representing the plural morpheme -s in sentences such as Cats like fish was replaced by a cough. Hearers reported hearing the sibilant even though it was not present in the actual sound wave; in fact, when told that a sound was missing, they could not accurately identify which one it was. Other factors are known to affect the identification of words. Frequency is one: high- frequency words are processed more quickly and easily than low-frequency words, and are more readily identified in noisy conditions. Also relevant is the existence of phonologically similar words, which slow down identification through interference. In one experiment it was shown that if word frequency is held constant, words with many phonologically similar neighbours – that is, words differing from the target word by a single phoneme – are identified more slowly than words with few neighbours. Recall from §6.2 that words often have a number of different senses, and may be homophonous with other words. Experimental evidence suggests that even when they appear in a sentence, words immediately invoke a range of polysemous senses, as well as homophonous words; the appropriate sense or lexical item is selected only after a slight delay. The experiments use lexical decision tasks, in which sentences with a polysemous word like bug are presented to subjects through headphones; shortly after the word bug is heard, a target form is presented on a screen, to which the subjects should respond by indicating (usually by pressing a button) as quickly as possible whether or not it is a word. The word bug in a sentence such as For several weeks after the exterminator’s visit they did not find a single bug in the apartment immediately facilitates the recognition of both insect and spy, decreasing the time taken to recognize them as words. However, after a few hundred milliseconds only insect is facilitated, suggesting that within this very brief space of time the other interpretations have been discarded as irrelevant.

Comprehension of sentences Comprehension of sentences involves not just lexical lookup, the identification of the component words (as discussed in the previous subsection), but also parsing, the assignment of a grammatical structure to the sentence. Parsing begins immediately from the very beginning of an utterance: hearers do not wait until the entire utterance has been produced before they begin processing it, as any self-respecting grammarian would. Evidence from conversational interaction indicates that interactants continually monitor what is being said, projecting what is to follow; they switch speaker and hearer roles so rapidly that there is often no gap in speech. This would be impossible if processing was delayed until the end of utterances. There is a downside to beginning parsing so soon. In sentences like The horse raced past the barn fell – called garden path sentences – beginning parsing from the start of the sentence results in raced being interpreted as the main verb in the intransitive clause the horse raced past the barn. But then the next word is inconsistent with this analysis; the only possibility is that raced past the barn is a relative clause and part of an NP with the horse (i.e. the horse that was raced past the barn). Intonation and prosody often provide cues to parsing spoken sentences, including garden path sentences. In fact, the garden path sentence just cited will probably only garden path a reader, not a hearer, because of the distinctive intonation contour it will almost certainly be produced on.

265

266

Linguistics

Production Production of sentences does not proceed on a one-phone-at-a-time or a one-word-at-a-time basis, with each phone or word being processed sequentially. Like comprehension, production also has top-down components, and entire sentences are planned ahead of time, before any part is produced. That this is so finds support in speech errors or slips of the tongue. As it turns out, these are not random. Many errors involve exchanges with later elements, indicating that larger units must have already been planned. Exchange errors occur at all levels. At the phonological level are spoonerisms, named in honour of the Oxford don Reverend A. Spooner (1844–1930), who was renowned for this type of error. Among the famous spoonerisms attributed to him are Let us drink a toast to our very queer dean; and You’ve hissed all my mystery lectures and tasted the whole worm. The transpositions of these two examples are typical, and involve switching of phonemes from identical positions in syllables in nearby lexical words, indicating the significance of the syllable as a processing unit. One does not find errors like to our very near queed, where the transposed phonemes are from different syllabic positions. Moreover, it is typically consonants in syllable-initial position that are transposed.2 At the morphological level are errors in which lexical morphemes are transposed; the bound morphemes normally remain in place, as in slicely thinned for thinly sliced, where each lexeme gets the morphological marker appropriate to its position and role. At the syntactic level are transposition of lexemes within syntactic constructions, as in He is writing a mother to his letter. Such errors are like spoonerisms in that transposed words normally come from the same phrasal position – in this instance, they are both the heads of their respective phrases. The latter type of error reveals another important characteristic of speech errors: choice of phonologically conditioned allomorph (see §3.4) is in accordance with the replacing item, not the replaced item. Thus the allomorph of -ed in thinned is /d/, the allomorph appropriate to thin, not to slice (which takes voiceless /t/ allomorph); and if instead of mother in the example of the previous paragraph we had aunt, an would have been chosen as the allomorph of the indefinite article, not a – He is writing an aunt to his letter, not He is writing a aunt to his letter. Exchange errors are not the only types of error that occur in speech. Another type is anticipatory errors, where a later form is anticipated, as in kindler and gentler. There are also wrong word choices, where a phonologically or semantically similar word occurs instead of the intended word – for example, sexton instead of sextant. Errors can also be mixed, involving both phonological and semantic components, as in The competition is a little strougher where strougher is a blend of stronger and tougher. Based on speech errors, it has been suggested that syntactic constructions are planned about two clauses in advance, whereas phonological structures are constructed about one clause ahead (Garrett 1988). Of course, the content of the message to be communicated must also be planned ahead, a task which is typically more attention-demanding than the relatively automatic and unconscious processes involved in linguistic planning. But the two must be carefully coordinated so that the right information is processed and expressed in language at the right time. The language production system and more general processes of cognition must work together in concert, and cannot be entirely separate.

Psycholinguistics: Language, the Mind and the Brain

Relations between production and comprehension Speech comprehension and production are both complex processes that are only partly understood. They are not mirror-images – comprehension is not production put into reverse; the available evidence suggests that at least some of the psychological mechanisms involved are different. For one thing, as we have seen, in production entire utterances are planned ahead of time. In comprehension, by contrast, parsing begins with the first word and must be to some extent incremental; if it were the precise reverse of production, it would operate on complete utterances. On the other hand, comprehension and production cannot be entirely separate processes. Speakers monitor their own speech production and correct errors – recall the feedback loop in the speech chain model (see §2.1). This suggests that the comprehension system is involved at least to some extent in speech production. Conversely, according to one theory of phonetic perception, the motor theory, processing does not just involve processing of acoustic signals reaching the brain from the ear. It also involves matching these signals against mentally reconstructed sub-vocalized sequences of articulations; thus the acoustic signal would be analysed at least in part via reconstruction of its production.

11.3 Language and the brain Basic structure of the human brain The human brain, which is roughly spherical in shape, is divided into two hemispheres, the left hemisphere and the right hemisphere. There are a number of connections between the two hemispheres, the most important of which is by a bundle of nerves called the corpus callosum. Bodily senses and control are largely contra-lateral; that is, each hemisphere manages the opposite side of the body. The outer layer of the brain, the cerebral cortex, is a layer about 2–4 millimetres in thickness, made up of the bodies of several billion brain cells or neurons. Many cognitive functions are carried out in the cortex. The cerebral cortex is deeply folded and fissured, and is divided into four main lobes in each hemisphere: the frontal lobe, the parietal lobe, the occipital lobe and the temporal lobe. These are shown in Figure 11.2. Also shown in Figure 11.2 are the brain stem, which controls the automatic functions necessary to keep the body alive (e.g. the beating of the heart), and the cerebellum, which (among other things) helps control movement and cognitive processes requiring precise timing. The branch of psycholinguistics concerned with the brain is called neurolinguistics.

Localization and lateralization In 1848, in Vermont in the USA, Phineas Gage, a railway foreman, was tamping gunpowder into a blasting hole in a rock, when it exploded. The three-and-a-half-foot (a bit over a metre) long

267

268

Linguistics

Figure 11.2 Major structures of the human brain.

tamping rod was projected through Phineas’ left cheek and out through the top of his forehead, falling some fifty metres distant. Remarkably, Phineas never lost consciousness, was fully aware of what had happened, and survived for many years after the injury. His language abilities were reported to have been unaffected; however, within a few months of the accident his personality had changed dramatically. This story is consistent with localization of certain cognitive functions in particular regions of the brain. In particular, it suggests that the extreme front part of the brain is the site for emotions; language ability must be localized elsewhere, as must vision and motor control of the limbs. Furthermore, in most individuals the left hemisphere is more dominant in language processing than the right. Most right-handers show left-hemisphere domination, as do most left-handers (though the proportion is slightly lower). This feature is known as lateralization. Although neuroscientists disagree about the precise details and extent, it seems certain that language is localized and lateralized to some degree. Two areas in the dominant hemisphere are particularly important in language processing: ●

Broca’s area, named after Paul Broca, a nineteenth-century French physician and anthropologist, is a small patch in the anterior (front) part of the temporal lobe of the

Psycholinguistics: Language, the Mind and the Brain



language-dominant hemisphere about two centimetres across. If you put your finger to your head just above the left temple, that’s about where it is. Broca’s area is believed to be associated with speech production. Wernicke’s area, named after the German physician Carl Wernicke, is a slightly larger area than Broca’s, and located further towards the posterior (back) of the brain. Put your finger just above and slightly behind the left ear. Wernicke’s area is believed to be associated with speech comprehension.

Figure 11.3 shows the approximate locations of Broca’s and Wernicke’s areas, as well as some other major brain areas of the left hemisphere. Broca’s and Wernicke’s areas are connected by a bundle of neurons called the arcuate fasciculus (not shown). As just mentioned, the extent to which language is localized in the brain is an issue of contention. Recent neuroscientific research has shown that there is a substantial degree of plasticity in the human brain. For instance, one region can take over the functions of another region that has been damaged. More surprisingly, experience can change both the physical structure and functional

Figure 11.3 Some of the major areas of the left hemisphere of the brain. (Adapted with permission from http://brainmuseum.org. Specimens used for this publication are from the Defense Health Agency Neuroanatomical Collections Division of the National Museum of Health and Medicine, the University of Wisconsin and Michigan State Comparative Mammalian Brain Collections supported by the US National Science Foundation.)

269

270

Linguistics

organization of the brain. There is also evidence of fairly substantial variation among individuals. At the same time, there are significant limitations on the plasticity of the brain.

Evidence for localization and lateralization In what follows we discuss five different types of evidence that argue in favour of some degree of localization and/or lateralization of language functions in the brain. This evidence is mostly indirect, coming from circumstances in which things fail to function properly.

Aphasia Aphasia is an impairment of language function (as distinct from muscular paralysis of the speech organs) due to brain damage, often as a result of a stroke, a tumour, or head injury. The original evidence for Broca’s and Wernicke’s areas as language centres came from post-mortem studies of aphasic patients. Broca and Wernicke both found associations between certain types of aphasia and damage to the regions of the brains of their patients that were named after them.

Broca’s aphasia In 1861, Paul Broca described the results of an autopsy he performed on a patient by the name of Leborgne, who had suffered severe aphasia for more than two decades. Leborgne was able to utter no more than a few swear words and the syllable Tan, after which he is often named. By use of gestures, however, he was able to answer some questions, and he understood much of what was said to him. The autopsy revealed extensive damage in the region now known as Broca’s area. Broca subsequently found similar damage to the brains of almost a score of other aphasic patients displaying similar impairments in speech production. In this type of aphasia – called Broca’s aphasia or agrammatic aphasia – the person experiences difficulties in the production of fluent speech and the almost exclusive use of lexical words at the expense of grammatical morphemes, which are rarely used. These features are illustrated in the following utterance by a patient who was asked why he had returned to hospital: (11-1) Ah . . . Monday . . . ah Dad and Paul . . . and Dad . . . hospital. Two . . . ah . . . doctors . . . and ah . . . thirty minutes . . . and yes . . . ah . . . hospital. And er Wednesday . . . nine o’clock. And er Thursday, ten o’clock . . . doctors. Two doctors . . . and ah . . . teeth Comprehension of speech by Broca’s aphasics is typically much better than production. Thus it is clear that in (11-1) the patient understood the question asked of them, though they experienced great difficulty in formulating their reply. Deaf patients with damage to Broca’s area show similar deficits in sign language – namely, dysfluency and agrammaticism – but relatively intact comprehension. This suggests that Broca’s area is specialized for language, rather than speech as such.

Wernicke’s aphasia Shortly after Broca’s studies, Carl Wernicke investigated a number of aphasic patients, and found extensive damage in what is now called Wernicke’s area. This type of aphasia, called Wernicke’s

Psycholinguistics: Language, the Mind and the Brain

aphasia or fluent aphasia, is characterized by severe difficulties in comprehension, but quite fluent speech, which is often incomprehensible and may include nonsense words. Example (11-2) is an attempt to refer to a kite. Notice that the utterance is much more fluent than the example of Broca’s aphasia in (11-1). The speaker is evidently trying hard to convey meaning, and their difficulty lies primarily in finding appropriate words; they attempt to circumvent this by circumlocution. (11-2) It’s blowing, on the right, and er there’s four letters in it, and I think it begins with a C – goes – when you start it then goes right up in the air – I would I would have to keep racking my brain how I would spell that word – that flies, that that doesn’t fly, you pull it round, it goes up in the air.

Other types of aphasia Wernicke also described a third type of aphasia, now called conduction aphasia, in which the arcuate fasciculus suffers damage. Comprehension and fluency of the speech of conduction aphasics is usually little affected. However, sufferers experience difficulties in repeating words spoken by another person, and in monitoring their own speech, thus leading to frequent hesitations and pauses. Anomic aphasia is the inability to name things seen. Strangely, this does not necessarily extend to things perceived by other means – for example, by touch or smell. This type of aphasia manifests itself in a variety of different forms: some people lose words for only vegetables, or just words for inanimates. One case reported is of two women who had no trouble with nouns, but severe difficulty with verbs. For words with dual membership, such as milk, they experienced no difficulty with the word used as a noun, but couldn’t cope with it as a verb. There is no evidence of any specific site for brain damage giving rise to anomic aphasia. In the following example, notice that the patient cannot find the noun comb, although she uses the verb comb. (11-3) Doctor: Can you tell me what this is? [Showing a pen] Patient Geez, you know . . . isn’t that funny, oh I know, it’s one of those things, . . . it’s . . . . it’s funny, you know . . . I know that it is . . . you know . . . it’s hummmm . . . it’s one of those things. Doctor: How about this? [Produces a comb] Patient Ooohhh. . . . isn’t that funny . . . I’m getting old . . . it’s so terrible, ohhh . . . you know . . . I just . . . it’s that funny, oh geez . . . you know . . . I know, it’s that thing you use to comb your hair with. Mark Ashcraft, a cognitive psychologist, describes an episode in which he suffered from a temporary restriction of blood flow to part of his brain while working late in the office one evening. He suddenly found himself unable to understand familiar labels on the computer printouts, and could not remember any of the terms of his profession. He describes the episode as follows: The most powerful realization I had during the episode, and the most intriguing aspect to me since then, was the dissociation between a thought and the word or phrase that expresses the thought. The subjective experience consisted of knowing with complete certainty the idea or concept I was trying to express and being completely unable to find and utter the word that expresses the idea or concept. (Ashcraft 1993: 49)

271

272

Linguistics

Global aphasia, as the label suggests, involves disturbance to all language functions, to all processing components. Global aphasia typically involves damage to a large portion of the frontal and temporal lobes.

Summary of aphasia types Table 11.1 summarizes the main features of the five types of aphasia, according to the standard or classical model.

Problems of interpretation The account of aphasia presented in the previous subsections that links the type of aphasia with language centres in the brain can be criticized on more than one count. As Sigmund Freud pointed out, we can’t conclude that a function is localized in a certain area of the brain because damage to that area results in aphasia. It could be that the area is involved in a crucial way in the task that is widely distributed across areas of the cortex; for instance, it could be where several lines of connection cross. Henry Head, who studied language disturbances resulting from gunshot wounds to the head in the 1920s, found a wide range of aphasic symptoms among individuals with similar injuries. He concluded that the classification into Broca’s, Wernicke’s and conduction aphasias was not clear-cut.

Table 11.1 The five classical types of aphasia Syndrome

Symptoms

Location of brain lesion

Broca’s aphasia

Utterances typically short, with grammatical morphemes, usually omitted; speech effortful and non-fluent. Comprehension much better.

Broca’s area; front part of the temporal lobe and back part of the frontal lobe

Wernicke’s aphasia

Poor comprehension, but fluent production that is often incomprehensible. Common use of nonsense words.

Wernicke’s area; back part of temporal lobe

Conduction aphasia

Problems with repetition of speech, though comprehension is usually good. Sound and meaning appear disconnected. Reasonably fluent, though rhythm of speech may be disrupted by pauses and hesitations.

Arcuate fasciculus; bundle of fibres connecting Broca’s and Wernicke’s areas

Anomic aphasia

Inability to name things or events. Use of circumlocutions.

No specific location

Global aphasia

Disturbance to all language functions.

Large part of frontal and temporal lobes

Psycholinguistics: Language, the Mind and the Brain

One can also question the extent to which it is valid to draw conclusions about localization of brain function from autopsies. It could be that changes to brain functions, as well as to the damaged region itself, took place between the onset of aphasia and the death of the patient. Indeed, aphasics not infrequently develop ways of compensating for their injuries.

Split-brains Studies of patients who have had one of their hemispheres removed provide some evidence of lateralization of language processes. Removal of the right hemisphere, an operation sometimes performed in cases of malignant brain tumours, usually does not result in the loss of language, though other cognitive processes may be affected. A clinical procedure that used to be used on patients suffering from epilepsy was to surgically cut the corpus callosum connecting the two hemispheres of the brain, in the hope that this would prevent the spread of the seizure from one hemisphere to the other. It also inhibits the sharing of information between the two hemispheres: with the corpus callosum severed, the two hemispheres are to a considerable extent independent. This condition is called split-brain. Certain information will reach only the right hemisphere, and other information only the left hemisphere. If the image of an object is presented to the right visual field of a split-brain patient,3 this will be processed by the left hemisphere, and the object can be named. If it is presented to the left visual field, it will be processed by the right hemisphere, and cannot be named. In one experimental study, a word was presented to the left visual field of a split-brain patient, who was to select by feel the correct object from a group of objects behind a screen. This could be done, though when asked what the object was, the person would reply ‘I don’t know.’ This was their left hemisphere doing the talking, unaware of what the right hemisphere knew. This experiment reveals that the right hemisphere is capable of at least some linguistic processing, such as identifying an object by name. Some studies have suggested that the right hemisphere is capable of processing concrete lexemes, but poor at more abstract items, though this finding is controversial.

Dichotic listening The dichotic listening test is an experimental technique for determining which hemisphere is dominant in language processing in an individual. It relies on the contralateral processing by the brain of sensory input. The subject wears a set of headphones through which two different signals – which might be syllables, numbers or words – are input simultaneously. Most people show a right-ear advantage: that is, it is the signal played into the right ear that most people tend to correctly identify. If boy is played in the left ear, and girl in the right, it is most likely that the subject will report hearing girl. This is consistent with left-hemisphere dominance for language processing for most people. This is because the signal received through the left ear is sent first to the right hemisphere, and only then via the corpus callosum to the left hemisphere for processing. The signal coming from the right ear will be processed earlier, since it goes directly to the left hemisphere. By contrast, when the sounds played in the headphones are not speech sounds – for example, music, coughs, traffic noise and so on – a left-ear advantage tends to be shown. So, if the sound of

273

274

Linguistics

laughing is played in the left ear, and coughing in the right ear, subjects are more likely to perceive the laughing input. These specializations of the ears appear to have more to do with the nature of the processing than the physical type of sound that is input. Thus speakers of Thai, a tone language (recall §2.5), reveal a left-hemisphere advantage when distinguishing CV syllables contrasting in tone. By contrast, speakers of English tend to show a right-hemisphere advantage. This difference is presumably a consequence of the different ways in which tone is employed – and processed – in the two languages.

Wada test The Japanese neurosurgeon Juhn Wada devised a test to determine which hemisphere is dominant in language processing by injecting sodium amobarbital into the carotid arteries of neck. The result is deactivation of the ipsilateral (same side) hemisphere of the brain, and immediate contralateral paralysis of the body. The patient is asked to count backwards, beginning as the injection is given. Counting is always interrupted when the sodium amytal takes effect, momentarily if it affects the nondominant hemisphere, for a longer period of one to three minutes if it affects the dominant hemisphere. Using the Wada test, a study by Rasmussen and Milner (1977) revealed evidence of correlations between handedness and language lateralization. Another investigation conducted by Loring and others in 1990 used the Wada test with tasks such as counting, comprehension, naming and repetition. It revealed that of 103 patients, 79 had exclusive left-hemisphere language representation, 2 had exclusively right-hemisphere dominance, and 22 had bilateral language representation (Loring et al. 1990). Bilateral representation was much higher in left-handers than right-handers.

Brain scanning Recent advances in technology permit us to study the human brain in operation in relatively normal circumstances. Of the numerous brain-scanning technologies currently in use two are described in this subsection that provide evidence relevant to localization. They give quite precise information about the location of brain activation, although both suffer from significant imprecision in timing of the activation.

Positron Emission Tomography Positron emission tomography scanning or PET scanning involves injection of a harmless radioactive isotope (often oxygen-15) into the blood stream. Since neurons in the more active areas of the brain require more oxygen, blood flow to that region increases. The PET scanner detects the locations of the radioactive isotope; greater concentrations will be recorded in regions where blood flow is higher. The regions of the brain that are most active in the performance of a task can be mapped in three dimensions. The subtraction paradigm is an experimental design frequently used in PET studies of localization of language functions. This method can be illustrated by Petersen et al.’s investigation (1989; see also Petersen and Fiez 1993) of the processing of single words under different conditions.

Psycholinguistics: Language, the Mind and the Brain

In the first condition the subject fixated visually on a small point on a monitor while brain activity was measured. In the second stage the subjects fixated on the same point, while words were either presented on the monitor just below the point, or aurally through headphones. In each case, the words were presented at the rate of 40 per minute. Next, subjects were required to say the words they read or heard. And in the final stage, the subjects were requested to give a verb describing an appropriate action for each displayed noun. For example, they might say eat in response to the word cake. The brain activity associated with the various component tasks can be determined by subtraction. Thus, taking away the level of brain activity associated with the perception of the fixation point from the level of activity associated with the visual or aural perception words can reasonably be expected to give an indication of the activity associated with the comprehension of the spoken or written words. Taking the level of activity involved in perception from the level involved in speech production will indicate the activity involved in production of the words, and so on. Using this design with a number of experimental subjects it is possible to determine the regions of the brain that are most active in specific language-related tasks. Unsurprisingly, the visual and auditory regions were active in the viewing and hearing conditions; Wernicke’s area and large parts of the sensory cortex were also involved in the hearing condition. Broca’s area was active in the task of generating verbs, as was an area in the temporal lobe. The speaking words condition was associated with activation in a region between Broca’s and Wernicke’s areas, and involving parts of the motor and sensory cortexes. PET scanning suffers from numerous disadvantages, most of which are too technical to discuss here. One that we can mention is that since it involves the injection of a radioactive isotope, ethical considerations limit the number and duration of tests to which an experimental subject is exposed.

Functional Magnetic Resonance Imaging Unlike PET scanning, functional magnetic resonance imaging or fMRI is a non-invasive technique: that is, it does not require the injection of anything foreign into the blood stream. In fMRI, brain activity is measured indirectly through changes in oxygen levels in the blood stream, which are measured via different magnetic properties of oxygenated and deoxygenated blood. fMRI has certain advantages over PET: it is faster, gives better spatial resolution, and does not suffer from such severe restrictions on the amount of time, or number of times, a patient can be put in the scanner. It is also cheaper. fMRI is a more recent technology than PET, and a number of investigations carried out with PET have been repeated with fMRI, using the same experimental designs and methods, including the subtraction paradigm. The results of the two imaging techniques appear to be in general agreement. Like all technologies fMRI is imperfect, and suffers from disadvantages and limitations as far as neurolinguistic research is concerned. The machinery is noisy, thus decreasing its usefulness in speech perception. Worse, subjects must remain virtually immobile: even tiny movements of the head resulting from jaw movements in speech can affect determination of the location of activity.

275

276

Linguistics

Concluding comments Modern technology permits observation of the normal human brain in action in conditions approaching those of natural speech production. Findings from neuroimaging techniques such as PET and fMRI are in overall agreement with findings from earlier post-mortem dissection studies of aphasic patients, and show that Broca’s and Wernicke’s areas are indeed active in language production and comprehension tasks. This does not argue, as mentioned above, that these are the language areas. The balance of evidence suggests that language processing and other cognitive tasks are intertwined in the brain, and that strict localization is likely only for elementary cognitive processes, not complex ones such as language. There is a growing body of evidence that language processing is not completely restricted to the dominant hemisphere; the non-dominant hemisphere plays important roles as well – for instance, in interpreting metaphoric and figurative language, and humour. One fMRI study revealed a more prominent degree of lateralization in males than females in a rhymejudgement task. The role of other brain structures than the cortex is increasingly recognized. Subcortical structures are also important. Damage to some of these areas can lead to certain types of aphasias; moreover, aphasia resulting from damage to Broca’s area seems not to be long-lasting if it is restricted to the cortex. There is also evidence that the cerebellum plays a role in language processing beyond mere motor control and coordination. The Swiss Army knife model of the human brain, in which distinct regions are dedicated to different functions, is now widely considered to be a misrepresentation. The human brain shows a considerable degree of plasticity in language and other cognitive functions. As already mentioned, it is capable of recovery at least to some extent from damage through deployment of other areas (the neurons themselves are rarely replaced). Studies of individuals who had their left hemisphere removed in early childhood have revealed that the right hemisphere is capable of taking over most language functions.

Neurolinguistics is not just concerned with questions of localization and lateralization, but with other aspects of brain functioning in language production and comprehension as well. Two other brain-scanning technologies have been used extensively in neurolinguistic investigations, although they do not provide very reliable information on localization or even lateralization.These are electroencephalography and, more recently, magnetoencephalography. Electroencephalograms or EEGs measure electrical activity in the brain resulting from the firing of neurons through electrodes placed on the scalp. In experimental studies involving this technology, the subject is presented with a language task, and brain activity is recorded over a number of trials to determine whether there are consistent changes in activity associated with performing the task. In one experimental procedure subjects are presented with sentences that end in either an expected or an unexpected way – for example, The pizza was too hot to eatt or The pizza was too hot to drinkk. A large change in electrical activity is observed some 400 milliseconds

Psycholinguistics: Language, the Mind and the Brain

following the presentation of an anomalous word; this does not occur following an expected word. This change, known as the N400 component (the N stands for negative), is a reliable indication of an incongruent or unexpected input. Magnetoencephalography is a variant of EEG technology. Magnetoencephalograms (MEGs) measure magnetic fields rather than electrical fields. Similar experimental procedures are employed as in EEG studies. EEGs and MEGs give very precise indications of timing of brain activity. EEGs give quite imprecise indications of the location of the activity in the brain; MEGs provide more accurate indication of location. Among the more exciting recent developments is the combination of fMRI with EEG or MEG in an attempt to match the high spatial resolution of the former with the excellent temporal resolution of the latter.

Summing up Psycholinguistics enquires into such issues as the relation between language and other cognitive phenomena, and the processes by which we comprehend and produce speech. The field is characterized by major differences of opinion and approach. According to some, language is represented by a distinct mental module, largely separate from other modules. Others hold that there is nothing unique about language: it uses mental processes used in other domains. Many hold views in between these two extremes. Another hotly debated issue concerns the relation between language and thought. According to the Sapir-Whorf hypothesis the structure of the language one speaks influences one’s conceptualization of the world; it comes in stronger and weaker versions: linguistic determinism and linguistic relativism. A recent reinterpretation is Slobin’s thinking for speaking. Speech comprehension involves recognition, integration and identification of units and information at all linguistic levels – as well as information from other perceptual modalities than the auditory. Experimental findings support the idea that comprehension involves both top-down and bottom-up processing. Evidence from garden path sentences indicates that parsing begins immediately, at the first word of an utterance. Sentence production is more difficult to study than comprehension, and much of the evidence comes from slips of the tongue, which reveal that utterances are planned ahead. The study of language in the brain and the brain functions involved in language processing is neurolinguistics. Two areas of the left (or language-dominant) hemisphere are especially important: Broca’s and Wernicke’s areas. Evidence for this localization comes from studies of aphasia and brain scanning. Different types of aphasia – including Broca’s, Wernicke’s, conduction, anomic and global – are distinguished according to impairments to different aspects of language; these tend to be associated with damage to different brain regions.

277

278

Linguistics

Brain-scanning technologies permit language processes to be studied in action, though all have limitations. Electroencephalography (EEG) and magnetoencephalography (MEG) provide excellent information about timing, but poor locational information; positron emission tomography (PET) scanning and functional magnetic resonance imaging (fMRI) provide accurate information about the location of brain activity, but are imprecise on timing events. Findings from experiments with PET and fMRI agree well with results of post-mortem studies of aphasics.

Guide to further reading Probably the best place to begin reading about the Sapir-Whorf hypothesis is with Whorf (1956). The hypothesis has engendered an enormous literature, both pro and con. Lee (1996) attempts to come to grips with what Whorf was really saying. Gumperz and Levinson (1996) and Gentner and Goldin-Meadow (2003) contain many articles exploring and extending Whorf ’s ideas. Good introductory textbooks on psycholinguistics are Aitchison (2011, 2012), Ludden (2016), Whitney (1998), Menn (2011), Traxler (2012), Warren (2013) and Harley (2013). Harley (2017) is an accessible introduction to psycholinguistics intended for the general reader. Aitchison (2003) is a useful glossary that explains psycholinguistic terminology clearly. On slips of the tongue, see Fromkin (1973a, 1973b, 1988) and Cutler (1982). See also Fromkin (1980) for a range of error types, and Bond (1999) on slips of the ear. Databases of speech errors gathered over many years by Victoria Fromkin and others are available online at https://www.mpi. nl/dbmpi/sedb/sperco_form4.pl. Rosenblum (2010) presents a beautiful account of the way the human brain integrates information from different sensory sources in perception, including perception of speech. His article in Scientific American (Rosenblum 2013) is a brief summary, which also makes for fascinating reading. The story of Phineas Gage is told engagingly in Fleischman (2002), though the details are not always entirely accurate. See also Macmillan (2000a, 2000b) and Ratiu and Talos (2004), which includes a video reconstructing the damage to Gage’s skull. Calvin and Ojemann (1994) tells a fascinating story of conversations with an epileptic patient before, during and after neurosurgery. Donald (1991: 82–6) provides a detailed account of a man (pseudonym Brother John) who experienced temporary aphasic seizures in which he would go through stages resembling different types of aphasia. During these seizures other cognitive processes were unaffected; he was fully aware of what was going on around him and could later remember and describe what had happened. Good popular introductions to the brain are Carter (2010), Carter, Page and Parker (2019), Greenfield (2000) and O’Shea (2005). For a more detailed and technical treatment, see Kolb and Whishaw (2003/1980). For an overview of neurolinguistics, see Caplan (2017). Ahlsén (2006), Ingram (2007), Denes (2011), Brennan (2022) and Kemmerer (2023) are good textbook treatments of the subject.

Psycholinguistics: Language, the Mind and the Brain

Issues for further thought and exercises 1 Some languages have grammatical systems of gender for nouns, which are generally indicated by agreement of verbs, determiners and/or adjectives. For example, French nouns are either masculine or feminine; Standard Danish and Swedish nouns are either common (en) or neuter (et); and Bantu languages are known for large gender systems. One interpretation would be that in thinking for speaking, speakers of such languages would employ an isomorphic system of classifying things. To what extent do you think this is likely to be so? Assuming that such a system is used in thinking for speaking, how could you test whether it extends to other aspects of thought, to thinking in other cognitive domains? 2 What is the Stroop effect? (You can find information about it on the web, and in many books on psycholinguistics.) Write a paragraph description of the effect, explaining what it shows. Try it out on yourself and your friends. 3 An important experimental technique used in psycholinguistics is known as priming. Find out what it is, and write a paragraph description of the technique (in your own words); explain the nature of the technique, a simple experiment using it, and why it is believed to show what it does. 4 Record examples of speech errors over the next few weeks. (Carry a notebook around with you and note the errors down as soon as you can to avoid forgetting or misremembering them.) How would you classify the types of error you have found? 5 We’ve spoken of slips of the tongue, but slips of the ear also occur, errors of perception and comprehension (see Bond 1999). These are more difficult to identify in actual speech, but there are certain conditions under which it is possible to notice or infer them. Think of some such conditions, and then over the next few weeks attempt to observe examples. What types of error are represented, and how do they compare with the types of speech errors mentioned in the text? 6 One example of a common phonological error is the pronunciation of ku klux klan as klu klux klan. What sort of error is this? Can you think of any other examples of similar phonological errors? Steve Mirsky’s column ‘Antigravity’ in the February 2004 issue of Scientific American reports the following humorous exchange from a radio conversation: ‘The Klu Klux Klan.’ ‘It’s not Klu. It’s Ku. Its not Klu Klux Klan, it’s Ku Klux Klan.’ ‘I didn’t say Klu Klux Klan,’ I said ‘Klu Klux Klan.’ ‘You said it again, you said Klu.’ ‘I did not say Klu Klux Klan,’ I said ‘Klu Klux Klan.’ ‘You said it again, you said Klu.’

What does this dialogue suggest about language processing? 7 I recall as a child hearing one boy say to another Heads I win, tails you lose. He threw a coin, and of course won. The other child looked puzzled. The same thing continued over a number

279

280

Linguistics

of throws, with the second child getting increasingly confused as he lost every time. What type of error is this? Can you think of (or have you observed) similar errors? What (if anything) does it reveal about language comprehension? 8 Below are some examples of speech by aphasics. Which type of aphasia do they appear to represent? Give your reasons. a. Well this is . . . mother is away here working her work out o’here to get her better, but when she’s looking, the two boys looking in other part. One their small tile into her time here. She’s working another time because she’s getting, too. b. Lower Falls – Maine – Paper. Four hundred tons a day! And ah – sulphur machines, and ah wood – Two weeks and eight hours. Eight hours – no! Twelve hours, fifteen hours – working working – working! Yes, and ah – sulphur. Sulphur and – ah wood. Ah . . . handling! And ah sick, four years ago. c. I felt worse because I can no longer keep in mind from the mind of the minds to keep from mind and up to the ear which can be to find among ourselves. [Uttered by a patient in response to a question about his health.] d. Examiner: What kind of work have you done? Patient: We, the kids, all of us, and I, we were working for a long time in the . . . You know . . . it’s the kind of space, I mean place rear to the spedawn . . . Examiner: Excuse me, but I wanted to know what kind of work you have been doing. Patient: If you had said that, we had said that, poomer, near the fortunate, porpunate, tamppoo, all around the fourth of martz. Oh, I get all confused.

9 Some aphasics substitute words for written words when asked to read them. Compare the following list of written word and read words (from two different occasions of reading), and state what the words have in common and how they differ. What does this suggest about the way words are stored in the brain? Written word act applaud example heal south

First read response play laugh answer pain west

Second read response play cheers sum medicine east

10 Supposing you were to give a Broca’s aphasic the following list of homophonous words to read, what differences would you expect in the reading of the words from the two columns? ewe bee eye hymn four

you be I him for

Psycholinguistics: Language, the Mind and the Brain

11 Find out about one or more of the following disorders: jargon aphasia, dyscalculia, dyslexia, acquired dyslexia, Specific Language Impairment (SLI) and autism. Write a brief description of the disorder, mentioning its physiological manifestations and causes, and typical effects on language. 12 In an experiment described in Warren and Warren (1970), people listened to sentences of the form It was found that the ºeel was on the _____, where the º indicates a loud cough, and the underscore was filled by words such as axle, orange, table and shoe. What do you predict the results of this experiment were? Explain your reasoning. 13 Other technologies that have been used in neurolinguistic investigations include transcranial magnetic stimulation (TMS), Diffusion tensor imaging (DTI), Functional near-infrared spectroscopy (fNIRS) and intracranial electroencephalography (iEEG). Find out about one of these technologies, and write a short description of it, including how it has been used in investigations of language activity in the brain.

Research project Discuss and evaluate the arguments of Levinson and his group for Whorfian effects in the domain of space. Be sure to identify the relevant linguistic features, the cognitive processes, and the motivation for the associations. Describe the experiments developed by the group, and how they are relevant to and argue for the claims. Aside from the references mentioned in §11.1, see Majid et al. (2004) for an overview.

281

282

12 Language Learning

This chapter is concerned with the processes by which human beings learn a language, with how they attain the ability to comprehend and produce utterances in it. We begin with first-language learning, the ways in which children learn to speak the language of the community they are born into. While there is considerable individual variation in the learning process, this variation falls within limits and learning follows regular patterns. We discuss four general learning strategies that have been suggested to be relevant to the child’s learning of their native language. Finally, we deal with the learning of language by adults, enquiring into the extent to which this process resembles first-language learning, and the extent to which adults can attain native-like command of a language.

Chapter contents Goals Key terms 12.1 Major features of child language learning 12.2 Strategies for child language learning 12.3 Second-language learning Summing up Guide to further reading Issues for further thought and exercises Research project

284 284 284 293 298 301 301 302 304

283

284

Linguistics

Goals The goals of the chapter are to: ● describe the milestones of language learning by the child; ● describe common patterns in the child’s learning of phonetics, phonology, lexicon, morphology, syntax and semantics; ● discuss strategies that have been proposed for the child’s learning of their language; ● identify developmental patterns in the learning of a second language by adults; ● discuss the effects of age on adult learning of a language, and raise the question of whether an adult can attain native-speaker competence in a language; and ● mention some ways in which second-language learning processes vary among individuals

Key terms babbling

imitation

overgeneralization

basic mastery

innateness

caretaker speech

L1

second-language learning

conditioned-response learning

L2

continued learning cooing critical period hypothesis hypothesis testing

mismatch in meaning object scope one-word holophrastic stage overextension of meaning

syntactic bootstrapping telegraphic speech transfer/interference two-word stage underextension of meaning

12.1 Major features of child language learning General characteristics of language learning Preliminary remarks Normal children in all societies gain, within the space of a few years, fluent control of a language, often two or more. By the time they are five years old they know several thousand words, have learnt the major phonological and grammatical systems of their language(s), as well as the fundamentals of

Language Learning

the semantic and pragmatic systems, and how the language is used in its social context. Exceptions are few: children with severe handicaps such as extreme mental retardation, or Down’s syndrome, may not learn a language fully; very rarely a child is not exposed to sufficient speech. The language(s) a child learns depends on the languages habitually spoken around them, by their parents and other community members, including other children they interact with. Children have no genetic predisposition to speak a particular language: if removed from their biological parents at an early age, and brought up by foster-parents who do not speak the language of the natural parents, the child will learn the foster-parents’ language like any native-born child. Although a good deal is known about the processes of learning of language by children, about what and when (in what sequence) the child learns, many questions relating to how and why remain unanswered. There are major disagreements on the issues of whether or not we are genetically programmed to speak, and the processes by which language is learnt. In no society are children explicitly taught to speak their first language(s). They learn their language spontaneously, in everyday interactive situations; explicit instruction is unnecessary, and, if given, usually has little effect. Profoundly deaf children, of course, are unable to perceive the acoustic input of languages spoken around them. But, if exposed to a sign language, they also learn it without instruction. This spontaneous learning of speech and signing contrasts with writing, which is usually learnt through explicit instruction; most children learn to write at school, or are taught by parents or siblings. The learning of all languages is believed to proceed through similar developmental stages. However, it must be cautioned that intensive investigations have been carried out in a vanishingly small fraction of the world’s approximately 7,000 languages. Most studies have focused on major languages of Europe (especially English) and Asia (particularly Mandarin Chinese and Japanese). Few studies have been undertaken of learning of indigenous languages of Australia, Papua New Guinea, the Pacific islands or North and South America. This represents a serious limitation on our knowledge base. We know most about language learning in Western, educated, industrialized, rich and democratic (WEIRD) societies, and much less about learning in hunter-gatherer societies.

Basic schedule of learning The child’s learning of their first language is a staged process. (For simplicity we speak of their first language, though as remarked already many children learn more than one language at the same time.) The stages, which are similar across the range of languages in which learning has been investigated, are as follows: ●

● ● ● ● ●

pre-language stages of cooing, beginning at about two or three months; and babbling beginning at around six months; one-word stage, beginning at about a year or so; two-word stage, beginning at 18 to 20 months; telegraphic speech, beginning at two to three years of age; basic mastery, at around four or five years; elaboration and expansion especially of lexicon – also to some extent grammar – continuing throughout life.

285

286

Linguistics

Children vary considerably as regards the times they reach these stages, some entering them very early, some very late – for example, Albert Einstein is said not to have begun talking until five years of age. Regardless of whether the child is fast or slow in learning their language, in the long run it seems not to matter: late talkers end up with full control of the language. Moreover, it should not be presumed that the stages are rigidly distinct; they merge into one another. Below we discuss these six stages in order.

Pre-language stages The earliest stages of language learning by the child are the pre-language stages, which last from about two months to a year of age. At around two months the child typically begins to produce vocalizations called cooing. These vocalizations consist of syllables, often repeated, made up of a velar consonant plus a back vowel, like [kuː], [gɑːgɑː] and [guː]. By about six months of age the child is generally sitting up, and producing a wider range of sounds, including stops, nasals and fricatives. In this stage, babbling, the child produces word-like utterances, typically CV syllables, though they are not recognizable as words of the language. The phones are not necessarily restricted to those of the surrounding language; for example, children in an English-speaking environment sometimes produce retroflex stops and bilabial fricatives. But as time goes on the phones in babbling tend increasingly towards the phones heard in the language environment. In the later stages, towards the end of the child’s first year, babbling becomes more controlled, and different intonation patterns may be used. Deaf children also babble vocally, though they usually cease to do so by about nine months of age, due to lack of auditory feedback. Deaf children exposed from birth to a sign language babble manually from about 10 to 14 months, producing a range of rhythmic and repeated motor actions of the hands. Not all of these actions are necessarily present in the sign language to which they are exposed. According to some investigators, hearing children also produce manual babbles as part of the process of gaining control of bodily movements, which include gestures; these differ from the babbles of deaf children exposed to a sign language.

One-word stage At around 12 to 18 months children produce their first recognizable words. These words occur alone, in single-unit utterances, and thus the term one-word stage or holophrastic stage. A oneword utterance may be given different intonation contours to express different speech acts – for example, falling intonation for a statement, rising for a question or request. The first words tend to be similar both phonetically and semantically, regardless of the language. They tend to consist of CV syllables, and rarely contain consonant clusters. The first words are lexical rather than grammatical, and generally label concrete objects or individuals that the child interacts with, like mummy, daddy and kitty. Also common in this stage are words for negation (used in refusal, no), non-existence (remarking on disappearance or absence of something – for example, allgone), recurrence (used in requesting more, more) and attention (drawing attention to something or someone – for example, hi). In the one-word stage, language tends to be closely tied to the interactive context, and shows little displacement (see §1.3). Utterances tend to be interpersonally oriented, engaging the child interactively with the parent or caregiver.

Language Learning

The two-word stage By 18 to 20 months or so the child typically has an active vocabulary of some fifty words; this increases dramatically over the next few months, so that by two years of age the child’s vocabulary will normally increase to around two to three hundred. When a child has a vocabulary of around fifty words, they begin to put the words together in two-word utterances. Prior to this, children often string together sequences of separate utterances, with pauses in between them. The first two-word utterances tend to express the same kinds of meanings as found in the oneword stage, but do so more explicitly: negation or refusal – for example, no bed; recurrence, as in more milk; non-existence, as in allgone doggie; and attention – for example, hi daddy. New kinds of meaning begin to appear later in this stage, including: actor-action, as in mummy eat or eat mummy; quality-thing, as in bad kitty; possession, as in baby chair; thing-location – for example, doggy table; action-location – for example, go park; action-undergoer, as in eat brekky; and actor-undergoer, as in mummy dinner. At this stage in the speech of many children, words specifying things occur in second (i.e. final) position.

Telegraphic speech Multiple-word utterances usually make their first appearance sometime during the third year of life. Multiple-word utterances begin, at least in English, as strings of lexical words, without grammatical words or morphemes. This is called telegraphic speech, after the style of expression that used to be used in telegrams.1 In this stage, function words and morphemes, such as prepositions (in languages like English) and inflectional morphemes, begin to appear. The learning of inflected forms of words differs depending on the morphological complexity of the language: in a morphologically simple language like English it lags behind that of more morphologically complex languages, where the morphology is more salient.

Basic mastery By four or five years of age most children have a basic mastery of their language. Their vocabulary will stand at well over 1,000 lexemes, and the basic systems of phonology, morphology and syntax will be in place.

Continued learning Language learning continues throughout life. This is especially true of lexical items, which continue to be learnt in adulthood, although at a much slower rate than for the two-year-old child. Certain registers, such as scientific and legal registers in Western cultures, may not be controlled until late adolescence or even adulthood. Some aspects of grammar take a longer time than others for children to learn. For example, the numeral classifiers of many South-East Asian languages – words that specify the semantic type of an object and have to occur in NPs with numerals (e.g. for two pencils one might say ‘two longobjects pencils’, for two coins, ‘two round-objects coins’) – are often not fully learnt until the child is ten or more years of age. One study of ten-year-old Thai speakers revealed only about 90 per cent of correct usage of numeral classifiers, according to adult norms. My own adventitious observations

287

288

Linguistics

suggest that tag questions in English (see §5.3) may not be fully learnt until adolescence. Thus children of ten or twelve may use negative tags to negative clauses (as in He didn’t go yesterday, didn’t he?) in circumstances that seem jarring to an adult who would use a positive tag in that context (as in the more neutral He didn’t go yesterday, did he?).

Caretaker speech As mentioned in §7.2, many languages have special speech registers for talking to young children; adults do not speak to infants, or interact with them, in the same ways as with other adults. These registers, variously called baby-talk, motherese, child-directed speech and caretaker speech – we will use the last term – have characteristics that assist (or are believed to assist) the child’s understanding and learning of language. Caretaker speech in Western cultures tends to be characterized by a slow rate of delivery, exaggerated intonation, high pitch, palatalization of consonants, repetition, high frequency of diminutive forms (like the English -ie ∼ -y ending in doggie, kitty), simple syntax, short utterances (one study found that the average length of sentences addressed by mothers to two-year-olds was less than four words), and simple and concrete lexical items. Sometimes infrequent or complex phonemes are replaced by more common or simpler ones, and sometimes special lexical items peculiar to caretaker speech are used, often involving simpler syllabic structures (e.g. tummy instead of stomach) and repeated syllables (e.g. wee-wee, poo-poo, choo-choo). These characteristics tend to be broadly associated with caretaker speech in many cultures. Some studies suggest that the intonation patterns used by mothers (or other caretakers) when talking with young infants carries information about approval and disapproval, and that similar patterns are used in different societies to encourage or discourage the child to do something (Fernald 1985). But prosodic features are not universal. High pitch is not always a feature of caretaker speech: in Quiche Mayan (Mesoamerica) caretaker speech is characteristically low pitched, high pitch being used in speech to socially dominant individuals. Who primarily uses caretaker speech to the child differs from culture to culture. In Western cultures it tends to be parents, especially mothers, and close relatives of the infant (such as grandparents and siblings). In Western Samoa it tends to be siblings, adult neighbours and adult relatives other than parents. The role of caretaker speech in language learning – to what extent it really does facilitate learning, or is tailored to the needs of the child – is an issue on which there is considerable difference of opinion. Experimental evidence suggests that infants tend to prefer caretaker speech to ordinary adult-to-adult speech. Unfortunately, caretaker speech has not been studied in depth across a representative sample of languages.

Learning phonetics and phonology Before the end of their first year infants typically recognize a number of words, involving a range of different consonants and vowels, although they are unable to produce more than a few of them

Language Learning

themselves. The perception of speech sounds begins very early, some phonetic differences being perceived from a very young age. Even one-month-old babies are able to perceive the difference between [pa] and [ba], regardless of their language environment. Very young babies show preferences for the voice of their mother over the voices of other women; they also prefer the language their mother speaks: a baby of a French-speaking mother prefers to hear French over other languages. Before the child can produce any words, he or she has learnt some of the basic intonation patterns and auditory characteristics of the language. By a year of age, the child’s ability to hear sound contrasts that are phonemic in the language being learnt is enhanced, while the ability to hear sound differences that are not contrastive begins to deteriorate. Nasals and stops are generally among the earliest consonants learnt (as in mama and papa), [ɑ] and [i] the earliest vowels, with [u] appearing slightly later. Labial consonants tend to be mastered earlier than consonants at other places of articulation. By the age of four, children learning English are generally able to produce all the contrasting vowels and diphthongs, though a few consonants are likely to still cause difficulty. These include the fricatives [θ] and [ð], which are rare across the world’s languages and are the last consonants to be fully mastered, as well as affricates ([ʧ] and [ʤ]); the alveo-palatal fricatives [ʃ] and [ʒ], and the voiced fricatives [v] and [z] also cause difficulty. In some phonetic environments, [l] and [ɹ] are not easily distinguished. The position of a consonant in a word is relevant to its learning. Consonants are more likely to be correctly produced at the beginning of words than elsewhere. Final consonants generally emerge latest in production. An important characteristic of language learning is that perception precedes production: children are often able to perceive contrasts that they are unable to produce. This is nicely revealed in the following quote: One of us, for instance, spoke to a child who called his inflated plastic fish a fis. In imitation of the child’s pronunciation, the observer said: ‘This is your fis?’ ‘No,’ said the child, ‘my fis.’ He continued to reject the adult’s imitation until he was told, ‘That is your fish.’ ‘Yes,’ he said, ‘my fis.’ (Brown and Berko 1960: 531)

Trends are also discernible in the ways in which children change the sound-shape of words they produce, replacing certain phones by non-adult ones. Among the trends are the following: ● ● ●

● ● ●



Velars are often replaced by alveolars; for instance, gone might appear as [dɒn]. Fricatives tend to be replaced by stops; thus see might be pronounced [tiː]. Word final consonants tend to be omitted; for instance, kick might be pronounced as [ti] (with simultaneous replacement of the velar stop by an alveolar stop). Unstressed syllables are often omitted, as in [nɑːnə] for banana. Consonant clusters tend to be avoided; thus sky might be produced as [kaɪ]. There is a tendency for phones to harmonize within words. Thus dog might appear as [gɔg], where the first consonant has fully harmonized with the second; thumb could appear as [nəm], showing partial harmonizing of the initial consonant with the following nasal. Laterals and rhotics tend to be replaced by glides: [l] might be replaced by [j], as in [jɛg] for leg.

289

290

Linguistics

Learning the lexicon We mentioned above that the child produces their first recognizable words at around a year of age. Between 18 and 36 months, the child’s vocabulary increases rapidly, doubling every six months; it doubles again over the next year. By four years of age, a typical child is estimated to have a vocabulary of around 1,600 words. The early lexicon of children learning English tends to be made up of a high proportion of nouns. This has been thought to reflect the inherent conceptual simplicity of nouns over verbs; it could also reflect a greater salience of nouns in caretaker speech in English. The early lexicons of children learning languages in which verbs occur more frequently than in English, and in languages in which they tend to occur finally in the clause, sometimes show a higher proportion of verbs (Snow 1995), though other studies have detected no such difference. As in the learning of phonetics and phonology, perception precedes production. By about 18 months of age, when a child has an active vocabulary of around fifty words, some studies have revealed that they can understand up to five times as many words. This difference remains throughout life: adult speakers recognize and understand many more words than they actively use.

Learning of meaning Learning a lexical item is more than learning a phonetic or phonological form. The child also has to learn the meaning associated with the form. Learning meaning is not a straightforward process; the meaning of a word is not directly perceptible. Children do not, however, use lexemes meaninglessly; they assign content to the lexemes they learn, and there is a surprisingly good correlation between the meaning assigned by children and the meaning in the adult language. Errors in meaning assignment are of three main types. Overextension refers to the child’s generalization of the meaning of the word beyond the sense in the adult language. The word might be extended to all things sharing a general feature of colour, shape, size or whatever. For example, the word daddy might be used to refer to any man, doggy to all four-legged hairy animals, or moon to all round things. Overextension need not necessarily apply equally to the production and comprehension of a word. For example, one child used the word apple to refer to other similar round objects like balls and tomatoes, but was able to correctly pick out the apple from a collection of such items when asked to identify the apple. Less common is underextension, where the child assigns a narrower meaning to the word than in the adult language, using it to designate a more restricted range of objects or events. For example, the word doggy might be reserved for just the pet dog, or duck to just a toy duck. Rarely, children assign a completely mistaken meaning to a word; this is referred to as mismatch. Mismatches are usually motivated: they involve assignment of some aspect of meaning present in the context in which the word was heard to the wrong item. Thus one child who saw his first bicycle at a party for a child named Mikey for some time afterwards called all bicycles and tricycles mikeys. The meanings of some words are more difficult for children to learn than others, both in comprehension and production. Concrete vocabulary is easier to learn than abstract vocabulary,

Language Learning

and lexemes expressing relative meanings. Children tend to assign absolute meanings to adjectives such as big in their first few years. Kinship terms tend to be first given ego-centred meanings: mother and father being reserved for just the child’s parents; the relational nature of kinship terms may not be fully appreciated until the age of seven or older. (In fact, it seems that the meanings of some complex kinship terms found in many Australian languages are not learnt until late adolescence.) English-speaking children often do not learn the relative use of left and right – as illustrated in the dog is to the left of the boy – until eleven or twelve. Children under about eight years of age generally do not appreciate that many words are ambiguous, and thus may not be able to understand puns and jokes that rely on lexical ambiguity.

Learning morphology Cross-language differences are prominent in the learning of morphology. In English it begins rather late; in languages with richer morphologies, such as Hungarian, Spanish and Turkish, learning of morphology begins earlier. In English words often occur in their root or stem form, which is the most salient form. But in more morphologically complex languages the bare root or stem form may not be free; the lexical item might be encountered only in different inflected forms. This is the case for Spanish verbs, which are encountered in inflected forms such as como ‘I eat’, comes ‘you eat’, comía ‘I ate’, comías ‘you ate’, but never in the uninflected root form com, which is a bound morpheme. The child exposed to Spanish learns verbal suffixes earlier than does the child exposed to English because they are more pervasive in Spanish, and one is always present on a verb. In general, the more pervasive a morphological category is in a language the more rapidly it tends to be learnt. Children tend to learn the grammatical morphemes of their language in relatively consistent orders. An investigation of three children by Roger Brown (1973) revealed a high degree of similarity among the children in terms of the morphemes they learnt earliest, and the sequence in which they were learnt. For instance, the earliest learnt morphemes are likely to be: (1) the verb suffix -ing (at around 24 months); (2) the prepositions in and on (at around 24 months); (3) the regular noun plural suffix -s ∼ -z ∼ -əz (at around 24 months); (4) irregular past tense forms of frequent verbs (around 30 months); (5) and the possessive clitic -s ∼ -z ∼ -əz (around 30 months). Similar patterns have been observed in learning English morphemes in a number of children from different family environments, although there is individual variation in the order and especially the age at which the child learns the morphemes. It should be remarked that the order in which the grammatical morphemes are learnt does not perfectly reflect their frequency in adult speech. Indeed, the most frequent word in adult English, the (see Table 9.1), does not appear in the above list. This word does not usually appear until the child is three years of age, and in the seventh position. The regular nominal plural morpheme -s ∼ -z ∼ -əz is, as just mentioned, one of the earliest learnt morphemes in English. However, the morphology of number marking in English shows irregularities, and the learning of plural marking of nouns is a staged process. The typical stages are as follows (based on Moskowitz 1978):

291

292

Linguistics

a. First, no nouns distinguish number: a single form is used regardless of how many things are referred to. b. Next, the child has a single noun that distinguishes number, usually the irregular foot ∼ feet; other nouns do not distinguish number. The singular and plural forms of this word are both highly frequent, and are saliently different phonetically. c. Third, another high-frequency irregular plural form is learnt alongside its singular form, typically man ∼ men. d. Following this, the regular allomorph -s appears on nouns ending in voiceless consonants, and -z on nouns ending in voiced segments; the third allomorph -əz is not yet in use, and nouns like house that end in a sibilant appear in just one form. The two regular allomorphs are overgeneralized, and appear also on the irregular plural forms – so the plural of man becomes /mænz/. For foot, many children have two plural forms, /fʊts/ and /fiːts/. e. In the fifth stage, the allomorph /əz/ appears on words ending in sibilants, such as house. However, it is overgeneralized to all nouns, so that the child is producing plural forms like /kætsəz/ or /kætəz/ instead of /kæts/. f. Sixth, most overgeneralized plural forms, with the exception of /mænz/, are corrected. g. Finally, all overgeneralizations are corrected. Notable in this sequence is the fact that adult irregular plural forms of high-frequency words are learnt early (though not necessarily with the plural meaning); subsequently non-adult forms involving the regular allomorphs appear. These regular forms coexist for some time with the irregular forms, before being completely replaced. Observe also that when they appear, the regular plural allomorphs are overgeneralized. Learning of the past tense of English verbs follows a similar sequence of stages.

Learning syntax We have already noted some of the major features of the earliest stages of learning syntax, beginning at around 18 months of age, when the child starts to combine words into longer utterances. The earliest stages, as we saw, are characterized by a lack of grammatical morphemes; these do not begin appearing until the child’s third year. To illustrate the staged nature of learning syntax, in the following subsections we outline the learning of just three syntactic constructions in English.

Negative constructions Three main stages have been identified in the learning of negative constructions in English. These stages overlap, and the child shows steady progress towards the adult norm. In the first stage, at around 18–26 months, negative markers no and not are put at the beginning or end of the utterance, as in No drink, No I can go and Gone no. In the second stage, which begins during the child’s third year, the negative word starts to be used between the subject and verb, as in You no do that, and I no eat it; in verbless clauses it occurs between the two noun phrases, as in That not mine. At about the same time, negative forms such as don’t and can’t appear as unanalysed

Language Learning

elements, also within utterances, as in I can’t see and I don’t want it. The third stage sees the appearance of other auxiliary forms with attached negative markers (isn’t, won’t), and their morphological analyses. Some examples are She won’t let go, She isn’t going and You’ve not got one. Some of the more advanced negative constructions are not learnt until the early school years. For example, the correct use of some and any in I’ve got some and I haven’t got any, and of hardly and scarcely.

Interrogatives A well-studied aspect of the learning of English syntax is how children learn interrogatives (see §6.3). Three main stages, again not discrete, have been identified. The first stage employs just intonation: high rising tone on an utterance signifies that it is a question. For example, Daddy there or I can go with high rising tone are used in requests of information. The second stage occurs during the child’s second year, when he or she begins to use interrogative words, first what and where, and later who, why, when and how. These words are put at the beginning of the clause, which is uttered on a rising intonation contour, as in Where horse go? and What you doing there? In the third stage children learn the auxiliary verbs be, have and do, and interrogative structures involving auxiliary verb followed by subject. This is learned first for yes-no interrogatives (as in can I go? and did he go?), and somewhat later for information interrogatives (as in When can I go? and Where are you going?).

Complex sentence constructions At about the age of three, sentences begin to appear that consist of more than one clause. To begin with, most of these are coordinate constructions using the conjunction and. Subordinate constructions (e.g. when and if clauses, and relative clauses) are increasingly used from this age, though they remain rarer than coordination constructions. Words like so, if, after, what, because and when are used in these constructions, though not necessarily in the adult way. An order of mention strategy is employed whereby the event of the first clause is presumed to occur before the event of the second, as in I fell down because I hurt my knee. The same strategy is used in the child’s comprehension of complex sentences. Children under six years of age experience difficulties in correctly interpreting complex sentence constructions, especially subordinate constructions. And more sophisticated conjunctive items, such as really, anyway, though, actually and of course, do not emerge until even later, perhaps not until the child is seven years old.

12.2 Strategies for child language learning In this section we discuss some strategies that have been suggested to explain how children learn to speak their first language. We begin with four major proposals, briefly mentioning evidence pro and con. Following this, we discuss some specific mechanisms that have been proposed for learning the meaning of lexical items.

293

294

Linguistics

Broad strategies for language leaning Conditioned-response learning Conditioned-response learning is a theory of learning associated with the psychological theory of behaviourism, which was applied to language learning by B. F. Skinner (1957). According to behaviourism, language develops from adult reinforcement and shaping of the babbling of the infant, and subsequently matures like other learned behaviour. Two types of conditioned-response learning are involved: classical conditioning and operant conditioning. In classical conditioning a stimulus (e.g. presentation of meat to a dog) that invokes a natural response (salivation) is consistently accompanied by another stimulus (e.g. ringing a bell). Eventually, the accompanying stimulus (the bell ringing) invokes the response (salivation), even in the absence of the initial stimulus (presentation of meat). Learning the meaning of a word was believed by Skinner to follow this process: if an object is presented to the infant accompanied by the word for it, the child begins to associate the word with the object, ultimately responding to the word in the same way as to the object. When this happens, the child has learnt the word. Operant conditioning involves rewarding certain behaviour, which is thereby strengthened; unrewarded behaviour eventually disappears. The infant behaves in such a way as to obtain rewards. According to Skinner, children are reinforced by adults who reward their early attempts to speak with smiles, attention and the like. As time goes on, adults become more demanding, and reward only the best approximations to adult speech. This selective reinforcement, it is suggested, gradually shapes the child’s behaviour in the direction of the adult norm. Although conditioning might account for some aspects of language learning, there is much it cannot account for, and it has been severely criticized by linguists, beginning with Chomsky’s excoriating review (1959) of Skinner’s Verbal Behavior. One criticism is that children learn the grammar of their language despite the fact that they are rarely if ever reinforced for producing grammatical utterances, or receive negative reinforcement (punishment) for ungrammatical utterances. By contrast, children are not infrequently reinforced for telling the truth and punished for lying; yet they typically end up as inveterate liars as adults! Another criticism is that conditioning cannot account for comprehension, which, as we have seen, precedes production throughout language learning – how could it be conditioned? A third criticism is that it cannot account for the facts of learning of irregular morphology: why would the child who has learnt the correct irregular forms start using incorrect regular ones (see p. 292). Finally, conditioned-response learning cannot account for the child’s production of word forms and sentences they’ve never heard before.

Imitation Another strategy for language learning is imitation. Human beings are excellent imitators, surpassing other primates. Imitation is a common means by which children (and adults) learn new things, including aspects of language. Children actively imitate the speech of those in their social environments, sometimes at inappropriate times, to the embarrassment of their parents. Caretakers

Language Learning

often encourage their charges to imitate what they say, and provide especially clear models of what should be said in the form of caretaker speech (see p. 288). Children frequently imitate new lexical items. Imitation may also be relevant to the learning of grammar. Children often imitate sentence patterns they are unable to produce spontaneously, and later stop imitating them when they are able to produce them. Imitation may thus serve to link comprehension with spontaneous production. But although some aspects of language learning can be accounted for by imitation, not all can be. Four pieces of evidence are often cited against the significance of imitation. The first is the overgeneralization that occurs at one stage in the learning of morphology, where former correct irregular forms are replaced or augmented by incorrect regular forms. The latter forms, such as wented, are said to be unlikely to have been imitated from adult speech. Second, children are sometimes unable to imitate adult utterances exactly, even when encouraged to do so. A good illustration of this is the dialogue in (12-1), in which the child is unable to replicate a pattern, despite numerous repetitions by the adult. It seems that children are unable to reproduce aspects of the utterance that lie beyond their competence. (12-1) CHILD: MOTHER: CHILD: MOTHER: CHILD:

Nobody don’t like me. No, say ‘Nobody likes me.’ Nobody don’t like me. (Eight repetitions of this dialogue.) No, now listen carefully: say ‘Nobody likes me.’ Oh! Nobody don’t likes me. (McNeill 1966: 69)

Third, if children learn largely by imitation, why don’t they learn grammatical morphemes such as the and a ∼ an, which are among the most frequent morphemes in English, much earlier than they do? Fourth, imitation does not explain why comprehension precedes production in learning, and why children can perceive phonetic differences they are unable to produce. None of these objections argue that imitation plays no role in learning, and none are irrefutable. Studies of what is actually said to children are relatively few and limited, making it difficult to evaluate the claim that they are not exposed to overgeneralized forms like wented (e.g. in caretaker speech) – as a father I certainly used many of these in my caretaker speech. Furthermore, imitation can never be precise repetition in all details, and a rejoinder could be that the child is still in the process of learning what is a significant feature vs. what is not. Put in another way, comprehension may be a crucial component of ‘accurate’ repetition.

Following on from the last point, elicited imitation is a technique sometimes used to determine a child’s competence in a particular domain of grammar. Here the experimenter reads out a sentence to be repeated. If the child changes anything in their repetition, or fails to correct an error in the model sentence, this is presumed to indicate that it is an aspect of the grammar that has not yet been learnt.

295

296

Linguistics

Hypothesis testing In adopting the strategy of hypothesis testing the child is presumed to be behaving like a scientist, making guesses about how the language works, and testing these guesses against the evidence from speech, and the reactions of interlocutors. According to this theory, the child learns a language through their attempts at analysing it grammatically. One piece of evidence for hypothesis testing comes from overgeneralized forms such as wented and feeded that are believed to be inventions of the child, formed according to regular morphological processes. The child has apparently figured out the general rule, and applied it in novel cases: the child hypothesizes that past tense is formed by adding -ed, and tests it on the verbs go and feed. On the other hand, it seems implausible to attribute to babies and young children the abstract styles of thinking and experimentation employed by scientists, and honed over many years of training. It is not clear that children actively seek empirical confirmation or disconfirmation of their hypotheses – that they really do test their hypotheses against actual empirical data. Another criticism concerns where their hypotheses come from: on what basis do infants make their guesses, given their lack of experience, and that they are rarely given explicit training, and when they are, are often unable to make effective use of it? The hypothesis testing approach to language learning also ignores the fact that language is a skill, that learning a language is not just learning knowledge about the grammar and lexicon of a language, but the ability to use the language to produce utterances meaningful in the context.

Innateness A fourth proposal, promulgated in the 1960s by linguists working within generative grammar (see p. 17), is innateness, the notion that children are born with an innate capacity to acquire language, and that much of our knowledge about language is genetically encoded. The child has little to learn since much of the knowledge about language is hardwired in the brain. When children are exposed to a language, general principles of discovering the structure of the language are automatically put into operation; these principles constitute the child’s language acquisition device (LAD). The LAD is deployed to make hypotheses about the grammar of language being learnt, thus answering one of the difficulties raised for the hypothesis testing account: the guesses of the child are based on innate knowledge. Given that the grammatical structure of languages varies considerably from language to language, what is innate must presumably be very abstract. One frequently cited argument for innateness is the speed and accuracy with which language is learnt by the child. And this is despite alleged serious inadequacies in the language sample the child is exposed to, dubbed the poverty of the stimulus. It is further argued that there are certain errors that children never make, that could not be inferred from the evidence available to children from utterances they hear. The problem with these claims is that they are not backed up by empirical evidence. Increasingly investigators are finding that the data available to the child is not as fragmentary and unruly as suggested. Moreover, it turns out that children do make some of the types of error that they have been claimed never to make.

Language Learning

The exact nature and properties of the LAD are highly controversial and are in constant flux, in tune with theoretical changes in generative grammar. While the innateness theory has dominated in language learning studies since the 1960s, there are signs of dissatisfaction with it. Increasingly investigators are exploring alternatives in which language is not seen as unique among human cognitive phenomena, requiring a dedicated LAD.

Strategies for learning meaning of words It is no trivial matter for a child to determine what a lexical item means, how it should be used, and what part-of-speech it belongs to. The following six principles have been proposed as strategies the child employs for determining the meaning of words (Golinkoff et al. 1994): ● ●









Reference: assume that words refer to things, events and qualities. Extendibility: assume that words apply to more than just the specific thing, event or quality referred to in the first-observed instance – assume they label types, not tokens. Object scope: assume that words denoting objects denote whole objects, not portions of objects. Categorical scope: assume that words can be extended to objects in the same basic-level category as the thing referred to in the originally observed usage.2 Novel name-new category: assume that novel word-forms apply to things, events or qualities that you do not yet have a name for. This principle is based on a dispreference of synonymy: a new label will more likely denote something new than be an alternative term for something already known. Conventionality: assume that speakers prefer specific over general lexemes.

These strategies can be thought of as heuristic principles for children to operate with so as to quickly and effectively learn the meanings of new words. They resemble the Gricean maxims (see §6.3) more than grammatical rules, in that they are effective (but not infallible) operating principles. Another strategy is to infer word meanings from grammatical properties of the utterance. This strategy, called syntactic bootstrapping, was tested in an experiment in which preschool children were shown a depiction of an unfamiliar action, such as a person doing something to a pile of materials with an unfamiliar tool (Brown 1957). One group of children were told In this picture you can see nissing; a second group were told that they could see some niss; and a third group, that the picture showed a niss. Children from each group were then asked to select another picture showing nissing, niss and a niss. The result was that the children selected a picture depicting the same action, the same material and the same tool, respectively.

Researchers disagree as to whether inferring word meaning is assisted by the syntax (syntactic bootstrapping) or the meaning of words provides children with cues to the grammatical analysis of sentences (semantic bootstrapping). Most likely the two processes interdigitate.

297

298

Linguistics

12.3 Second-language learning In many parts of the world children grow up speaking more than one language, having learnt them during childhood by processes described in §12.2. Sometimes a person learns, or attempts to learn, another language as an adult. We will refer to such a language as a second languag (L2), regardless of whether it is the person’s second, third or later language. We will refer to the process of its learning as second-language learning or L2 learning; it is also called adult language learning.

Developmental stages in second-language learning Like L1 learning, L2 learning is a staged process; the stages are not, however, exactly the same. In this section we outline some aspects of L2 learning of English, a major focus of research on L2 learning.

L2 learning of phonetics and phonology The phonological system of the learner’s L1 may be reflected in errors in pronunciation of L2, especially early on. For example, the lack of voicing contrast in word-final stops in Dutch, Danish and German may be carried over to L2 English, and final voiced stops replaced by the corresponding voiceless stops. The absence of a velar nasal phoneme in Hungarian – [ŋ] occurs only preceding another velar consonant – can be reflected in the pronunciation of words such as singer as [sɪŋgə]. One investigation of the learning of stress in English by Polish and Hungarian L2 learners revealed that 95 per cent of the errors in stress assignment were the result of influence from L1. (Stress placement in both languages is predictable.) This phenomenon, referred to as interference or negative transfer, is not restricted to phonology, but is also found in the L2 learning of morphology and syntax. For instance, Spanish speakers learning English as a second language often transfer their possessive construction involving de ‘of ’ as in el libro del profesor ‘the teacher’s book’, using the corresponding construction the book of the teacher instead of the possessive with -s. Negative transfers are usually more easily corrected in morphology and syntax than in phonology.

L2 learning of morphology One investigation of adult L2 learners, including speakers with sixteen different L1s, revealed that they learnt eleven English grammatical morphemes in roughly the same order: (1) progressive -ing; (2) singular copula is ∼ -’s (as in Bush is the president); (3) plural -s ∼ -z ∼ -əz; (4) articles the and a; (5) singular auxiliary -is ∼ -’s (as in the dog’s barking now); (6) irregular pasts of some frequent verbs (e.g. went); (7) third person singular present verb suffix -s; and (8) the possessive enclitic -s ∼ -z ∼ -əz. This sequence is remarkably similar to the sequence of L1 learning of English morphology.

Language Learning

L2 learning of syntax L2 learning of English negative and interrogative constructions is similar to their learning in L1. Regardless of the learner’s L1, L2 learners typically begin by putting the negative particle not or no in clause-initial position, then before the main verb, and finally, in the correct position, with the correct choice of auxiliary verb, although tense marking may be imperfect, as in he didn’t felt it. Similarly, L2 learning of interrogative constructions progresses from rising intonation on an ordinary declarative structure, through use of initial WH words, to use of auxiliary verbs, and ultimately the correct ordering of subject and auxiliary. Interference from the speaker’s L1 can affect the order of the stages, and the length of time the learner remains at the stage; it can also result in additional stages, such as Saw you that? by L1 German speakers, modelled on the interrogative construction of German.

Effects of age on L2 learning Everyone learns their first language with apparent facility and ease; differences among adult L2 learners are more obvious: some L2 learners reach a high level of command of the second language, while others do not. The Polish-born author Joseph Conrad (1857–1924), who wrote a dozen or more novels in English including Lord Jim, is frequently cited as someone who attained a high level of command of English as an L2 learner – although Bertrand Russell said that he spoke it with a very Polish accent. Age is generally considered to be the most important factor affecting L2 learning. It is widely accepted that adults learn an L2 more rapidly in the short term, while children start off more slowly, but overtake adults within about a year or so. This claim has, however, been criticized, among other things on the grounds of the limited range of L2 learning environments in which it has been tested. That adult learners cannot achieve a native-like accent in their L2 has also been challenged. Some investigations reveal that a small proportion of adult learners can attain native-like accents, and fall within the range of native speakers; moreover, they cannot be reliably distinguished as second-language speakers by native speakers. Likewise, a small proportion of adult L2 learners can apparently attain native speaker-like competence. These observations contradict the frequently heard claim that it is impossible for an adult learner to gain full control of a second language. But they do not contradict the generalization that the older one is when exposed to the L2, the more difficult it is to learn the language, and the less successful the learner is likely to be. A widely held view is that there is an optimal age-window for learning a language. According to the critical period hypothesis, there is a biologically determined window for the full learning of language, extending from about two months of age to about thirteen years (Lenneberg 1967). After this age, the neurophysiological ability for language learning is lost or greatly impaired. The biological evidence for this hypothesis is, however, weak. And age could be relevant for other reasons. For instance, adults are unlikely to learn an L2 in environments remotely like those in which a child is reared and learns their L1; they are also superior in abstract thought, and already have knowledge of another language.

299

300

Linguistics

Transfer Aside from influences of the grammatical systems of L1 on L2, pragmatic functions such as manners and strategies of soliciting information, requesting action, refusing offers and the like can also be transferred from L1 to L2. More interestingly, the L2 system may have an inverse influence on the L1 system. For example, VOT (see §2.3) for stops in English is longer than the VOT of corresponding French stops. The VOTs of stops in the speech of French speakers who learnt English as an L2 tends to be longer than the VOTs for monolingual speakers of French. They have a VOT in between the VOTs of the two languages, regardless of the language they are speaking at the time, though the actual values of the VOT may depend on the language spoken. Evidence suggests that such bilinguals have two phonological systems, though both systems differ from the systems of monolinguals. By contrast, for bilinguals who learnt both languages in childhood the respective systems are apparently indistinguishable from the systems of monolingual speakers. The meanings of words in L1 can also be affected by an L2. Monolingual speakers of Korean use the term paran skej ‘blue’ for colours that are greener than colours covered by the same term when used by speakers who learnt English as an L2 in adulthood. Whether these L2 bilinguals have a single lexical system for both languages, or two separate systems – one for each language, and each possibly different from the respective monolinguals’ systems in the languages – is a moot point. Evidence is conflicting: some investigators argue for two possibly overlapping systems, others that the L1 system is still operating while the bilingual individual is speaking the L2.

Factors relevant to L2 learning Many factors are relevant to the success of L2 learning, including personality factors. One is motivation, the need or desire to learn the L2, which can be a desire for proficiency so as to participate in the life of the community, or for more practical purposes such as getting a job or promotion. A second factor is aptitude: adult L2 learners differ in their talent for learning a second language. A third consideration is the learner’s attitude to the second language. Negative attitudes are likely to lead to decreased motivation, and to failure to attain proficiency in the L2, while positive attitudes are likely to be associated with increased motivation and greater likelihood of success. A fourth factor is empathy, the ability to take another person’s perspective. It has been suggested that empathetic persons are more likely to succeed in language learning in natural communicative situations. Being less inhibited than others, they may be less embarrassed by making mistakes. Also relevant are the circumstances and manner in which the L2 is learnt. Sometimes a distinction is drawn between foreign-language learning (in which the L2 is learnt outside of the community of speakers – for instance, Hungarian and Finnish (Uralic, Finland) in Denmark) and second-language learning (where the language is learnt in its speech community). It seems reasonable to believe that the latter situation is more conducive to L2 learning than the former. But things are not always as simple as this. In the Netherlands and Scandinavian countries adult

Language Learning

monolingual speakers of English can experience difficulties in entering into speech interactions in the language of the country because speakers immediately switch to English when a foreigner is present.

Summing up Human beings are predisposed to learn a language, and something must certainly be genetically coded. There is considerable disagreement among investigators as to what this is. According to some, specific knowledge about the grammar underlying all human languages – universal grammar – is a part of our biological heritage. Others argue that nothing specific to language is genetically coded, that we have merely a language-ready brain (and body). Language learning proceeds by a remarkably consistent sequence of stages, though children differ according to when they reach the stages. They are: cooing and babbling; the one-word holophrastic stage; the two-word stage, and basic mastery. Some aspects of grammar are not learnt until adolescence, and continued elaboration occurs throughout an individual’s life. Regularities exist in the learning of all aspects of language. In all domains perception precedes production. The child’s learning of lexical semantics is surprisingly accurate, though errors of overextension and underextension do occur. To explain the accuracy of semantic learning, a set of strategies have been proposed: reference, extendibility, object scope, categorical scope, novel name-new category and syntactic bootstrapping. General mechanisms for the child’s learning of language are conditioned-response learning, imitation and hypothesis testing. None of these explain all aspects of language learning. According to a widely held view, language is too complex to be learnt by the child from the imperfect model it is exposed to. This notion, the poverty of the stimulus, has it that there is an innate language acquisition devic (LAD) guiding the child in its construction of the grammar of the language it is exposed to. Second-language learning is also a staged process. Age is perhaps the important factor in success of L2 learning. The critical period hypothesis has it that there is a biologically determined window for attaining native speaker competence in an L2. Also important to the success of secondlanguage learning are personality factors including motivation, aptitude, attitude and empathy. A recurrent feature of L2 learning is transfer or interferenc from the L1 system. Transfer can also proceed from the L2 to the L1.

Guide to further reading Good article-length accounts of first language learning are Chapter 8 of Gleason and Ratner (1998), MacWhinney (2017) and Chapter 9 of O’Grady et al. (2017). Surveys of topics in first-language learning can be found in Fletcher and MacWhinney (1996); Clark (2009) and Saxton (2010) are

301

302

Linguistics

good textbook treatments. Three good anthologies are Bloom (1996), Bowerman and Levinson (2001) and Lust and Foley (2004). Ochs (1988) describes first-language learning in a non-Western culture, and Dan Slobin’s multivolume collection (1985a, 1985b, 1992, 1997a, 1997b) deals with first-language learning in many languages. Chapter 5 of Emmorey (2002) discusses first-language learning of deaf sign languages, primarily American Sign Language. Elman et al. (1997), Gopnik et al. (2001) and Sampson (2005) take non-innatist perspectives on language learning; Crain and Lillo-Martin (1998) and Pinker (1994) represent the innatist side. Michael Halliday (1977/1975) and Michael Tomasello (2003a) set out very different socialization and constructivist theories of first-language learning. The Child Language Data Exchange System (CHILDES), at https://childes.talkbank.org/, contains learning data – including audio and video recordings and transcriptions – from children in various languages. Cook (2017) provides an excellent account of current issues in second-language learning. Also worth reading are Chapter 10 of O’Grady et al. (2017) and Chapter 10 of Gleason and Ratner (1998). Recent textbooks on L2 learning include Gass et al. (2020) and Saville-Troike and Barto (2017); Gass and Mackey (2012) is a handbook. Evidence for the critical period hypothesis is discussed in Strozer (1994); Birdsong (1999) is a collection of articles presenting both sides of the argument. An alternative proposal, the perceptual magnet effect, is elaborated in Kuhl and Iverson (1995); see also Gopnik et al. (2001).

Issues for further thought and exercises 1 Two experimental methods that have been used to study speech perception in pre-speaking infants are the high amplitude sucking paradigm and the conditioned head-turn procedure. Find out about these two methods, and write a paragraph description of each, explaining their motivations (why do they work?). 2 How would you explain the use of the word-form mikey as a term for bicycles and tricycles by the child referred to in §12.1 by the strategies for determining the meaning of words given in §12.2? 3 Below are some utterances produced by three children at different stages of development. What is the most likely order of the stages of development of the children in these examples? Justify your ordering. a. You want eat? I can’t see my book Why you waking me up? b. Where those dogs goed? You didn’t eat supper Does lions walk?

Language Learning

c. No picture in there Where momma boot? Have some?

4 What is the wug-test? Describe the test (see Berko 1958) and its motivations (i.e. what was the reason for developing it). Design a wug test to explore the learning of other grammatical categories such as: agentive derivations (-er as in farmer); the progressive aspect (the -ing form of verbs); and nominative and accusative cases (for a language with nominal cases as an inflectional category of nouns). 5 We mentioned that deaf children babble vocally, and later with their hands. See what you can find out about the learning of sign language by deaf children. What stages does it follow, and are the stages comparable to the stages of learning of spoken languages? (Some references are Meier and Newport (1990) and Newport and Meier (1985); the following website provides some information on the child’s learning of a deaf sign language: https://www.handspeak. com/; note that you will have to navigate to the appropriate place on this website under topic keywords.) 6 What is the gavagai problem? Give a brief description and comment on how serious a problem you think it might be to first- and/or second-language learners. Could the strategies mentioned in §12.2 assist in its resolution? What other factors might be brought into account? 7 The following is a small selection of two-word utterances of a child of two years and nine months of age (cited in Blake 2008: 237). Describe the morpho-syntax of this child’s speech as revealed by these examples. a. b. c. d. e. f. g. h. i. j. k. l.

Bubble coming Bubble come Smack daddy Naughty me Gone pencil Smack Laurie Sockie here Study bed Finish tea Near daddy Out bed Clothes wet

‘a bubble is coming’ ‘a bubble is coming’ ‘I’m going to smack daddy’ ‘I am naughty’ [Reply to ‘Why did you hit Lawrence?’] ‘my pencil has gone’ ‘smack Lawrence!’ ‘your socks are here’ ‘he is studying in his bedroom’ ‘have you finished tea [evening meal]?’ ‘I’ll put this chair near daddy’ ‘I want to get out of bed’ ‘her clothes are wet’

8 What is foreigner talk? Compare it with caretaker speech, identifying similarities and differences. Do you think that foreigner talk is useful to the learning of an L2? Explain your reasons. 9 Find out about the immersion approach to second-language learning. Describe the method briefly (in a few paragraphs) and comment on its usefulness; do you perceive any inadequacies?

303

304

Linguistics

10 George Birdsong’s introductory chapter to his Second Language Acquisition and the Critical Period Hypothesis (1999: 1–22) outlines evidence for and against the critical period. Summarize the evidence he presents, and comment on its relevance to the critical period. How might proponents of the hypothesis deal with the counter-evidence? What do you conclude from Birdsong’s discussion? 11 The following is a transcription of a description of a classroom scene by an L2 speaker of English whose native language is Spanish. What ‘errors’ are represented? Which do you guess result from interference from Spanish, and which would you attribute to overgeneralization or other processes? Check a grammar of Spanish to see whether your guessed interference error is reasonable. In a room there are three womens . . . one is blond . . . blond hair . . . there are three womens . . . one woman is the teacher . . . and the other two womans are seat in the chair . . . one of them are . . . are blond hair . . . and the other woman . . . is black hair . . . the teacher is made an explanation about shapes . . . triangle circle

Research project The Guide to further reading mentions two socialization theories of language learning. Write an essay on one of them, providing a discussion of the main characteristics of the theory and the evidence supporting the theory. To what extent are the stages, learning processes and strategies discussed in §12.1 and §12.2 accounted for in the theory? What other stages, learning processes and strategies are proposed in the theory? What (if anything) is innate? (In answering this question you should consider not just what is said to be innate, but also what is assumed innate.) Provide an evaluation of the theory, highlighting what it accounts for and does not account for.

Part IV Language: Uniformity and Diversity

305

306

13 Gesture and sign languages

So far our attention has been directed to speech, to language conveyed in the auditoryvocal medium. In this chapter and the next we discuss the two other main mediums of human language: the visual-gestural and writing, respectively. We will see that the visualgestural medium is employed in a range of ways in human communication, from gestures that accompany speech to fully fledged languages that are independent of speech. We discuss some of the characteristics of these visual-gestural systems, and how they relate to spoken languages.

Chapter contents Goals Key terms 13.1 The visual-gestural medium 13.2 Primary sign languages 13.3 Alternate sign languages Summing up Guide to further reading Issues for further thought and exercises Research project

308 308 308 312 323 328 329 330 331

307

308

Linguistics

Goals The goals of the chapter are to: ● discuss the relation between gesture and speech and show that they are integrated together in a single system of communication; ● show that primary or deaf sign languages are languages in their own right, and satisfy Hockett’s design features; ● show that primary sign languages have phonological, morphological and syntactic structure; ● compare and contrast the structural features of primary sign languages with those of spoken languages; ● remark on the significance of sign languages to linguistics; and ● introduce alternate sign languages and sign systems used by hearing persons in contexts in which speech is inconvenient or proscribed.

Key terms alternate sign languages American Sign Language (ASL) Auslan beats British Sign Language (BSL)

fingerspelling

non-imagistic gestures

gesture

non-manual gestures

handshape

orientation

home signs

pointing

imagistic gestures

primary sign languages

location

sign space

manual gestures

village sign languages

classifiers

movement

eye-gaze

multi-channel signs

13.1 The visual-gestural medium Preliminary remarks The visual-gestural medium (see p. 12) can be employed expressively and communicatively in many ways. As already mentioned, signed languages of the deaf employ this medium, using a system of codified gestures that function independently of spoken language. At the opposite extreme are bodily events that we share with animals, that express emotions (as mentioned in

Gesture and sign languages

§10.1), such as the raising of hair (humans), feathers (birds) and fur (mammals) in fear. Most of these are not subject to conscious control. Between the two extremes are visible bodily movements that we more or less consciously and deliberately employ alongside speech, and sometimes instead of speech, such as movement of the hands, head and torso. These include idiosyncratic and nonconventionalized bodily movements that depict on the fly aspects of what is being spoken about, as well as conventionalized signs like shaking the head to signify ‘no’ or disagreement. The bulk of this chapter (§13.2) deals with sign languages of the deaf. We discuss their linguistic properties in some detail, and show that they are organized in similar ways as spoken languages; they have phonologies, morphologies, lexicons and syntax. Differences from spoken languages are also commented on. Following this is a briefer discussion (§13.3) of sign languages that are used in certain circumstances by hearing persons. Before we begin this discussion, however, it is useful to say something about the intermediate domain of visible signs that typically accompany speech, which for simplicity we will refer as gestures.

It should be cautioned that it is difficult to draw precise boundaries between these phenomena, which merge into one another. Gestures are of particular interest because they straddle the boundaries with culturally shared systems of communication, including language, and spontaneous biologically given expressions, and because of their possible relevance to the evolutionary origin of language (recall the ‘noddy’ theory, p. 250).

Gestures As we use the term here, a gesture is a visible bodily activity, typically involving movement of part or parts of the body, that is used interactively to convey some meaning. In discourse a gesture may constitute either an utterance or a component of an utterance involving speech as well. For instance, some time ago I was walking with my wife along a road in Hobro (a village in Denmark) when a car passed us, moving in the opposite direction. It had only gone a short distance before it reversed to a little in front of us, with the driver pointing insistently towards the direction we had come from. After a second or two we realized that he was indicating that my wife had dropped a glove some metres back. (Presumably he had made a similar gesture when first passing us, but we had not seen it.) On another occasion I was walking along a street in another Danish village, Arden, when a young guy drove up to me asking (in Danish), ‘Excuse me, which is the road to Store Økssø [a popular lake].’ I replied (in Danish), ‘That direction’ simultaneously pointing back along the street in the direction he had come from. To this he replied (again in Danish),‘Back that way’, accompanying his utterance with a point. I nodded and uttered ‘Yes’ (in Danish). In the former case, speech would have been impractical. In the latter both language and gestures are almost certainly going to be used (in the absence of a conventional gesture for Store Økssø). We use gestures in diverse ways: to give expression to thoughts and ideas, to describe things, to greet and farewell, to draw attention to things, to indicate agreement or disagreement, to indicate directions, and so forth. Indeed, it is difficult not to use gestures when speaking: you have doubtless

309

310

Linguistics

observed people gesturing while talking on the telephone, despite the fact that the interlocutor is unable to see them. It is difficult to give directions without the use of gestures. Gestures have been classified in different ways by different investigators. For our purposes it is sufficient to outline the classification suggested by McNeill (1992, 2005), who distinguishes between imagistic and non-imagistic gestures. Imagistic gestures involve the depiction of a feature of an object or event in terms of shape, size, movement pattern or whatever. These are typically iconic. Let me illustrate with an example from Warrwa, from the beginning of a short text narrated by Maudie Lennard about the old gaol in Derby, Western Australia (for a detailed transcription of the spoken text and gestures, see McGregor 2004: 299–304). The spoken utterance is shown in (13-1). As she begins the word bidiwarri ‘hole’, the speaker moves her right hand down to the ground and begins tracing with her middle finger a circle next to her leg in a clockwise direction. She continues the gesture until the circle is completed about 1.5 seconds following the completion of the final word yunguru ‘round’. Figure 13.1 shows the handshape and beginning of the gesture. What is being referred to is a wheel-shaped object, iconically represented by the gesture.

Figure 13.1 Beginning of the gesture accompanying (13-1), by Maudie Lennard. (From The Languages of the Kimberley, Western Australia, William B. McGregor, Figure 13.1. © 2004. Reproduced by permission of Taylor & Francis Group.)

Gesture and sign languages

(13-1) kinya-mirri nyinka-n/ bidiwarri this-EMP this-LOC hole ‘Here, by this hole, was a round thing.’

nyinka-n-ka/ this-LOC-EMP

yunguru/ round

Warrwa

Imagistic gestures can also be metaphoric, representing some abstract notion in terms of something more concrete, which is represented iconically. For instance, Kendon (2004: 100) describes a gesture used by a social worker in reference to rapid and spontaneous speech by a client: the speaker moved one hand rapidly outwards and upwards from a position near her waist, suggesting gushing water. Non-imagistic gestures include pointing gestures and beats. Points are signs that draw attention to, or indicate the location of, some entity; they are indexical signs in Pierce’s terminology (see box on p. 237). We generally think of pointing as a movement performed with the index or pointing finger. However, the handshape employed in pointing varies according to what is being indexed, and its relevance at that place in discourse. Thus pointing with an open hand contrasts with pointing with the index finger. Eye-gaze and head movement are also used in pointing. In a number of Australian Aboriginal societies pointing with the index finger is rare, and the most neutral way of pointing is by protruding the lips in the direction of the referent. Beats are simple up-and-down rhythmic movements that have no particular meaning of their own, but mark out segments of discourse. A number of linguists (including Edward Sapir, Dwight Bolinger, Kenneth Pike and Charles Hockett) have maintained that gesture and language form a unified whole, that no sharp boundary can be drawn between language and gesture. Similarly, gesture theorists Adam Kendon and David McNeill suggest that gesture and speech are tightly integrated, and even, in McNeill’s view, represent two aspects of a single process of utterance. Thus speech and gesture often express different, though related meanings, each complementing the other; together they permit the expression of more complex and nuanced meanings. This is illustrated by (13-2), the fifth sentence of the Warrwa text just mentioned. (13-2) marlu yidany-ngana/ wuba-mirri kanʔ kan-kanu/ not long-ALL little-EMP tha. . . there-ABL kalbu/ up ‘It wasn’t a long way, only a little way from there to the food.’

mangarri-ngana food-ALL Warrwa

During the production of the first two words the speaker slowly raises her right hand, pointing with her index finger to a position on the ground about a metre distant. She moves her hand in a slight circular movement, and then jerkily raises it slightly during the brief silence before the third word. On uttering this word, the middle finger replaces the index finger, as shown in Figure 13.2.1 As she utters the syllable kan she drops her hand to almost the ground, and simultaneously gazes in the same direction. Then with her hand pointing down she begins the next word, kankanu ‘from there’. Finally she raises her hand to about the previous position, and simultaneously raises her eyes to gaze in the same direction. The gesturing evidently provides an indication of the distance to the food by measuring out the endpoints; the spoken utterance, (13-2), gives less precise specification of the distance.

311

312

Linguistics

Figure 13.2 The middle-finger gesture produced simultaneously with wuba ‘little’ in the utterance of (13-2) by Maudie Lennard. (From The Languages of the Kimberley, Western Australia, William B. McGregor, Figure 13.7. © 2004. Reproduced by permission of Taylor & Francis Group.)

13.2 Primary sign languages Primary sign languages – also called deaf sign languages – are natural human languages used in communities of deaf people, and that are the first languages of some group of signers. These languages use the visual-gestural medium as the sole medium. Primary sign languages are languages in their own right, and satisfy Hockett’s main design features (see §1.3). (Recall that we ignore Hockett’s first design feature, the vocal-auditory channel.) Moreover, they are not representations of the surrounding spoken languages in the visual-gestural medium, as are systems such as Signed English. At the time of revising this chapter, Glottolog (a listing of the languages of the world – see §17.1) lists 139 primary sign languages, including American Sign Language (ASL, the sign language used in the deaf community in the USA), Auslan (the sign language of the deaf in Australia), British Sign

Gesture and sign languages

Language (BSL), Danish Sign Language (Dansk Tegnsprog, DTS), French Sign Language (Langue des Signes Française, LSF), Hungarian Sign Language (HSL), Israeli Sign Language (ISL) and Nicaraguan Sign Language (NSL). These are distinct languages, not fully intelligible to signers of the other languages. Primary sign languages are mostly rather young languages, with origins dating back just a few hundred years; some (like Nicaraguan Sign Language) are more recent, and emerged in the twentieth century. They generally arise in contexts where deaf people come together in numbers large enough to form viable communities, as happened in Europe with the advent of urbanization accompanying the Industrial Revolution. Prior to this, deaf people in European countries were usually isolated from one another, and circumstances were not conducive to the emergence of full sign languages. Residential schools for the deaf, the first of which were established in the early nineteenth century, also played an important role in the development of many primary sign languages, including British Sign Language, French Sign Language, American Sign Language and Nicaraguan Sign Language. These schools permitted the formation of signing communities and facilitated the development and expansion of gestural systems into full languages, usually within the space of a few generations (see e.g. Senghas, Kita and Özyürek 2004). Sometimes in relatively isolated communities a high proportion of the population is deaf, usually hereditarily, due to a high incidence of a gene for deafness. In such circumstances sign languages sometimes also emerge; these are called village sign languages. Examples of village sign languages are Martha’s Vineyard Sign Language (a now extinct sign language of the island of Martha’s Vineyard in the USA), Kata Kolok Sign Language (Bali), Ban Khor Sign Language (Thailand) and Adamorobe Sign Language (Ghana). Village sign languages are normally known and used by all members of the community, deaf and hearing. Most deaf children have speaking parents, and thus are unlikely to have been exposed to sign language from birth. Research in a number of cultures has revealed that in the absence of signlanguage input deaf children usually develop a system of home signs to communicate with those around them. These systems of home signs are usually idiosyncratic and restricted to single families. They show some characteristics of language, but are not fully developed languages. In the following four subsections we overview some of the main structural features of primary sign languages, dealing in turn with phonology, morphology, lexicon and syntax. It must be stressed that the coverage is very partial, and many important characteristics are omitted. We wind up the treatment of primary sign languages with some remarks on their significance to linguistics.

Phonetics and phonology of sign languages Duality of patterning (see pp. 14–5) implies that the forms of signs can be analysed as patterns of meaningless components that are put together in various ways, reused as it were in a variety of different signs. In sign language linguistics the terms phonetics and phonology are used analogously with their use in linguistics generally to refer to the structure of signs in terms of these meaningless components. Phonetics, therefore, refers to the actual physical manifestations of the gestures,

313

314

Linguistics

whereas phonology is concerned with the system lying behind these manifestations and what is used distinctively in it. Most signs of sign languages are manual, made with the hands alone. There are one-handed and two-handed signs. An example of a one-handed sign in Auslan is WHITE, shown in Figure 13.3 a. Two-handed signs may be symmetric or asymmetric, depending on whether or not the handshapes are the same. Figures 13.3 b and c show two types of symmetric signs in Auslan. In the first, the sign for DAY, the hands move in mirror-image paths; in the second, CAT, one hand, the dominant hand, is active, while the other, the subordinate hand, remains (virtually) stationary and serves as a location for the other. In asymmetric signs the hands adopt different shapes; in these signs one hand is always dominant, the other subordinate; this is illustrated in EXIT (Figure 13.3 d).

In sign language linguistics it is conventional to represent signs by their glosses in English (or another relevant spoken language), rendering them in capital letters. There are other ways of transcribing signs, including various systems of representing them according to their phonetic or phonological structure. The Hamburg Sign Language Notation System (HamNoSys) is a widely used system, which, like the IPA, is intended for the transcription of signs of all sign languages; it is not intended as a practical orthography for any particular sign language. (For information on this system, see http://www.sign-lang.uni-hamburg.de/ dgs-korpus/index.php/hamnosys-97.html..)

A second group is made up of non-manual signs, which are formed with a part of the body other than the hands, such as the face, eyes, mouth, head and torso. An example of a non-manual sign in BSL and Auslan is NO, expressed by a headshake. Non-manual gestures often serve grammatical functions (see below pp. 319–21). Some signs involve a combination of manual and non-manual components; these are called multi-channel signs. For example, the sign for GULLIBLE in Auslan involves a manual gesture (hand moves upwards and holds nose between thumb and index finger; hold is released and hand moves downwards to neutral position) which is often accompanied by a forward tilt of the head. At least for some Auslan signers there may be a phonemic contrast in the facial expressions in MOUSE and ORGASM: both involve the same manual gesture, but differ in that MOUSE is signed with a neutral expression, whereas ORGASM is produced with rounded lips and sucked-in cheeks. LATE and NOT-YET have the same manual gestures in ASL. They differ in non-manual features: LATE is produced with a neutral facial expression, while NOT-YET involves the tongue protruding between the teeth. Returning to manual signs, these are analysable into four component meaningless features: handshape, location, movement and orientation. A wide range of handshapes are physically possible, though not all are used in any given sign language. Auslan shows some sixty-two broad phonetic handshapes, of which only about thirtyseven are phonemic. The remainder are non-contrastive allophonic variants of these thirty-seven. For instance, the so-called S handshape, a fist shape, can be made with the thumb against the index

Gesture and sign languages

(a) WHITE

(c) CAT

(b) DAY

(d) EXIT

Figure 13.3 Four signs of Auslan (Source: Johnson and Schembri 2003. Reproduced with permission from the authors. Videos of these signs can be found online at the Auslan–Signbank website, http://www.auslan.org.au/dictionary/.)

finger or bent over the index and middle fingers. The position of the thumb is variable in this handshape. Location concerns the position of the hand on or near the body or in sign space, the volume from just above the head to about the waist, and elbow to elbow when the arms are loosely bent. Most signs are formed within sign space; and it is within this region that the hands and arms mostly move and may come close to or make contact with the body. A large number of distinctive locations on or near the body are relevant in Auslan (and other sign languages). There are some thirty-nine

315

316

Linguistics

primary locations, plus a number of secondary ones on the subordinate hand. Some signs are produced in neutral space, meaning that they are not located close to any specific part of the body. Movement features are of two major types, primary and secondary. Primary movement features are of path, which concern movement from one place to another – for instance, towards or away from the signer, in an arc or a straight line, or downwards or upwards (see Figure 13.3 for examples of path in Auslan). And local or internal, which concern the changes to handshape and orientation, in opening and/or closing the hand. Secondary movements typically involve rapidly repeated changes of handshape or orientation (in local movements), e.g. rubbing, squeezing. Orientation concerns the direction of the palm and fingers during the production of the sign. Orientation may be upwards, downwards, left, right, or towards or away from the signer’s body. Note the different orientation of the hand in the 1 handshape in WHITE and EXIT (see Figure 13.3 a and d).

Morphology of sign languages Primary sign languages resemble spoken languages in that they typically show morphological structure; words are analysable into component signs of the types discussed in Chapter 3, including roots, derivational and inflectional morphemes. For instance, Auslan has a negative derivational suffix, a flat handshape in the palm up position (with fingers either separated or together), and a genitive suffix that appears to be an inflection, and resembles fingerspelled S, although with an upward rather than a downward movement. Affixes are, however, infrequent in sign languages. There is a tendency for derivation and inflection to be expressed by means of changes to the lexical sign itself, as occur in irregular forms in English such as foot and feet, see and saw, which involve a process called ablaut. Put in another way, the morphological structure of signs tends to be simultaneous rather than sequential, and is generally better understood in process terms (i.e. morphological processes) rather than in item terms (morphemes strung together in sequence).2 For instance, some verbs in ASL and many other sign languages have different beginning and/or endpoints according to who is acting on who, much as verbs in languages like Spanish agree with their subject. GIVE is such a verb in ASL: the gesture begins at a spatial location associated with the giver and ends at one associated with the recipient. Thus ‘I give it to you’ will be expressed by an outwards gesture beginning at the speaker and extending towards the hearer, whereas ‘You give it to me’ will be directed in the opposite way. Not all verbs in ASL show agreement, and those that do differ in terms of which grammatical role they agree with, and whether the movement goes from subject to object or from object to subject. In many sign languages – including ASL, Auslan, BSL, DSL and LSF – certain verbs show another morphological peculiarity. The handshape adopted in the sign varies depending on the nature of the entity involved in the event. These handshapes are called classifiers, and are of three types. First, for some verbs of movement and location there are entity classifiers that iconically represent salient properties of classes of referents such as shape (e.g. round, flat), type (e.g. human,

Gesture and sign languages

vehicle) or physical attribute (e.g. solid, liquid). For example, in ASL the positional verb used to specify the location of money takes different handshapes according to whether the money is in the form of a coin (using the classifier for small round objects), a note (using the classifier for flat objects) or a pile of coins (using the classifier for dome shapes). Second, there are handling/instrument classifiers that imitate the action of the hands on objects; these are found with certain verbs that denote processes of handling objects. Third are size and shape specifiers, classifier handshapes that outline the shape and size of the referent entity. For example, in ASL there are surface handshapes that describe the surface of objects, indicating whether it is narrow or wide, flat or undulating, perimeter-shape handshapes that trace the outline of an object, and depth and width handshapes that specify relative depth and width of objects. Another illustration of the non-sequential character of sign language morphology is provided by numeral incorporation in Auslan. The handshape of some signs can be modified by substituting handshapes for the numerals 2–9. This occurs in some time signs. For instance, the sign TOMORROW involves the 1 handshape (index finger extended); this may be replaced by the 2 handshape (index and middle fingers extended) to indicate TWO DAYS HENCE. (Location, movement and orientation of the sign remain unchanged.) Compounding is common in many sign languages, including ASL, Auslan and BSL. For instance, in Auslan the sign PARENTS is a compound of MOTHER and FATHER; FURIOUS is a compound of THINK and BAD; and BOYFRIEND of BOY and FRIEND. Compounds are signed as single units with smooth and fluent transitions between their component signs; their durations are comparable to that of single signs. In addition, formal changes occur in compounding that affect the shapes of the individual signs, and distinguish compounds from sequences of signs. These include reduction and shortening of the first sign and loss of repetition of movement in the second sign, if present in citation form. Reduplication is employed in many sign languages. In ASL and Auslan it sometimes distinguishes nouns from verbs. For example, the noun KEY in Auslan involves repeated movements, whereas the verb LOCK involves just a single movement. Not all noun-verb pairs, however, are distinguished in this way. Reduplication is also used in ASL and Auslan to mark nouns as plural. In Spanish Sign Language reduplication of some nouns marks plurality, whereas reduplication of verbs generally indicates that the event is continuous, intensive or iterated. Pronouns exist in all sign languages, and distinguish first, second and third persons and various numbers.3 In ASL singular pronouns are formed by pointing with the index finger at the location of the real-world referent, or at a location that a non-present referent has been established at, or assigned to, in discourse. Sometimes the thumb is employed instead. Non-singular forms specifying two, three, four or five referents are made with the appropriate numeral handshapes and involve movement patterns that specify who is included in the set of referents. Thus in the first person a distinction is made between whether or not the addressee is included. General plural forms of the pronouns are made by movement of the hand along different arcs. Possessive forms of pronouns follow the same patterns, but use the B handshape (flat hand, fingers together), with the palm oriented towards the possessor or possessors.

317

318

Linguistics

Space is used meaningfully and/or grammatically in many sign languages. As we have just seen, pointing can be used to establish an entity at a particular location in the space shared between speaker and hearer. Subsequent pointing to that location will select the referent established there. We have also seen (p. 316 above) that space is used with certain verbs to convey information that is conveyed by agreement inflection in some spoken languages. Space is also used to iconically represent the arrangement and location of entities in the referent world. In addition, space is often used metaphorically, to represent temporal relations by position on timelines. For instance, in Israeli Sign Language past time is located behind the signer, future in front of the signer. Thus the signs for YESTERDAY and TOMORROW have the same handshape and location, but differ in direction of movement: in the former the hand moves backwards, while in the latter it moves forwards.

Sign language lexicon As in spoken languages, lexemes in sign languages typically fall into different parts-of-speech. In Auslan the following parts-of-speech can be identified by virtue of their different morphological and syntactic behaviours: nouns, verbs, adjectives, adverbs, determiners, auxiliary verbs, prepositions, conjunctions, pronouns and interjections. Some differences in morphological properties of nouns and verbs were mentioned in the previous section. Nouns, verbs, adjectives and adverbs in Auslan are open classes; the other parts-of-speech are closed (see p. 84). Similar partsof-speech systems have been identified in other sign languages. In ASL and various other sign languages a distinction is sometimes drawn between native lexicon, made up of indigenous signs that developed within the language itself and which follow the phonological patterns of the language, and non-native lexicon, which are signs that are borrowed from other languages.

Within the native lexicon some linguists distinguish between core and non-core lexicon, where the former are the most lexicalized, the most thoroughly incorporated into the language, while the latter are more gesture-like, and include e.g. pointing signs. The lines between the three categories – core native, non-core native and non-native – are not easy to draw, and items may shift over time from the latter two categories into the core native lexicon. For instance, when a sign is first borrowed from another sign language it may be borrowed in its phonetic shape in the source language, which might not be a legitimate form in the borrowing language; it would thus be a non-native lexeme. Over time, the sign might change to satisfy the phonological patterns of the borrowing language, thus becoming nativized.

The non-native lexicons of ASL, BSL and Auslan have been significantly influenced by borrowings from English via fingerspelling, systems for spelling words with the manual signs of a hand

Gesture and sign languages

alphabet. There are numerous hand alphabets in use. The hand alphabet used in ASL is a onehanded system, with origins in France; the system used in Auslan is two-handed, with origins in Britain. Fingerspelling is an essential part of sign languages such as ASL, BSL and Auslan, and is used in spelling out place names, personal names, and words without corresponding sign language equivalents. Sometimes ASL signers use fingerspelling instead of an existing sign for rhetorical purposes. Fingerspelled words are not infrequently abbreviated, or show some changes to their shape (recall from p. 316 above the change in direction of motion of the fingerspelled genitive S of Auslan). Fingerspelled items generally have the same meanings as the corresponding English words, whereas signs belonging to the native lexicon typically do not. Not all sign languages use fingerspelling to the same extent as these three sign languages; French Sign Language, for instance, uses fingerspelling less frequently, and tends to replace fingerspelled French words with native signs. Idioms also exist in sign languages. In ASL the expression TRAIN GO SORRY ‘the train has left, sorry’ has an idiomatic meaning ‘I won’t repeat what I just said’.

Sign language syntax Words in sign languages can be put together to form larger units that express meanings too complex to be expressed with just the morphological and lexical resources available. As in spoken languages, there are regularities in the ways signs are combined. At least for some sign languages it appears that a unit corresponding to the sentence of spoken languages (see §5.1) can be distinguished, which is hierarchically structured in terms of smaller units, clauses and phrases. For instance, in Auslan a noun such as GIRL can be combined with a pointing sign that serves as a determiner identifying the referent, an adjective, and/or a possessive pronoun. Combinations such as these are NPs, and there are restrictions on the order of signs making up these phrasal units. Clauses are presumably identifiable in all primary sign languages, and show comparable grammatical possibilities to clauses of spoken languages. In Auslan experiential roles of Actor, Undergoer and Event (see pp. 120–2) can be identified for clauses with verbs (Johnston and Schembri 2007: 203–5). Usually these roles occur in the order Actor Event Undergoer (if there is one), as in: (13-3)

BABY CRY Actor Event ‘The baby is crying.’

(13-4)

CAT LOVE Actor Event ‘The cat loves the dog.’

DOG Undergoer

However, the order of grammatical roles in Auslan is less rigid than in English, and sometimes the Undergoer NP precedes the Event VP. This alternative order is especially likely when knowledge of the world indicates which NP is Actor and which is Undergoer, as in example (13-5), and when the verb shows agreement, like GIVE (see p. 316 above).

319

320

Linguistics

(13-5)

BOY CAKE Actor Undergoer ‘The boy eats cake.’

EAT Event

Non-manual features are used in many sign languages to express certain types of grammatical meaning. Auslan distinguishes declarative, interrogative, imperative and exclamative clause types (see p. 146) non-manually. Declaratives use no particular non-manual marking. Interrogatives in Auslan come in two main variants, polar (which request confirmation or disconfirmation, as in Are you energetic this evening?) and content (which request information on something involved in the event, as in Who is energetic this evening?). Polar interrogatives are signalled with raised eyebrows and head tilted forwards throughout the production of the clause; in addition, the final sign may be held for longer than normal. Content interrogatives employ signs for WHERE, WHO, WHEN, WHAT, HOW, HOW-MANY, HOW-MUCH and HOW-OLD, and are typically produced with furrowed eyebrows, and head and body tilting forwards. Interrogatives in other sign languages follow similar patterns. In ASL polar interrogatives are marked by raised eyebrows and wide-open eyes, and content interrogatives are marked by lowered eyebrows and slightly squinted eyes. Figure 13.4 shows how a content interrogative can be expressed in Finnish Sign Language. Observe that in addition to the non-manual gestures of lowered brows and head tilt, a PALM-UP manual interrogative sign is used at the end of the clause. A similar sign can be used optionally in polar interrogatives in ASL. Imperatives in Auslan are typically accompanied by direct eye-gaze at the addressee, and frowning extended over the entirety of the clause. Exclamatives use different non-manual features

Figure 13.4 A content interrogative in Finnish Sign Language. The horizontal lines indicate the extent of the utterance over which the accompanying non-manual movements apply. (Source: Zeshan 2004: 33. Reprinted from Language 80 (2004), p. 33, by permission of the Linguistic Society of America. Thanks to Ulrike Zeshan for the digital images.)

Gesture and sign languages

depending on the type of emotion they invoke. For instance, exclamations of surprise use raised eyebrows and frowning. Negation in Auslan is also expressed non-manually (as mentioned above), by shaking the head from side to side throughout the clause, sometimes over just a part of it. This is optionally accompanied by frowning, squinting or pouting. A manual sign of negation representing NOT, NOTHING, NOT YET or NEVER may also be used in Auslan in addition to the non-manual gestures. These manual signs usually occur before the verb, though occasionally they follow it. Complex sentences of various types are found in sign languages. In (13-6) two Auslan clauses are simply joined together by the conjunction BUT. (Note that the equal signs in the two personal names indicate that they are fingerspelled.) (13-6) K=I=M LIKE CAT ‘Kim likes cats but Pat prefers dogs.’

BUT

P=A=T

PREFER

DOG

The two clauses in (13-6) are conjoined, and have equal status. However, in (13-7), also in Auslan, the first clause is subordinate to the second, and indicates a condition that needs to be met for the occurrence of the event specified in the second clause. The subordinate status of the first clause is indicated by the non-manual features of raised eyebrows and head tilt that extend over its duration. In Auslan a fingerspelled I=F ‘if ’ can also be used at the beginning of a subordinate clause of this type. (13-7) raised eyebrows + backwards head tilt HOT TOMORROW I GO-TO BEACH ‘If it is hot tomorrow, I will go to the beach.’ Another type of subordinate clause – called a relative clause – is illustrated in the ASL sentence (13-8). Again, the non-manual features that are spread over the duration of the first clause indicate its status in the sentence. (13-8) raised eyebrows + backwards head tilt + tensed upper lip RECENTLY DOG CHASE CAT ‘The dog that chased the cat came home.’

COME

HOME

Sign languages in linguistics Linguistic investigations of sign languages are relatively recent, and only began seriously in the 1970s in the wake of pioneering work of Bernard Tervoort (1953) and William Stokoe (1960). Earlier in the twentieth century, linguists tended to regard the gesture systems of the deaf as derivative, developments from ordinary gestures and spoken languages; they were not considered to be languages in their own right (see e.g. Bloomfield 1973/1933: 39). In the past forty or so years, interest in sign languages has grown exponentially. An increasing number of linguists are involved in describing and documenting sign languages from different

321

322

Linguistics

parts of the world, including village sign languages. As in the case of spoken languages, this research is partly motivated by the endangerment of a number of sign languages in the modern world (see §7.5). The interest in sign languages has also been stimulated by fundamental questions they raise concerning the nature of human language. Similarities and differences between signed and spoken languages may help us to better understand the human capacity for language, and better illuminate the properties that all languages share, regardless of their medium of expression. In this section we have discussed various similarities in the fundamental structural organization of signed and spoken languages. Whether spoken or signed, languages show phonological, morphological, lexical and syntactic structures. There are many other similarities. Like spoken languages, sign languages such as ASL, BSL, Auslan and others come in different dialectal, age, social, gender and religious variants, as well as ideolectal variants that reflect the speaker’s background. Learning of sign languages by children is similar to learning of spoken languages, given similar circumstances – in particular, that the sign language is learnt as a first language. Deaf children exposed to sign language from birth typically go through the same stages and milestones at the same approximate times as hearing children. For instance, both generally enter the two-word stage at around 18 months of age, and have completed basic learning of the language by about five years of age. Psycholinguistic investigations reveal similar patterns in the comprehension and production of signed and spoken languages, although there appear to be some differences. Neurolinguistic investigations of deaf signers who have suffered brain damage reveal similar patterns to those observed in hearing people. The left hemisphere is normally dominant in sign language processing, as in spoken language processing, and Broca’s and Wernicke’s aphasias appear to be associated with the same brain regions in deaf signers. However, there is a dearth of psycholinguistic and neurolinguistic research on sign languages in comparison with spoken languages, and one must be cautious of drawing conclusions too hastily. Early research on primary sign languages was characterized by a felt need to show that they are real languages, with properties of spoken languages. Attention thus focused on identifying commonalities between signed and spoken languages. Now that they have been accepted as full human languages, linguists have turned attention to differences between signed and spoken languages, and how these differences might be accounted for. Moreover, the viability of certain analytical tools devised for spoken languages and similarities in grammatical structure and categories between signed and spoken languages have been increasingly called into question. For instance, some investigators (following Meier 1990) have suggested that pronouns of sign languages such as ASL and BSL do not distinguish between second and third persons, but in contrast with all known natural spoken languages distinguish just first vs. non-first persons. Differences in the visual-gestural and spoken-auditory mediums and how information in these mediums is processed in the human brain may be responsible for some differences in structure and use of signed and spoken languages. This may account for some of the grammatical differences that exist between sign languages and spoken languages, as well as some of the striking grammatical similarities among sign languages – for instance, in their use of space in reference and indicating temporal relations (see box on p. 318).

Gesture and sign languages

The signs of sign languages often show a greater degree of iconicity than do the words and morphemes of spoken languages. This may be a consequence of the differences in the mediums. It might also be a reflection of the extreme differences in age of spoken and signed languages: there is evidence that as signs – whether spoken or gestural – age they tend to become less iconic.

The relation between the form and meaning of native lexical signs in sign languages ultimately remains conventional, and in most cases it is impossible to predict the meaning of a sign from its form. For instance, it would be difficult to predict the meanings of the Auslan lexemes shown in Figure 13.3, although (d) EXIT is clearly iconic, and (c) CAT has iconic and indexical components (the action of patting is depicted iconically, and this indexically represents a cat by association with patting). (Given the two meanings ‘exit’ and ‘cat’ you would probably guess correctly which of (b) and (d) is the corresponding lexeme.)

Some researchers have suggested that the greater degree of iconicity in the signs of sign languages may facilitate learning of signs by children. Evidence from psycholinguistic investigations does not, however, support the notion that iconicity is a significant factor in sign processing – that is, there is no compelling evidence that iconic signs are processed in different ways to less iconic or noniconic signs. Another striking difference between signed and spoken languages is a consequence of the fact that most deaf children are not the offspring of deaf parents: the majority of adult sign language users usually did not learn the language in their family context as a first language. In the case of Auslan, for instance, it seems that over 95 per cent of signers are not first-language signers in this sense, and the transmission of the language is not typically from parent to child. (This feature is usually considered to be a factor relevant to language endangerment – see §7.5.) Learning ASL post critical period (see §12.3) is believed to result in a variety that is not identical with the form of ASL spoken by native signers, at least in terms of some of the grammatical structures.

13.3 Alternate sign languages The gestures hearing people use in communicative interaction are, as mentioned already, sometimes codified as signs that can be used as alternatives to spoken expressions. Sometimes these codified gestures form a large set, and can be employed instead of speech in situations where that medium is inappropriate or impractical. Such systems of gesture are known as alternate sign languages. Alternate sign languages are used primarily by hearing people, who already have a spoken language. Alternate sign languages differ enormously in terms of their degree of elaboration. Some are highly elaborated and can be used in a wide range of communicative contexts to express a diverse range of meanings, comparable with the range expressed by spoken language. This is the case for the sign languages used by certain Aboriginal groups in Central Australia and by American Indians of the Great Plains of the USA. Others are more restricted in context of use and expressive range,

323

324

Linguistics

as appears to be the case for the sign systems used in some monasteries, and various occupational groups such as saw-millers in British Columbia. While these systems can be used alone in discourse they may not satisfy all of Hockett’s design features: for instance, not all systems show reflexivity. Not all of these gestural systems, that is, are sufficiently elaborate to warrant the label language.

Sign languages in Central Australia Systems of codified gestures for animals, directions and kin relations are common across the Australian continent. These systems differ markedly in degree of elaboration, ranging from restricted systems with fairly small inventories of signs to fully fledged sign languages. In Central Australian Aboriginal communities differences are found in the control and use of sign languages according to region, age and gender. Full sign languages are mainly used by women: widows traditionally observed a ban on speech for a period of about a year following the death of their husband. Sign language may be used as a replacement for speech during this time of mourning, and entire discourses could be conducted in the sign language. Men know some signs, and may use them to accompany speech or instead of speech when hunting, when providing directions in a vehicle, or when interlocutors are out of earshot. Even those with quite limited sign vocabularies are generally able to put signs together to make (in limited ways) sentence-sized utterances. All signs of the sign systems of Central Australian Aborigines, including the Warlpiri and Arrernte, are manual. Non-manual features such as facial expressions, eye-gaze, posture and so on are not contrastive, although they can play a role in discourse, and facial expression are sometimes employed to make the meaning of vague or polysemous signs more precise. Most signs are onehanded, as illustrated by the sign for NGAPA ‘water’ in Warlpiri Sign Language, in which the active hand taps the centre of the upper chest twice. Two-handed signs are usually asymmetrical, as in the case of the sign for WIRLINYI ‘hunting’, in which the side of the active hand taps the palm of the subordinate hand twice. (See Kendon 1988: 102, 109 for photographs.) Parameters of handshape, location and movement are contrastive. In Warlpiri Sign Language thirty-five handshapes are distinctive, including the B handshape with flat hand (fingers extended and together), the G handshape with index finger fully extended (other fingers fully flexed) and the horned handshape (index and pinkie fingers fully extended, middle and third fingers fully flexed). These handshapes come in various allophonic forms; for instance, in the B handshape the thumb may be adjacent to the hand or drawn away to various degrees. In addition, eighteen locations and twenty-three movement patterns are phonemic. In contrast with the situation for primary sign languages, lexical signs of Central Australian alternate sign languages represent lexemes of the corresponding spoken languages. Not all lexemes of the spoken language, however, are represented by separate signs, and frequently more than one spoken language lexeme corresponds to a given sign. For instance, SEE in Arrernte Sign Language covers a range of perceptual events specified by distinct spoken language lexemes, including ‘look for’, ‘go hunting for’, ‘look after’, ‘watch’ and others. Compounding is sometimes used to construct signs corresponding more closely in meaning with spoken lexemes. The lexicon is not closed, and there are signs for modern things and activities such as introduced animals (e.g. cattle) and driving a car.

Gesture and sign languages

Many signs are iconic, depicting a salient feature of the referent thing or action. For instance, the signs for many animals are based on features of their tracks or movements. The sign for ‘policeman’ in Yuendumu Warlpiri Sign Language involves arms held out horizontally with hands crossed at the wrists, depicting being handcuffed (Kendon 1988: 108). On the other hand, numerous signs lack an obvious iconic basis, as in the case of the sign NGAPA ‘water’ mentioned on the previous page. Grammatical morphemes are few, and many grammatical categories that are distinguished in the spoken languages of the region are not indicated in the alternate sign languages. For instance, case marking of nouns is absent from Arrernte Sign Language, as is tense and other inflectional marking of verbs. In Warlpiri Sign Language no case markers are used on Actor and Undergoer nominals in transitive clauses (see p. 121), although these roles are marked differently in spoken Warlpiri. There are, however, signs that represent other case relations (e.g. possession and spatial relations), though these do not correspond precisely with the case markers of spoken Warlpiri. Three persons are distinguished in Central Australian sign languages, and a distinction is made in the first person non-singular according to whether or not the addressee is included: that is, ‘we including you’ is a distinct sign from ‘we excluding you’. The structured use of space for grammatical purposes is less well developed than in the typical primary sign language. The morphological structure of signs shows some intriguing similarities to the morphological structure of spoken words. Reduplication of nominals in Warlpiri is a widely used word-formation process, and often indicates plurality of referents. Reduplication is used in a similar way in Warlpiri Sign Language. Additionally, lexical roots of Warlpiri sometimes involve reduplication of meaningless component forms. In some cases these reduplicated meaningless forms are homophonous with genuine roots of the language. For instance, the root wantawanta ‘red ant’ has a repeated bisyllable wanta that is homophonous with the word for ‘sun’.4 Correspondingly, the sign for ‘red ant’ is WANTA-WANTA: it involves reduplication of the form of the SUN sign. Warlpiri Sign Language effectively represents – albeit imperfectly – the spoken language in the medium of sign. It is thus like Warlpiri represented via a different modality, like written varieties of, say, English. The more practised the sign user the closer the representation of the spoken language. Nonetheless, it retains some features reminiscent of telegraphic speech (see p. 287). According to Kendon (1988: 406), deaf Aborigines in Central Australia do not generally become highly proficient users of the local sign language, but rather employ home signing which may include some elements of the alternate sign language. By contrast, in north-east Arnhem Land, Yolngu Sign Language is used on Elcho Island by speakers of Djambarrpuyngu (Pama-Nyungan, Australia) as an alternate sign language as well as by the handful of deaf people on the island as a primary sign language.

Plains Indian Sign Language Since prior to European contact, signed languages have been used as alternatives to spoken languages among many North American Indian groups, sometimes as the first language of deaf members of the communities. Plains Indian Sign Language (PISL) is the best known and described

325

326

Linguistics

of these sign languages, though others were used in other cultural areas. PISL served as a lingua franca among Great Plains Indians, permitting communication between groups that did not share a spoken language. It is still in use today in traditional storytelling, rituals, legends and prayers, although it is considered endangered. The phonological and morphological features of PISL are comparable with those of primary sign languages, and PISL shows similarities with other sign languages including use of space to express grammatical information and use of classifiers (see p. 317 above). Derivational and inflectional morphological processes are present, and PISL has bound (affixes) and free morphemes (content words and function words). The PISL lexicon consists of over a thousand signs, and compounding is productive. In contrast with Central Australian sign languages, there is little relation to any of the spoken languages of the groups using PISL. This is presumably because PISL arose as a lingua franca independent of the spoken languages, which differ significantly from one another in lexicon and grammar.

Ts’ixa sign language Many hunting-gathering societies have systems of signs that are used primarily in hunting, where speech is inappropriate, to indicate directions and the presence of certain animals. These systems are often merely vocabularies of signs for animals, although in some cases they have expanded into alternate sign languages. Such systems can be found among some former hunter-gatherer groups in the Okavango Delta in northern Botswana, including the Ts’ixa and ǁAni Khoe. The system used by the Ts’ixa, Tshàúkák’ùí (hand-INS-speak), is still in use in hunting by members of the Mababe community; it is also used when speaking to the one deaf person in the village. The ǁAni sign language, however, is no longer used in hunting, but exclusively in narrative and performance contexts. Handshape, location and orientation are phonemic in Tshàúkák’ùí; movement is also, though it is rarely employed. (a) and (b) in Figure 13.5 illustrate contrastive handshapes, straight vs. bent fingers; (c) and (d) show a sign involving movement. Non-manual features appear not to be phonemic, and there is no evidence of use of facial features to mark grammatical categories. The majority of signs (about 80 per cent) are one-handed. By contrast, in the ǁAni sign language about half of the signs are two-handed. Mohr and Fehn (2013) suggest that the difference may relate to the fact that the former is still used in hunting, while the latter is not. The majority of signs of Tshàúkák’ùí are monomorphemic, with only about 2 per cent compounds. An example of a compound sign is MBIRI ‘honey badger’, consisting of the classifying sign for small animals and the sign for ‘angry’. By contrast, in the ǁAni sign language about half of the lexical items for animals are compounds. In both systems the generic element occurs first in compounds. There is a high degree of iconicity in the Tshàúkák’ùí signs for animals, which represent salient features such as horns, fur patterns and the like. This can be seen in the signs ǀXOO ‘gemsbok’ and K’ARA ‘impala’, illustrated in Figure 13.5 a and b, respectively, which depict the salient horns of these species, straight for the former, and curved for the latter.

Gesture and sign languages

Figure 13.5 Three signs in Tshàúkák’ùí, signed by Maxwell Kebuelemang. (© William McGregor)

Monastic sign languages Since the tenth century sign languages have been used in Cistercian, Cluniac and Trappist monasteries, which observed the rule of silence originally established by St Benedict (c. 480–c. 550). These monastic sign languages are not considered replacements for speech but rather to help preserve the silence needed for communion with God, and their usage is expected to be restrained; they are not intended for use in idle chatter. They are limited systems that are used for unavoidable communication within the monasteries for essential needs. Following Vatican II (1962–5), proscriptions on speech have been reduced, and in some cases the monastic sign language is remembered only; however, they are still used in some monasteries, including one in Japan.

327

328

Linguistics

Handshape, location and movement parameters are phonemically distinctive. The lexicon is small; official sign lists produced for novices over the past millennium have consistently shown fewer than 500 items, mostly nominals. Each monastery had a somewhat different system, and in some instances more elaborate unofficial systems emerged. Thus the unofficial sign language of St Joseph’s Abbey in Massachusetts (USA) had over 1,200 signs. Compounding is used to produce new lexical signs, and in post-war Japanese monastic sign language new signs were invented for communication about the war, planes, atomic bombs and communists. In St Joseph’s Abbey most of the unofficial signs are compounds, which may involve as many as three or more signs. Syntax is borrowed from the syntax of the spoken languages of the sign users, though it is quite limited.

Summing up Linguistic uses of the visual-gestural medium range from codified systems of gestures with the expressive potential of spoken languages to autonomic bodily events associated with emotional arousal. It is impossible to draw a sharp boundary between gestures and language, which form a unified communicative system. Some researchers believe that a single mental production system underlies each. Gestures are either imagistic or non-imagistic; the former include iconic and metaphoric subtypes, the latter, pointing gestures and beats. Primary sign languages or deaf sign languages are natural human languages that emerged in deaf communities. Within a few generations the sign system may become a full language, and serve as the first language of a generation of signers. Sign languages may also arise in small isolated communities with high proportions of deaf people. Such village sign languages are usually known and used by the entire community, deaf and hearing. Primary sign languages are structured grammatically in similar ways to spoken languages, and show phonological, morphological, lexical and syntactic organization. Signs may be manual, non-manual or multi-modal. Phonological parameters relevant to manual signs are handshape, location, movement and orientation. In any sign language only a subset of the possible handshapes, locations, movements and orientations are phonemically contrastive. The morphology of sign languages tends to be simultaneous rather than sequential, and is generally better described in process terms than as morphemes in a sequence. Compounding and reduplication are common in sign languages. Primary sign languages have large lexicons that can be divided into parts-of-speech. Aside from the native lexicon of gestures, many sign languages also have a non-native lexicon of fingerspelled borrowings from spoken languages. A hierarchy of syntactic units is probably recognizable in many sign languages, with levels of word, phrase, clause and sentence. Non-manual gestures are often used to express grammatical meanings. Alternate sign languages are sign systems used primarily by hearing people as alternatives to speech in certain contexts. These systems differ considerably in complexity, and not all are full

Gesture and sign languages

human languages; many represent restricted systems that can convey a limited range of meanings in a restricted range of social contexts. Alternate sign languages are sometimes used as the primary sign language of deaf community members. The study of sign languages is of interest because of the light these languages may throw on the nature of language as a system, abstracted from the medium in which it is produced. It is also of interest in suggesting properties that may be the result of differences in the visual-gestural and auditory-vocal mediums, and differences in how visual and auditory stimuli are encoded and decoded.

Guide to further reading Abner et al. (2015) is a good overview of gesture. More technical are Kendon (2004) and McNeill (1992, 2000); Kita (2003) is an edited collection of papers on pointing. Kendon (2013) is a good overview of the history of the study of gesture. Sandler and Lillo-Martin (2017) provides an overview of the field of sign language linguistics and some of the main characteristics of sign languages. Vermeerbergen and Leeson (2011) is a readable overview of some of the features of the sign languages of Europe. More advanced is Brentari (2010), which contains some twenty-five articles covering a range issues in sign language linguistics. A nice outline of the history of sign language linguistics can be found in Woll (2013). Of descriptions of particular sign languages I strongly recommend Johnston and Schembri (2007)’s book on Auslan; Johnston and Schembri (2003) is a short dictionary of Auslan. Also good are Valli, Lucas and Mulrooney (2005/1992) on ASL and Sutton-Spence and Woll (1999) on BSL. Basic information on the grammar, use and viability of some thirty sign languages, including primary sign languages and alternate sign languages, is provided in Bakken Jepsen et al. (2015). Numerous websites provide grammatical and sociolinguistic information on particular sign languages; there are also a number of online dictionaries or wordlists with illustrative video clips of signs, including of ASL (https://www.signingsavvy.com/), Auslan (http://www.auslan.org.au/), BSL (https://www.signbsl.com/), DTS (http://www.tegnsprog.dk/), to name just a few. (Since web addresses change so rapidly, the best thing is to search the internet for the language you are interested in.) The journal Language and Linguistics Compass has published a number of overview articles on various topics in sign language linguistics, including phonology (Sandler 2012), learning (Thompson 2011; Morford and Hänel-Faulhaber 2011; Marshall et al. 2021), processing (Thompson 2011; Carreiras 2010), variation (Lucas and Bayley 2011) and semantics (Zucchi 2012). The classic works on the emergence of new sign languages are Kegl, Senghas and Coppola (1999) and Senghas, Kita and Özyürek (2004), which treat the emergence of Nicaraguan Sign Language. A unique collection of articles on village sign languages is Zeshan and de Vos (2012). Sacks (2012/1989) is a very readable non-technical treatment of primary sign languages and cognition by an expert popularizer of science.

329

330

Linguistics

Kendon (1988) was the first book-length study of alternate sign languages used by Australian Aborigines; it contains a wealth of information on alternate sign languages in Central Australia. Green and Wilkins (2015) provides an excellent account of Arrernte Sign Language. Kendon (2008) overviews the history of the study of Aboriginal sign languages. Numerous articles and books deal with Plains Indian Sign Language; of these I recommend Davis (2010, 2015). Chapter 14 of Kendon (2004) provides an accessible treatment of a range of alternate sign languages.

Issues for further thought and exercises 1 What is Signed English or Manually Coded English? Find out about this system of manual signs, and write a brief description (about a page). Mention where the hand-signs come from, and the relation of the system to English – is the system similar to anything discussed in this chapter? Who uses this system, and where? Comment also on any advantages or disadvantages of this system in relation to primary sign languages such as ASL and BSL. 2 Repeat Question 1 for International Sign Language. 3 What is oralism? Write a brief overview, outlining its main notions and the historical development of the ideas. What is the relevance and impact of oralism to primary sign languages and research on them? 4 As is the case for spoken languages, many primary sign languages are under threat of extinction. One of the reasons is the emergence of cochlea implants. Do you think this technology will spell the end of sign languages? Explain why or why not. See what you can find out about reactions to cochlea implants by the deaf community; discuss the possible impact these attitudes might have on the survival of sign languages. 5 Go through each of Hockett’s six main design features of human languages (§1.3) and see if you can find empirical evidence that they are satisfied by ASL (or a sign language you know). 6 Which sign language or languages is ASL most similar to? What are the reasons for this similarity? 7 On pp. 319–21 above we mentioned a few types of grammatical information that are conveyed non-manually in ASL and Auslan. Find one other grammatical category that is conveyed non-manually in one of these languages, and give a brief description of it along with examples. 8 We mentioned in §13.2 that the phonology of sign languages concerns the pattern behind the system of gestures, and the features that distinguish gestures from one another in terms of their form. Explain how you would go about determining whether a particular feature (e.g. of location or handshape) is phonemic in a particular sign language.

Gesture and sign languages

9 Go to the online Auslan dictionary (http://www.auslan.org.au/) and examine the pairs of lexemes PAPER and BROTHER; ON and TRUE; ALWAYS and TOMORROW; THREE and NINE; and TALK and WORK. How do the signs of each pair differ from one another? What would you conclude from each pair of examples? 10 Again in the online Auslan dictionary, examine the lexemes WASH and MAKE, focusing attention on the subordinate hand (the one that remains stationary). Given that it is the same phonemic handshape in both lexemes (referred to as the S handshape, a fist – see pp. 314–5 above), what conclusion can you draw? 11 Go to the online ALS dictionary (https://www.signingsavvy.com/) and examine the sign REMEMBER. This is a compound of KNOW and STAY. Explain the ways that the signs KNOW and STAY have been changed in the formation of the compound. (Note that the dictionary gives variants for each sign; in answering this question you will need to determine which are employed in REMEMBER.)

Research project It was mentioned in the Box on p. 318 that many deaf sign languages utilize space in one way or another. Write a description of how space is employed grammatically, semantically and/or pragmatically in one sign language – for example, ASL, Auslan, BLS or DTS. If possible, draw comparisons with the use of space in other sign languages, highlighting similarities and differences.

331

332

14 Writing

We turn in this chapter to the second alternative medium of human language, writing. We overview the main types of writing system employed in the languages of the world, remarking on the history of their evolution. Some salient differences between writing and speech are identified, as are some characteristics of writing in electronic media.

Chapter contents Goals Key terms 14.1 The visual-inscribed medium 14.2 Types of writing system 14.3 The English writing system 14.4 Writing systems in society 14.5 Linguistic features of some written varieties Summing up Guide to further reading Issues for further thought and exercises Research project

333 334 334 337 342 344 346 353 354 355 357

Goals The goals of the chapter are to: ● identify some of the main similarities and differences between writing and speech; ● overview the history of writing systems, including their emergence and development; ● distinguish the major types of writing systems used in the world’s languages;

333

334

Linguistics

● ● ● ●

comment on social issues concerning writing; discuss some of the reasons why the English spelling system is (as good/bad) as it is; discuss the influence of technology on writing; examine characteristics of some writing styles in electronic media.

Key terms abjad

Hangul (Han’gu˘l)

nominalization

allograph

hieratic

pictogram

alphabet

hieroglyphic

pinyin

blog

hiragana

rapid fade

characters

hyperlink

rebus

cuneiform

instant messaging

Rongorongo

demotic

internet

stickers

digraphia

kanji

syllabary

dyslexia

katakana

transcribed speech

electronic media

lexical density

tweet

emojis

Linear A

Twitter (X)

emoticons

Linear B

wiki

etymological spellings

logographic system

writing reform

14.1 The visual-inscribed medium Differences from other mediums We introduced (p. 12) the term visual-inscribed medium for a second visual medium for the expression of language, writing, on analogy with visual-gestural for sign languages. The term picks out two of the most salient features of writing, namely that it is perceived visually and is produced by making marks on surfaces of various types, including clay, parchment and paper. At least this has been the case until the advent of electronic media. Regardless of the recent technological

Writing

developments it remains the case that writing differs in an important way from both spoken and signed languages. I refer here to one of Hockett’s design features of human languages not mentioned in §1.3, rapid fade: the feature that once a sign has been produced either vocally or gesturally it disappears rapidly; it is available to a recipient only momentarily. This feature is not shared by either ordinary or electronic writing which typically persist over some duration of time: it can typically be viewed for as long as the recipient wishes, and returned to at a later time. As mentioned in §1.2, writing is derivative from and secondary to speech in the sense that it represents the spoken language in a different medium.1 Writing is secondary to speech in at least three other important ways. First, there is a cognitive difference. Human beings are more or less born to speak; the normal hearing child cannot help mastering in a matter of a few years or so the language spoken around them (see §12.1). We are not born to write; writing is usually explicitly taught, and few children learn it through mere exposure. Second, the majority of the world’s languages have no tradition of writing, and are exclusively (or almost exclusively) spoken. In fact, as will be seen in the next subsection, writing is a recent phenomenon. If you were to travel back in time just 7,000 years you would find all languages were exclusively spoken – as they had been for tens of millennia previously. Third, speech (and signing) requires no special technology beyond the human body. Writing is impossible without external tools or mediums.

Systems of visual marks that bypass language and represent meanings directly are not regarded as writing systems. Thus the Hindu-Arabic numerals symbolize numbers directly, independently of any human language and do not, by themselves, constitute a writing system in our sense. Of course, they can be incorporated into texts written in English, French or any other language. In such cases we might say that signs from the other semiotic system have been borrowed into the writing system of the language, and that in these circumstances they do represent words of the language, and the numbers indirectly via those words.2 Iconic representations of narratives such as the Bayeux tapestry clearly do not count as writing (although the tapestry does employ the medium of writing as an accompaniment to the pictorial representation): they can be interpreted just as well in any language, and in each in many different ways in terms of lexical and grammatical choices. Somewhat closer to writing are pictograms, stylized picture-like representations of things and ideas that bypass language. Examples of pictograms include conventional signs like ) N, emoticons ),, N, such as :-) and :(, and emojis such as J and L, which represent things, qualities and emotions rather than words. Indeed, they need not necessarily be read as words to be understood, and if they are, might be represented in a language by a range of words. By glad contrast, happyy represents a particular lexeme in English, and cannot be read as glad. d. It is not always easy to be certain of the status of systems employing the visual-inscribed Rongorongo, a system of inscriptions engraved medium. For instance, on Easter Island, Rongorongo, on pieces of wood, was once employed. However, it went out of use soon after European contact, and few examples of it survive. It is not known for sure whether Rongorongo was a system of pictograms or a genuine writing system.

335

336

Linguistics

Emergence of early writing systems Writing first began, as far as is known, in Mesopotamia (modern Iraq) in the fourth millennium BCE – just under 6,000 years ago. The first language to be written was Sumerian (see p. 16). It was written on clay tablets with the wedge-shaped end of a reed stylus, and hence the label cuneiform (from Latin cuneus ‘wedge’). The earliest cuneiform writings in Sumerian date to around 3100 BCE. The first written symbols most likely derive from pictograms.3 Indeed, the earliest inscribed signs, which predate cuneiform writing by a few centuries, depict animals, plants, body parts, artefacts and natural environmental phenomena. Some of these pictograms represented not just an entity, but also an associated event. For instance, a pictogram of a leg and foot might also denote ‘walk’ or ‘stand’. Over time, and with increased usage, some pictograms became more simplified and stylized, ultimately losing much of their iconic value. The advent of writing was facilitated by use of two strategies permitting representation of words that are not easily depicted. One is combinatorial: a combination of symbols is used to represent a meaning that is difficult to represent directly. Thus the Sumerian symbol for ‘eat’ was a combination of the symbols for ‘head’ and ‘food’. The second strategy is the rebus principle. Here a written symbol stands for the sound or approximate sound of a word. The rebus principle is employed today in text messaging and various other informal styles of writing, where for instance numerals 2 and 4 are used not to represent either numbers or the English number words two and four, but the prepositions to and for – and other things as well, including syllables within words, as in 2gether, 4tunate. Another example would be use of a pictogram for the sun, e.g. R, to represent the homophonous word son, or a pictogram of a chilli, e.g. , for chilly. These two processes are evident in the development of the early writing systems discussed below. Cuneiform was subsequently developed in various ways and used for writing other languages – none related to Sumerian – in the Babylonian, Assyrian, Hittite and Persian empires. Cuneiform was employed for some three millennia, with the latest known inscriptions dating to 75 CE. Just slightly later than Sumerian writing came Egyptian writing, which probably dates from around 3000 BCE. This was the well-known hieroglyphic writing, usually carved into stone or wood. A few centuries later, a less formal hieratic variant was developed that was primarily written cursively in ink with a reed brush on papyrus; this coexisted with hieroglyphic writing. Hieroglyphic writing was primarily used in the religious and monumental domains, hieratic mainly in business and administration. Around 650 BCE the so-called demotic script developed from the hieratic. Whether Egyptian hieroglyphic writing was an independent invention or diffused from Mesopotamia is not known. In both regions pre-writing phases are evident, and the trajectories of development were rather different. Nevertheless, it is likely given the geographical and temporal proximity that there was at least some mutual influence. Dating from about 2500 BCE is the Harappan script, which was inscribed on seal stones in cities of the Indus Valley civilization. However, this script has not yet been deciphered, and it is not known for certain whether it represents a genuine system of writing. The origin of writing in China is more recent, dating to about 1200 BCE. The earliest instances of true writing appear on the so-called oracle bones, found at Anyang in northern China. Many of the signs on these bones resemble modern Chinese characters. Much older signs have been found on pottery dating to 5000–4000 BCE, but these are believed to be proto-writing. It seems likely that

Writing

the emergence and development of Chinese writing was independent of that in Mesopotamia, Egypt and the Indus Valley. However, there is disagreement among scholars, and it is not impossible given ancient influences and movement of people across Eurasia (e.g. along the Silk Route) that the development of writing in China was not fully independent. That it may have taken a couple of thousand years to reach China is not surprising given the distance (and time).4 More certain candidates for independent emergence are the Olmec writing system which emerged around 900 BCE in Meso-America and the Easter Island Rongorongo script mentioned above, presuming it was actually a system of writing. Both systems arose in places widely separated from regions in which writing existed, and evidence of human contact is non-existent. The Olmec system is undeciphered, though there is reason to believe that it was a fully fledged writing system. It predates the better-known (and deciphered) Mayan hieroglyphic system by a millennium. In Europe writing first appeared on Crete, where three different scripts were used prior to 1200 BCE; these were inscribed on clay tablets. The earliest, known as Cretan Hieroglyphic, is poorly attested and generally regarded as proto-writing. The next oldest, Linear A, has been only partly deciphered, and the language is unknown. The youngest of the scripts, Linear B, dating from about 1450 BCE, was deciphered by Michael Ventris in the 1950s. Ventris showed that the language was an archaic variety of Greek. It appears to have been used exclusively for administrative purposes. The Greek alphabet did not emerge until some 400 years after the last Linear B inscriptions, and is unrelated to Linear B. Like languages (see §7.5), writing systems also go out of use and die. Many of the systems mentioned above are no longer in active use. Writing systems in use today have their roots in these earlier systems, but aren’t direct descendants of them. An exception is the modern Chinese writing system, which is a historical development from the ancient one. Many of its characters have changed significantly over the 3,000 or so years of their history, simplifying in form and becoming less iconic.

14.2 Types of writing system Four different kinds of writing system can be distinguished according to which attribute of the spoken language is represented: logographic systems represent words or morphemes; syllabaries or syllabic systems represent syllables; abjads represent consonant phonemes; and alphabets represent both vowels and consonants. These are idealizations, and actual systems almost always represent their targeted units inadequately and inconsistently; often systems involve aspects of more than one type of system. In this section we discuss the four types in turn.

Logographic systems Logographic systems represent the words and/or morphemes of a language more or less directly by means of visual forms called characters. Sumerian writing was logographic. Some characters were used by the rebus principle to represent homophones or near homophones. Thus the character ti ‘arrow’ was used for ti(l) ‘life’ as well. Confusion was sometimes avoided by adding a supplementary character indicating something about the meaning or pronunciation. For example, prefixing the

337

338

Linguistics

character gis ‘wood’ to the ‘plough’ character distinguished apin ‘plough’ (the noun) from uru ‘to plough’ (the verb). In other complex signs the gis ‘wood’ character specified the phonetic value [iz]. The writing system used in most Sinitic languages (see p. 429 below) is also a logographic system; the same system is shared by Mandarin, Cantonese, Hakka, Gan and other languages. Chinese makes extensive use of complex characters. Over 90 per cent of the modern characters are combinations of components indicating something about the meaning or the pronunciation of the word. The characters shown in (14-1) all represent words concerning cooking and associated notions of burning and heating; all involve the ‘fire’ radical 火 huoˇ. Those shown in (14-2) all involve the same phonetic element, 尧, yáo; this gives a rough guide to the pronunciation of the word, that it rhymes with ao, roughly as in the vowel sound in English word loud.5 (14-1) 烧 shāo ‘cook, burn’ (14-2) 饶 ráo ‘forgive’

灰 huī ‘ash’ 浇 jiāo ‘pour’

灶 zào ‘oven’

炕 kàng ‘heated brick bed’

烧 shāo ‘cook, burn’

煤 méi ‘coal’

燃 rán ‘ignite’

晓 xiaˇo ‘dawn’

Logographic systems obviously need a considerable number of characters to represent the range of words in a language. There are some tens of thousands of Chinese characters. No user knows all of them, and research has shown that functional literacy requires a knowledge of three to four thousand. As we have seen, it is not the case that the characters are all completely distinct from one another. Rather, there are partial systematic relations among them.

Syllabaries Syllabaries represent the syllables of a language in the visual-inscribed medium, ideally representing each different syllable by a different graphic form. It is not difficult to imagine how a syllabic system might develop from a logographic one. The logogram for a monosyllabic word might come to represent the sound shape of that syllable rather than – or as well as – the word itself. For instance, the character for ti ‘arrow’ in Sumerian might have been used also for the syllable /ti/, even in a polysyllabic word where it was meaningless. If, as was the case in Sumerian and Chinese, many words were monosyllabic, a large range of different syllable shapes could be represented. The set of syllables represented could be extended by various processes, resulting in a syllabic writing system. Indeed, it seems that some Sumerian logograms did ultimately come to represent syllables. Linear B was a syllabic writing system. It had almost ninety symbols representing just open V and CV syllables. For example, the symbol ̤ represented the syllable /a/, ͅ represented /ka/, ̮ represented /da/, ̦represented /e/, ͊represented /ke/, and ̯represented /de/. Most likely the early Greek language Linear B recorded had closed CVC syllables; these, however, are not distinguished by distinct symbols. Syllables with initial consonant clusters were represented as though bisyllabic. Thus /pra/ was written ͮ΁– as though /pa$ra/. (Recall that the $ sign indicates a syllable boundary.)

Writing

Japanese is written with three sets of characters, kanji, katakana and hiragana. Kanji comprises about 2,000 symbols, mostly borrowed Chinese characters, although some are of indigenous origin; this is a logographic system. The two other systems, katakana and hiragana, are syllabaries. Each comprises about fifty symbols mostly representing V and CV syllables; katakana uses a number of diacritics in addition. One symbol in each system represents a single consonant phoneme, a syllable final nasal, ン in katakana and ん in hiragana. The graphic symbols in katakana and hiragana derive from Chinese characters with similar phonetic values, though they show considerable simplification. For instance, the syllable /ka/ is represented by ヵ in katakana and by か in hiragana; both derive from the Chinese character 加. The hiragana syllabary is used primarily for words and grammatical inflections that are not covered by kanji. By contrast katakana is mainly used for writing borrowed words and in the transcription of words from foreign languages. It is also used for technical and scientific terms, for emphasis and to represent onomatopoeic forms. In the nineteenth century a Cherokee man named Sequoyah invented a syllabary for his mother tongue. It seems that he was impressed by the power of the white colonizers’ writing, and was determined to develop a means of representing Cherokee (Iroquoian, USA) on the medium of paper. Sequoyah began working on his system around 1809, and completed his work in 1821. His system was officially adopted by the Cherokee Nation in 1825, and was used in the first newspaper of the Cherokee Nation. Sequoyah’s syllabary has just over eighty graphic symbols. Some symbols are shared with the Latin alphabet; however, they have different sound values. For instance W symbolizes the syllable /la/, and C the syllable /tli/. Not all phonemically distinct syllables have a separate symbol, and there are inconsistencies in representation. Thus /g/ initial syllables are not distinguished from /k/ initial syllables, whereas /d/ initial syllables are mostly distinguished from /t/ initial syllables. Nor are syllables with long vowels generally distinguished from those with short vowels, and tones are not indicated; consonant clusters are not represented consistently. Syllables that end in vowels, /h/ and /ʔ/ are represented by the same symbols. For example, Ꮡ represents /suː/ in ᏑᏓᎵ /su:dali/ ‘six’, but /suh/ in ᏑᏗ /suhdi/ ‘fishhook’.

Abjads Abjads and alphabets represent the minimal phonemic segments of a language. But whereas alphabets represent (ideally) all contrasting segments, abjads represent the consonant phonemes, totally or largely ignoring the vowels. (Systems that represent some information about vowels are sometimes called abugidas; we do not draw this distinction here.) A number of Semitic languages (see p. 426), including the extinct Phoenician language, Aramaic, Arabic and Hebrew, were or are written with abjads. The Phoenician system seems to have been the source for the other Semitic abjads. It was normally written from right to left, though sometimes it was written boustrophedon – that is, right to left, then left to right, and so forth. Twenty-two symbols make up the inventory; all represent consonants; vowels were not marked at all. For instance ‫ ؜‬represents the glottal stop /ʔ/, ‫ ؝‬represents /b/, ‫؞‬

339

340

Linguistics

represents /g/, ‫؟‬represents /d/, and ‫ؠ‬represents /h/. Many of the Phoenician letters are simplifications of Egyptian hieroglyphs: ‫؜‬can be traced back to the hieroglyph ‘ox’, and ‫؝‬to the hieroglyph 𓉐 ‘house’. The youngest of the Semitic abjads, dating from the late fourth century CE, the Arabic system is written from right to left, and comprises twenty-eight letters. It is a cursive script, with most letters normally being directly joined with adjacent letters within a word. The letters generally take on different forms – that is, have different allographs (analogized on the terms allophone and allomorph) – according to their position in a word, whether initial, medial, final or isolated. For example, allographs of the consonant letter representing IPA /x/ are: ‫( ﺦ‬final) ~ ‫( ـﺧـ‬medial) ~ ‫ﺧ‬ (initial) ~ ‫( ﺥ‬in isolation). Arabic script represents not just consonants but also long vowels, the symbols for which derive from consonant symbols. Short vowels are not usually indicated except in educational materials and in the Qur’ān. Where short vowels are indicated it is by means of diacritics either above or below the letter representing the preceding consonant.

Alphabets The Phoenician abjad was widely used throughout the Mediterranean area, spread by Phoenician merchants. It was adopted in modified form by many societies of the region, including the Greeks, who adapted it to their language in the ninth century BCE. Differences in the phonologies of Ancient Greek and Phoenician led to some significant innovations. Ancient Greek lacked the glottal stop and voiced pharyngeal fricative as phonemic segments; the corresponding Phoenician letters ‫؜‬and ‫ث‬were instead deployed – in the modified forms Α α and Ο ο (capital and lowercase forms) – to represent the vowels /a/ and /o/, respectively. Other letters denoting consonants were also redeployed to represent vowels in Greek, including ‫ؠ‬, with original consonantal value /h/, which in the forms Ε and ε came to indicate the vowel /e/. The Greek alphabet was subsequently borrowed and adapted to write Etruscan (isolate, Italy), which alphabet was later adapted for Latin. The Latin alphabet is the source for most modern European alphabets, and alphabets for numerous languages from other parts of the world that have only been written in the past few decades. One of the most interesting alphabets is Korean Hangul (Han’gŭl), invented by King Sejong (1397–1450). King Sejong seems to have devised the system himself, making it public in 1443 in a document Hunmin Chŏng’ŭm ‘Correct sounds for the instruction of the people’. He followed this three years later with a document that provided commentary on the system and explanation of the linguistic principles behind it. At the time literacy was in Chinese, which only a small elite class had command of. Moreover, the Chinese logographic system was rather unwieldy for Korean, and Sejong wanted something more appropriate to Korean and easier to learn. This would facilitate more widespread literacy. Hangul is still used today in both North Korea and South Korea, though over a thousand Chinese characters, known as hanja, are currently taught to schoolchildren in South Korea. Hangul has official status (along with Chinese characters) in the Yanbian Korean Autonomous Prefecture of Jilin Province in China, where signs in Hangul are visible – see Figure 14.1.

Writing

Figure 14.1 A trilingual sign in Yanbian Prefecture. (© William B. McGregor)

The Hangul system is unique among alphabets in that the letters show another level of structure and are not arbitrarily connected to the phonemes they represent, as in the Latin alphabet. The system, that is, shows a degree of transparency: the letters show a degree of iconicity in their representation of the articulatory properties of the consonants. Five basic forms are distinguished in terms of place of articulation: ㄱ for velars (representing the shape of the back of the tongue), ㄴ for apico-alveolars (again representing the tongue shape), ㅁ for bilabials (representing the lips), ㅅ for dentals (representing a tooth), and ㅇ for glottals (depicting the throat). Modifications to these shapes indicate different manners of articulation. Thus the unmodified ㄱ represents /k/, an additional horizontal line gives ㅋ representing aspirated /kh/, and reduplicated ㄲ a tense stop /k̥ / (the small circle under the letter represents tense articulation). Not all is plain sailing, however, and the basic letter ㅁ represents /m/, with modified forms ㅂ /p/, ㅍ /ph/ and ㅃ /p̥ /. Unlike the consonant letters the vowel letters did not represent articulatory features – at least, not transparently. Vowel symbols include ㅏ /a/, ㅓ /ə/, ㅗ /o/ and ㅜ /u/. Hangul letters do not follow one another sequentially as in the Latin alphabet, but rather are grouped together in syllabic blocks. Thus the syllable /ka/ is represented as 가, /kə/ as 거, /ko/ as 고, /ku/ as 구 and /ki/ as 기. In fact, Korean has a large number of distinct syllable shapes, including closed syllables, also represented in blocks, as in e.g. /kak/ 각, /kan/ 간 and /kam/ 감.

341

342

Linguistics

14.3 The English writing system As already mentioned, the types distinguished in §14.2 are ideals, which real systems approximate; many systems show features of more than one type. No writing system represents the spoken language precisely. Indeed, writing systems typically ignore certain features completely (e.g. stress is phonemic in English, but is not shown in writing), and if a particular phenomenon is indicated it may be represented partially or inconsistently. Alphabetic scripts ideally represent words according to their phonological structure, as a sequence of letters each of which corresponds one-to-one with a phoneme. Some systems do this well, and you can make a good guess at the pronunciation of a word from its written form, if you know the correspondences between letters and phonemes. This is the case for Spanish and Hungarian. But some writing systems, including the English, French and Danish ones, are notorious for their poor and unsystematic representations of the phonemic shape of words. In the remainder of this section we discuss the mixed nature of the English writing system, and why it is as it is. Problems begin with the size of the phoneme inventory of English, which comprises around twenty-four consonants and fifteen to twenty vowels depending on the dialect. Given that the Latin alphabet has just twenty-six letters, the writing system simply cannot be an ideal alphabetic one. One way in which the fit is improved is by using sequences of letters to represent some phonemes. For example, by representing /ŋ/ by and /ʃ/ by . (By convention, angle brackets enclose letters.) Even so, the correspondence between phonemes and letters and letter sequences is messy. It has been estimated that the typical vowel can be represented in twenty different ways; the different written forms typically also represent different vowel phonemes. For instance, /iː/ can be written (e.g. seed), (e.g. bean), (e.g. piece), (as in police), (as in charity), (e.g. me, be) and (as in receive). Most of these forms can also represent different vowel phonemes. For instance, also represents /ɛː/ (as in bread), /ɜː/ (e.g. learn), /eɪ/ (as in steak), /ɛ/ (e.g. head) as well as the sequences /iːeɪ/ (as in create) and /iːʌ/ (as in Korea). Old English made use of the Latin alphabet, augmented by runic letters (a system of letters used for carvings in wood and stone) including (called ash), (thorn), (eth) and (wynn). The writing system of Old English was closer to the alphabetic ideal than the modern system, and was a good way along the pathway to standardization. The Norman Conquest of England in 1066 brought considerable French influence not just in the form of borrowing of words, but also to the writing system. French-speaking scribes were uninterested in the conventions of writing Old English. They used their own conventions not just in representing French borrowings but also often for spelling native Old English words, ousting established spellings. Borrowings from Latin and Greek were spelt according to the Latin model. Later, in the colonial period, borrowings from other languages sometimes reflected spelling conventions of the donor language. For instance, many borrowings ultimately from languages of the Americas retain the Spanish or Portuguese spellings of the intermediate languages they were actually borrowed from.

Writing

Some inconsistencies in English spelling reflect different origins of words; that is to say, etymology (the origin and history of words) is to some extent reflected in spelling. In such circumstances we speak of etymological spellings. In some cases it is presumed etymology, which linguists call folk etymology. Numerous French loan words were respelled to reflect Latin origins. For instance, the spelling was changed to to reflect Latin although the word was borrowed from French not Latin. The spelling – borrowed from Latin auctor – reflects a false Greek etymology. And the of was inserted in Old English on the basis of Latin insula. The Great English Vowel Shift (which began around 1350 and is ongoing), which affected the qualities of the long vowels, either raising or diphthongizing them, resulted in further inconsistencies in the writing system of Middle English, these changes being represented inconsistently in writing. Other changes in the phonology that were not reflected in spelling changes led to further discrepancies. Lost word final /b/ after /m/ in words such as dumb and womb accounts for spellings and . The velar fricative /x/ of Old English was completely lost, but its written representation by remains in words such as right. The advent of the printing press, introduced to England in 1476 by William Caxton, also had consequences. Many printers were Dutch; they introduced Dutch spellings in some words. For example, the spelling of word initial /g/ as in ghost was introduced by Dutch-speaking typesetters; many other such spellings of typesetters failed to catch on, e.g. for girl, and for goat. Interestingly, the early days of printing saw the emergence of many alternative spellings of words rather than standardization. In order to right justify lines printers often added or omitted letters. Caxton often added an for this reason, spelling English sometimes as , sometimes as . Ultimately, however, the printing press did contribute to the standardization of English spelling. One additional factor contributing to its poorness in phonological representation is that the English writing system shows some logographic tendencies. Thus there are a number of homophonous lexemes in English that are spelled differently. Examples include (in my dialect): rite, right, write and Wright; pore, paw, pour and poor; dune and June; and by, buy and bye. Another logographic tendency is the trend for morphemes to be spelt in the same way regardless of their phonological shape. This holds for grammatical morphemes like possessive and plural , which are never spelt with a , despite the fact that two allomorphs of each end with a /z/. Similarly we find a tendency for root morphemes to be spelt in the same way. An example is critic and criticize, where the root is spelt the same despite the fact that it ends in /k/ in the first form and /s/ in the second. Another example is provided by the pair erect and erection. The writing system for English is imperfect to say the least. It is not, however, entirely random, and there are a number of tendencies that permit explanation for spellings, even if they are of limited value in prediction. The English writing system also has certain advantages. One is that it does not favour any one variety: it is about equally good – and bad – for all varieties. The alternative of a perfect alphabet representing the phonemic segments would require either representation of a single variety – which? – or different systems for different varieties. Both alternatives come with problems.

343

344

Linguistics

14.4 Writing systems in society Until recently writing in all literate societies was the prerogative of a select few. In the ancient world, in Mesopotamia, Egypt, China and Meso-America, special classes of professional scribes arose whose task it was to write documents of various types. In Ancient Egypt writing was perhaps limited to one percent of the population; the vast majority of the population remained illiterate. In some of these societies scribes were held in high esteem (e.g. Egypt and China); in others their social position was lower (e.g. Mesopotamia). These days literacy is regarded as a universal human right. However, universal (or almost universal) literacy is a very recent achievement, one that is restricted to certain societies in the modern world. In fact, according to a 2010 UNESCO estimate, there are some 790 million illiterate adults in the world today, mainly living in African and Asian countries. The figures remain about the same a decade later. But even in affluent countries a significant fraction of the population are functionally illiterate, with literacy skills inadequate to societal demands – ranging from around 5 per cent to over 20 per cent depending on the country and on how functional literacy is defined. Many such individuals suffer from dyslexia – a disorder characterized by difficulty in reading and writing despite education and intelligence – and other reading disorders. Writing occupies a central place in modern societies; it is visible all around us, not just in books, journals and magazines, but also in the many signs that make up the landscape of modern cities, labels on products, instructions on electronic devices, advertisements, manuals, travel guides, forms, graffiti, and elsewhere. Indeed, it might even be argued that civilization is impossible without writing. Many investigators have commented on the effects of writing and literacy on human society and on the human mind, and it is not uncommon for a distinction to be drawn between (primary) oral societies (in which writing is not used, or is used in restricted domains by a small portion of the population) and literate societies in terms of fashions of thinking. In fact, it is hard to imagine history, linguistics or science in the absence of writing. (Why might this be?) Writing changes much more slowly than speech. Even if there is a good match between writing and speech at a particular point in time, disparities soon emerge. Thus languages with long histories of writing often show significant discrepancies between written and spoken varieties. In such circumstances the need for writing reforms may be felt by some users. But writing reforms are often subject to strong resistance from users. Of the many reforms that have been proposed to the modern English writing system none have been implemented. It is true that Noah Webster (1758– 1843) made a number of successful changes to the spellings of certain words in American English, so that today there are slightly different orthographic standards for British English and American English. However, not all of his recommendations were adopted, and the resulting system is at best marginally more regular than British spelling. Writing as much as spoken language expresses social values and identity. Thus the primary motivation for Webster’s spelling changes were symbolic, to express an American social identity. Identity considerations are some of the most important reasons why people persist in using the

Writing

writing systems they do, and resist changes even to notoriously complex writing systems such as the English and Japanese ones. There have, to be sure, been some successful writing reforms, though in many cases they are less comprehensive than intended. In the nineteenth and twentieth centuries a number of attempts were made to reform Chinese writing. In the early twentieth century the Beijing dialect of Mandarin Chinese was chosen as the official written dialect, replacing the ancient classical language. Later reforms, dating to 1956 and 1964, involved the introduction of simplified characters (as shown in examples (14-1) and (14-2) above). The simplifications are based primarily on a reduction in the number of strokes for each character. For political reasons these reforms were not adopted outside of the People’s Republic of China. It was also decreed in the 1950s in mainland China that Chinese would be (standardly) written horizontally from left to right, rather than vertically from top to bottom. Some writing reforms have been more thoroughgoing, and have involved replacement of one system by another; the successful cases have all been politically motivated. Vietnamese used a system of writing based on Chinese characters for about a thousand years; in the seventeenth century French missionaries developed a version of the Latin alphabet, which came to be progressively more used in schools as Vietnam fell increasingly under French influence. By the late nineteenth and early twentieth centuries the writing used both a Chinese-based character system and a modified Latin-based script. With independence from France in 1945 the Latin-based alphabet was adopted as the official writing system. Another example is Turkish, which, until the early twentieth century, was written in a version of the Arabic abjad, called the Ottoman Turkish script. As part of his efforts to modernize Turkey and establish closer links with Europe, Mustafa Kemal Atatürk issued, in 1928, a decree that the Arabic-based script would be replaced by a version of the Latin alphabet. This system has been used ever since. Turkish written in the Ottoman Turkish script can now be read only by scholars. In some diglossic situations (recall §7.4) the ‘high’ (H) variety is associated with writing, the ‘low’ (L) variety with speech. For example, in Medieval Europe diglossia was widespread. People generally spoke their vernacular language (English, French, Spanish, etc.) but mainly wrote in Latin. In the case of the diglossic German-speaking community in Switzerland, Standard German is normally used in writing, Swiss German in speech. In China until 1919 most writing was in Classical Chinese, which was quite unlike the spoken languages. Sometimes two different writing systems are used for the same language. Such a situation is referred to as digraphia. For example, Hindi and Urdu are linguistically speaking dialects of one language. However, Hindi is written in a Devanagari script, while Urdu uses an abjad brought by Muslim invaders. The different writing systems reinforce the different social and religious identities of the users. Digraphia is not uncommon. Chinese is written in both characters and the romanized alphabet pinyin. Pinyin is frequently used by speakers of Chinese languages when writing on computers and smartphones, as an interface for character selection, but is otherwise little used. (It is sometimes used for special effects such as humour in messages otherwise in characters.) Korean and Japanese writing systems may also be regarded as digraphic: both employ Chinese-based character systems along with the indigenous systems, hangul for Korean, and the two syllabaries for Japanese.

345

346

Linguistics

14.5 Linguistic features of some written varieties Up to now we have discussed systems of writing and their characteristics, including the types of linguistic element represented, their origins and development, and broad social features. We now turn to writing systems in use: to linguistic features of written language and how writing is employed in creating texts. We begin by outlining some differences between spoken and written language in use, some of which presumably result from differences in the mediums. These differences are not categorical, but rather are matters of degree: they are quantitative rather than qualitative differences. Ranges of variation are found in both spoken and written language, including registerial variation (see §7.3). Furthermore, differences in technologies of writing correlate with linguistic differences. In the following subsections we discuss some features of writing in electronic mediums. Before we begin, it is worth stressing that writing is not transcribed speech; it is not speech written down, and does not aspire to be. Transcribed speech is of interest to linguists, but is not useful for the purposes writers put writing to. For example, it is difficult to imagine what possible use the representations of hesitations, false starts and the like would be to ordinary readers of a document.

Differences between speech and writing Table 14.1 lists some differences between speech and writing, a number of which we have already commented on. To begin with, all of the differences are matters of degree. For example, some instances of writing (e.g. on clay tablets) have ages of thousands of years, while others are ephemeral and disappear without trace in a few weeks or less (e.g. many letters written on paper). And while social and individual characteristics of a writer may not be evident in printed documents,6 in the case of handwritten documents they may be. Furthermore, some instances of writing may display features associated with speech to a greater extent than some instances of writing, and the reverse. For example, in the case of the reform in Turkish writing mentioned in §14.4, there was a quite rapid and significant change to the system, which was probably faster than any contemporaneous change in the spoken language. On the other hand, a formal lecture on linguistics might display a higher frequency of nominalizations – nouns derived from words of other parts of speech by the addition of derivational affixes, as in the nouns nominalization from nominalize (verb) and difficulty from difficult (adjective) – than a personal letter to a friend. It might also show a higher lexical density – that is, a higher frequency of use of lexical words (see further exercise 9 below). The development of the printing press had an enormous impact on writing. The world’s first movable type printing technology was invented by the Han Chinese printer Bi Sheng sometime in the years 1041‒8. It was not until some four hundred years later that the technology appeared in Europe, first developed by Johannes Gutenberg, in Mainz. Printing was possible only with the

Writing

Table 14.1 Some differences between speech and writing Speech

Writing

rapid fade (Hockett 1960)

persistent (slow fade)

rapid rate of change

slow rate of change

occurs in an interpersonal interactive context

occurs in a constructed or imagined interpersonal context

rapid online processing of speech and hearing

slow and planned in production (writing), rapid in comprehension (reading)

gender, age and other social features of both interlocutors generally apparent

gender, age and social features often not apparent

individuals typically identifiable and known personally

individuals not so readily identified and may not be known

emotional meanings expressed by prosodic features

emotional meanings limited in expression

space not used systematically to convey meaning (except in accompanying gestures)

systematic use of space to convey meanings (e.g. to separate words, sentences and paragraphs; visual displays such as tables)

low level of lexical density

high level of lexical density

infrequent use of nominalizations

more frequent use of nominalizations

advent of paper, another Chinese invention. The technology spread rapidly throughout Europe, and permitted the production of numerous identical copies of a single work. This had not been possible previously when medieval scribes had had to copy by hand – a very time-consuming and error-prone process. The printing press was perhaps a prerequisite for universal literacy. The next technological development with comparable social, political and linguistic impacts was the development of electronic media, particularly mobile phones, computers and the internet. These technologies permitted new possibilities for writing. As with the development of the printing press, many commentators have expressed concern about possible negative consequences of these new technologies on both people and the writing system of their language – even the language itself. Many have predicted the demise of writing on the medium of paper, the emergence of the paperless office, and the end of hardcopy books and journals. The reality seems to be somewhat different, and just as the printing press did not spell the end of handwriting it seems unlikely that electronic media will completely oust paper-based writing. Indeed, paper usage in the office appears to have increased over recent years, and even though many journals and books are produced electronically, many readers prefer to read them on paper rather than on a computer screen. Jabr (2013) presents psychological evidence for not just preference of paper over the electronic medium, but also better memory for content, even among those who have grown up with electronic media. Although electronic writing almost certainly does not have the permanence of clay tablets (it seems improbable that writing on electronic media will be readable or even decipherable 5,000 years hence), it is surprisingly versatile, and we tend to underestimate just how long lasting it can

347

348

Linguistics

be. Differences – including supposed differences – in the electronic medium are doubtless responsible for some of the distinctive features of writing in that medium.

Writing on the internet A range of new styles of interaction and discourse are enabled by the internet, including email, chat, blogging, tweeting, instant messaging, wikis, online games, social networking and many others. Language is fundamental to most of these; in many writing plays a central role, though in some speech and video are also possible or even the norm. The linguistic and discourse characteristics of writing differ significantly across the interaction types, just as they do in ordinary writing. Some writing is identical with paper-based writing, as in the case of many online journals, the content of which may be identical with that in printed copies of the journal; other forms are more speech-like. The interaction between reader and writer in electronic media is different from the interaction between reader and writer in paper-based writing. Thus, writing on the screen is different from writing on a physical page; it is both written and read differently (recall also the remarks on the previous page). Differences from paper-based writing include hypertext links (which are more or less comparable with footnotes and cross-references in paper writing), and lack of persistence in the appearance and content of web pages. Lack of house styles, copy-editing and often even proofreading means that much public writing on the internet differs from the writing of hard-copy publications, and as such may more closely resemble informal private writing. The interaction between the electronic writer and reader is also different from that between a speaker and a hearer. For instance, even in more speech-like systems such as instant messaging time lags may have effects on turn-taking and adjacency of elements belonging to a single exchange. Moreover, messages are sent only after a key press, and thus typically after being fully composed, rather than from the beginning of their production. This permits editing by the sender, but effectively precludes joint construction of instant messages by sender and receiver. In chatrooms a person can be involved in more than one conversation at a time, which is usually impossible in face-to-face conversations. The internet is responsible for the emergence of numerous neologisms (e.g. twictionary, tweologism), mostly by the processes we discussed in Chapter 4. Indeed, the internet is lexically very active and volatile. There is less evidence of grammatical novelty, however. English is not, of course, the only language used on the internet. Unicode facilitates the use of many languages, including many that do not use a Latin-based alphabet. It is difficult to reliably measure the linguistic diversity of the internet. (Why?) According to 2010 figures cited in Crystal (2011: 79), English makes up for a bit over a quarter (27.5 per cent) of internet users, Chinese a little under (22.6 per cent). The top ten languages account for over 80 per cent of users. More recent figures given on the website https://www.statista.com/statistics/262946/share-of-the-most-commonlanguages-on-the-internet/ are just slightly lower, indicating an increased presence of other languages. In what follows we look in turn at some features of the language and structure of instant messages, Twitter, blogs and wikis. We restrict attention to English and acknowledge that the genres of discourse in each of the three cases are still fairly young, and may change radically in the future (assuming the survival of the systems).

Writing

Instant messaging Instant messaging is a technology that allows messages to be sent and received over the internet in close to real time. Instant messaging typically takes place between two users who engage in private exchanges of messages with one another, in contrast with chatrooms where many users carry on multiple and overlapping interactions. Instant messaging is somewhat similar to text messaging (txtng, short message service (SMS)) which allows users to exchange messages without an internet connection using the mobile phone network. Instant messaging has now largely replaced text messaging in the private domain, and the latter technology is now primarily used in communications between service providers (such as banks, software companies, delivery companies) and users. See the website for this chapter for a brief description of the language of text messaging. In 1996, ICQ (I Seek You) was the first instant messaging system to be made available, although there were earlier non-internet-based systems. Since then a number of other systems have emerged, including AOL (America Online) Instant Messenger, MSN (Microsoft Network) Messenger, Telegram Messenger, Skype, WhatsApp, WeChat, among others. (Some of these, e.g. WeChat, have much wider functionalities including social media and mobile payment.) Instant messaging is currently very popular, with many billions of messages being sent daily. WeChat alone has over a billion regular users. Instant messaging systems permit one to set up a contact list, which may indicate the status of the persons on the list (e.g. whether they are logged on the system, busy, want to remain invisible, or whatever). Clicking on a person’s name will initiate an attempt to communicate with them, and a window opens into which one can input text. This has to be actually sent by clicking on the ‘send’ button. The message then appears in another small window, usually in the shape of a speech bubble, typically on the right-hand side of the sender’s screen. It simultaneously appears on the left-hand side of the recipient’s screen, along with a notification to the recipient alerting them to the message, if they are not actively using the system. At the extreme left or right of the screen appears a small representation of the interactant, chosen by the user. The recipient can choose to respond by entering their own message in their input window. As the interaction progresses a scrolling dialogue appears on the respective screens of the interactants. If there is a break in the interaction, a time stamp typically precedes the move. A record is typically kept of the interaction, which can be scrolled through. Most instant messaging systems permit sending of more than just written messages. Emoticons, emojis, stickers (detailed and elaborate cartoon-like representations of emotions and actions, often animated), photographs, files, video and audio clips, and location information (usually shown on a map) can be sent on many systems. Most of these have to be sent separately: each sticker, photograph, video clip or whatever must be sent as a single message in the interaction. Emoticons and emojis, however, can be incorporated alongside written text in a move. Many systems also permit the participants to rapidly switch to voice or video calls within the same app. (14-3) is an excerpt from an instant message conversation, from my own small corpus. The text of the message, including capitalization and punctuation, is as in the actual messages; icons and stickers have been replaced by descriptions which are enclosed in square brackets.

349

350

Linguistics

(14-3) on the [icon of a train]

A

I’m on the way to the Asian shop

B

[sticker depicting a stationary steam train releasing steam] Rushing to Aarhus

[sticker depicting turtle moving slowly]

A

[sticker depicting fast moving train]

B

I will get to Aarhus at 17:12

[interruption of about 5 minutes] [OK sticker]

A

Hot pot dipping sauce ok? S [photograph of the sauce] [another interruption of 5 minutes or so] B

I’ve never tried

Probably ok That’s all I can see B

We have had some dips, not so spicy I think

A

Writing

This brief excerpt illustrates a number of common features of instant messages. To begin with, instant messages are typically very short: the longest in (14-3) is 10 words, and the average is 4.6 words per written move (4.7 if the icon in the first line is counted as a word). This figure compares quite well with the average length of 5.81 words per move mentioned in Crystal (2006: 251) for a somewhat larger corpus comprising 178 turns. The small number of words per message is the result of a number of choices by the writer. A number of moves in (14-3) are made up of elliptical sentences, beginning with the first line, a highly elliptical question (are you on the train?) the interpretation of which depends on earlier messages concerning delays to the train that B had planned to take. Some of the turns are divided into separate messages. For instance, A uses two messages in their first turn, separating it into a question about what the addressee is doing followed by a statement of what they are doing. This style of division into messages dealing with different topics is common in instant message interactions. Also common to instant messages is the reduced punctuation. Full stops at the end of sentences are completely absent in (14-3), and only one question mark is used, at the end of one of the two questions. Apostrophes and sentence initial capitals are consistently used, though these were almost certainly all added in by the spellchecking system. Most users do not bother to type in apostrophes or to use the shift key – and are not infrequently annoyed by the spellchecker’s corrections. (14-3) shows a number of messages made up of stickers; in fact, four of the fifteen messages – almost a third – are stickers. This is common in my corpus of instant messages, which shows many uninterrupted sequences of between two and ten stickers. There is also an icon incorporated into a text message in (14-3). In this instance the icon is clearly a replacement for the word train. There are no instances in this excerpt of icons expressing emotions. They are, however, not infrequently used in instant message interactions, where they fulfil various functions. Unlike the train icon, emoticons and emojis cannot be effectively replaced by words and are in some ways comparable with gestures in speech (see §13.1). As has been remarked already, instant messages often unfold in real time, like face-to-face conversations. Sometimes there are delays in the internet or mobile phone network that interrupt the pattern of turn-taking. Some delays are caused by participants doing other things simultaneously, as is probably the case in the two 5-minute interruptions in example (14-3) – in the first A is clearly busy shopping, while in the second it seems that B has evidently been distracted by other things, and was probably engaged in an instant message conversation with someone else. In fact, in some cases interactants do not expect (or even want) an immediate response. They may use an instant message much like an email, perhaps sending it at a time they expect the other user to be busy doing something else. The line between emails and instant messages can be blurred in the other direction, as when one is able to carry out an email conversation in more or less real time.

Twitter Twitter, created in 2006 (rebranded X in 2023), is a micro-blogging platform for sending and receiving text-based posts, called tweets. It experienced an explosion of users from around half a million in 2007 to 100 million in April 2010 (Crystal 2011: 54). Tweets were originally restricted to 140 letters (longer messages were truncated, indicated by ellipses, . . .), with an extra 20 reserved for

351

352

Linguistics

usernames and the like. The messages are displayed on the author’s profile page, and are automatically sent to anyone who follows that author, though they can be read by anyone else unless they have been excluded. Messages consist of author identification followed by the message text, and data concerning the tweet. An example is the following, from Crystal (2011: 37): (14-4) stagewatch: just seen an excellent production of Macbeth at Shakespeare’s Globe 4 days ago from web – Reply – View Tweet The system provides various types of functionality, including the possibility of linking to another tweet by use of @ followed by the username: (14-4) could be responded to with @stagewatch followed by a comment message. Messages can be cc’d (retweeted) to one’s followers. Almost 20 per cent of the tweets in the corpus used by Crystal were retweets (Crystal 2011: 40). The majority of tweets fell below the original 140-letter limit. In Crystal’s corpus their average length was 100.9 letters, comprising 14.7 words; most consisted of a single sentence or two. As expected, similar strategies are used by tweeters for shortening messages as are used by texters, including contractions, abbreviations, logograms, etc. Ellipsis is also common, especially in noninitial sentences. Tweets sometimes display grammatical complexity. Many contain cohesive devices (recall pp. 191–5) such as conjunctions and connective adverbs (like so, well, etc.), anaphoric forms, interjections (such as ok, yeah), and address forms (e.g. hey, man). According to Crystal, the majority of tweets in his corpus provided observations and opinions; advertisements and phatic communion (see p. 198) were also common.

Blogs Blogs (from web logs – what word-formation processes are involved?) became popular in 2004. Writing is central to blogs, although videos, audio files and pictures can be included. Blogs tend to be personal in style. Hyperlinks form the backbone of blogs; indeed, they are sometimes essential to understanding a blog. The extent of hyperlinking makes the boundary of blogs somewhat fuzzy. There is a wide range in frequency of use of hyperlinks ranging from the odd link up to almost exclusively links. The links may be to other blogs, to websites other than blogs, and to mainstream media. Links may be from web addresses, titles or quotations in the blog; these are sometimes incorporated into the text itself – for example, a clause, phrase or word. Hyperlinks serve a wide range of purposes, including: providing additional information; providing evidence for claims; giving credit for information; soliciting action; solving puzzles resulting from lack of information in the linking text; and providing different information or perspectives. The effect of the hyperlink is sometimes humorous. The media frequently portrays bloggers as ranters shouting at an empty internet. This is a far from apt portrayal. First, bloggers use a variety of linguistic means to engage with their audience, including modes of address and reference, use of moves that demand a following move (e.g. questions soliciting answers), and enactments of conversational interactions by means of representation of exchanges in the text itself (e.g. a question together with the author’s answer). Bloggers also use politeness and face-saving devices such as modal modifiers like maybe and

Writing

perhaps, and hedges. They also construct an in-group by use of shared lexical peculiarities and allusion to shared knowledge. Second, bloggers use a range of lexical and grammatical devices to express attitudes and feelings, judgements and comments on the proposition expressed. These include use of modals like can and may, reported speech and thought complements (I think (that), you know and the like) and conversational particles such as well, so totally, etc. Such devices are also used to specify the factual basis for claims, indicating whether they are derived by induction, deduction, hearsay, belief or whatever.

Wikis Wiki software dates to the mid-1990s; Wikipedia, which we focus on here, made its debut in 2001. The wiki platform permits many authors to collaborate on a single text: everyone has the right to modify, edit and proof the text. This is of course not a unique feature of wikis, but an extension of what can happen with ordinary writing, where many authors may contribute to a text; the difference is that authorship is open in the case of Wikipedia. Wikipedia includes a history tab (showing all previous versions of the text, along with date and other information concerning the edit), an edit tab, as well as a talk tab (permitting editors to discuss changes and proposed changes to the text of an entry). Wikipedia language tends to be impersonal. We tend to forget that many written texts, especially those intended for the public domain, are the product of many revisions. In the case of hard-copy published texts only the final version is usually available, which is often very different to the first version. The history pages of Wikipedia provide useful information on the types of input authors and editors make to texts. In the years since the appearance of Wikipedia a complex process of collaboration has emerged, partly the result of explicit procedures and principles, but also implicit ones concerning how one should interact. Myers (2010) observes the following types of edit in a selection of Wikipedia articles: adding information; changing information, including revision of claims and possible compromises (where differences of opinion are admitted); formatting changes to fit the conventions of Wikipedia and to the structure of the article; proofing changes; and vandalism (which can be difficult to distinguish from the expression of strongly held opinions). In some cases a committed group of editors make most of the edits, in others a more varied range of editors is involved. Myers (2010) proposes the following sequential development of an article (of course, as distinct from hard-copy writing, there is no final form): a beginning as a basic stub (rather than an extended piece by a single author); relatively few edits at the beginning, and a clustering of edits around certain periods of activity – some of the edits survive, others are quickly rejected; early edits set up wording and categories that often survive throughout many edits; corrections of vandalism and obvious spelling errors usually happen quickly (though some can persist for a long time).

Summing up Writing employs the visual-inscribed medium, and unlike both speech and signed languages, does not show rapid fade. Genuine writing is derivative from speech, and represents its sounds to some

353

354

Linguistics

degree. However, writing is not transcribed speech, and numerous differences exist between writing and speech, including differences lexical density and frequency of nominalizations. Some differences are consequences of differences in the mediums; some are due to differences in the uses of language in the different mediums. Writing probably originated in pictographs; the rebus principle permits representation of abstract concepts visually. The first known writing was cuneiform, used in Mesopotamia around 3100 BCE. Slightly later, hieroglyphic writing appeared in Egypt; other early writing systems of China and Meso-America emerged later, possibly independently. The earliest writing in Europe dates from around 1200 BCE with hitherto undeciphered Linear A; Linear B is attested from somewhat later, and represents Minoan Greek. Four types of writing system are distinguished according to the linguistic unit ideally represented: logographic systems, syllabaries, abjads and alphabets. These are ideal types; actual systems often involve aspects of more than one type. Some languages are written in more than one system; this is called digraphia. As an alphabetic system, the English writing system is notorious. But spelling is not random and many inconsistencies can be explained. Some result from the fact that speech changes more rapidly than writing; others are consequences of social history. Another factor is etymology: words are often spelled as in the source (etymological spelling), or presumed source language (folk etymological spelling). English writing also shows a tendency to logographic representation. Written language has, like spoken language, a symbolic value relating to expression of the user’s identity. This limits the success and extent of reforms to writing systems and spelling. Differences exist in the linguistic and discourse properties of writing in electronic and paperbased mediums; the former shows some characteristics that make it more speech-like than the latter. However, significant differences exist in the language of different types of electronic mediums, making it pointless to speak of the characteristics of electronic writing in general. Instant messages are typically two-party exchanges that unfold in real time, and show some similarities to face-toface conversations; the messages tend to be very short and to show reduced use of punctuation, especially full stops. Tweets show non-standard features in the representation of words, including non-standard spellings, acronyms and other abbreviations, and reduction in use of capitals and punctuation. Blogs tend to show linguistic choices making them somewhat personal in nature. Wikis permit the collaboration of many authors on the production of a single text, and show linguistic choices that mark them as rather impersonal.

Guide to further reading There are many good articles and books introducing the writing systems of the world’s languages. I recommend Coulmas (1989), Nakanishi (1990/1980), Daniels (2017), Rogers (2005), Robinson (2009b), Fischer (2001) and Powell (2009). Daniels (2013) is also worth reading. Somewhat more technical is Coulmas (2003). Gnanadesikan (2009) is interesting, but must be read critically.

Writing

Encyclopaedic treatments include Coulmas (1996) and Daniels and Bright (1996). Coulmas (2013b) provides in-depth discussion of the sociolinguistics of writing. One fascinating topic we did not deal with in this chapter is decipherment of ancient scripts. There are many intriguing works on this topic, including: Coe (1999) on Mayan script; Robinson (2002) on Linear B; Robinson (2012) on Egyptian hieroglyphics; Robinson (2009a) on undeciphered scripts. Sproat (2010) Chapter 4 and Pope (2003) give nice general accounts of the principles of decipherment. On linguistic differences between speech and writing, see Halliday (1983). Goody (1977) and Ong (1982) are standard references on literacy and orality, and how the transition from orality to literacy influenced culture and changed human consciousness. Crystal (2012) is a very readable account of English spelling; Sebba (2007) is more detailed and scholarly. Crystal (2006, 2011) and McCulloch (2019) discuss the language of the internet. Myers (2010) provides an account of the discourse of blogs and wikis. The second edition of this book had a section on text messaging, which is replaced in this edition by the section on instant messaging. (The text of the earlier section can be found on the webiste for this book.) Crystal (2008) is an entertaining story of texting; more linguistically substantial is Tagg (2012).

Issues for further thought and exercises 1 You should have noticed that we did not mention any type of writing system comparable with an abjad, but that represents just the vowels of a language. Do you think such a system would be possible? Explain why or why not. 2 It has been suggested that ghoti is a possible way of spelling the word fish (i.e. [fɪʃ]) in English orthography. It is possible to find words in which gh represents [f], o represents [ɪ] and ti represents [ʃ]. See if you can find examples of these representations, and then see if you can argue that in fact ghoti is not a possible spelling of fish after all. 3 Describe the allography of the letter in your handwriting: that is, identify the range of allographs and their conditioning factors. Are there any instances of free variation among allographs? (You will need to collect examples of your written versions of various words involving this letter in different positions.) Are there any other letters (graphs) that form suspicious pairs with , and if so, can you produce an argument for distinctive (or emic) status of the two letters? 4 Hangul is unusual for an alphabet in that letters are grouped into syllable blocks, rather than follow one another linearly. Based on the following data, formulate rules that account for the arrangement of letters into syllable blocks, and comment on any allography you observe (see also p. 341 above for the values of the letters):

355

356

Linguistics

Open

IPA

Closed

IPA

Closed

IPA



ka



kak



kɨps







kan



kiks



ko



kam



kilk



ku



kat



kilm



ki



kap



kops



ma



mak



maks







man



malph



mo



moŋ



məlph



a



ɨm



məps



ɨ



us



mols



i



om



molk



o



ot



moks

5 A number of homophones in English involve grammatical and lexical words which are spelt differently, e.g. by vs. buy and bye. Collect as many of these as possible. Can you make any generalizations about the spellings of the homophones? 6 Trace the development of a selection of Latin letters from their Phoenician origins. (You will find plenty of information on the internet.) 7 What are some possible social advantages of the Chinese writing system, which would not be available if it were written with an alphabetic script. 8 Identify and discuss three reasons for and three against reforming the notoriously bad English spelling system. 9 One way you might measure lexical density is by counting the number of lexical words in a stretch of text, and divide that by the total number of words. Use this measure to determine the lexical density of a piece of writing (about a page) and a spoken piece of roughly the same number of words. Is the lexical density of the spoken piece lower than that of the written piece? Compare also the lexical density of an informal email message (or better, set of email messages amounting to a comparable number of words) or a corpus of instant messages with the figure you got for the page from this book. Which appears to be closer to speech in terms of this measure? Do your findings agree with your expectations? 10 If you are a speaker of a language other than English, to what extent do you think that the features of instant messaging in English apply to your language? Test your hypotheses against some examples of instant messages in your language. 11 Find examples of attitudes to the language of instant messaging (or some other form of writing in electronic media) in the public and private domain. What concerns are voiced, and to what extent do you think there is foundation to them?

Writing

12 What does the term eye dialect mean, and where does it come from? Give some examples of eye dialect forms employed on the internet and elsewhere. 13 It has been observed (Tagg 2012: 52) that many common words occur in one or more respelled forms in text messages. Is the same true in instant messaging? If so, can you explain why common words show a greater tendency to respelling than infrequent ones?

Research project Collect a manageable corpus of email messages, say in the order of thirty or so of them. (Don’t forget to ask the authors for permission to use them.) Can you say anything about their overall structure (recall §8.2), including the presence of greetings and closings? Do the emails show evidence of language use comparable with oral and non-electronic written language such as letters? Discuss examples. Do the emails show use of features such as non-standard spellings, abbreviations, word choice, emojis and emoticons? Can you make any generalizations about the use of these features, e.g. when and why the author might choose to employ them?

357

358

15 Unity and Diversity in Language Structure

The main theme running through the remainder of the book is variety and variation. In this chapter we deal with the range of variation in the grammatical structures of the world’s languages. We explore the extent to which languages can differ phonologically, morphologically, syntactically and semantically; we also investigate the parameters of variation, limitations on variation, and correlations between variable characteristics. Some remarks are also made on possible explanations of the similarities and differences.

Chapter contents Goals Key terms 15.1 Preliminaries to the study of the unity and diversity of languages 15.2 Universals of language 15.3 Typology 15.4 Explaining unity and diversity of language structure Summing up Guide to further reading Issues for further thought and exercises Research project

360 360 361 363 365 379 381 381 382 385

359

360

Linguistics

Goals The goals of the chapter are to: ● introduce the notions of linguistic universals and typology; ● discuss empirical requirements for investigations of universals and typology; ● distinguish and exemplify four main types of linguistic universal; ● describe two widely used typologies of human languages; ● discuss some important typologies of the phonology, morphology, syntax and lexicon of the world’s languages; ● introduce the notion of markedness and illustrate its relevance to linguistic typology; and ● discuss reasons for the similarities and differences among human languages.

Key terms absolute universals absolutive case accusative case agglutinating languages

free word order languages

non-implicational universals

fusional languages

number

implicational universals

polysynthetic languages

inalienable possession

alienable possession

intransitive

animacy hierarchy

isolating languages

argument structure

markedness

case systems

motion verbs

ditransitive

neutralization

ergative case

nominative case

fixed word order languages

non-absolute universals

prefixing languages suffixing languages tone systems transitive typology universals

Unity and Diversity in Language Structure

15.1 Preliminaries to the study of the unity and diversity of languages Two complementary perspectives on variation in language It is difficult not to be impressed by the extraordinary diversity within the world’s languages. Some languages distinguish just three vowels, while others distinguish a score or more. There are languages with no more than ten phonemes, and languages with over ten times as many. English and French use only the pulmonic airstream mechanism in ‘ordinary’ lexemes, while Goemai uses the glottalic airstream as well. In Mandarin Chinese words are morphologically simple, while Yup’ik words show great complexity, and a single word may express what requires a multi-word sentence in Mandarin Chinese. Yet the variation is not unlimited. There are certain properties that most or all languages share; these are known as language universals. To give a simple example, all languages use egressive pulmonic air; no language uses only ingressive pulmonic air, only velaric or only glottalic air. Furthermore, although there are languages that use glottalic but not velaric air, it seems that all languages that use velaric air also use glottalic air (though sometimes just as an accompaniment to clicks). No language contrasts sounds produced with any of these airstreams and sounds made with oesophageal air (air from the stomach). These are universal properties of the sound systems of human languages. Discovering language universals is an important task for modern linguistics; so also is explaining them. Figure 15.1 depicts the situation graphically, and shows that we can group languages into types according to whether all phonemes contrast on the pulmonic airstream, or they contrast just glottalic, or glottalic and velaric airstreams as well. Language typology deals with grouping together or classifying languages in ways like this. Some linguists have classified languages into click languages (in which sounds produced on velaric air contrast with sounds produced on pulmonic air in ordinary lexemes) and non-click languages. Thus Shua, Zulu and Xhosa would be click languages, English, Mandarin Chinese, Saliba and Warlpiri (Pama-Nyungan, Australia) non-click languages. However, you might reasonably question whether presence vs. absence of contrasting airstream mechanisms in a language really is a useful

Figure 15.1 Language types according to airstream mechanisms used contrastively.

361

362

Linguistics

way of grouping languages. Perhaps this is like classifying animals according to whether they have fur, hair, bristles or spines – possible, but of no great scientific significance. Typologists also investigate the classification of component systems of languages. For instance, one could classify vowel systems into nasal (having a phonemic contrast between vowels with nasalization and vowels without) and oral (without such a contrast). This approach is sometimes referred to as linguistic typology, to maintain a terminological distinction from the classification of entire languages (language typology). Typologists today mostly do linguistic typology. Typology and universals are really just different perspectives on the same thing: how to get a handle on the limitations on variation among languages. Typology views it from the perspective of variation within commonality; universals, from the perspective of unity within variation. For convenience, I will use the term typology as a cover term for both perspectives; it is usually clear which sense is relevant.

Requirements Claims about cross-linguistic variation and similarity must be based on and evaluated in reference to not just one single language, but many languages. Ideally you might say they should be based on all languages of the world. Otherwise, perhaps the one language you omitted is the exception. But given that the world’s languages number around 7,000 and most of them have not been described in a great deal of detail, the ideal is beyond our present reach. Granted that we must be selective, how do we choose the languages? To begin with, it will depend on the sort of question one is asking: there is no single all-purpose procedure that works for all types of question. For instance, we can ask questions about languages of the world generally, the languages of a particular region, the languages of a family (see §17.2), or languages that have a particular property (e.g. have clicks, or nominal cases). Your sampling procedure will be different for each of these questions: if you are enquiring into the languages of a region it would make no sense to include a large number of languages outside of that region. It will also differ according to size of the targeted group, and the extent to which the languages have been studied. For instance, a typological study of the Nyulnyulan family (comprising around a dozen languages spoken in the far north-west of Australia) might potentially include all described Nyulnyulan languages, while a comparable study of the Pama-Nyungan family (comprising a couple of hundred languages spoken over most of the continent) would almost certainly be more selective. Let us suppose we are interested in questions about the world’s languages generally. Two considerations are pertinent: (a) A selection of languages cannot be judged as good or bad merely on how many languages are included. A selection of fifty very different languages is more representative than a selection of a hundred quite similar ones. We can make a selection that maximizes diversity by choosing languages that are widely distributed geographically and belong to different families (see §17.2). (b) Considerations of diversity alone are insufficient. You obviously need to select languages for which you have access to good data, which generally means a language for which a

Unity and Diversity in Language Structure

comprehensive grammar is available. What is comprehensive will depend partly on what you are investigating. For some studies (e.g. possession of nasal vowel phonemes) even a brief sketch grammar might be adequate. On the other hand, if you are investigating noun cases or parts-of-speech systems you will need more comprehensive grammars. Even if you are able to find languages with excellent grammars, your investigation can be hampered by terminological and theoretical differences. For instance, descriptive linguists use the term subject in a range of ways, and to disregard this could be fatal to your study.

15.2 Universals of language A distinction can be drawn between characteristics that are shared by all languages (so far as we know) and characteristics that are exhibited by many though not all languages. Thus we can talk – with some poetic licence – of absolute and non-absolute universals. The shared characteristics can be either specific linguistic features such as vowels, or relationships between features such as ‘if a language uses the velaric airstream, it also uses the pulmonic airstream’. (Note that the inverse implication does not hold: if a language uses the pulmonic airstream it need not use the velaric airstream.) Correspondingly we can distinguish non-implicational from implicational universals. Table 15.1 shows the four types of universals these distinctions give rise to, and summarizes their distinguishing attributes. In the following subsections examples will be given of each of the four types. Table 15.1 Four types of language universal Absolute universals

Non-absolute universals

Non-implicational universals

A characteristic shared by each and every language without exception All languages have X

A characteristic shared by many (or most) languages, a tendency Languages tend to have X

Implicational universals

A logical relation of implication between two characteristics that is found in every language

A logical connection between two characteristics that is found in most/ many languages

In all languages if X then Y

If a language has X it tends to have Y

Absolute non-implicational universals Some examples of absolute non-implicational universals are:1 ● ● ● ●

All languages have syllables, consonants and vowels. All languages have at least one stop phone. All languages have lexical words and distributional words (minimal free forms). All languages distinguish between grammatical units of at least three sizes, word, phrase and clause.

363

364

Linguistics

These probably seem quite unexciting. Nevertheless, they are not logical necessities, and systems can easily be imagined that do not display the properties. Some are also contested: the existence of phrases in some languages has been questioned; indeed, according to some theoreticians, no such unit exists in any language. These universals might therefore be more significant than first appears.

Non-absolute non-implicational universals Non-absolute universals are robust tendencies that admit some exceptions. Here are a few: ● ● ● ● ●

Most languages have CV syllables. Most languages have nasal phones (and phonemes). Most languages have an alveolar stop phone or phoneme. Most languages have at least three vowel phonemes. In most languages a part-of-speech distinction can be drawn between nouns and verbs.

These generalizations hold for a high proportion of the world’s documented languages. A few exceptional languages have no CV syllables: Breen and Penselfini (1999) argue that in Arrernte (PamaNyungan, Central Australia) all consonant phonemes occur in syllable final position. Some Lakes Plain languages (Papuan, Papua) lack nasal phones entirely, while in some Asmat languages (also Papuan, Papua) nasals and corresponding stops are in allophonic variation. Hawaiian (Austronesian, Hawaii) lacks alveolar stop phones. Some languages lacking the noun-verb distinction were mentioned in §4.1.

Absolute implicational universals Absolute implicational universals are not easy to find, but here are three: ● ● ●

If a language has phonemic clicks it also has phonemic ejectives and/or implosives. If a language has voiceless nasals it also has voiced nasals. If a language distinguishes dual number (a grammatical category indicating ‘two’) in pronouns it also distinguishes plural number in pronouns.

There are three things to be wary of when formulating absolute implicational universals like these, which connect properties within languages. First, if the ‘if’ clause expresses an absolute universal, then the consequence must also be an absolute universal. (Can you see why?) It is preferable to simply state the latter as an absolute universal. Thus rather than ‘if a language has vowels, it has consonants’ it is preferable to state the absolute non-implicational universal ‘all languages have consonants’, since even though the implicational universal is valid, the non-implicational universal makes a stronger claim. Second, if the consequence is an absolute universal, then it makes little sense to formulate an implicational universal. Occasionally, one sees an implicational universal like ‘if a language has voiceless vowels, it also has voiced vowels’, or ‘if a language uses the velaric airstream, it also uses the pulmonic airstream’ (p. 361). Since we have already seen that all

Unity and Diversity in Language Structure

languages have vowels and use pulmonic air, it is preferable to state the stronger absolute non-implicational universals that all languages have voiced vowels and that all use the pulmonic airstream. Logically, (almost) anything could serve as the condition for an absolute truth: ‘if a language has duals, it has voiced vowels’! (Even ‘if pigs could fly, a language has voiced vowels’!) Note that our second absolute implicational universal above almost runs foul of this: it is saved by the fact that there are a small number of languages lacking nasals. Third, if the ‘if’ and ‘then’ clauses are connected by a logical relation it makes little sense to state the implicational universal: reason alone will tell you. Thus it is unnecessary to state as an implicational universal ‘if a language has accusative case marking of nouns it has case marking’.

Non-absolute implicational universals Non-absolute implicational universals are abundant. Here is a sample: ●

● ●



If a language has front rounded vowels, it will usually have front spread and back rounded vowels. If a language has phonemic affricates, it usually has phonemic fricatives as well. If one of the two number categories singular and plural is marked by an affix to a noun, it tends to be the plural. If a language has bound morphemes marking number and case in nouns and either both follow or both precede the noun, the marker of number almost always comes between the noun and the case morpheme. In other words, if they are adjacent the number-marker is closer to the noun.

There are exceptions to each of these implications, though they are few. In such circumstances it pays to look carefully at the exceptions to see whether they really are exceptions, or to try to specify the condition in another way. For instance, some grammars fail to make the distinction between affixes and clitics, and this could be the source of certain exceptions. Alternatively, it might be that the markers in question are not genuine markers of the category. For example, in some languages a marker that at first glance appears to be a plural number-marker turns out on closer examination to be instead a collective marker, specifying that the referent set is both plural and a group acting as a unit.

15.3 Typology The bulk of this section deals with morphological typology, one of the best-studied areas of typology. We begin in the first subsection with two morphologically based classifications of languages. Following this we explore (in the second subsection) the typology of three morphological categories: number, possession and case. Syntactic typology has also attracted a considerable amount of attention. However, our treatment (in the third subsection) is brief, partly because a greater knowledge of syntax is required than presented in Chapter 5, and partly due to analytical

365

366

Linguistics

and theoretical problems that render the empirical basis somewhat shaky. Actually, the division between morphological and syntactic typology is made here for expository purposes, and many phenomena can be equally treated under either heading: what is represented morphologically in one language may be expressed syntactically in another. Somewhat less attention has been devoted by typologists to phonetic/phonological typology and lexical/semantic typology, and I restrict myself to a few brief remarks on these underdeveloped but fascinating domains.

Two morphological typologies of languages Morpheme integrity A widely used typology of the world’s languages, with roots in the nineteenth century, distinguishes four morphological types: ●

Isolating languages have no (or few) bound morphemes. Every word tends to be monomorphemic. Haitian Creole (a French-based Creole spoken on the island of Haiti) is an isolating language, as the following example illustrates:

(15-1) m pa konprann sa l(i) ap di m lan 1sg not understand what 3SG PROG say 1sg DET ‘I don’t understand what he/she is telling me.’ ●

Agglutinating languages show morphologically complex words that are usually easy to segment into morphemes. The boundaries between morphemes are typically clear-cut: it is obvious where one morpheme ends and the next begins. Hungarian, Finnish and Turkish are agglutinating languages; so also is Guaraní (Tupi, Bolivia and Paraguay):

(15-2) pe-mitã-kuña o-u-va hína che-nupã that-child-woman 3SG:ACT-come-REL PROG 1sgINACT-hit-VOL kuri RPST ‘That young woman who is coming wanted to hit me.’ ●

Haitian Creole

Guaraní

Fusional or inflectional languages also have morphologically complex words in which it can be difficult to separate morphemes from one another: the boundaries between them are blurry. In contrast with agglutinating languages words are not easily analysed into morphemes that follow one another like beads on a string. The extinct Indo-Aryan language Pali (India) was a fusional language. Consider the following case forms of the noun kaññā ‘girl’:

(15-3) Nominative Accusative

singular kaññā kaññam̩

plural kaññā, kaññāyo kaññā, kaññāyo

Pali

Unity and Diversity in Language Structure

Instrumental Genitive Dative Ablative Locative Vocative

kaññāya kaññāya kaññāya kaññāya kaññāya, kaññāyam̩ kaññe

kaññāhi kaññānam̩ kaññānam̩ kaññāhi kaññāsu kaññā, kaññāyo

Although it is easy to distinguish a root kaññā ‘girl’ that remains almost invariant, the affixes are not easily segmented into morphemes separately marking number and case. ●

Polysynthetic languages are morphologically rich languages with long and complex word forms that often convey information requiring a multi-word clause in other types of language. Yup’ik is a polysynthetic language; so also are many other languages of North America, including Koyukon (Athabaskan, Alaska):

(15-4) to-ts’eeyh-ghee-ø-tonh water-boat-PRF-CL-put:long:object ‘He launched a boat.’

Koyukon

The above examples illustrate some reasonably clear-cut instances of languages of the four morphological types. However, the reality is messier, as can be seen from the data and discussion in Box 15.1, which gives translations of ‘I’ll bring back the honey’ – in some cases slightly modified because of absence of a word for ‘honey’ – in twenty-three languages. It is better to see the four types as ideal points along a continuum between the extremes of isolating and polysynthetic languages. Based on our example clause, the languages in Box 15.1 can be placed roughly as shown in Figure 15.2. Of course, one would want to take more than a single sentence into account in locating languages in this morphological ‘space’. With more data, Mandarin Chinese would doubtless end up closer to the isolating end of the scale than English, and West Greenlandic would certainly be further towards the polysynthetic end than Gooniyandi.

Figure 15.2 Location of the 23 sample languages on two dimensions of morpheme integrity.

367

368

Linguistics

Box 15.1 ‘I’ll bring back the honey’ in 23 languages ‘I’ll bring back the honey’

Morphs: Words

Fused morphs: Morphs

Language

nga-yiuk-yi-rrurnde-ng 1SG-honey-APP-return-NPST

5/1=5

0

Gun-djeihmi, (Gunwinyguan, Australia)

ngalinya honey

6/2=3

1/6=0.17

Gooniyandi

nu˜’ty waje-’na¯-da-ki honey bring/get.back-FUT-1SG-DEC

5/2=2.5

0

Kwaza

ma¯-a¯-ket-u FUT-1SG-bring-hither

5/2=2.5

0

Sabaot (NiloSaharan, Africa)

7/3=2.3

1/7=0.14

Warrwa

7/3=2.3

0

Michif (Mixed language, Canada)

re-fer-ebo mel back-carry-1SG/FUT honey/ACC/NEUT

4/2=2

2/4=0.5

Latin

po-yl lyi-p me-b sugarcane-DEF get-NF/1 carry-NF/1 ya-d o-bu here-DAT come-FUT/1SG

10/5=2

3/10=0.3

Ku Waru (Papuan, New Guinea)

neqi meat

4/2=2

1/4=0.25

West Greenlandic (Eskimo-Aleut, Greenland)

warna honey li the

barn-ja-wi-l-arri return-SUB-FUT-1SG>3SG/CL

nguy return

myel honey

beenyto meat

ka-na-ngka-ya-ngany 1SG/FUT-CL-FUT-say-APP

ni-wii-ashee-peet-aw 1SG-FUT-back-bring-3SG

oqquti-ssa-ara bring:home-FUT-1SG>3SG/IND

k-mul-ak 1SG-return-APP

madu honey

4/2=2

0

Taba

vissza-hoz-om back-return-1SG

a méz-et DEF honey-ACC

6/3=2

0

Hungarian

6/3=2

0

Swahili

5/3=1.7

0

Finnish

6/4=1.5

2/6=0.3

Kuot

7/5=1.4

0

Mandarin Chinese

7/5=1.4

0

Danish

ni-ta-let-a asali 1SG-FUT-bring-IND honey tuo-n bring-1SG

hunanja-n takaisin honey-ACC back

eba inə FUT again ilumə honey 我 woˇ 1SG

要 yào yào FUT

jeg 1sgNOM tilbage back

hapa/huko here

t-ana-ŋ 1SG/FUT-bring:back-3SG/FUT

把 baˇ CON

蜂蜜 fe¯ngmì honey

拿回来 ná-húilái take-back

tage-r honning-en med take-NPST honey-N.ART with

Unity and Diversity in Language Structure

Morphs: Words

Fused morphs: Morphs

Language

9/7=1.3

0

Japanese

6/5=1.2

0

English

7/6=1.17

2/7=0.29

Goemai

7/6=1.17

0

Lao (Tai-Kadai, Laos)

ʔo¯ʔa¯ː¯ ʔo¯ʔ ¯ a¯ː¯ bring:back

4/4=1

0

Shua

nánùn here

5/5=1

0

Kisi

tbɨh tbɨ b h daak sut bring water bee

6/6=1

0

Laven (AustroAsiatic, Laos)

‘I’ll bring back the honey’

watashi ga hachimitsu 1SG NOM honey ku-ru yo come-NPST PART I’ll bring I’llbring back 1SG-FUT bring

the back

hen t’ong mang 1SG IRR take/sg n-ni comitative-3SG kuu3 si0 qaw w3 1SG IRR take maa2 come dane honey

ke PROG

honey the honey nshi wa honey return:home/SG

nam0-pheng5 liquid-bee

ta: 1sgNOM

ì 1SG

có FUT

ʔaj 1SG

ma cɔk c k cɔ FUT take

o tot-te ACC get-GER

lìáŋ cùùwó honey bring

khn2 return

Qualifications i.

A single example is given for each language, although there will always be other ways of expressing basically the same meaning. For example, in Gun-djeihmi the word ‘honey’ could appear outside of the verb; the free pronoun ‘I’ could also occur. The clauses given are minimal expressions of the meaning in the language, in the absence of context.

ii. Zero morphemes (see §3.6) have been excluded. iii. Some word boundaries are uncertain: the nominative and accusative postpositions in Japanese, for instance, are sometimes treated as bound morphemes.

Discussion The fuzziness of the boundaries between the four morphological types is apparent. For example, English and Mandarin Chinese both have a single complex word made up of easily separated pieces. Thus they are not strictly isolating, but have some agglutinative tendencies. And while much of Gooniyandi’s morphology is agglutinative, there are fusional -l- in the verb means ‘I (acted on) it’, 1sgNOM/3sgACC; it cannot be tendencies: the form -l divided into two separate morphemes.

369

370

Linguistics

In each language the clause has between one and seven words, with the two extremes instanced just once. It seems reasonable to use the ratio of morphs per word to give a rough idea of the degree to which a language is isolating or polysynthetic. These are shown in the second column; the figures suggest that this ratio varies continuously rather than takes discrete values; with a larger set of languages we would expect to find many more intermediate values. The languages are listed in the table in order of decreasing value for this ratio. In the third column are figures suggesting how fusional the language is. This figure is the proportion of morphs that are fused: that is, the ratio of the complex or fused morphs to the total number of morphs. However, since 1SG is fused in every language – often also with NOM – 1SG and 1SG.NOM were treated as single morphs; to do otherwise would result in losing the distinction between the languages that combine just these categories, and those that combine one or more others with them. By this index, Latin emerges as a good example of a fusional language, both grammatical morphs expressing complex components of meaning.

Affixing typology The examples in Box 15.1 reveal an unexpected pattern. Within a word, grammatical morphemes are much more likely to follow a lexical root than to precede it. Using terminology loosely, suffixes outnumber prefixes by a considerable margin: there are some thirty-one readily segmentable suffixes, but only twelve prefixes, less than half the number. This pattern is not an accident of the small and unrepresentative selection of languages in our sample. Bybee et al. (1990) found suffixes outnumbered prefixes by almost three to one in a larger and more representative corpus of languages. The difference between prefixes and suffixes is so striking that some linguists have proposed it as a typological parameter, though of course it is limited to languages that are not strictly isolating. The exact manner of defining the two categories varies, but what seems to work best (and appears to have been first formulated by Arthur Capell in 1938) is: suffixing languages have suffixes only; prefixing languages have prefixes and/or suffixes. A range of other characteristics tend to correlate with this distinction. The other types of affix – infixes, circumfixes and suprafixes (i.e. prosodic ‘affixes’) – are far less frequent than prefixes or suffixes.

Morphological typology Grammatical number We mentioned in §15.2 an absolute implicational universal that if a separate dual number is distinguished in the pronoun system of a language there will also be a plural category. No pronominal system distinguishes dual against an undifferentiated non-dual (i.e. everything else,

Unity and Diversity in Language Structure

singular and plural more than two). Duals only emerge if there is already a category contrast between singular and non-singular pronouns. We can extend this generalization with the observation that if a language has the trial category (specifying ‘three’) or the paucal category (specifying ‘a few’) in its pronouns then it will also have the dual category. A useful way of summarizing these linked generalizations is in terms of a hierarchy, as shown in (15-5). (15-5)

singular/non-singular


θ

three thou

trois tu

*k > h

heart hound

cœur chien (initial /ʃ/ from /k/)

*d > t

two tooth

deux dent

*g > k

knee (/ni/ from /kni/) corn

genou (initial /ʒ/ from /g/) grain

The three sets of sound changes in Grimm’s law are interrelated, and must have been simultaneous. If the voiced aspirated stops first changed to plain voiced stops, and then later these changed to voiceless stops, then these to fricatives, all Proto-Indo-European stops would show up as fricatives in Germanic languages. Since this did not happen, the changes must have been linked together in a chain shift, as depicted in Figure 16.1. We can imagine either the top change dragging the other changes along after it, as it were, filling the spaces left over as a result of the changes; or alternatively the bottom change could be imagined as pushing the other changes ahead of it, preventing massive collapse of phonemic contrasts.

Figure 16.1 Grimm’s law chain shift.

391

392

Linguistics

Some common types of sound changes This section discusses and exemplifies five common types of sound change.

Loss or deletion Loss or deletion of segments is a not uncommon type of sound change. English examples include loss of voiced stops following nasals at the end of a word: the final /b/ of lamb and womb have been lost, as has been the final /g/ of strong (compare stronger). Loss of word final segments is called apocope. Loss of a segment from the beginning of a word is aphaeresis; word initial /k/ preceding an /n/ in English has been lost, as in knife and knee, the spellings of which reflect previous pronunciations. Segments can also be lost within a word – this is called syncope – as happened to the middle consonant of many three-member consonant clusters in Scandinavian languages. For example, the Swedish place name Väsby derives from väst ‘west’ and by ‘town’.

Insertion Insertion or epenthesis involves the addition of phonological segments into a word. The stops in thunder and number are epenthetic; speakers of English often add an epenthetic schwa between the two final consonants of film. Latin words beginning with an s followed by a stop began in the second century CE to add an initial short i vowel: scola became iskola (cf. French école ‘school’), and stabula ‘stable’ became istabula.

Assimilation Assimilation is the process whereby a segment changes to become more like a nearby segment, usually adjacent to it. This is a very common type of sound change. Latin octo ‘eight’ became otto in Italian, noctem ‘night’ became notte, and factum ‘done’ became fatto. (The letter c in Latin represents /k/.) The first stop in the consonant cluster in these words has changed to become more like – indeed, identical with – the following stop by adopting its place of articulation. This is called regressive assimilation because it is as though a property – in this case the place of articulation – of the second stop moves backwards (regresses) to the preceding stop in the cluster. An example of progressive assimilation, where the phonetic characteristic shifts forward, is the change in the history of Icelandic from *findan ‘find’ to finna, and *munθ ‘mouth’ to munn. Here the nasal component of articulation has shifted forward from the nasal to the following stop or fricative, resulting in a long nasal consonant. The above examples illustrate total assimilation – one segment becomes identical with another. Partial assimilation occurs when one segment becomes more like, but not identical with, another. Regressive partial assimilation is illustrated in the change from the velar stop /k/ to the alveopalatal affricate /ʧ/ before front vowels in the history of English. For instance, *kinn changed to chin, and *kɛ:si to cheese.

Language Change

Do you understand why the change from /k/ to /ʧ/ in these two examples is assimilation? Recall that front vowels have the high point of the tongue forward in the mouth, whereas velar stops have the back of the tongue relatively high in the mouth, making contact with the velum. The articulation of the stop changed so as to more closely approximate the following vowel by anticipating its tongue position. (Remember that when velars precede front vowels they tend to shift their point of articulation forward to the pre-velar region, as in words like keyy. Alveo-palatal articulation is a more extreme form of assimilation.)

Dissimilation Dissimilation is the reverse of assimilation: neighbouring segments become less alike. Dissimilation is most frequent with laterals and rhotics. The second rhotic in the Latin words arbor ‘tree’ and rôbur ‘strength’ changed to a lateral in Spanish arbol and roble.

Metathesis Metathesis is the inversion of the order of adjacent or nearby phones. English examples include ask from Old English acsian (in fact, aks was regular until the seventeenth century, and is still found in some dialects), bird from Old English brid, and horse from Old English hros (later sound changes have resulted in the loss of the rhotic in the last two words in many dialects). Metathesis is not common and often affects only certain words. But occasionally it is systematic. In Ilocano (Austronesian, Philippines) word initial /t/ and word final /s/ have been fairly consistently switched. Corresponding to Tagalog taŋis ‘cry’ and tigis ‘decant’, which preserve the original sequence of these two phonemes, are Ilocano sa:ŋit and si:git.

Generality of sound changes Sound changes can be limited to particular phonological environments, though they are sometimes unconditional, and apply everywhere. For instance, at some point in time the palatal lateral /ʎ/ in Hungarian changed unconditionally to /j/. Sound changes are generally regular. If /k/ changes to /ʧ/ before front vowels, this usually happens to every /k/ in this environment, and not just to some of them. Metathesis and dissimilation are the most frequent exceptions to the regularity of sound change.

If sound changes are so regular, why do we have exceptions to Grimm’s law such as genuflectt and pedicure e, which are obviously related to the French words genou u ‘knee’ and pied d ‘foot’ cited in Table 16.1? Shouldn’t these words show up with initial k and f, respectively? (See end of chapter for the answer.)

393

394

Linguistics

16.3 Morphological change Acquisition and loss of bound morphemes One way of acquiring new bound morphemes is through borrowing of words, and later factoring out of common components. Following the Norman invasion of England in 1066 English acquired a large number of lexical items from French, many of which had derivational affixes attached to them. As the borrowings became increasingly numerous and integrated into the language, these affixes were identified, and came to be attached to native English words as well. Derivational morphemes such as -able, -ment, dis- and re- entered the English language in this way. This is not the only way new bound morphemes can appear. Sometimes bound morphemes are borrowed as such, and not as parts of lexical items. Jeffrey Heath’s seminal study (1978) identifies a number of bound morphemes borrowed between languages of Arnhem Land. Among his examples is the verbal derivational suffix -thi ‘become’, borrowed into Ngandi (non-Pama-Nyungan) from its unrelated neighbour Ritharrngu (Pama-Nyungan). And in the Kimberley region, the bound comitative postposition -ngarri ‘with’ has been borrowed independently as a bound morpheme across languages belonging to at least three different families. Another important process of morpheme acquisition is grammaticalization, discussed in §16.5. Morphemes can also be lost over time, going slowly out of use until they completely disappear, or only relics remain. Old English verbs took a third person plural agreement marker -n as in stodon ‘they stood’, which was gradually lost over time as a result of a regular sound change.

Analogical change Languages often regularize their morphology, replacing irregular forms by regular forms, thus making the regular morphological rules more productive. For instance, shoe once had plural shoen, which is now completely replaced by shoes; and in Middle English the plural of cow was kine, which has completely gone in most dialects, being replaced by the regular cows. What is going on here resembles the process of solving a simple mathematical equation involving proportions: shoe is to x as loo is to loos. Clearly x must be shoes:

The new form is constructed on analogy with another morphological opposition; this is called analogical change. The term analogical levelling is used for regularizations in which an irregular morphological opposition is replaced by another opposition modelled after a more regular pattern. Another type of analogical change is analogical extension, whereby a minor morphological pattern is used as the basis for analogical remodelling. Otto Jespersen (1922: 131) cites a now famous example of a Danish child ‘who was corrected for saying nak instead of nikkede [“nodded”], [and] immediately retorted “stikker [“prick”], stak [“pricked”], nikker, nak,” thus showing on what analogy he had

Language Change

formed the new preterit’. In some dialects of American English the past tense of dive is dove rather than the regular dived. This form is formed by analogy with a minor pattern, as in drive ∼ drove.

The word mouse e has recently been extended to an item of computer equipment. The most usual plural of this type of mouse e seems to be not mice e, but the regular mousess – in a Google search for mousess (23 November 2013), the first hit was Shop for mouses on Google e; nearly a decade later (13 October 2022), the first hit was almost the same, Ads – shop “mouses” ”. Not far below was an article entitled The Very Best Wireless Mouses (https://nymag.com/strategist/article/best-wireless-mouse.html). However, mice e remains the standard plural of mouse e in its ordinary sense, as well as an alternative plural for the computer item. This is the beginning of the process of bifurcation. Both regular and irregular plurals remain, but they have acquired different meanings.

Reanalysis Reanalysis is the process by which a word with a certain morphological structure comes to be analysed differently over time (see also §4.3). Quince was originally the plural of quin, but the plural suffix was reanalysed as a part of the stem; the plural is now quinces – which historically has two instances of the plural suffix! Likewise for the Dutch word schoenen ‘shoes’, the earlier plural form of which was schoen. The auxiliary verb have, pronounced as /əv/ in many environments, may be being reanalysed in some varieties of English as of following a modal auxiliary (e.g. may, could, etc.). Thus it appears as /əv/ – even /ɔv/ – when stressed in expressions like would have gone; indeed, it is sometimes written with the preposition of, as in would of gone. The verbal suffix -ing is in some environments reanalysed as the participial -en. In some Austronesian (see §17.3) languages of the Pacific, ordinary nouns are preceded by a free morpheme a that indicates the following word is a noun. It seems that the proto-language also had a similar morpheme *a. But in Paamese it has been reanalysed as part of the root form of some nouns, as in atas ‘sea’ from *tansik, and ani ‘coconut’ from *niu.

16.4 Syntactic change Changes in word order Latin had relatively free word order. The three words of our earlier Latin example (3-2) (repeated as (16-4) below) can occur in any order, and the result is an acceptable sentence with the same experiential structure and meaning (see pp. 120–2). This is illustrated by examples (16-4)–(16-9). (16-4) serv-ı̄ cōnsul-em slave-PL:SUB consul-SG:OBJ ‘The slaves hear the consul.’

audi-unt hear-3PL:PRS

SOV

395

396

Linguistics

(16-5) serv-ı̄ audi-unt slave-PL:SUB hear-3PL:PRS ‘The slaves hear the consul.’

cōnsul-em consul-SG:OBJ

SVO

(16-6) cōnsul-em serv-ı̄ consul-SG:OBJ slave-PL:SUB ‘The slaves hear the consul.’

audi-unt hear-3PL:PRS

OSV

(16-7) cōnsul-em audi-unt consul-SG:OBJ hear-3PL:PRS ‘The slaves hear the consul.’

serv-ı̄ slave-PL:SUB

OVS

(16-8) audi-unt serv-ı̄ hear-3PL:PRS slave-PL:SUB ‘The slaves hear the consul.’

cōnsul-em consul-SG:OBJ

VSO

(16-9) audi-unt cōnsul-em hear-3PL:PRS consul-SG:OBJ ‘The slaves hear the consul.’

serv-ı̄ slave-PL:SUB

VOS

In the modern daughter languages, the Romance languages, however, word order is fixed SVO, as shown by (16-10). (16-10) les esclaves entend-ent the:PL slave:PL hear-3PL ‘The slaves hear the consul.’

le the

consul consul

French

This rigidification of word order doubtless resulted from loss of noun cases in modern Romance languages. With the loss of cases, word order was apparently pressed into service for distinguishing between subjects and objects. A similar thing happened in the history of English.

Saying that languages like Latin and Old English had ‘free word order’ should not be interpreted literally. First, in these languages it was phrase order rather than word order as such that was ‘free’: words belonging to the same phrase usually remained together. Second, different word orders conveyed different pragmatic nuances: they were not in free variation. Third, it is usually the case that not all word orders in a language are equally common; indeed, they may show different frequencies in usage, even if all are grammatically acceptable.

Word order can also be borrowed. In parts of Papua New Guinea, some Papuan languages have apparently borrowed SVO word from neighbouring Austronesian languages, while some Austronesian languages have borrowed SOV word order from their Papuan neighbours. Thus it is believed that word order in the Austronesian languages of the Central and Milne Bay Provinces was originally SVO, but this changed to SOV under the influence of nearby Papuan languages.

Language Change

Changes in grammatical constructions Grammatical constructions can, like morphology, change as a result of borrowing, reanalysis and extension.

Borrowing of grammatical constructions Pipil has borrowed the Spanish comparative construction, including the morphemes más ‘more’ and que (/ke/) ‘than’. Thus compare (16-11) with the Spanish counterpart (16-12). Prior to contact with Spanish, Pipil had different ways of expressing comparatives, all of which have been replaced by the Spanish-style construction. (16-11) ne siwa:t mas galá:na ke the woman more pretty than ‘That woman is prettier than you.’

taha you

(16-12) esa mujer es más linda the woman is more pretty ‘That woman is prettier than you.’

que than

Pipil

tú you

Spanish

Reanalysis of grammatical constructions Mandarin Chinese has a special grammatical construction called the baˇ-construction, a type of passive in which the word order is SOV instead of the normal SVO, and the object is marked by the preposition 把 baˇ, as in: (16-13) 我 把 张三 woˇ baˇ zhāng-sān I OBJ Zhang-san ‘I hit Zhang-san.’

打 daˇ hit

了 le PFV

Mandarin Chinese

Written records show that 把 baˇ was a verb meaning ‘take’ in Archaic Chinese. Gradually it developed into a preposition, while retaining its use as a verb until the Tang dynasty (618–907 CE). The modern construction comes from a sentence involving two verbs forming a so-called serial verb construction – in the case of (16-13), this would have had a meaning like ‘I took Zhang-san and hit him’. This also accounts for the unusual word order of the construction: the object is in the expected place for the erstwhile verb 把 baˇ ‘take’.

Extension of grammatical constructions A frequent type of meaning extension is for a reflexive construction to extend to cover passive senses. Old Spanish had a reflexive construction involving the reflexive morpheme se, as in (16-14). (16-14) Juanito se vistió Johnny reflexive dressed ‘Johnny dressed himself.’

Old Spanish

397

398

Linguistics

Over time this construction came to be used in contexts where a passive interpretation was also possible, and ultimately in contexts where only the passive interpretation is possible, as in (16-15) – which does not mean that the 2,000 people captured themselves! (16-15) cautiváron-se quasi dos mil they:captured-reflexive almost two thousand ‘Almost two thousand persons were captured.’

personas persons

Spanish

16.5 Grammaticalization The English adverbial derivational suffix -ly derives from Middle English lic ‘like’, and ultimately from Old English gelic. This is an example of grammaticalization, a process by which a lexical word becomes a grammatical item. One of the Danish passive constructions has a similar developmental history to the Spanish passive discussed in the previous section. This passive is marked by the verbal suffix -s, as in example (16-16). This suffix derives from an earlier free reflexive form sig ‘self ’, which later cliticized and extended to reciprocals (‘do to one another’) and middles (so-called because they are intermediate between actives and passives). The clitic ultimately reduced to the verbal inflectional suffix -s, and extended to cover passive senses. (16-16) dørene låse-s klokken seks the:doors lock-passive o’clock six ‘The doors will be locked at six o’clock.’

Danish

Grammaticalization generally involves phonological reduction and semantic bleaching, reduction in the meaning of the item, whereby the meaning becomes less concrete. Both are illustrated in the grammaticalization of English derivational -ly and the Danish passive suffix -s. Some examples of common types of grammaticalizations are: ●









Verbal derivational morphemes sometimes come from lexical verbs. For example, most varieties of the Western Desert dialect continuum (see p. 163) have a verbal derivational morpheme -rri ‘become’, which comes from a verb meaning ‘to fall’, which previously occurred in compounds. Complementizers (elements used as connectives in certain types of complex sentence, like that in I regret that he is sick) often come from the verb ‘say’, as in Ewe bé ‘that’, ‘say’. Copulas (words like to be that connect subject and predicate in clauses like John is sick) often derive from verbs of position or stance, like ‘stand’, ‘sit’, ‘lie’: Quechua tiya- ‘to be’ comes from *tiya ‘to sit’. Copulas frequently come from demonstratives or pronouns: the copula shì of Mandarin Chinese derives from shì ‘that, the afore-mentioned’. Auxiliary verbs often derive from main verbs, as in the case of English auxiliary verbs have and will which were once used only as main verbs meaning ‘to possess’ and ‘to desire’.

Language Change



● ●



Definite markers not infrequently come from demonstratives: the Danish definite markers -en (as in mand-en ‘the man’) and -et (as in bord-et ‘the table’) derive from postposed demonstratives. Indefinite articles often grammaticalize from ‘one’, as in the case of English indefinite a ∼ an. Relative pronouns (as in the child who was hit) often come from wh-question words, as in the case of English who, where and when. Future tense markers often have a source in verbs like ‘want’, ‘go’ and ‘have’: for example, the French and Spanish inflectional futures derive from the Latin verb habere ‘to have’.

Processes of grammaticalization are generally unidirectional: they proceed in one direction only. For example, verbs like ‘want’ and ‘go’ often become future tense markers, but the reverse does not occur. Free lexical words may become bound affixes, and show reduction in semantic content. The reverse process is rare, though not entirely unheard of: the English derivational suffixes -ism and -ish have recently become independent words meaning ‘doctrine, belief system’ and ‘to some extent’, respectively. Another possible example is the possessive enclitic -s ∼ -z ∼ -əz, which may derive from an inflectional genitive case marker. (Other sources have been proposed.) Such processes are the exception not the rule.

16.6 Semantic change It is not just the forms (signifiers) of linguistic signs that change over time, but also the meanings (signifieds); indeed, these can change drastically. As a result, cognates can be obscured, and appear unlikely. For instance, English silly is cognate with Danish salig and German selig both of which mean ‘blessed, blissful’. In fact, silly comes from Old English sǣlig ‘happy, blessed, blissful’, which took on the sense ‘humble, simple’ in Middle English, then ‘feeble, week’, and then ‘weak-minded, stupid’ in Early Modern English. Semantic changes are not random, although they are not as regular as sound change, and are frequently restricted to individual lexemes. Like other changes, semantic changes can be classified into a number of recurrent types. We have encountered some of these already: extension (§4.3); narrowing (§4.3); bifurcation (§16.3); and bleaching (§16.5). Four other types of change are common; we discuss them in the following subsections.

Pejoration Pejoration is the process by which a word acquires negative connotations. Speakers come to evaluate the word less positively, ultimately giving it a more negative meaning. The changes in meaning of silly illustrate this. Pejoration is also involved in the development of the modern words moron, negro and midget. A considerable number of terms for women in English (and other languages) began as relatively neutral terms, and acquired increasingly negative connotations over time. Hussy was originally a shortened form of housewife; slut previously denoted a woman of

399

400

Linguistics

untidy habits; mistress, a borrowing from Old French maistresse ‘woman in control’, at one time denoted a ‘woman who employed others in her service’; and madam began as a polite term of address to women. For the first two terms the pejorative senses triumphed; for the second two, both neutral and pejorative senses are still available.

Amelioration The reverse process, in which a word comes to acquire (more) positive connotations, is amelioration. Fond comes from the past participle of fonnen ‘to be silly, foolish’ in Middle English. Knight comes from Old English cniht ‘boy’, which shifted to ‘servant’, then ‘military servant’, and thence to its modern meaning ‘member of lower nobility’. Parallel developments are found in other European languages. Spanish caballero ‘knight, nobleman’ began as a term for ‘horse-rider’; caballo ‘horse’ in turn comes from Latin caballus ‘workhorse, nag’.

Hyperbole In hyperbole a word loses a strong aspect of meaning through frequent exaggerated use. Intensifying adverbs like terribly, awfully and horribly have, through overuse, in certain contexts lost the senses of the stems from which they are derived, terrible, awful and horrible. Thus, when used in adjectival phrases such as terribly sick, awfully sorry, they are now general intensifiers meaning little more than ‘very’. (Their earlier senses remain in other contexts, as in She died horribly.) Starve comes from Old English steorfan ‘to die’, and quell from cwellan ‘to kill, to slay, to put to death’.

Understatement Understatement is another type of exaggeration that can lead to semantic change through overuse. Verbs of killing sometimes derive from weaker verbs of violence that do not necessarily imply death, via understatement: kill derives from a verb meaning ‘to hit, strike, knock’; French meurtre ‘murder’ from a verb meaning ‘to bruise’. Another example is bereaved, from Old English be-rēafan ‘to rob, plunder’.

Direction of semantic change What usually happens in semantic change is that an additional meaning is acquired in a certain restricted context of use; this new meaning tends to increase in frequency of occurrence, until it takes over and the original sense goes out of use. English write can be traced back to a protoGermanic lexeme meaning ‘to cut, scratch’. The meaning was extended to include also ‘to write’, the context being through runic writing, which was scratched on stone and wood. This is reflected in Old English wrîtan ‘to cut’ and ‘to write’, and Old Norse (Indo-European, Europe) ríta ‘to score’ and ‘to write’. In Modern English the original sense ‘to cut, scratch’ has been lost, and only the extended sense ‘to write’ remains. As this illustrates, to understand an extension in the meaning of a lexeme may require knowledge of previous technology and cultural practices.

Language Change

We conclude this all too brief discussion of semantic change by mentioning a few instances of not infrequent changes that tend to go in a single direction, typically from more concrete to less tangible, and more abstract. ● ● ●





Body-part terms often develop into spatial terms (e.g. at the foot of). Spatial terms frequently acquire temporal senses (e.g. after and before). Perception verbs often develop into verbs of comprehension (e.g. from see or hear to ‘understand’). Terms for handedness and/or sides of the body often develop into terms of moral evaluation or qualities; typically the left side develops in the negative direction to badness and evil (for instance, sinister derives from a Latin word meaning ‘left’), the right to goodness and virtue (think of right). Terms for obligation, ability and permission often develop into terms expressing degrees of probability (e.g. must, which originally indicated obligation, has developed a sense ‘necessarily true’).

16.7 Causes of language change Why do languages change? Numerous reasons have been put forward over the centuries, some plausible, others quite fanciful. Among the latter are anatomical, ethnic, racial and geographical factors. To give one example, it has been suggested that consonants in languages spoken in mountainous regions change more rapidly than they do in languages spoken in coastal regions because of the greater breathing effort required. Yet Danish has undergone extensive consonantal changes, although its primary speech community resides on a very flat coastal terrain. Perhaps, as Otto Jespersen joked (1922: 257), it is due to the number of Danes holidaying in Switzerland and Norway (these days on Crete)! In the following sections we outline some of the more plausible reasons why languages change. In most cases, a change is likely to be motivated by a combination of factors, rather than just a single factor. Before we begin discussing the causes, it may be well to remark that changes over time are generally considered to emerge from variation that existed in earlier varieties of the language, prior to the change. Synchronic variation serves as it were as fuel for diachronic change, which did not happen instantaneously.

Physiological tendencies Simplification or ease of articulation has often been suggested as a reason for sound changes. Loss of segments results in shorter words, and less effort in production; assimilation reduces the difference between segments in sequence, and so also the effort in production. It is not far from this view to the idea that laziness, sloppiness and indolence are the major causes of sound change. Speakers of English will be familiar with these as everyday explanations of the contemporary changes in the language, habitually remarked on by media watchdogs of ‘correct’ English.

401

402

Linguistics

But there are problems with simplification and ease of articulation as explanations of sound change. To begin with, what is simple or easy? Crowley (1992: 201–2) observes that the two segments in the sequence /gl/ in Kuman (Papuan, New Guinea) were fused together to form the velar lateral /ʟ/. This ‘simplification’ results in a segment that is relatively unusual in the world’s languages, and which is far from easy for non-native speakers to articulate. A similar ‘simplification’, this time at the allophonic level, is found in some dialects of Australian English where the lateral /l/ is realized as the velar allophone [ʟ] preceding a velar stop, as in milk and elk. (In some dialects /l/ has become even more like the high back vowel in some syllabic positions, and lost its consonantal features entirely.) As in these examples, simplification in one place often leads to complexity elsewhere. Loss of final vowels or initial consonants in a language will result in shorter words, and less production effort. But it can lead to complexity in syllable shapes – for instance, the emergence of V and CVC syllables, where previously all syllables were CV. It is not that the simplification explanation is totally misguided. Rather, it needs reformulation in more explicit and physiologically appropriate terms. We can think of speech production as involving muscular gestures or movements that are coordinated in a complex way, rather like the instruments in a symphony. A variety of physiologically and psychologically natural processes affect the gestures when they are put together; these concern the nature of the gestures, their presence and their timing. We can see these in the emergence of the Kuman velar lateral: the velar gesture and the lateral gesture (i.e. lower the sides of the tongue) have been retimed from sequential to simultaneous, and along with this the stop gesture and the apical gesture have disappeared. Nasalized vowels arose in French in a similar fashion. Between the ninth and fourteenth centuries final /n/ began to be lost in words like bien ‘well’ and fin ‘end’. The lowering of the velum was retimed to occur during the vowel; the final gesture, the blockage of air through the oral cavity, eventually disappeared. A similar explanation accounts for the insertion of segments in some circumstances. It is difficult to coordinate the articulatory gestures in sequences such as [ml] and [mt] so that the velum is raised at precisely the same time that the bilabial contact is released and the apical contact initiated. Closure of the nasal cavity prior to the opening of the lips in the production of the [m] will result in insertion of a [b]. The English word bramble acquired its second bilabial stop in this way. This type of explanation is based on physiological processes for which no further explanations are proposed – we have not attempted to explain why some gestures were lost, some gained, and others became simultaneous: they just happened. The actual changes are not predictable like the motion of the planets; at best they are more or less explicable in hindsight. In most circumstances different outcomes could have eventuated, some more likely, others less likely. It is important to note that not all sound changes can be explained in this way.

Functional considerations Languages change to meet new needs and purposes. We have already seen illustration of one such process in the acquisition of new items of vocabulary for new and novel things and meanings (see

Language Change

§4.2). In a similar way lexemes can be lost – or undergo meaning change – as the objects they denote become outmoded. Thus a lure was originally a special pipe used to call back hawks in the medieval sport of falconry; now it refers to anything used as an enticement. A computer used to be a person whose job was to make numerical calculations. This sense of the word has now virtually completely disappeared, along with the job, and computer now refers to a machine that habitually makes such calculations. Morphological and syntactic change can also be motivated at least in part by functional considerations. The shift from free word order in Old English to more or less fixed word order in Modern English (see §16.4) has a functional motivation: the need to keep the subject and object distinct, in the face of loss of case marking on nouns. With endless repetitions of lexical items, they tend to lose whatever expressive value they may once have had. Some instances of morpho-syntactic change are motivated by considerations of expressiveness, which can be considered as a type of functional motivation for change. Many languages of northern Australia have compound verb constructions involving compounding of a morphologically almost invariant preverb and an inflection-taking verb, as in Miriwoong dilyb gema-n-tha (break he-it-get-past) ‘he broke it off ’. There is no reason to suppose that these compound expressions were introduced into the languages because of lexical gaps in ancestor languages.2 Rather, it seems that they began life as constructions involving an ideophone (see pp. 87–8) paired with a verb. Over time this mode of expression came to dominate and eventually, in many instances, won out over plain verbs, which had become lifeless old ways of talking about events.

Identity An important motivation for language change is to establish and maintain group identity and cohesiveness on the one hand, and on the other hand, to signal distinctiveness from other groups. Youth and occupational groups often employ some lexical items peculiar to themselves, or give existing lexemes new senses. Youth ‘slangs’ or jargons distinguish members from older people. These styles are rather like fashions in clothing in that they serve to distinguish group members from outsiders: they are in effect fashions in speaking. Phonetic change can also be motivated by identity considerations. In a pioneering study of the centralization of the beginning of the diphthongs /aɪ/ and /aʊ/ in Martha’s Vineyard, a small island off the coast of Massachusetts, USA, William Labov showed that the extent to which the change is employed is correlated with social attitude. The change has taken greatest hold on speakers who identify themselves as islanders, and have the most negative attitudes to mainlanders. It has been adapted to a lesser degree by speakers with more neutral attitudes, or more positive attitudes to mainlanders. The idea here is that a minor variation in speech can be adopted by speakers as a badge of their identity as a group; the variation can then spread through the language variety of the speech community. The variant itself is effectively arbitrary; it is the acceptance of the variation leading ultimately to change in the variety that is explained, not the particular variant that emerged.

403

404

Linguistics

Foreign influence Extensive contact between speakers of different languages can result in language change. This is especially the case when speakers of one language are politically dominant, and there is widespread bilingualism or multilingualism in the speech community. The widespread changes that happened to English in the aftermath of the Norman invasion of England in 1066 resulted from the political dominance of French, and its high status in the public domain. In this case, (Old) English was the substratum language, the language of the politically subordinate group. In colonial times, English was usually the superstratum language. The new Englishes that arose in colonial contexts show features of the substratum languages. This is the case for Indian English, which shows phonetic characteristics of the substrate Indo-European and Dravidian languages. Some features of the English of African Americans that are not shared with standard American English have been put down to influences from the languages spoken by the slaves transported to America centuries ago, such as Ewe and Mandinka (Niger-Congo, Senegal). In situations of extensive community-level bilingualism it often happens that lexemes, morphemes, grammatical constructions and even phonemes are borrowed. Such borrowing has happened on a large scale in Aboriginal Australia, where multilingualism was the norm in precolonial times. In such multilingual environments, even bound morphemes are not infrequently borrowed between unrelated (or very distantly related) languages. In the small village of Kupwar in southern India three languages have been in close contact for some six centuries, Kannada (Dravidian), Urdu (Indo-European) and Marathi (Indo-European). Many villagers are bi- or tri-lingual. While the lexical items of the three languages have tended to remain separate, and relatively few lexical borrowings have occurred, their syntax has converged; the local varieties of the languages are rather different grammatically from the standard varieties, and more similar to one another.

Taboo Lexical replacement is sometimes motivated by phonological similarity to a taboo word. ProtoUralic *kuńćɛ ∼ *kućɛ ‘star’ and *kuńće ∼ *kuće ‘urine’ merged together in Old Hungarian, becoming homophones húgy ‘star’ and húgy ‘urine’. The former lexeme then became obsolete, and was replaced by csillag; the latter lexeme remained. Something similar happened with the word coney ‘rabbit’ in Middle English. It came to be used as a term first of endearment then of abuse of women; ultimately it was used for the female genitals. Because of this association, it was dropped as a term for ‘rabbit’, and now remains only in the last sense. A similar thing has been happening to cock ‘rooster’ in many dialects of English, with its extension to ‘penis’. These examples illustrate a general tendency, namely that the term denoting the taboo or ‘touchy’ body part or product is retained at the expense of the other term, which is replaced by a new term. Thus some speakers of English tend to use rooster in preference to cock when referring to the bird. We mentioned in §4.5 the tabooing of names of recently deceased persons and similar-sounding words in Australian languages. While the tabooed word usually comes back into use within a few

Language Change

years, it is likely that in some cases the replacement term sticks, and the original tabooed term is dropped for good.

Social upheaval It is sometimes suggested that major linguistic changes correlate with periods of social upheaval. With rapid breakdown of the existing social system, and the communication networks constituting it, the language system might also show disruption and rapid change. There is doubtless some truth to this suggestion. The Norman invasion of England was such an event, and it did give rise to a number of quite substantial changes in English. The decimation of many Indigenous groups in Australia and the Americas led to the obsolescence and death of many languages; in some cases the languages were only partially learnt by children, and survived as varieties with simplified grammar and reduced lexicons. However, it is unlikely that all cases of rapid language change (the rate languages change is variable, as mentioned at the beginning of the chapter) can be put down to social upheaval, or that social upheaval inevitably leads to rapid language change, except perhaps in the lexicon. As we saw in Chapter 14, the so-called Information Revolution of recent years has given rise to numerous new lexical items, even new ways of using language. But it does not seem that the phonological and grammatical systems of English have simultaneously undergone substantial changes.

Regularization Languages often change so as to regularize their grammar, reducing the number of irregularities and partially regular patterns in the morpho-syntax. The processes of neatening and extending the patterns characteristically occur at certain points in first-language learning (see §12.1). Speech ‘errors’ of adult speakers sometimes result from over-regularization – for instance, when an adult says strived instead of strove or striven. Such ‘errors’, especially when they occur in infrequent words, may become the accepted forms, ousting the existing irregular form, and giving rise to analogical levelling (§16.3). The result is greater transparency in the system.

Structural pressure There is some tendency for paradigmatic systems within a language to be regular or symmetrical. That is to say, languages tend to prefer regular systems such as the Sanskrit system of stops shown in (16-17) over irregular ones such as the hypothetical one shown in (16-18). (16-17) p pʰ b bʰ

t tʰ d dʰ

ʈ ʈʰ ɖ ɖʰ

c cʰ ɟ ɟʰ

k kʰ g gʰ

Sanskrit

405

406

Linguistics

(16-18) p pʰ bʰ

t



ʈʰ ɖ ɖʰ

c

k

ɟ

g



Asymmetrical systems tend to become symmetrical by filling in gaps. This is what is meant by structural pressure as a factor in language change. This seems to have been at least part of the motivation for the emergence of /ʒ/ in English. In the eighteenth century, English had the irregular system of fricatives as shown in (16-19). (16-19) f v

θ ð

s z

ʃ

h

In the nineteenth century a partner for /ʃ/ emerged through insertion of /j/ following the original /z/ in words like treasure and pleasure; this sequence subsequently fused into the single fricative segment /ʒ/. This leaves /h/ out on a limb. There is no evidence of any pressure in English for the emergence of its voiced counterpart /ɦ/. Instead, there is a tendency for /h/ to disappear from the phonological system of English: it has been lost completely in some varieties, and has been retained in others mainly through strong social pressures, including the influence of writing. Presumably at least part of the reason why English has not developed /ɦ/ is because the voiced glottal fricative is unusual in the world’s languages. The tendency towards regular paradigmatic systems is always balanced against such considerations. In many Australian languages we find a neat patterning of stops and nasals at each point of articulation. But laterals tend to break the pattern: bilabial and velar laterals are absent, and lamino-dental laterals are rare, even in languages that distinguish this place of articulation for stops and nasals. The best we can say is that there is some tendency for languages to fill in structural gaps. On the other hand, changes can result in gaps in what were perfectly regular systems. Proto-Indo-European *p was lost in proto-Celtic, resulting in a less regular system in the stop consonants. The gap was filled in differently in different branches of Celtic. Motu (Austronesian, New Guinea) has recently lost its velar nasal, creating an irregularity in the otherwise regular system of nasals and stops that distinguished three points of articulation. There is as yet no evidence of any change that might lead to filling this gap.

Summing up No living language remains static for long, and all aspects of language are subject to change, though some features are more resistant to change than others. And, overall, language change occurs at very different rates. Common processes of sound change are: loss or deletion, insertion, assimilation, dissimilation and metathesis. Sometimes a group of sounds is affected by a linked set of changes in a chain shift;

Language Change

Grimm’s law is an example. An important property of sound change is its regularity. This permits us to identify cognates, words in related languages that can be traced back to the same word in an ancestor language. Morphological change can happen through borrowing, analogical change (including levelling and extension) and reanalysis. In some cases both the new analogized form and the original form coexist, and take on different meanings; this is called bifurcation. The same three types are also found in syntactic change. In addition, word order changes sometimes result from morphological changes such as loss of case marking. Semantic change tends not to be as regular as sound change. A variety of processes can result in semantic change, including: bifurcation, tabooing, euphemism, dysphemism, metaphor, metonymy and synecdoche. Types of semantic change include extension, narrowing, bleaching, pejoration, amelioration, hyperbole and understatement. Grammaticalization is the process by which new grammatical words and morphemes emerge in a language, often from lexical items. Grammaticalization is normally accompanied by semantic bleaching and phonological reduction. It is normally unidirectional: lexical items become grammatical items, but the reverse process is rare. Causes of language change are numerous and varied. They include physiological and psychological tendencies, functional and structural pressures, maintenance of identity, foreign influences, social upheaval, and taboo.

Guide to further reading Aitchison (2013) is an excellent and entertaining introduction to language change. The best introductory textbook is, in my opinion, Crowley and Bowern (2010); Campbell (2021) is also good. More advanced textbooks include Anttila (1972) and Hock (1991). Luraghi and Bubenik (2013/2010) is an accessible collection of articles covering the major topics in language change. Joseph (2017) provides a good overview. See the Guide to further reading for Chapter 4 for works on the history of English; of these, Burridge (2004) is recommended for the variety of examples it provides of each type of semantic change, and its lively style. Etymological information on English words can be found in dictionaries of etymology, such as Ayto (1990) and Onions (1966). Simpson and Weiner (1989) also includes a good deal of etymological information, as well as extensive exemplification of word usage from written sources since the earliest times. Many useful (and not so useful) etymological resources can be found on the web, including a free online etymological dictionary of English at http://www.etymonline.com/. Also useful is Eugene Cotter’s Roots of English: An Etymological Dictionary, which can be downloaded from https://www.thefreewindows.com/downloads/default. asp?h=https://onedrive.live.com/download?resid=61014F748757CEE4%21863&ttl=RootsOfEngl ish; this dictionary, however, is restricted to terms of Latin and Greek origins. Numerous recent works deal with grammaticalization, though few are suitable for beginners. With some reservations, I recommend Hopper and Traugott (2003), Heine and Kuteva (2002) and

407

408

Linguistics

the comprehensive Narrog and Heine (2011). A general overview can be found in Smith (2011); three articles in Joseph and Janda (2003) deal with grammaticalization: Bybee (2003), Heine (2003) and Traugott (2003).

Issues for further thought and exercises 1 Choose a passage of about 100 words from a piece of popular scientific writing, such as from the pages of Scientific American or a book such as e.g. the first paragraph of Charlesworth and Charlesworth (2003: 1). Attempt to identify which lexical words are inherited from protoGermanic (i.e. are ‘native’ English), and which have been borrowed or involve borrowed elements. What proportion of the lexical items are borrowings? Identify the source language for each borrowing. Which fraction of the borrowed items has each language contributed? (You should consult an etymological dictionary – see above.) Now do the same with a similar length passage from a novel. Comment on any differences you observe between the two excerpts. 2 Sometimes chimney is pronounced as chimbley or chimley. What sound changes have occurred? In many dialects of English words like due, duty, dubious, dual and duke are pronounced with an initial /ʤ/; they can be traced back to forms with initial /d/. How do you think the sound change occurred, and what type of sound change is involved? (Hint: recall the emergence of /ʒ/.) 3 Spellings often provide information about earlier pronunciations of words. But this is not always so in English spelling. Find out how the bolded letters in the following words became part of the spelling: psychology, photograph, enough, ye olde shoppe, doubt, ghost, sneeze, knave, caught, fault and which. 4 Below are some words in Banoni (Austronesian, Solomon Islands) and the proto-forms they derive from. What sound changes have occurred, and what types of change are they? Proto-form *mpaɣa *mpunso *tipi *makas *pekas *koti *mata *matua

Banoni bara busa tsivi maɣasa beɣasa kotsi mata matsua

Gloss ‘fence’ ‘fill’ ‘a traditional dance’ ‘dry coconut’ ‘faeces’ ‘cut’ ‘eye’ ‘rise’

5 Below are some words in Portuguese (spelt phonemically) and their Latin sources. What sound changes are represented in this data?

Language Change

Latin contrā grandis septem tantus focus (‘hearth’) decem fēmina lūna nōn

Portuguese kõtra grãdi sɛci tãtu fɔgu deʒ femea lua n˜ɔ

Gloss ‘against’ ‘big’ ‘seven’ ‘so much’ ‘fire’ ‘ten’ ‘woman’ ‘moon’ ‘no’

6 Here are some cognates from two Indo-Iranian languages, Sanskrit and Pali. The forms in one language have undergone a sound change, while the forms in the other have remained unchanged. Which language do you think has changed? What type of phonological change is exemplified? Justify your answer. (Note that represents the palatal glide.) Sanskrit bhartum patra sahasravarsati ārya

Pali bhattum patta sahassavassati ayya

Gloss ‘carry’ ‘wing, leaf ’ ‘thousand’ ‘it rains’ ‘noble’

7 Find out the etymology of the following English words: marshal, giddy, pioneer, coach, husband, wife, bowdlerize, woman, cretin, pen, avocado, assassin, phony, love (as the zero score in tennis), barbeque, grog and lord. What sound changes and semantic changes have occurred in the documented history of these words? 8 It was mentioned at the beginning of §16.7 that changes in a language can usually be traced back to variants existing at a single point in time. Such variants can emerge for humorous purposes. For instance, from a letter to Scientific American: ‘Freud said a few absurd things, but to ignore all his ideas would be a “phallusy” ’ (Scientific American, September 2004, p.7); some years ago I gave a lecture on ‘phorensic phonetics’; and from the Clinton-Starr affair is fornigate. What process is involved in these inventions? Can you think of other examples in modern English (or another language you know well) that illustrate this process? 9 What was the Great English vowel shift? Find out when it occurred, and which vowels were affected and how. What sort of change was it? (See, for example, Campbell 2021, Anttila 1972, Hock 1991 and the internet.) 10 The following sentences taken from Shakespeare’s plays illustrate the way negatives were constructed in Early Modern English. Describe the negative constructions as illustrated by these examples; compare these constructions with the modern counterparts, and describe how the syntax has changed.

409

410

Linguistics

a. Be it so she will not here before your Grace Consent to marry with Demetrius (A Midsummer Night’s Dream, Act I, Scene 1) b. I know not by what power I am made bold (A Midsummer Night’s Dream, Act I, Scene 1) c. Whether, if you yield not to your father’s choice . . . (A Midsummer Night’s Dream, Act I, Scene 1) d. My soul consents not to give sovereignty. (A Midsummer Night’s Dream, Act I, Scene 1) e. Why should not I then prosecute my right? (A Midsummer Night’s Dream, Act I, Scene 1) f. Demetrius thinks not so; He will not know what all but he do know. (A Midsummer Night’s Dream, Act I, Scene 1) g. Nay, faith, let not me play a woman; I have a beard coming. (A Midsummer Night’s Dream, Act I, Scene 2) h. For, being not propp’d by ancestry, whose grace Chalks successors their way, nor call’d upon (King Henry the Eighth, Act I, Scene 1) i. Ladies, you are not merry. (King Henry the Eighth, Act I, Scene 3) j. Was it not she and that good man of worship, Antony Woodville, her brother there, That made him send Lord Hastings to the Tower, From whence this present day he is delivered? (King Richard III, Act I, Scene 1) k. Heard you not what an humble suppliant Lord Hastings was, for her delivery? (King Richard III, Act I, Scene 1) l. Didst thou not kill this king? (King Richard III, Act I, Scene 2) m. Is not the causer of the timeless deaths Of these Plantagenets, Henry and Edward, As blameful as the executioner? (King Richard III, Act I, Scene 2) n. The saddler had it, sir; I kept it not. (The Comedy of Errors, Act I, Scene 1) o. Dost thou not know? (The Comedy of Errors, Act I, Scene 2) p. May he not do it by fine and recovery? (The Comedy of Errors, Act I, Scene 2) q. Saw’st thou not, boy, how Silver made it good At the hedge corner, in the coldest fault? I would not lose the dog for twenty pound. (The Taming of the Shrew, Act I, Scene 1) r. Would not the beggar then forget himself? (The Taming of the Shrew, Act I, Scene 1) s. Trouble us not. (The Tempest, Act I, Scene 1) t. He misses not much. (The Tempest, Act I, Scene 2)

11 In the following semantically related pairs the words in the first column are native English stock, those in the second are borrowings from French. How would you characterize the semantic difference? Can you suggest an explanation for why the words differ in meaning in this way? sheep calf pig cow

mutton veal pork beef

Other pairs of semantically related inherited items and French borrowings include:

Language Change

clothes gown ask climb arse

attire negligee question mount derrière

Can you modify your explanation for the first set of terms to cover these additional pairs? What other similar pairs can you find?

Research project What are evidentials – what is the nature of evidentiality as a grammatical category, and what are some languages that have evidentials? Write an essay on the grammaticalization of evidentiality, identifying the main sources from which evidentials develop, and the diachronic processes that have been proposed for their development from the sources. Discuss the role of metaphor in the grammaticalization processes you find. Good places to begin are the chapter on evidentials in Narrog and Heine (2011) and a search of the internet.

411

Answer to question on p. 393 These words were borrowed into English after Grimm’s law ceased to apply. The existence of words like this, that fail to observe established sound laws, is one factor that permits linguists to identify borrowings and their timing relative to sound changes.

412

17 Languages of the World

This chapter is an introduction to the diversity of the world’s languages. It surveys the number, distribution and viability (states of ‘health’) of the languages, and how they are related to one another. The notions of genetic relatedness and language family are explained, and the methods for establishing them are outlined. Some of the major families are overviewed. We also discuss three types of language that do not fit into families.

Chapter contents Goals Key terms 17.1 Number and variety of the world’s languages 17.2 Relations among the languages 17.3 Seven (putative) language families 17.4 Contact languages Summing up Guide to further reading Issues for further thought and exercises Research project

414 414 415 417 422 432 435 436 437 439

413

414

Linguistics

Goals The goals of the chapter are to: ● overview the linguistic diversity of the world; ● show the enormous discrepancies in the numbers of speakers of the world’s languages; ● explain what it means for languages to be genetically related; ● introduce the comparative method, the principal method of establishing genetic relations among languages and two other less reliable methods; ● overview seven of the world’s major (putative) families; and ● identify three types of contact language that do not fit neatly into genetic groups: pidgins, creoles and mixed languages.

Key terms Afroasiatic

genetic relations

pidgins

Austronesian

groups

proto-languages

basic vocabulary

Indo-European

reconstruction

cognate sets

Khoisan

Sino-Tibetan

comparative method

language isolates

contact languages

lexicostatistics

sound correspondences

creoles

mass comparison

expanded pidgins

mixed languages

families

mutual intelligibility

family trees

Niger-Congo

stocks subgroups Trans-New Guinea

Languages of the World

17.1 Number and variety of the world’s languages How many languages are spoken in the world today? Current estimates put the number of languages somewhere around 7,000. The latest (25th) edition of Ethnologue (see p. 436) lists some 7,151 languages, including 157 sign languages. Somewhat smaller estimates are found in other recent sources. For instance, Pereltsvaig (2012: 11) estimates 6,500–7,000, while Ostler (2005: 7) estimates somewhat over 6,000. Twentieth-century sources tend to give significantly smaller numbers. Thus the 11th edition of Ethnologue (dated 1988) lists almost a thousand fewer languages than the current edition, 6,170, and Ruhlen (1987: 3) speaks of around 5,000 languages. Is this evidence that the languages of the world are multiplying rapidly? No. In fact, quite the contrary is the case: the world’s linguistic diversity is, like its bio-diversity, rapidly declining (recall §7.5). The increases in numbers are partly due to improved knowledge of the linguistic diversity of the world, as more and more languages spoken in remote and inaccessible (from the Westerner’s perspective) regions come to light. Also significant is the question of what is being counted – what constitutes a single language? The word language is used in a variety of different senses. In popular parlance, a political sense is usually invoked: people often speak of languages as the forms of speech that are associated with nations. Different nations, according to this view, have different languages. Italian is the language of Italy, German of Germany, French of France and so on. Different varieties of speech found within a nation would be regarded as dialects of the language. In other parts of the world the relevant political unit might not be a nation, but a ‘tribe’, or some other unit. Linguists often use the term in a different way, and employ the criterion of mutual intelligibility. If speakers of one form of speech can understand the speakers of another without having to learn it, the varieties are said to be mutually intelligible, and they are dialects of a single language. British English and Australian English are mutually intelligible, and so are dialects of a single language, English, according to this definition. Sometimes linguists use the term language to refer to forms of speech that share a certain percentage of common (or very similar) words and morphemes, and the term dialect for forms that share a higher percentage of common words. For example, American English, Australian English and British English share a considerable number of words, and would count as dialects according to this conception. These three senses do not always coincide. The mutually intelligible varieties of speech spoken in Britain and Australia are associated with different nations, and would represent different languages by the political criterion – and indeed they are sometimes spoken of as different languages: e.g. ‘the Australian language’ as distinct from ‘the British language’. Hindi and Urdu are mutually intelligible, but often regarded as distinct languages, Hindi being the language of Hindu people mostly living in India, Urdu the language of the Muslim people of Pakistan. By contrast, Mandarin Chinese and Cantonese are frequently spoken of as ‘dialects’ of Chinese, not only in the West, but also in China itself, even though they are not mutually intelligible. This is because of the long political unity of the communities of speakers and the shared writing system.

415

416

Linguistics

The number of languages one distinguishes in the world depends on which sense of the term language one deploys. To make things more problematic, linguists (and others) often do not make it clear which sense they are using, and even mix senses. So someone making a list of languages cannot always be sure what sort of things the named varieties are, and be certain they are not counting chalk and cheese.

Distribution of languages Languages are not spread evenly across the globe. There is an especially high density of languages in the equatorial to tropical region, between the Tropic of Cancer and the Tropic of Capricorn, even though this spreads across continents and islands separated by expanses of sea. The density in more northerly regions is significantly lower. It is not difficult to guess that the low linguistic density in Greenland and Siberia is a result of low population densities. But population density is not the only consideration. Mainland China and Europe are much more densely populated than Australia and the island of New Guinea, but their language densities are considerably lower. In fact, over 1,000 languages are spoken in New Guinea and on nearby islands, making it the most linguistically dense region of the world.

Why so many languages? It is generally agreed that anatomically fully modern human beings, of the genus homo sapiens sapiens, emerged in east Africa around 200,000 years ago. They spread out from there to the rest of Africa, Asia, Europe, Australia (arriving at least 40,000 years ago), and later into America (arriving perhaps as recently as 13,000 years ago, possibly over 20,000 years ago). Languages change rapidly, and the physical separation of populations over time would result in the division of languages into dialects, and ultimately into mutually unintelligible languages. Even if the earliest populations in Africa spoke a single language, the social and geographical separation of human populations during the past two hundred millennia could account for the modern diversity of languages. Perhaps we should put the question the other way around: why so few languages?

Numbers of speakers of the world’s languages It is impossible not to be struck by the enormous discrepancies among languages in terms of their numbers of speakers. A small number of languages are spoken by enormous numbers of speakers: over 40 per cent of the world’s population have as their mother tongue one of nine languages, each with more than 100 million speakers: Mandarin Chinese, Spanish, English, Hindi-Urdu, Arabic, Portuguese, Bengali, Russian (Indo-European, Russia) and Japanese.1 At the other end of the spectrum, around 3,500 languages (i.e. around half the total number of languages) have fewer than 10,000 speakers each. Strikingly, the speakers of these languages together comprise less than 0.3 per cent of the world’s population.

Languages of the World

Of the known languages, almost 1,000 are now extinct, and over 300 are nearly extinct, with just a few elderly speakers (according to Glottolog https://glottolog.org/langdoc/status). This trend has intensified since the beginning of the colonial period, and continues today. For example, in the more than four decades during which I have been working on Kimberley languages, at least six have lost all their fluent speakers, and are effectively dead. A couple of others have reached the critical stage, and have at best a handful of speakers and part speakers.

17.2 Relations among the languages How are the world’s languages related? Some languages belong together in the sense that they derive from a single ancestor language – called a proto-language – that was spoken long ago, and that subsequently split into varieties that over the passage of time became mutually unintelligible. Languages that derive from a single protolanguage are said to be genetically related, and to belong to a single language family. It is not known for sure whether or not all languages of the world ultimately come from a single ancestor language spoken in the very distant past – say, at the dawn of the emergence of modern human beings. Languages change so rapidly that convincing evidence of relatedness does not remain for more than about 10,000 years (many would put the limit much lower than this, to about 6,000 years) – beyond that length of time, it is increasingly difficult to separate chance similarities among languages from similarities shared from a common ancestor language. The term stock is sometimes used for a hypothetical grouping of more or less well established families into a larger and more tentative set. We now discuss some of the methods linguists use to establish language families. It should be cautioned that genetic relatedness of languages has, in principle, nothing to do with the biologicalgenetic relatedness of their speakers. Speakers of genetically related languages need not be closely related biologically; a child will acquire the language spoken in its social environment, not the language spoken by its biological parents, if they are not present in the social environment. Thus, English is spoken as a mother tongue by humans of diverse biological ancestry. On the other hand, speakers of genetically unrelated languages may belong to the same genetic groups. Hungarian is not genetically related to the neighbouring languages, although the speakers of Hungarian are not distinguishable as a population in terms of biological-genetic features from speakers of nearby languages. Nonetheless, some large-scale statistical correlations between genetic groupings of languages and biological genetic groups do seem to exist: the two are not totally independent (see CavalliSforza 2001). However, the correlations are imperfect, and striking mismatches do occur.

Methods for establishing language families Various methods have been developed and used by linguists to establish language families and thus the genetic relatedness of languages. Among them the comparative method is a well-honed and stringent set of techniques that provides convincing evidence for genetic relatedness. If languages

417

418

Linguistics

can be shown to be genetically related by careful application of this method, few linguists would question their relatedness. We discuss this method in the next subsection. The following subsections then describe two more contentious and less reliable methods.

Genetic relatedness is not the same thing as typological similarity. Languages sharing typological characteristics need not be genetically related; Hungarian and Walmajarri are both agglutinating languages, though there is no reason to believe they are genetically related (if they are, the connection may go back to proto-world, the original language of human beings!). On the other hand, languages deriving from a common ancestor language need not be very similar typologically; the Indo-European languages show considerable typological diversity: for instance, while most are accusative, there are ergative languages (see §15.3) in the Indo-Iranian branch.

Comparative method The idea underlying the comparative method is that genetic relatedness of a set of languages can be established by reconstructing a proto-language that could plausibly serve as an ancestor of each of the languages, and showing in detail how the modern languages could have developed from this proto-language through a credible series of changes. Reconstruction involves hypothesizing what the proto-language might have been like by attempting to undo the changes that occurred between the proto-language and its descendants. To reconstruct a proto-language you begin by compiling sets of cognates among the languages – that is, you gather together lexical and grammatical items that are similar in form and meaning, that can be assumed to have derived from a common ancestor. From these cognate sets you identify recurrent correspondences in the forms of the cognates and propose a form in the proto-language from which the modern forms could have derived by plausible sound changes. Consider the words cited in Table 17.1, from four languages from the far north-west of Australia. (The words are spelt in the orthography for the language, which in each case is phonemic: j indicates a palatal stop, y a palatal glide, rl a retroflex lateral; oo in Bardi and Nyikina orthographies is the high back vowel (u in the other languages), and rr an apical tap; double vowel letters (other than oo) indicate long vowels.) The forms in each row appear to be cognates; they are similar enough in phonological form and in meaning to be plausibly traced back to single words in an ancestral proto-language. Not only are the word forms phonologically similar but also there are systematic correspondences in the phonemes that comprise them. Aside from the cases of identity of phonemes in the corresponding places in the words, we have: ● ●



where Bardi, Nyikina and Warrwa have a final vowel, Nyulnyul has none; where Nyulnyul, Nyikina and Warrwa have a palatal or bilabial stop (/j/ or /b/) between vowels, Bardi has a glide (/y/ or /w/); where Bardi and Nyulnyul have the long high front vowel /ii/, Nyikina and Warrwa have a short high front vowel.

Languages of the World

Table 17.1 Some basic words in four languages of the north-west of Australia Bardi

Nyulnyul

Nyikina

Warrwa

‘boomerang’

jiiwa

jiib

jiba

jiba

‘camp’

booroo

bur

booroo

buru

‘down’

jimbin

jimbin

jimbin

jimbin

‘two’

kooyarra

kujarr

koojarra

kujarra

‘be sitting’

miyala

mijal

mijala

mijala

‘(his) mouth’

ni-lirr

ni-lirr

nilirr

nilirr

We can now guess what original sounds in the proto-language might have given rise to the phonemes in the modern languages, bearing in mind the principle that the sound changes that give rise to the modern forms should be credible. First, we would guess that what remains constant across the languages – including initial /b/ and /k/ – was identical in the proto-language. No sound change is required to explain the modern forms, and there would be no reason to propose that, for instance, the recurrent initial /b/ comes from some other segment – for instance, an initial prenasalized stop. (Of course, we cannot rule out the possibility that the initial /b/ did come from a prenasalized stop; but there is no evidence for this, and it is pointless to make such unwarranted and untestable speculations.) Second, it is natural to guess that the proto-language had final vowels where Bardi, Nyikina and Warrwa have final vowels, and that these were lost in Nyulnyul. This is more likely than that the other languages gained final vowels, especially in the light of the words for ‘down’ and ‘(his) mouth’ – which should have final vowels in Bardi, Nyikina and Warrwa if these languages had gained their final vowels. Third, it is reasonable to guess that the correspondence between glides in Bardi and stops in the other languages goes back to stops in the proto-language. It is more likely that stops between vowels would weaken to glides than that glides would strengthen to stops. This involves assimilation, and is attested in many other historical cases. Fourth, the final correspondence we would naturally guess goes back to a long high vowel in the proto-language. In fact, vowel length in Nyikina and Warrwa is not phonemically contrastive for high vowels, and it is most natural to guess that it was lost in these two languages rather than gained in the other two. With these observations in mind, we can reconstruct the six words in the proto-language as follows (recall that in historical linguistics the star before the word indicates it is a reconstructed form): ‘boomerang’ ‘camp’ ‘down’ ‘two’ ‘be sitting’ ‘mouth’

*jiiba *buru *jimbin *kujarra *mijala *-lirr

419

420

Linguistics

The reconstruction of ‘mouth’ as *-lirr is based on the observation that we can also reconstruct a prefix *ni- meaning ‘his, hers, its’. This subsequently became part of the root form in Nyikina and Warrwa. The comparative method is ultimately based on the assumption of the arbitrariness of the linguistic sign. Occasional resemblances in words are not unexpected between any pair of languages – for instance, compare Kaqkchikel (Mayan, Guatemala) mes ‘mess, disorder, garbage’ and English mess. But large numbers of similarities in forms and meanings between a pair of languages is unlikely to be accidental, except in onomatopoeic words. For this reason, one initially excludes obvious onomatopoeic words from cognate sets when applying the comparative method. A large number of lexical similarities between two languages does not necessarily mean that they are genetically related, and that the words can be traced back to a proto-language. One language might have borrowed heavily from the other. In applying the comparative method, it is important to determine whether apparent cognates are genuine, or borrowings; this can be very difficult. One additional assumption – for which there is much independent evidence – is helpful in this context. It is that basic everyday words (such as terms for the major parts of the body, everyday artefacts, low numerals, primary kinship terms, and basic observable phenomena of the world) are less likely to be borrowed than less basic words (like technical vocabulary, words for high numbers and for unusual plant and animal species). For this reason one first begins to apply the comparative method to basic vocabulary, as we did in our four-language sample. The demand that sound correspondences be recurrent further reduces the likelihood of lexical similarities being accidental. Nevertheless, even as stringent a method as the comparative method can’t provide absolute proof of genetic relatedness. There remains a small chance that two languages could show numerous recurrent similarities in their basic vocabularies by accident, just as it is possible (though very unlikely) you will throw a straight sequence of a hundred, or even a thousand, heads.

Reconstructed proto-languages are idealizations; reconstructions are limited by accidents of what survives in the descendant languages, and indeed which daughter languages survive. One of the few cases where we have extensive written evidence of a ‘real’ ancestral language is the Romance languages, which are known to be descendants of Latin. Proto-Romance as reconstructed from the modern languages is not the same as Latin. For instance, we know from written sources that Latin distinguished cases for nouns; however, none of the modern daughter languages do, and cases can’t be reconstructed for proto-Romance nouns.

Mass comparison Applying the comparative method is an exacting process, requiring a detailed knowledge of the languages being compared, not to say a considerable amount of time and effort. As a first step in determining whether languages are genetically related one might relax the criteria somewhat. The method of mass comparison is a way of getting an initial idea of the relatedness of a number of languages by comparing basic vocabulary items, excluding onomatopoeic forms. A good deal of

Languages of the World

Table 17.2 A selection of basic words in six African languages Afrikaans

Bemba

Kanuri

Chichewa

Shona

Swahili

‘woman’

vrou

úmwaanakashi

kámú

mkazi

mukádzí

mwanamke

‘man’

man

úmwaaúmé

kwâ, kwângâ

mwamuna

murúmé

mwanamme

‘sun’

son

ákasuba

kə̀ ngâl

dzuwa

zúvá

jua

‘fish’

vis

ísabi

búnyì

nsomba

hóvé

samaki

‘dog’

hond

ímbwa

kə̀ ri

galu

imbwá

mbwa

‘bird’

voël

icúúní

ngúdò

mbalame

shiri

ndege

‘three’

drie

-tatu

yàskə̀

-tatu

-tatú

tatu

‘water’

water

ámeenshí

njî

madzi

mvúrá

maji

‘big’

groot

-kulu

kúrà

-kulu

-kúrú

kubwa

‘good’

goed

-suma

ngə̀ là

-bwino

-naka

nzuri

‘tree’

boom

úmutí

kə̀ ská

mtengo

mutí

mti

phonetic and semantic similarity among the languages – in other words, a fair number of potential cognates – is indicative of possible genetic relatedness. To give an illustration of the method, consider the short list of basic words in six languages of Africa presented in Table 17.2. Which languages would you group together as likely members of a family? Write down your suggested groupings before reading on. Glancing through the list reveals very few similarities between the Afrikaans words and any others. Nor are there many resemblances between the Kanuri words and words of any other languages, with the exception of the word for ‘big’. But there are many similarities among the words in the other four languages. Particularly striking are similarities of the words for ‘tree’ and ‘three’ (identical in the four languages, except that the word is free in Swahili, but bound in the other three languages). Less obvious, but nevertheless discernible (if you make some intermediate chains of linkages), are the similarities in the forms of the words for ‘sun’, ‘woman’ and ‘man’. The words for ‘dog’ and ‘big’ are each also very similar in three of the four languages. It thus seems reasonable to tentatively group these four languages together as members of a single family. This type of evidence is less convincing than evidence obtained by application of the comparative method. Many linguists regard mass comparison as a useful heuristic tool in initial hypothesis generation, but insist that it should be followed by application of the comparative method. Nevertheless, a number of language families are supported by no more than this sort of evidence – and some by much less! One serious problem with mass comparison is that putative cognate sets obtained by eyeballing wordlists for items with similar forms and meanings will result in the inclusion of ‘false friends’, words that resemble one another in form and meaning, but are not genuine cognates, and exclude real cognates, the forms of which have diverged through phonological change. Thus, one

421

422

Linguistics

would group together French feu ‘fire’ and German feuer ‘fire’ in applying mass comparison, although the French word comes from Latin focus ‘hearth’, while the German word derives from proto-IndoEuropean *pūr ‘fire’ which became *fūr-i in proto-Germanic. Feu ‘fire’ and feuer ‘fire’ are not cognates: French f derives from proto-Indo-European *bh but German f comes from proto-Indo-European *p by Grimm’s law (see §16.2). (One might hope that the weight of numbers will ultimately even things out, false friends adding where distant cognates subtract. But this is surely wishful thinking.)

Lexicostatistics Lexicostatistics is a statistical method for distinguishing groups and subgroups in language families, i.e. sets of genetically related languages that are particularly closely related because they derive from proto-languages that are daughters or granddaughters of the proto-language of the family. Lexicostatistics is based on the idea that basic vocabulary is relatively resistant to change, and will be renewed rarely compared to non-basic vocabulary. If the rate of replacement of basic vocabulary is roughly constant regardless of the language, the proportion of shared basic words between a pair of related languages can give an indication of how long the languages have diverged from one another, provided borrowings are excluded.2 From this, it is possible to determine groupings and subgroupings of the languages within the genetic set. Application of this method depends on having first established the genetic relatedness of the languages, and that one can reliably distinguish between borrowings and shared retentions from a proto-language. This means that the comparative method has already been employed to reconstruct the proto-language, its lexicon, and the historical sound changes giving rise to the modern forms. Variants of the lexicostatistical method have been used in a number of regions in order to gain an initial picture of language relatedness, well before any application of the comparative method is feasible. One such variant was applied extensively in Australia in the 1960s, on the presumption – which still remains a hopeful (or, rather, in my opinion, hopeless!) guess – that the languages form a genetic unity. Pairwise counts were made of shared apparent cognates between languages, obvious borrowings being excluded. With no independent evidence of genetic relatedness, the known varieties were grouped into stocks, families, groups, subgroups, languages and dialects. Recent work by some Australianists has shown that application of this quick and dirty version of lexicostatistics sometimes gives a quite good picture of groups and subgroups, one that is in relative accord with the results obtained by the comparative method, when subsequently undertaken.

17.3 Seven (putative) language families The languages of the world can be divided into a number of families of related languages, possibly grouped into larger stocks, plus a residue of isolates, languages that appear not to be genetically related to any other known languages – in other words, languages that form one-member families of their own. The number of families, stocks and isolates is hotly disputed. The disagreements centre around differences of opinion as to what constitutes a family or stock, as well as the criteria and methods for reliably establishing them.

Languages of the World

Linguists are sometimes divided into ‘lumpers’ and ‘splitters’ according to whether they lump languages together into large stocks, or divide them into numerous family groups. Merritt Ruhlen is an extreme lumper: his classification of the world’s languages (1987) identifies just 19 language families or stocks, and 5 isolates. More towards the splitting end is Ethnologue, which identifies some 153 top-level ‘families’, including one constructed language, 157 sign languages, 92 creoles, 16 pidgins, 25 mixed languages, 14 isolates and 48 unclassified languages. Aside from the fact that many of these groupings (e.g. sign languages, creoles, pidgins, mixed languages) are not genetic, in terms of what has actually been established by application of the comparative method, the Ethnologue system is wildly lumping! Some families – for instance, Austronesian and Indo-European – are well established, and few serious doubts exist as to their genetic unity (though some uncertainties may remain). Others are highly contentious. Both Ruhlen (1987) and Ethnologue identify an Australian family, although (as mentioned above) there is as yet no compelling evidence that the languages of the continent are all genetically related. At least as contentious is Joseph Greenberg’s (1987) putative Amerind stock of Native American languages. In the following subsections, we present an overview of six major families: Indo-European, Austronesian, Afroasiatic, Niger-Congo, Sino-Tibetan and Trans-New Guinea. Each of these families has over 300 languages, and together they account for almost two-thirds of the world’s languages, and over 80 per cent of the speakers. In the final subsection we discuss Greenberg’s Khoisan grouping, which is believed by experts to comprise at least three independent genetic lineages. The website for this chapter includes a brief survey of the world’s languages organized geographically.

Indo-European The Indo-European languages have been recognized as forming a family since at least the late seventeenth century, when Andreas Jäger observed in 1686 that Persian and many of the languages of Europe are descendants of a single language. Since Jäger’s time, many more languages have been shown to belong to the family. Indeed, Indo-European languages are spoken throughout most of Europe, across Iran, through Central Asia, and into India. With the European colonial expansions of the fifteenth to nineteenth centuries, they spread into the Americas, Australia, New Zealand, Africa and Asia, in the process, diversifying into numerous dialects. They have become major languages in many of the former colonies, and are spoken by over three billion (i.e. 3 × 109) speakers. The family consists of just over 400 languages (448 according to the 25th edition of Ethnologue), which can be grouped together into a number of subfamilies or branches, as shown in the family tree representation of Figure 17.1, which shows the major branches and some of the smaller ones. Map 17.1 shows the approximate locations of some of the main ones in pre-colonial times. More historical-comparative work has been done on Indo-European than any other language family, and many lexemes have been reconstructed for proto-Indo-European, as well as some of its

423

424

Linguistics

Figure 17.1 The Indo-European family tree

Map 17.1 Location of the main groups of the Indo-European family prior to the sixteenth century.

Languages of the World

grammar. Proto-Indo-European was an inflecting language (like ancient Indo-European languages such as Sanskrit, Hittite and Ancient Greek), with a complex verbal system with different inflections for different persons and numbers of the subject, tense, aspect, mood, as well as case-marking for nouns. A range of proposals have been made for the origins and early spread of the Indo-European family. The Kurgan or steppe hypothesis is the currently favoured hypothesis. It identifies the speakers as members of a Kurgan culture, and places the homeland of Proto-Indo-European in the steppe region north of the Black Sea about 6,000 years ago. From there the language may have spread out with the domestication of the horse and the invention of the wheel (Anthony 2007), fragmenting into numerous mutually unintelligible languages as it spread to the east and west. Supporting this hypothesis is genetic evidence of a substantial population movement out of the steppe region at around the predicted time. The major alternative scenario has it that Proto-Indo-European was spoken further to the south, in the region of present-day Turkey, some 6,000–8,000 years ago. The archaeologist Colin Renfrew proposed (1987, 1989) that the Indo-European languages spread with agriculture from a centre in Anatolia, beginning at this time. The current consensus is that this hypothesis is incompatible with the linguistic and genetic evidence.

Austronesian Austronesian is the largest universally accepted language family in the world with over 1,200 languages, spoken by almost 400 million people from Madagascar in the west to Easter Island in the east, Taiwan in the north and New Zealand in the south, with the exception of Australia and much of the island of New Guinea. (The Niger-Congo family (see pp. 427–9) is the only larger family, but it is more contentious.) As is the case for Indo-European, a good deal of Proto-Austronesian has been reconstructed. There are, however, differences of opinion concerning how the family is structured. One view is that it is divided into four groups, three of which – Atayalic, Tsouic and Paiwanic – are located exclusively on the island of Taiwan. Other proposals identify up to nine groups on Taiwan. Just one branch, Malayo-Polynesian, accounts for the bulk of the languages of the family, and includes all Austronesian languages spoken outside Taiwan. Malayo-Polynesian is subdivided into four groups: Central Malayo-Polynesian, South Halmahera-West New Guinea, Oceanic (eastern group) and Western Malayo-Polynesian. Regardless of the actual structure of the family, it is clear that there is considerably greater diversity in the languages of Taiwan than in all of the rest of the languages. It is generally assumed that the region of greatest diversity is the most likely homeland, the region where the proto-language was spoken, since it is in this region that the languages have been longest that they have had the most opportunity to diversify. Taiwan is thus the most likely homeland for Austronesian. Evidence from archaeology is largely in agreement with linguistic evidence that Taiwan was the homeland of Austronesian, and that the languages began spreading from there some 5,500 or so years ago. The languages spread via migrations of people travelling over the sea, and taking farming

425

426

Linguistics

with them. The island of New Guinea was reached about 2000 BCE, Polynesia around 1200 BCE, Hawaii and Easter Islands around 500 CE, and New Zealand about 1250–1300 CE. It has recently been proposed that the Austronesian languages are genetically related to the Sino-Tibetan languages (see p. 429), forming a large Sino-Tibetan–Austronesian family. Laurent Sagart (2005) makes a plausible – though not widely accepted – case for this macro-group, identifying some sixty cognates in basic vocabulary among Austronesian and Sino-Tibetan languages, as well as recurrent sound correspondences. He avers that there is archaeological evidence in agreement with his proposals, and that the initial spread of the proto-language for this family was from mainland China to Taiwan, accompanying a migration of agriculturalists driven by population expansion. The archaeologist Peter Bellwood (2005) is in basic agreement, though he places the original mainland China homeland in a different location.

Afroasiatic Afroasiatic consists of some 381 languages (according to the 25th edition of Ethnologue) spoken in northern Africa and south-west Asia by almost 500 million people – see Map 17.2. It is regarded as the best established of the four families that African languages are often divided into, following Greenberg (1963); the other three families are Niger-Congo (on which see next subsection), the more contentious Nilo-Saharan and the highly dubious Khoisan (see final subsection). Nonetheless, reconstruction of Proto-Afroasiatic is largely absent. Afroasiatic is generally divided into six groups: Berber (consisting of around 30 languages spoken in Morocco, Algeria, Tunisia, Mali, including Tamazight, Zenaga and Kabyle); Chadic (made up of nearly 200 languages spoken in Nigeria, Chad, Cameroon, including Hausa, Miya and Ngizim); Cushitic (with about 50 languages in Ethiopia, Eritrea, Somalia, Kenya and Tanzania, including Somali, Dahalo and Afar); Egyptian (one language, Coptic, which became extinct in the fourteenth century, though it is still used as a language of religion); Semitic (consisting of some 80 languages spoken in Ethiopia and the Middle East, including Arabic, Hebrew, Aramaic, Amharic and Tigré); and Omotic (with 30 or so languages spoken mainly in Ethopia, including Dizi, Bench and Ganza). Semitic is the only group spoken widely outside of Africa. It is also the best-studied group. A notable feature of Semitic languages is a root structure consisting of three consonants; grammatical information is expressed largely through intervening vowels and their modifications. For instance, the root form for ‘book’ in Arabic is k-t-b; thus kitab ‘book’, and kutub ‘books’. There is no consensus on when or where Proto-Afroasiatic was spoken. Estimates of when the language was spoken vary widely from 12,000 to 18,000 years before the present, making it a significantly more ancient language than Proto-Indo-European. It is generally agreed that the homeland of Proto-Afroasiatic was somewhere in north-east Africa, though where in this vast region is a matter of disagreement. An alternative view, not so widely accepted, is that the homeland was instead in the Levant (on the Eurasian continent), and that the languages spread with agriculture into Africa. This would be consistent with the later date for Proto-Afroasiatic. However, against this hypothesis is the lack of shared agricultural vocabulary across the family, and the fact that northeast Africa is the region of greatest diversity.

Languages of the World

Map 17.2 Approximate locations of the putative language families of Africa.

Niger-Congo Consisting of something over 1,500 languages, the Niger-Congo family is the largest language family in Africa, indeed in the world. This must be tempered by the observation that it is hypothetical, and a number of linguists have expressed doubt concerning its status as a genetic unit. It is accepted as a genetic unit by Ethnologue, but not by Glottolog, which distinguishes nine separate family groups: Atlantic-Congo (the core of Niger-Congo), Mande, Dogon, Ijoid, Lafofa, Katla-Tima, Heiban, Narrow Talodi and Rashad. Disagreements are partly because the proto-Niger-Congo has not been reconstructed, and thus the genetic unity of the languages is not an established fact.

427

428

Linguistics

Niger-Congo languages are spoken over a vast area of the African continent, as shown in Map 17.2, and by at least 600 million speakers, possibly as many as 700 million. The composition of the putative family is controversial, and has been revised more than once. An idea of the structure of the family is shown in the tree of Figure 17.2. Note that some nodes on this tree represent individual languages (e.g. Pre/Bɛrɛ), some represent small groups of languages (like Dogon), while others represent enormous groups (e.g. Bantoid). The well-known Bantu languages are a subgroup of the Bantoid group (bottom right of Figure 17.2). They comprise between 400 and 700 languages (including Swahili, Xhosa, Fang, Setswana, Zulu, Southern Sotho, Luganda and Shona), with perhaps 350 million speakers. It is believed that

Figure 17.2 Some major groupings within the Niger-Congo family

Languages of the World

Bantu is a relatively young group that began diverging when speakers spread out from Cameroon perhaps 4,000–5,000 years ago. (Some sources suggest a much later date, 2,500–3,000 years before present.) Bantu-speaking people migrated eastwards and southwards, taking West African yam agriculture with them. Today Bantu languages are spoken across a third of the African continent. One well-known characteristic of Niger-Congo languages is their possession of an elaborate system of noun classes (see note 2, Chapter 7), distinguishing humans, animals, plants, masses and liquids, abstracts and so on. The classes are marked by affixes, usually prefixes, that occur sometimes on the noun, but usually on adjectives and verbs in agreement with the noun they apply to, as shown by the following example, where ki- and -ki are the class markers: (17-1) ki-tu hi-ki ki-thing this-ki ‘This large thing fell.’

ki-kubwa ki-large

ki-lianguka ki-fell

Swahili

Sino-Tibetan Comprising over 400 languages, Sino-Tibetan is the second-largest language family of the world in terms of numbers of speakers, with a bit under half the number of speakers of Indo-European. The Sino-Tibetan family includes Mandarin Chinese, the language with the largest number of native speakers. Sino-Tibetan falls into two subgroups. One, Sinitic, consists of around 16 languages, including Mandarin Chinese, Cantonese (Yue), Hakka, Northern Min, Southern Min and Gan. The other group, Tibeto-Burman, has almost 450 languages, mainly spoken in China, Nepal and India. Groupings within Tibeto-Burman include, according to the traditional classification: Baric (e.g. Meithei in India), Bodic (e.g. Tibetan), Burmese-Lolo (e.g. Burmese), Keren (various Keren languages spoken in Myanmar and Thailand, the most widely spoken being S’gaw Karen), Nung (e.g. Norra, Nung) and Qiang (e.g. Northern and Southern Qiang, spoken in China). Map 17.3 shows the approximate location of the family and some of the groups. With the exception of Baric languages, Sino-Tibetan languages are mainly tone languages. Tone cannot, however, be reconstructed for proto-Sino-Tibetan. Rather, certain syllabic endings of the proto-language gave rise to the tones of most modern languages.

Trans-New Guinea As already mentioned, the region of island of New Guinea is one of the most linguistically diverse region in the world, populated by some 1,200 languages. These are usually divided into two groups, Austronesian and Papuan, where the term Papuan refers collectively to the non-Austronesian languages of the region. Papuan languages fall into thirty or more distinct genetic families and some two dozen isolates. Most of these families are quite small, with an average of twenty-five languages, each of which is spoken by of less than 3,000 speakers on average. The largest Papuan family (or stock), Trans-New Guinea, consists, according to the 25th edition of Ethnologue, of almost 500 languages belonging to dozens of groups and spoken mainly along the

429

430

Linguistics

Map 17.3 Location of the Sino-Tibetan family.

mountainous cordillera of New Guinea. Proto-Trans-New Guinea has not been reconstructed; indeed, there is considerable diversity of opinion as to the structure and composition of the putative family. The Ethnologue grouping is a ‘lumping’ one, that basically follows Wurm (1975). More recently, Foley (2000) has suggested that it consists of around 300 languages, and Ross (2005) has proposed a version with 400 or so languages. Glottolog identifies Nuclear Trans-New Guinea, a family of 317 languages. Many Papuanists accept the core of the Trans-New Guinea grouping, albeit with a good deal of disagreement and uncertainty as to which languages and groups belong to it. Even in its most reduced form Trans-New Guinea fits into the category of large language family.

Khoisan The term Khoisan (also spelt Khoesan) is a cover term for a group of languages of Africa that have clicks as part of their normal phoneme inventory, but are not Bantu (Niger-Congo) or Cushitic (Afroasiatic). Greenberg’s classification of African languages (1963) identified Khoisan as one of four genetic macro-units covering the continent. However, there is no evidence that the languages form a genetic unit, and specialists use the label Khoisan as a convenience label for a residue class of languages that don’t fit into the better-supported family units. Those spoken in southern Africa are generally accepted by experts as comprising three distinct genetic lineages (e.g. Vossen 2013: 3): Khoe-Kwadi (about fourteen languages, including Haiǁom, Nama-Damara, Khwe, Shua, Ts’ixa and Kwadi),3 Kx’a (around half a dozen languages, including Juǀ’hoan, N!aqriaxe and ǂKx’auǁ’ein) and Tuu (about six to ten languages, including !Xóõ, Nǁng and ǁXegwi). Map 17.4 shows the locations of many southern African Khoisan languages. Whereas

Languages of the World

earlier versions of Ethnologue identified a Khoisan family, the latest edition recognizes the above three genetic lineages. Khoisan also includes two languages spoken far to the north, in Tanzania, Hadza and Sandawe. Current consensus among experts is that Hadza and Sandawe are language isolates (Ethnologue agrees); there is, however, some evidence that Sandawe may be very distantly related to Khoe-Kwadi. Khoisan languages are famous for their possession of click consonants. Some languages – for instance, !Xóõ, the language with allegedly the largest known consonant inventory in the world, with well over 100 distinct consonant phonemes – distinguish five different click phonemes: bilabial (ʘ), dental (ǀ), (post)alveolar (ǃ), palatal (ǂ) and lateral (ǁ). Each of the clicks may be accompanied by some further modification by changes to the manner of the velar or uvular closure. (Check your understanding of click articulation, p. 40.) In !Xóõ each click admits up to sixteen accompaniments, including voicing, aspiration, nasalization and glottalization. Vowel systems in Khoisan languages may distinguish as many as five phonemic vowel qualities. Almost all Khoisan languages show

Map 17.4 Location of Khoisan languages of southern Africa.

431

432

Linguistics

distinctive nasalization of vowels; in addition, glottalized, breathy and pharyngealized vowels are commonly phonemic.

17.4 Contact languages According to the simple model of language diversification alluded to in §17.1, new languages emerge as the eventual result of geographical, temporal or social separation of speakers of a single original language. This model underlies the notion of genetic relatedness of languages, the notion that each language has a single parent language from which it ultimately separated. The reality is more complex. Languages are not like biological species, which cannot normally interbreed; rather, languages do interact significantly with other languages spoken in the same environment. We have already mentioned one way this is can happen, namely through borrowing of lexical items or grammar. Sometimes borrowing between languages is so extensive that it obscures the genetic picture. Indeed, it can render family tree diagrams inappropriate and misleading. For some language situations linguists have proposed that, instead of the tree model, a bush model is preferable, in which there are many complex interconnections between languages. Another thing that can happen is that new languages come into being as the result of interaction between two or more languages – or, rather, between speakers of two or more languages. The new languages, called contact languages, thus can’t be traced back to a single parent language that either split or incorporated many characteristics of a neighbouring language, and neither the tree nor the bush model is appropriate. This section focuses on cases of this type, which, though in some sense exceptional, are not rare – there are hundreds of them.

Pidgins Pidgins are rudimentary or simplified forms of speech that sometimes arise in contact situations, when speakers of mutually unintelligible languages come into contact with one another in a limited range of social interactions. These interactions might be for economic purposes such as trade or labour, including slavery, on plantations, on boats, or in mines. In keeping with their reduced range of circumstances of use, pidgins show structural reductions compared to ordinary human languages. They typically have smaller lexicons, which are restricted in terms of the semantic domains they cover, though they may be specialized in some domains. Their grammars show reduction, and their stylistic ranges are diminished. Many pidgins arose in the wake of European colonialism, in the Pacific region, the Americas and Africa. The lexical items of these pidgins often derive from the language of the colonizers, but are usually pronounced according to the sound systems of the languages of the colonized, who represent the majority of speakers. Pidgins often show considerable variation across speakers in vocabulary and pronunciation, depending on the speaker’s mother tongue. Fanagalo, a pidgin spoken in South Africa, arose in interactions between European settlers and Zulu people, and was later used in mines. Fanagalo is somewhat unusual for a pidgin in that most

Languages of the World

of its words (about 70 per cent) come from Zulu, rather than from the languages of the colonizers (24 per cent comes from English and 6 per cent from Afrikaans). Words are phonologically simplified in Fanagalo, with Zulu clicks replaced by k, and English interdentals replaced by either an apical stop or the labiodental fricative (e.g. bath appears as baf). Examples (17-2) and (17-3) illustrate the syntax. Note the presence of a subject pronoun between the subject and the verb (a characteristic shared with many English-based pidgins in the Pacific), the SVO word order, and the placement of the question word in initial position (as in English and Afrikaans – in Zulu it appears in the usual position for an NP in that grammatical role). (17-2) lo foloman yena funa lo nyuzipepa the foreman he want the newspaper ‘The foreman wants the newspaper and tea.’ (17-3) yinindaba wena hayikona shefile why you not shave ‘Why haven’t you shaved today?’

nambla today

na and

lo the

ti tea

Fanagalo

Fanagalo

Not all pidgins arose in contexts of European colonization. Pidgin Yimas, a pidginized form of Yimas (Papuan, New Guinea), for instance, arose in pre-contact times in the context of trade along the Arafundi River. And Hiri Motu, a pidginized variety of Motu, was once used on annual trading expeditions (called hiri) by Motu speakers into areas occupied by speakers of Papuan languages. The term Hiri Motu (or Police Motu) is now used for a different pidgin Motu that arose around 1900 in the predominantly Motu-speaking Port Moresby area, where speakers of many different languages were brought into contact primarily in the police force.

Creoles Sometimes pidgins become useful in a wider range of interactive contexts, and may take on the role of auxiliary languages and perhaps even be given official status. Such pidgins gain extra words and grammar to cope with the additional uses they are put to; being more complex than pidgins they are called expanded pidgins. An example is the English-based pidgin Tok Pisin, now an official language in New Guinea, and frequently used in parliament and in commerce. Tok Pisin is spoken by as many as 4 million people as a second (or later) language, and is often used between speakers of mutually unintelligible Papuan languages. However, it has also become the mother tongue of a small number of people (perhaps 50,000) living mainly in urban areas of Port Moresby. When this happens – when a pidgin language acquires mother tongue speakers – it is said to have been creolized, and to be a creole.4 Unlike pidgins, creoles are full languages, structurally and functionally comparable with ordinary human languages. The process of creolization is associated with increases in the range and depth of vocabulary and in the structural complexity of the former pidgin (e.g. by adding subordinate clause constructions, tenses and so on), as well as expansion in stylistic range. Some investigators hold that creoles share more with one another grammatically than they do with other natural languages, and that these similarities are indicative of general linguistic abilities shared by all people.

433

434

Linguistics

Also based on English is Torres Strait Creole or Broken, spoken by some 18,000 Torres Strait Islanders and Aboriginal people living on northern Cape York Peninsula, Australia. Broken creolized from Pacific Pidgin English around the turn of the twentieth century. The following are a few illustrative clauses. Notice that the ‘predicate marker’ in (17-4) derives from English he, and connects the subject to the following verb, like yena ‘he’ in (17-2) above. (17-4) ai luk wan gel i I see a girl predicate:marker ‘I can see a girl approaching.’

kam come

Broken

(17-5) em pinis skras-e koknat lo she completive scrape-transitive coconut instrumental ‘She has already scraped out the coconut with the scraper.’

madu scraper

Broken

(17-6) dem piknini go luk plural:definite child future visit ‘The children will visit their mother.’

dempla they

Broken

ama blo mother possessive

Almost 100 creoles are spoken by many millions of people in Africa, the Americas, Asia and the Pacific region. Many, like Tok Pisin, are based on European languages – that is, they derive from former pidgins that drew their lexicon mainly from a European language. A number are based on other languages. Cutchi-Swahili and Asian Swahili are spoken in Kenya and Tanzania, and are based on Swahili (Nilo-Saharan). Tetun Dili, also called Tetum Prasa, is spoken by around 50,000 people in East Timor, and based on Tetun (Austronesian). Africa is the home of three Arabic-based creoles: Babalia, with about 4,000 speakers in Chad; Sudanese Creole Arabic or Juba Arabic, with about five times as many speakers in Sudan; and Nubi, with about 26,000 speakers in Kenya and Uganda. Six Malay-based creoles are spoken in Indonesia and Malaysia.

Mixed languages The third group of contact languages is mixed languages: hybrid languages, components of which come from different sources. Some aspects (e.g. the lexicon) may come from one language, while others (e.g. the grammar) come from another.5 Perhaps the paradigm mixed language is Michif, an endangered language spoken by a few hundred elderly people in Canada and northern USA. Michif nouns come mainly from French (about 90 per cent), and noun phrases follow the grammatical structure of French. But the verbs almost all come from Cree, an Algonquian language, and the complex verbal morphology of Cree is largely retained. Even more strikingly, two phonological systems coexist in Michif with little if any influence on one another. The Cree component retains Cree phonology, while the French component retains French phonology. Michif syntax is closer to Cree syntax than French syntax, except in the NP. (17-7), from the beginning of a short Michif text, is illustrative. It is given in the standard orthography for each language; French morphemes are bolded.

Languages of the World

(17-7) un vieux opahikê-t ê-nôhcihcikê-t, êkwa un an old trap-he COMP-trap-he and a ê-waniskâ-w ahkosi-w, but kêyapit COMP-wake:up-he be:sick-he but still wî-nitawi-wâpaht-am ses pièges want-go-see:it-he:it his traps ‘There was an old trapper who was trapping. One morning wanted to go and look at his traps.’ (Bakker 1994: 28–30)

matin morning ana this:one

Michif

he woke up sick, but still

It is not known how old Michif is, although it seems that it has been around since at least the early decades of the nineteenth century. Peter Bakker (1994: 23) argues that it could not have arisen as a contact pidgin between speakers of French and speakers of Cree. He suggests instead that the first speakers were fluent in both French, which they learnt from their fathers, and Cree, which they learnt from their mothers; he proposes further that Michif may have been invented by adolescents. Another mixed language is Ma’a, spoken in Tanzania by Mbugu people. The Mbugu speak two languages, Ma’a and Mbugu, the latter a Bantu language. Ma’a has basically Bantu morphology and syntax, but possesses a considerable number of non-Bantu lexemes, the majority of which come from Southern Cushitic languages. There are also a handful of phonemic segments in Ma’a that do not occur in Mbugu. Maarten Mous (1994) regards Ma’a is a special register created by speakers of Mbugu in order to set themselves off as distinct from their Bantu neighbours. Other explanations have been proposed. For instance, Thomason and Kaufman (1988) suggest it is a Cushitic language that borrowed extensively from a Bantu language. Some varieties of Romani (Indo-European, Europe to Near East) appear to be mixed languages, including in the Near East Qirishmal (Eastern Persia) and Armenian Romani, and in Europe, Basque Romani, Norwegian Romani and now extinct Dortika in Greece. These varieties preserve Romani lexicon, but employ the grammatical structures of the surrounding languages.

Summing up It has been estimated that around 7,000 language are spoken in the world today. This is not a precise figure, due in part to lack of accurate information especially on languages spoken in areas remote from European habitation and in part to the different criteria used in identifying languages. The distribution of languages across the globe is not uniform, and there is a concentration of languages in the band between the two Tropics. The Pacific region is the most linguistically diverse region of the world. Languages also differ strikingly in their numbers of speakers. A few languages are spoken by large numbers of speakers, while many languages have very small numbers of speakers. The languages of the world can be grouped into families of genetically related languages, plus isolates. The languages of a family can be traced back to a single proto-language, which, over time and perhaps migrations of the speakers, differentiated into separate languages. Within a language family it is often possible to identify groups and subgroups of more closely related languages. The

435

436

Linguistics

structure of a family can thus be represented by a tree model. Ideally the tree represents also the history of the separation of the languages. Hypothetical groups of families or stocks have been proposed by some linguists; all of these are quite tentative. Sometimes new languages, contact languages, arise through contact between two or more languages, rather than divergence of a single language over time. In such circumstances the language cannot always be traced back to a single parent language, raising problems for the tree model. Pidgins, creoles and mixed languages are examples of this sort of ‘hybrid’ language. The most reliable method of establishing genetic relations among languages is the comparative method, which reconstructs a proto-language and shows how the modern languages could have developed from it via a series of plausible changes. Reconstruction is done by setting up sets of cognates in the languages on the basis of which a proto-form is proposed, together with regular sound changes. Two other methods sometimes used to establish genetic relations, mass comparison and lexicostatistics, are less reliable. The number of language families in the world is hotly contested. Some linguists have suggested as few as 19, others around 100. Even the latter figure involves many highly dubious groupings, like Khoisan, Australian, Papuan and Amerind. Among the best-supported families are Indo-European, Afroasiatic, Sino-Tibetan and Austronesian. In between lie a number of intermediate cases of likely families, such as Trans-New Guinea, the extent and composition of which remain uncertain.

Guide to further reading Ethnologue: Languages of the World (Eberhard, Simons and Fennig 2022, https://www.ethnologue. com/) is intended to provide a comprehensive listing the known living languages of the world. It provides basic information on the languages, including alternative names and spellings of the names, numbers and locations of speakers, main references and family membership. This information is of varying degrees of reliability, depending on the information available on the language, and what is conveyed to the editor by experts. Unfortunately, however, a subscription is required to access more than the most basic information. Glottolog (Hammarström, Forkel, Haspelmath and Bank (2022), http://glottolog.org) also aims to provide a comprehensive catalogue of the world’s language varieties and their classification. A good overview of the linguistic diversity of the world is Comrie (2017), which provides basic information on many language families. There are a number of more comprehensive book-length treatments of the world’s linguistic diversity. My recommendation is Pereltsvaig (2012), which provides much intriguing information on a range of languages and their speakers. Anderson (2012) is an accessible overview. Garry and Rubino (2001) contains basic information on 191 languages from all over the world, including the viability of the language, use in education, genetic classification and basic grammar. Each description is by a linguist with some knowledge of the language, and it is a very useful resource. An older, but still useful survey, is Lyovin (1997), which gives grammatical information on a selection of languages. Ruhlen (1991) is a highly speculative classification of the world’s languages, and few of the proposed groupings are accepted by experts.

Languages of the World

Numerous textbooks discuss the comparative method. Among them Crowley and Bowern (2010) stands out for its lucid style and absence of Indo-European bias. An excellent article on the comparative method is Rankin (2003). Numerous attempts have been made in recent years to link archaeological, biological and linguistic reconstructions of the human past, including Renfrew (1987, 1994), Cavalli-Sforza (1991, 2001) and Anthony (2007). Bakker and Matras (2013) contains readable and up-to-date articles on contact languages; Velupillai (2015) is a good introduction to the field. Siegel (2008) discusses the emergence of pidgins and creoles; Mufwene (2016) suggests an alternative account according to which creoles do not always come from pidgins. Bhatt and Veenstra (2013) is a collection of articles exploring whether creoles are typologically distinct from other languages. Information on Fanagalo given in the text comes from Childs (2003: 207–10), which should be consulted for further details; the CDROM accompanying that book contains a short stretch of spoken Fanagalo. The best sources on mixed languages are Bakker and Mous (1994), and Matras and Bakker (2003), which cover mixed languages from all over the world.

Issues for further thought and exercises 1 We have said that many of the world’s languages are currently highly endangered; some hundreds are likely to go out of use in the next century. On the other hand, there are a few languages with extremely large numbers of speakers. Do you think that eventually the entire population of the world will speak a single language, perhaps Mandarin Chinese or English – if so, which do you consider more likely? To what extent do you think modern digital technology, including translation technologies such as Google Translate, and future elaborations (e.g. to the domain of simultaneous interpretation) might counteract this trend and contribute to strengthening of currently endangered languages? Discuss your reasons. 2 Below is some additional data (slightly simplified) on the four languages we discussed on pp. 418–20. (The same orthographical conventions are employed.) Reconstruct each word in the proto-language, and state the phonological rules required to give the forms in the modern languages. How would you account for the Warrwa words for ‘sun’ and ‘path, road’, and the Nyikina word for ‘emu’? Bardi

Nyulnyul

Nyikina

Warrwa

‘brother’

borla

babarl

babarla

babarla

‘club’

nola

nawul

nawoola

nawula

‘man’

amba

wamb

wamba

wamba

‘sun’

alka

walk

walka

kidiyi

‘wattle type’

irrola (‘spear’)

yirrakul

yirrakool

yirrakulu

‘path, road’

morr

makirr

makoorr

kaadi

‘emu’

inini

winin

karnanganyja

winini (‘emu chick’)

437

438

Linguistics

3 Below is a list of basic lexical items (written phonemically) in seven different languages of the Pacific region. Apply the method of mass comparison to this lexical data, and suggest how the languages might be related to one another: how many families does it appear they form, and which languages seem to be genetically related? Hawaiian

Shan

Tahitian

Ma¯ori

Pangasinan

Tay-Nung

Samoan

‘woman’ ‘man’

wahine kaane

vahine taane

wahine kaane

bií toó

nhình chài

fafine taane

‘sun’ ‘fish’ ‘dog’ ‘bird’ ‘three’ ‘water’ ‘big’ ‘good’ ‘tree’ ‘long’ ‘small’

laa i ʔa ʔiilo manu kolu wai nui maikaʔi laaʔau loa liʔi, iki

kón jíŋ, mɛ jíŋ kón sáaj, phu sáaj kă ̆aŋ wán pă ̆a mă ̆ nôk sạm nâm jáɯ, lôŋ lı̆ ton mâj jáaw ʔɔ` n, nɔj, lêk

raa i ʔa ʔuri manu-rere toru vai nui maitaʔi raʔau roa riʔi, iti

raa ika kurii manu toru wai nui pai raakau roa riki, iti

ágew ikan asó manok talo danum báleg maong kiew andukey melág

tha vàn pja ma nô̩ c slam nâ̩ m cai ʔdây ma̩y rì eng

laa iʔa maile manu tolu vai tele lelei laaʔau loa liliʔi, itiiti

4 Do a preliminary and rough ‘lexicostatistical’ investigation of the data in Question 3. To do this, you will need to count the number of apparent or likely cognates in each pair of languages, and fill the figures in on the table below. What do the figures suggest about the way the languages are related? In particular, are they in agreement with your proposals using mass comparison, and what (if anything) do they suggest about subgrouping in the families? (You could also do a similar investigation of the six African languages discussed on p. 421.) If you are knowledgeable in statistical methods, you could do a cluster analysis of the results to see how the languages might be grouped together hierarchically. Hawaiian

Shan

Tahitian

Ma¯ori

Pangasinan

Tay-Nung

Shan Tahitian Maori Pangasinan Tay-Nung Samoan

5 An alternative to the tree model for representing historical relations among languages is the wave model. Find out about this model, and write a paragraph description mentioning who first proposed it, and its main characteristics including assumptions.

Languages of the World

6 Critique the following excerpt from a speech to the Royal Society of St George by the conservative British politician Enoch Powell, as reported in The Independent on 23 August 1988. Identify the attitudes embodied in the quote, and discuss the notions expressed in relation to what you know about the history of English. Others may speak and read English – more or less – but it is our language not theirs. It was made in England by the English and it remains our distinctive property, however widely it is learnt or used.

7 Below is a short excerpt from a story in Kriol, an English-based creole spoken in northern Australia, given in the standard orthography, which is phonemic. Read it through first – without looking at the free English translation given below – and see how much you can understand. Now read the free translation. List all of the Kriol words; use the free translation to attempt to give each Kriol word an approximate gloss. It is obvious that most of the words come from English. Which words do not come from English? Try and explain the way in which each such word is formed. Describe as much of Kriol grammar as you can based on this excerpt; you should be able to say something about word order, tense marking and complex sentence constructions. Gardiya bin pikimap mipala en teik mipala langa mishin longtaim en deya wen mipala bin lil-il kid mipala yusdu tokin Walmajarri. Samtaim gardiya bin gib mipala haiding fo tokin Walmajarri. From deya mela bin lisining sampala kid bin tokin Kriol. Mela bin lisining en pikimap lilbit. ‘White people picked us up and took us to the mission a long time ago, and there, when we were little children we used to speak Walmajarri. Sometimes the white people would give us a hiding for speaking Walmajarri. Later, we heard some children speaking Kriol. We listened to it, and picked up a little of it.’

Research project Find out about a language – select a language from the following list, and find out basic information on it: Acehnese, Adamorobe Sign Language, Basque, Blackfoot, British Sign Language, Cantonese, Chamorro, Georgian, Etruscan, Ewondo, Kalkatungu, Ket, Kwaza, Lahu, Lango, Lavukaleve, Lezgian, Martha’s Vineyard Sign Language, Mongsen Ao, Mundari, Navajo, Nivkh, Paamese, Slave, Squamish, Tauya, Ts’ixa, Warrwa, Yapese and Yuaalaraay. Write a short description of the language, giving information on the following: where it is spoken; approximate number of speakers and its status (healthy, endangered, dead etc.); its role in education; existence of an orthography and/or tradition of literacy; its genetic classification; basic facts about its grammar, including phonology, morphological and syntactic type. You can start by looking in Glottolog, which includes references to modern grammars (many of which are available on the internet) and other works on the languages.

439

440

Glossary

abjad an alphabet-like writing system in which, ideally, just the consonant phonemes are represented. ablative case The case of a noun or pronoun that indicates the place or direction from which an object or motion event begins. See also case. absolute universal A generalization that holds true for all languages (although in practice usually restricted to spoken languages) without exception – for example, all (spoken) languages have vowels. See also non-absolute universal. absolutive case When in a language a noun or pronoun has the same case for subject of an intransitive clause and object of a transitive clause, but a different case for subject of a transitive clause, this is called absolutive case. For example, if the word for ‘window’ is in the same case in ‘The boy broke the window’ and ‘The window broke’ – but different in ‘the window stopped the stone’ – the noun ‘window’ is in absolutive case. See also case, ergative case, intransitive, transitive. accent A variety of speech differing phonetically or phonologically from other varieties. See also dialect. accommodation Adjustments speakers make in their speech to adapt it to features of their interlocutor’s speech. accusative case The case of a noun or pronoun when it is object of a transitive clause, when this is different from its case as a (transitive or intransitive) subject. For example, in English the accusative case of the first person singular pronoun is me. See also case, intransitive, nominative case, transitive. acoustic phonetics The study of the physical properties of speech sounds. acronym A word formed from the initial letters of a sequence of words – for example, Qantas from Queensland and Northern Territory Aerial Services. Actor The grammatical role of a noun phrase the referent of which performs the action or engages in the state designated by the clause. For example, the farmer in the farmer is kissing the duckling and the farmer is sitting in his favourite chair. See also Subject. adjectival phrase (AdjP) A grammatical or syntactic unit made up of an adjective and possibly an accompanying modifier, that occurs within a clause or noun phrase indicating a quality of some object – for example, most difficult in the most difficult problem. adjective A part-of-speech consisting of words that typically refer to qualities or properties of things, and occur as modifiers in noun phrases – for example, bright in the bright light. AdjP adjectival phrase adult language learning see second language learning adverb A part-of-speech consisting of words that normally qualify a verb, indicating the manner in which an action was performed, e.g. quickly in she ran quickly; the frequency of the event, e.g. often in she runs often; or the time or location of an event, e.g. soon in she’ll come soon and here in she’ll come here.

441

442

Glossary adverbial phrase A grammatical or syntactic unit consisting of an adverb and a modifier, that specifies something about the manner, time, location, frequency of an event, as in very quickly. affix A bound morpheme attached to a root or stem, modifying its meaning in some way, and forming a lexical or grammatical word with it – for example, dis- and -ed in displaced. See also prefix, suffix, infix, root, stem. affricate A complex sound made up of a stop followed by slow release accompanied by friction noise – for example, the first segment of chap, written ch, IPA [ʧ]. Afroasiatic A family of languages spoken mainly in northern Africa and the Middle East, including Semitic (e.g. Arabic, Hebrew), Chadic, Cushitic and other groups. agglutinating language A language like Turkish or Hungarian in which many words are morphologically complex and consist of a root plus one or more affixes the boundaries between which are quite clear-cut. airstream mechanisms The means of producing a stream of air for the production of speech sounds – for example, the egressive pulmonic airstream, the stream of air produced by forcing air out of the lungs. alienable possession A grammatical category indicating a type of possession in which the possessor and possession are not linked by intrinsic ties – for example, in my dog, her car, your bus, my street. See also inalienable possession. allative case The case of a noun or pronoun that indicates the intended goal or direction towards which a motion event is oriented. See also case. allograph One of the alternative shapes of a graphical form used in a writing system – for example, , , and are allographs. allomorph One of the alternative phonemic forms of a morpheme – for example, the prefix in- in English has allomorphs /ɪn/, /ɪm/ and /ɪŋ/ depending on the first segment of the root to which it is attached, as in inexplicable, implausible and incredible, respectively. allophone One of the alternative phonetic realizations of a phoneme – for example, [t] and [th] are allophones of /t/ in English. See also phoneme. alphabet, alphabetic writing A system of writing that uses a set of symbols each ideally representing a phonemic segment. alternate sign language a sign language used by hearing people as an alternative to speech in certain circumstances. See also primary sign language. alternation The correspondence between two or more allophones of a phoneme or allomorphs or a morpheme – for example, between [t] and [th] as allophones of /t/ in English. alveolar A speech sound produced by bringing the tip or blade of the tongue towards or against the alveolar ridge – for example, [t], [n]. alveolar ridge The ridge on the hard palate just behind the upper front teeth. alveopalatal A sound produced with constriction in the region just behind the alveolar ridge – for example, the initial phone [ʃ] of she. ambiguity, ambiguous The term used to describe the situation in which a word, phrase or larger unit has multiple meanings that are linguistically distinct from one another. Ambiguity is not the same thing as vagueness (see that entry). amelioriation The process by which a word comes to acquire more positive connotations – for example, fond in Modern English comes from the past participle of fonnen ‘to be silly, foolish’ in Middle English. Amerind The most contentious of the three groupings (stocks) of the languages of the Americas proposed by Joseph Greenberg.

Glossary analogical change A process of change whereby an old form, usually irregular, is replaced by a new form constructed by extension of another pattern, usually the regular one. For example, the English plural cows was formed by analogical change, replacing the earlier plural kine. anaphoric reference The term used to describe the situation in which a referential expression such as a pronoun refers to something already mentioned in a discourse. See also reference. animacy hierarchy A scale of pronouns and nouns extending from the first and second persons pronouns (considered as the most animate) to the least animate entities (trees, stones, clouds, etc.). This hierarchy is used in the formulation of certain generalizations about languages; for example, if a language has accusative case marking of inanimate nouns, it will have accusative case marking of animate nouns and pronouns. The animacy hierarchy is also called Silverstein’s hierarchy. annotation In corpus linguistics an annotation refers to a piece of additional linguistic information added to a corpus to facilitate interpretation. For instance, some corpora are annotated for part-ofspeech membership of the words. anomic aphasia A type of aphasia in which the patient shows inability to produce words for things or events. See also aphasia. anticipatory error A speech error in which the speaker anticipates a subsequent word, morpheme or sound, and puts it earlier in their utterance – for example, kindler and gentler for kinder and gentler. See also exchange error. antonymy The relation of oppositeness in some component of the meaning of a pair of words – for example, hot and cold both concern temperature, but are at opposite ends of the scale. aphasia A language loss or disorder following brain damage. This may be a disorder of either production or comprehension; problems resulting from paralysis to the vocal organs due to brain damage are excluded. applied linguistics The branch of linguistics concerned with practical applications – for example, to second-language learning, language maintenance, translation, machine generation of speech and so on. approximant A speech sound involving narrowing at some point in the vocal tract, but insufficient to produce fricative noise. arbitrariness The property of linguistic signs whereby there is no intrinsic or necessary relation between the signifier (form) and signified (meaning). arcuate fasciculus The bundle of neurons connecting Broca’s area with Wernicke’s area. articulatory phonetics The study of how speech sounds are produced by the vocal apparatus. aspirated A feature of a voiceless stop in which a puff of air follows its release, caused by a brief delay between the release of the stop and the beginning of voicing of a following vowel. See also unaspirated, voice onset time. assimilation The modification of a sound that makes it more like a nearby sound – for example, when the vowel in pin is nasalized due to the following nasal consonant. Assimilation can be progressive (when the sound becomes more like a preceding one) or regressive (when it becomes more like a following one). See also dissimilation. auditory phonetics The study of the perception of speech sounds. auditory-vocal medium The primary medium or channel over which language is conveyed, i.e. speech, which involves both the vocal apparatus (production) and the hearing apparatus (perception). See also medium, visual-gestural medium, visual-inscribed medium. Austronesian Name of a large family of related languages spoken mainly on islands in the Indian and Pacific Oceans from Madagascar to Easter Island.

443

444

Glossary auxiliary A verb that normally accompanies other verbs, and expresses purely grammatical information, like was in He was going. babbling An early stage of language learning that infants go through from about four to six months of age. Babbling may involve a wide range of speech sounds, though it typically consists of simple syllables (e.g. ba, ma); over time, the range of sounds tends towards that of the language being learnt. Deaf children also babble with hand gestures. backformation Process whereby a new word is created by removing what is mistakenly analysed as affix from an existing word – for example, edit from editor. back vowel A vowel produced by moving the body of the tongue towards the back of the mouth, so that its high point is towards the back of the mouth, as in [o] and [ʊ]. balanced corpus A corpus that contains roughly equal numbers of sample (pieces of) texts of approximately equal sizes, in each of the categories represented (e.g. genres, registers, mediums). See also representative corpus. basic vocabulary The lexical items in a language expressing meanings of a basic type, that would be expected to be found in all languages, including lexemes for major parts of the body (e.g. ‘head’, ‘hand’), fundamental human and animal categories (e.g. ‘boy’, ‘girl’, ‘dog’), basic qualities (e.g. ‘big’, ‘little’), common states (e.g. ‘sit’, ‘stand’) and events (e.g. ‘hit’, ‘say’), etc. beat A simple rhythmic movement that marks the boundary of a segment of discourse. bee dance A set of bodily movements used by some species of honey bee to indicate the location of a nectar source. bilabial A phone produced at the lips – for example, [m], [b]. bilingualism The ability of a person to speak two or more languages. A range of types of bilingualism are distinguished depending on the time of learning of the languages, the person’s competence in each, the contexts in which the languages are used, etc. binomial A more or less fixed syntagm comprising a pair of words, often linked by a conjunction, as in e.g. ups and downs and ins and outs. See also cluster. bird calls Brief vocalizations by birds conveying information about the immediate environment, including danger, feeding and flocking. birdsong A complex pattern of vocalizations used by birds for attracting mates and marking territory. blade of tongue The part of the tongue immediately behind the tip. blend A new word created by putting together parts of two existing lexical items – for example, smog is a blend of smoke and fog. body of tongue The main bulk of the tongue. borrowing The incorporation of a word or other item from one language into another – for example, English borrowed the words government and science from French. bottom-up processing The analysis of linguistic input beginning with the smallest units, the phones, and moving upwards step by step to larger and larger units such as words, phrases and clauses, until the complete utterance is interpreted. See top-down processing. bound morpheme A morpheme that cannot occur as a separate word by itself, but must be attached to another item – for example, the English morphemes -ly and -ed. See also affix, free morpheme. brain scanning Technologies used for studying the human brain in operation, including Electroencephalography (EEG), Magnetoencephalography (MEG), Functional Magnetic Resonance Imaging (fMRI) and Positron Emission Tomography (PET).

Glossary broadening A process of semantic change whereby the meaning of a word becomes wider – for example, bludger in Australian English used to mean ‘pimp, someone living off the earnings of a prostitute’, but now means ‘scrounger’ and ‘lazy person’. See also narrowing. broad transcription A transcription of a spoken utterance that indicates the major phonetic features, usually using a limited range of basic symbols. See also narrow transcription. Broca’s aphasia A language disorder often resulting from damage to Broca’s area, which is characterized by problems in speech production and the use of grammatical morphemes. Broca’s area An area of the frontal lobe of the left hemisphere of the brain that is believed to play a significant role in language production. It is named after Paul Broca, the nineteenth-century French scientist who first observed its role in language. calque Also called a loan translation, this is a type of borrowing in which the morphemes making up the word in the source language are translated one by one into the borrowing language – for example, English power politics from German Machtpolitik. caretaker speech A special form of speech used by adults (especially mothers) and older children when talking to infants, that is characterized by exaggerated articulation and intonation. Also referred to as baby talk, motherese and child directed speech. case A morphological category of nouns and/or pronouns that indicates the grammatical role of a noun phrase in a clause or another noun phrase. For example, us is the accusative form of the first person plural pronoun, used when it serves in the object role, as in they saw us. cataphoric reference The relation between a referential expression such as a pronoun and something introduced later rather than previously in a discourse. See also reference. categorical perception The perception of speech sounds in terms of phonemic categories. central vowel A vowel produced with the high point of the tongue in the centre of the mouth on the front-back axis. cerebral cortex The thin layer of neurons forming a covering of the two hemispheres of the brain. chain shift A series of two or more linked sound changes by which one sound changes to another sound, which in turn changes to another, and so on. character A graphic symbol used in a logographic writing system such as Chinese. classifier handshape A term used in sign language linguistics for the handshape employed in certain verbal signs, that varies depending on the nature of the entity involved in the event. clause A syntactic unit that is like a minimal or reduced sentence, typically consisting of one main verb and accompanying noun phrases and other items – for example, the farmer kissed the duckling with puckered lips. click A speech sound produced by a velaric airstream mechanism. The back of the tongue makes a closure near the velum and a second contact is made further forward. This space is next enlarged, rarefying the enclosed air; the anterior closure is then released, and air flows inwards with a clicking noise. English tut! tut! is made up of clicks; clicks are part of the regular phonology of some African languages. See also velaric airstream mechanism. clipping The deletion of a part of a word resulting in a new and shorter word – for example, fax from facsimile. clitic A bound grammatical morpheme that behaves like an independent word, and at best loosely related to the word it is attached to: it does not a give rise to a new form of a lexical item or a new lexical item. cluster A sequence of words that is repeated a number of times in a corpus, such as salt and pepper. See also binomial.

445

446

Glossary coarticulation The simultaneous production of a speech sound at two places of articulation (e.g. the labio-velar /w/ of English) or with two manners of articulation (e.g. affricates). code-switching Switching from one language or dialect to another within a single speech interaction or even turn of speech. cognate Words in different languages that come from the same word in an ancestor language – for example, English man and Danish mand ‘man’. coherence The characteristic of a text that it represents a chunk of knowledge concerning a real or imaginary world that holds together in the view of members of a culture. cohesion A linguistic manifestation of coherence comprising the features of a text that concern links among different parts of clauses or larger units. cohesive device A linguistic means of establishing textual cohesion by constructing links among the sentences of a text. There are five main types of cohesive device: reference, conjunction, substitution, ellipsis and lexical cohesion. cohesive link/tie A link or tie between units in the linguistic realization of a text that is constructed by a cohesive device. coinage A lexical item that is a pure invention, and not created through use of any of the regular patterns of lexeme formation; possible examples are nerd and barf. collocate The relation between separate lexical items that are often found in proximity to one another – for example, blond collocates with hair and beer, coffee with black and white. collocation The frequent co-occurrence of two or more words within the environment of one another, typically within a window of four to five words on either side. For instance, there is a relation of collocation between the words blond and hair. comparative method The method of comparing languages to determine if they have developed from a common ancestor. Lexical and grammatical items are compared in order to discover correspondences relating sounds in the languages; if these are sufficiently numerous and regular, the most reasonable hypothesis is that the languages come from a common ancestor. complementary distribution When two speech sounds do not share any environments of occurrence they are said to be in complementary distribution – for example, in English [p] and [ph] are in complementary distribution. complex sentence A sentence composed of more than one clause – for example, When danger threatens your children, call the police. componential analysis A semantic theory that analyses the semantics of lexemes into a small set of meaning components or ‘semantic features’ taking + and – values. For example, boy would have the features [+male] and [–adult], whereas girl would be [–male] and [–adult]. compounding A process of forming new lexical items by putting together a pair of words, as in washroom and handbook. concordance A listing of the instances of a particular word or phrase in a corpus together with some context, generally a specific number of preceding and following words. conditioning factor A circumstance that, when met, leads to the choice of one allophone or allomorph – for example, one conditioning factor for the unaspirated allophone [p] is that it follows a word initial [s], as in [spɪn]. conduction aphasia A type of aphasia that may result from damage to the arcuate fasciculus. Patients often experience difficulties in repeating words spoken to them, and in monitoring their own speech. conjunction A grammatical word whose primary function is to connect linguistic units – for example, and, but and or. Conjunction is also used in reference to the cohesive link established by a conjunction.

Glossary connotation A term used in semantics for the associations, typically emotional, of a linguistic unit. consonant A speech sound produced with some obstruction, a narrowing or blockage, to the airstream at some point in the vocal tract. constituent analysis Hierarchical analysis of a syntactic construction into units. Two main types are Immediate Constituent Analysis and String Constituent Analysis. contact language A language such as a pidgin, creole or mixed language that comes into being as the result of frequent interactions among the speech communities of two or more different languages. contextual meaning Part of the meaning of an utterance that is not encoded by the signs making it up, but which is engendered by the context of the utterance. continuer A word, usually short, like mhm, yeah and right that is used by a speech interactant to signal that they are attending to what is being said. contour tone system A tone system in which the direction of tonal movement is significant; Mandarin Chinese has a contour tone system, with high level, high rising, low falling rising, and high falling. contralateral control This refers to the control of one side (left/right) of the body by the opposite hemisphere (right/left) of the brain. Many bodily sensations are also experienced contralaterally. conventionality The idea that the form and meaning of a sign are linked by convention rather than by necessity. Conversation Analysis (CA) The field of linguistics that studies the structure of conversations, e.g. the way turn-taking is organized. cooing A very early stage in language acquisition in which the infant produces cooing-sounds, typically made up of syllables consisting of velar consonants and back vowels. cooperative principle The principle formulated by H. P. Grice that speech interactants assume that each is behaving rationally and cooperatively; this underlies the way people understand the intended meaning of an utterance. corpus (plural corpora or corpuses) A compilation of texts in a language, these days usually in electronic format, that has been prepared for the purposes of linguistic research. corpus callosum The major bundle of nerve fibres connecting the left and right hemispheres of the brain. corpus linguistics The field of linguistics that is concerned with the construction and analysis of corpora in a language or set of languages. creole A language that began as a pidgin but eventually became the first language of its speech community. In the process of creolizing, the earlier pidgin becomes more complex, and shares the major properties of other human languages. See also contact language, pidgin, mixed language. critical period hypothesis The idea that there is a biologically determined window of time, between infancy and puberty, within which full native-speaker competence in one’s first language is possible. Outside of this period it is believed that it is impossible to achieve native-like fluency in a language. cuneiform An early type of writing that was done with the wedge-shaped end of a reed stylus on the medium of clay tablets. dative case The case of a noun or pronoun when it is an indirect object or recipient as in she gave the book to me, if the first person pronoun were to occur in a distinct case form. Dative case usually covers a range of meanings similar to the prepositions to and for in English. See also case. deaf sign language see primary sign language deictic, deixis A means of establishing the reference of linguistic elements by situating them relative to speaker, hearer, and time and place of the speech interaction. Tense is deictic because it locates an event with respect to the time of speaking.

447

448

Glossary deletion see loss dental phone A consonant employing the teeth as the place of articulation. derivational morpheme A bound morpheme added to a root or stem to form a new stem – for example, the suffix -er in English. See also inflectional morpheme. descriptive linguistics The branch of linguistics that aims to describe the facts of a language as it is actually spoken, as distinct from how (some) speakers believe it ought to be spoken. dialect A variety of language characterized by a particular set of words, grammatical structures, and phonetic or phonological characteristics and that is associated with a geographical region, as in the New Zealand dialect of English. The term dialect is sometimes used for varieties associated with age, social class, gender, religion, etc.; thus we could talk of a middle-class dialect. See also accent. dichotic listening task An experimental method used in neurolinguistics in which subjects hear different sounds in the left and right ears. diglossia A situation in which two distinct varieties of a language that differ in terms of formality are used throughout a speech community; thus one, the high variety, is associated with formal situations, the other, the low variety, with informal situations. The term is also used for bilingual situations in which the languages differ in terms of formality. digraphia A situation where two different systems are used for writing the same language. diphthong A vowel sound involving significant movement of the tongue from one vowel position to another – for example, [aɪ], [æʊ]. direct speech act A speech act in which the grammatical form directly indicates the type of act – for example, in English a question would be expressed as a direct speech act by use of a grammatical form like Is she going? discourse The language component of a complete interactive event such as a job interview. Discourses are units primarily concerned with doing things with words, with language as a form of action. discourse analysis The field of linguistics that studies the structures and regularities in discourse. One theory is called Discourse Analysis (see p. 200). displacement One of Charles Hockett’s design features of language, according to which language can be used in reference to things that are not present in the immediate speech situation. dissimilation The modification of a sound to make it less like a nearby sound – for example, the second rhotic of Latin arbor ‘tree’ was changed to a lateral in Spanish arbol. See also assimilation. ditransitive clause A clause that in its full form requires three noun phrases – for example, clauses of giving in English (the farmer gave the duckling some bread). duality or duality of patterning A design feature of language referring to the simultaneous organization of language on the level of form and the level of meaning. dyslexia A condition characterized by difficulty in reading and writing despite education and intelligence. dysphemism An expression employing direct or harsh terms, usually with offensive overtones – for example, shithouse is a dysphemistic expression for toilet in Australian English. See also euphemism. EEG electroencephalogram egressive airstream mechanism An airstream produced by forcing air out of the vocal tract. Most sounds of most languages are produced on the egressive pulmonic airstream. See also ingressive airstream mechanism. ejective A speech sound produced on an egressive glottalic airstream. The air in a closed cavity above the larynx is compressed by raising the glottis; this pent-up air is then released. See also implosive, glottalic airstream mechanism.

Glossary electroencephalogram (EEG) A record of the electrical activity in the brain resulting from the firing of neurons as detected by electrodes placed on the scalp. ellipsis A type of cohesive device or tie characterized by the omission of something that is required by the grammar. For example, in Some of his classmates got the answer right. Most _ didn’t _ phrases are omitted at the points in the second sentence marked by the underlines; these omissions establish links with the corresponding phrases in the first sentence. embedding Inclusion of a unit in another of the same type – for instance, of an NP in an NP. enclitic A type of clitic attached to the end of a word. See clitic. endophoric reference The term used to describe the situation in which a referential expression such as a pronoun refers to something mentioned somewhere in a discourse, either previously or subsequently. See also reference. epenthesis see insertion ergative case In some languages (e.g. Basque, West Greenlandic) nouns and/or pronouns are in one case when they are subject of a transitive clause but in a different case when they are subject of an intransitive clause or object of a transitive clause. In such systems the case of the subject of a transitive clause is called ergative. Ergative case is somewhat akin to the preposition by in the building was struck by lightning. See also absolutive case, accusative case, case, nominative case, intransitive, transitive. etymology The study of the origins and history of the form and meaning of words. etymological spelling The situation in which a word is spelled as in the source language or presumed source language. Many irregularities in English spellings are the result of etymological spellings. euphemism An expression used instead of one considered offensive – for example, pass away instead of die. See also dysphemism, taboo word. Event Term for the grammatical role generally served by a verb phrase in a clause – for example, the role of kissed in the farmer kissed the duckling. evolutionary linguistics An area of linguistics concerned with the origins and development of human language. exchange A sequences of moves by different speakers that go together as complementary in speech act value, e.g. question and answer. See also move. exchange error A speech error in which two elements switch places in an utterance, as in slicely thinned for thinly sliced, where the two lexical items thin and slice have swapped places. See also metathesis. exophoric reference The situation in which a referential expression such as a pronoun refers directly to something in the ‘world’ constructed by a discourse. See also reference. experiential meaning The type of meaning relating to the construal and understanding of our world of experience; also called representational meaning. experiential role A grammatical role that encodes experiential or representational meaning. explicit performative A sentence that indicates its speech act value explicitly – for example, I resign. exposition A type or genre of text that explains or describes some aspect of the world. extension see meaning extension family A set of genetically related languages, comprising all languages that derive from a common ancestor. family tree A tree diagram representation of the relations between languages of a family. felicity condition A condition that an utterance must meet in order to be appropriate or successful as a speech act.

449

450

Glossary fingerspelling Systems for spelling words with manual signs of hand alphabets such as the one-handed system used in ASL or the two-handed system used in Auslan. flap A sound produced by a single rapid movement of one articulator against another. The most common flap is the apico-alveolar flap, [ɾ]. fMRI functional magnetic resonance imaging form The perceivable aspect of the linguistic sign – for example, the form of a lexical sign is its phonemic representation. See also meaning , sign. formal grammar, formal linguistics One of the two major divisions of linguistic theory, the formal approach puts focus on language as an algebraic system of symbols manipulated according to rules. Formal theories tend to see meaning as peripheral, and do not normally recognize the linguistic sign as a fundamental unit. See also functional grammar. free morpheme A morpheme that can occur alone, as a separate word. free variation Where one sound can replace another in a given environment without giving rise to a new word – for example, [p˺] can be replaced by [ph] at the end of the word stop. Allomorphs can also be in free variation. fricative A consonant produced with a narrow but incomplete obstruction in the vocal tract, resulting in a friction sound as the airsteam passes through. functional grammar, functional linguistics One of the two major divisions of linguistic theory, the functional approach focusses on language as it is used. Meaning occupies a central place in functional linguistics; in extreme varieties, form is marginalized or even has no place. See also formal grammar. functional magnetic resonance imaging (fMRI) A brain-imaging technology in which brain activity is determined indirectly through changes in oxygen levels in the bloodstream, measured by different magnetic properties of oxygenated and deoxygenated blood. See also brain scanning. fusional language A language in which words are typically morphologically complex, and it is difficult to determine where the boundaries between the morphemes lie. Latin and Sanskrit are examples of fusional languages. See also agglutinating language, isolating language, polysynthetic language. garden path sentence A sentence the beginning of which suggests a particular analysis, but by the end it is obvious this analysis cannot work. A well-known example is The horse raced past the barn fell. gender A grammatical category in which the nouns of a language are divided into groups according to the forms of syntactically related items such as verbs, demonstratives and adjectives. Standard Danish distinguishes two genders according to the form of the article (en or et) and agreement patterns of determiners and adjectives. general corpus A corpus that aims to provide a snapshot of a language or language variety as it is spoken or written at some point in time. A general corpus aims to give a good coverage of the range of usages of the targeted language in roughly the proportions in their uses. genetic relation The relation between languages that developed from a common ancestor – for example, there is a genetic relation between French and Spanish, both of which derive from Latin. genre A technical term for a type of text generally defined in accordance with structural characteristics. Examples of genres are narrative and exposition. gesture A visible movement of a body part that is used in discourse as either an utterance or as a component of an utterance. Gestures convey meanings, e.g. the manual gesture meaning OK, or shaking the head in denial. glide A vowel-like consonant sound produced with minimal obstruction to the passage of air at its place of articulation. Also called a semivowel.

Glossary global aphasia A type of aphasia involving disturbance to all language functions. This is usually associated with damage to large parts of the left frontal and temporal lobes. glottal A sound produced with constriction in the glottis, e.g. with complete closure a glottal stop [ʔ] results. glottalic airstream mechanism An airstream produced by forming a closed cavity above the larynx, which pocket of air is compressed or rarefied by raising or lowering the glottis; then the outermost obstruction is released. See also ejective, implosive. glottis The opening between the vocal folds. grammatical, grammaticality A sequence of words that satisfies the grammatical patterns of a language is grammatical. grammatical category A category distinguished in the grammar of a language; tense, gender, case and number are grammatical categories in many languages. grammatical morpheme A morpheme that provides information about the grammatical properties of a linguistic unit, and has little or no lexical meaning – for example, English the and a. grammatical relation Any syntactic function that a linguistic unit can serve – for example, a noun phrase can serve in grammatical relations such as Subject, Actor and Theme. grammaticalization The process by which grammatical morphemes in a language emerge over time, often from lexical items. The term is also used for the emergence and development of grammatical categories and other grammatical phenomena. Gricean maxims Four maxims – principles governing the inferences conversational partners draw – that were formulated by H. P. Grice and comprise the cooperative principle. Grimm’s law The description of a systematic set of sound changes in consonants in an ancestor of the Germanic languages that was formulated by Jacob Grimm. group A subset of the languages of a genetic family consisting of the languages that derive from a proto-language that was a daughter of the proto-language of the family. handshape A component feature of the phonetic and phonological structure of manual signs in sign languages concerning the shape taken by the hands, e.g. whether it is in a fist shape or flat shape. hierarchical structuring The grouping and subgrouping of units making up a sentence – for example, the hierarchical structuring of the duckling waddled is [[[the] [[duck][ling]]] [[waddle][d]]], where square brackets enclose units. See also tree. hieroglyphic writing system A writing system that visually resembles the early system of writing employed in Egypt that was usually carved on stone and wood. Hieroglyphic systems typically combine logographic and other types of representation, and many of the glyphs depict entities or actions. high vowel A vowel with the high point of the tongue relatively high in the oral cavity, e.g. [i], [u]. historical corpus A corpus that presents snapshots of a language in use at different points in time. historical linguistics The branch of linguistics that studies how languages change over time. holophrastic stage A stage in language learning that is typically reached around 12 to 18 months in which the infant produces one-word utterances that convey complex messages similar to those conveyed by phrases or clauses in adult speech. home signs A system of conventionalized signs that often emerges in the communication between a deaf child and those they interact with, when the child is not exposed to a sign language. Home signs are usually idiosyncratic and restricted to single families; they show some, but not all, characteristics of human language.

451

452

Glossary homophone, homophony, homonymy Different words that share the same phonological form – for example, threw and through. hyperbole The process by which a word loses a strong aspect of meaning through overuse, as happened to intensifying adverbs like terribly and awfully when modifying nouns or adjectives. hypernym, hyperonym see superordinate hyponym, hyponymy A word with a more specific meaning than another, which it is an instance of – for example, blue and green are hyponyms of colour. icon, iconic sign A sign in which the form bears some resemblance to the meaning – for example, the manual gesture for ‘two’, ✌. idiom An expression whose meaning is not predictable from the meaning of its component parts, e.g. kick the bucket for ‘die’. illocutionary force The speech act performed by a speaker in making an utterance – for example, promise, command, request, warning. imagistic gesture A gesture, typically iconic, that depicts a feature of an object or event in terms of shape, size, movement pattern, etc. implicational universal A generalization about human languages that is expressed in the form of an implication between two properties – for example, ‘if a language has voiceless nasals it also has voiced nasals’. See also non-implicational universal. implosive A speech sound produced on an ingressive glottalic airstream. The air in a closed cavity above the larynx is rarefied by lowering the larynx, and the closure in the oral cavity is then released, allowing air to be sucked in. See also ejective. inalienable possession A grammatical category indicating a type of possession in which the possessor and possession are linked by intrinsic ties – for example, the possession of parts of the body (my ear, her breasts), and kindred (your mother, my brother). See also alienable possession. indirect speech act An utterance the linguistic form of which does not reflect its communicative purpose – for example, I have no money used as a request. Indo-European A well-established family of languages spoken throughout most of Europe, across Iran, and into Central Asia and India; in recent times Indo-European languages have expanded into the Americas, Australia, New Zealand and elsewhere. Branches of Indo-European include Germanic (German, Danish, Dutch, etc.), Romance (French, Spanish, Italian), Celtic (Breton, Welsh, Cornish), Balto-Slavic (Russian, Polish). infix An affix that is inserted within a root. inflectional morpheme A bound grammatical morpheme that gives rise to a form of a word expressing some grammatical category, such as past tense as in walk-ed, or plural number as in dog-s. See also derivational morpheme. ingressive airstream mechanism An airstream produced by drawing air into the oral or nasal cavity. Ingressive airstream may be used when speaking while taking a breath; it is also used with glottalic and velaric airstream in the production of implosives and clicks. See also egressive airstream mechanism. innateness The idea that children are biologically predisposed to learn language, that they are born with knowledge of an abstract universal grammar that underlies the grammars of all human languages. insertion The addition of one or more phones into a word, as when speakers of English add a schwa between the two final consonants of film. intension Defining properties of a lexical item, that must be met for it to be used appropriately. interdental A sound produced with the tip or blade of the tongue between the upper and lower teeth – for example, the initial segment of the.

Glossary interference see transfer interjection A word that expresses an emotional attitude (e.g. yuck!, erk!) or is used as a warning or call for attention (e.g. hey!). International Phonetic Alphabet (IPA) The alphabet of the International Phonetic Association, designed to represent all sounds of the world’s languages. interpersonal The type of meaning that concerns the establishment and maintenance of social relations; also used of a grammatical relation that encodes this type of meaning. interrogative A grammatical construction that directly expresses a question – for example, Are you going? intonation The pitch contour of a phrase or sentence. intransitive clause A clause with one obligatory noun phrase – for example, clauses of state or motion in English (the farmer slept, the duckling waddled). IPA International Phonetic Alphabet isogloss A line drawn on a map to show the boundary of an area in which a linguistic feature is found. isolating language A language in which words tend to comprise a single morpheme. Mandarin Chinese and various other languages of South-East Asia are isolating; English is also fairly isolating. See also agglutinating language, fusional language, polysynthetic language. keyness A statistical measure of the relative strength of a word as a keyword in a target corpus compared with a reference corpus: a larger keyness value indicates greater significance as a keyword. There are different statistical measures that give different keyness values. keywords Words in a particular text or corpus that are unusually frequent (positive keywords) or infrequent (negative keywords) in comparison with a reference corpus. Khoisan A genetically disparate residue class of African languages of have clicks but are neither Bantu nor Cushitic. At least three distinct genetic lineages make up the Khoisan languages of southern Africa; two other member languages are potential isolates spoken in east Africa. L1 A person’s first language or mother tongue. L2 A language learnt by a person after their L1. L2 learning see second-language learning labial A sound articulated with the lips. labiodental A sound articulated with the bottom lip in contact with the upper teeth. LAD language acquisition device language acquisition device (LAD) The genetically encoded biological faculty enabling a person to learn and use a language. This is a highly controversial notion held by linguists who believe in the innateness of language. language isolate A language that forms a family on its own, having no known genetic relatives. language death The process whereby a language loses its entire community of speakers. language endangerment, language obsolescence The process by which the community of speakers of a language reduces significantly, and few children acquire it. language family see family language maintenance, language revival Strategies developed to maintain and support the use of an endangered or dying language. language shift The process in which habits of using a language in a bilingual community change over time increasingly favouring usage of one of the languages, disfavouring the other(s). Language shift can result in the endangerment and ultimately death of a language. language universal A property that holds for all or most human languages.

453

454

Glossary larynx The part of the throat or windpipe lying behind the Adam’s apple that holds the vocal folds. lateral A manner of articulation of a consonant whereby the air escapes via one or both sides of an obstruction in the oral cavity, as for [l]. lateralization The tendency for certain cognitive functions to be performed in one or the other hemisphere of the brain. learner corpus A corpus that provides samples of language use by those who are not yet entirely proficient users. lemma The citation form for a set of word forms representing a single lexeme. lexeme, lexical item A linguistic sign of any size (morpheme, word or larger expression) that expresses meaning that is not entirely predictable. See also idiom. lexical cohesion A cohesive device or tie formed between a lexical item and another that is related semantically to it. lexical morpheme A morpheme that expresses content rather than grammatical meaning. See also grammatical morpheme. lexicon A list of all lexical items of a language. The full lexicon of a language will contain not just morphemes and words but also idioms. linguistic determinism The notion that the structure of a language determines the way its speakers think about and perceive the world. See also linguistic relativity. linguistic relativity The idea that there is a correlation between the structure of the language you speak and the way you conceptualize the world. See also linguistic determinism. linguistic typology The branch of linguistics that is concerned with classifying linguistic phenomena. Linguistic typology is sometimes distinguished from language typology, which is concerned with classifying languages into structural types. loanword A word in one language that has its origins in another language – for example, kangaroo is a loanword in many languages, deriving ultimately from a word in the Australian language Guugu Yimithirr. See also borrowing. location A phonetic or phonological feature of manual signs of sign languages concerning the position of the hand on or near the body or in sign space. localization The theory that different areas of the brain are responsible for different cognitive functions. locative case The case of a noun or pronoun that specifies it as a location for an object or event; the meaning is like that expressed by the prepositions at, on and in in English. See also case. logographic writing system A system of writing in which each symbol ideally represents a word or morpheme, e.g. the system of Chinese characters. loss A fairly common type of sound change in which a segment disappears – for example, the final stop of womb was lost at some point in the history of English. low vowel A vowel in the production of which the high point of the tongue is low in the mouth, and the body of the tongue is lowered from its neutral position. magnetoencephalogram (MEG) A record of brain activity by the measurement of magnetic fields. MEGs provide better spatial resolution than EEGs. manner of articulation The way the airstream is obstructed and modified as it passes through the constriction in the vocal tract in the production of a consonant. Manners of articulation include stop, nasal and fricative. See also place of articulation. manual sign A gestural sign made with just the hands. marked category A category that is less natural than another it is in opposition to. For example, plurals of nouns in English are marked with respect to singulars. Marked categories tend to be expressed

Glossary by larger forms, and tend to be less frequent than unmarked categories in language use and across languages. See also markedness, unmarked category. markedness The notion that some grammatical categories are less natural or commonplace (more marked) than others. See also marked category, unmarked category. markup Enriching information inserted into a corpus concerning the format and other relevant features of a text. mass comparison A quick way of gaining an idea of the relatedness of a number of languages by comparing basic vocabulary items (excluding onomatopoeic words) in the languages. Maxim of Manner The maxim or convention formulated by H. P. Grice that a speaker’s contribution to conversation should be orderly, and should avoid obscurity and ambiguity. Maxim of Quality The maxim or convention formulated by H. P. Grice that a speaker’s contribution to conversation should be truthful and not make unsupported claims. Maxim of Quantity The maxim or convention formulated by H. P. Grice that a speaker’s utterance should be no more nor less informative than required at that point in the conversation. Maxim of Relevance The maxim or convention formulated by H. P. Grice that a speaker’s utterance should be relevant to the topic being discussed at that point in the conversation. meaning The idea or concept that is conveyed by a linguistic form or utterance, its content. See also sense and reference. meaning extension The process by which the meaning of a word is extended or broadened to embrace new senses. meaning narrowing The process by which the meaning of a word is reduced so that it covers a smaller range of senses. medium The physical means or channel through which communication is effected. Three mediums of natural human language are auditory-vocal (speech), visual-gestural (sign languages) and visualinscribed (writing). See also under these entries. MEG magnetoencephalogram mental lexicon The internal lexicon that speakers of a language have in their minds. meronymy The part–whole relation – for example, hand and face are meronyms of clock. metaphor Non-literal meaning in which an expression that means one thing is extended on the basis of similarity – for example, grasp is used metaphorically in he grasped the idea. metathesis The exchange of position of phonological segments – for example, bird derives by metathesis from Old English brid. metonymy Broadening of meaning whereby the sense of an expression is extended to another concept it is habitually associated with – for example, crown for ‘king’. mid vowel A vowel in the production of which the high point of the tongue is in a relatively neutral position in the mouth, neither high nor low. minimal pair Two words that are identical except for a single phonetic characteristic or segment in some position; pin and bin are minimal pairs in English. mixed language A language in which particular parts of the lexicon and grammatical system come predominantly from different sources. An example is Michif, whose nouns and nominal grammar come from French, while its verbs and verbal grammar are from Cree. morph Any minimal meaningful form in a language, including morphemes and allomorphs. morpheme The smallest type of linguistic sign – for example, unlikely consists of three morphemes, un-, like and -ly. morphology The branch of linguistics that studies the structure of words.

455

456

Glossary morphophonemic form An abstract invariant form postulated for phonological allomorphs, that is operated on by morphophonemic rules to derive the phonological forms of the allomorphs. morphophonemic rule An explicit rule that accounts for the phonological allomorphs of a morpheme. motherese see caretaker speech move A single sentence or a sequence of sentences that cohere in speech act value that represent a speaker’s contribution to the discourse at that point in a discourse. A move often corresponds to a conversational turn; however, a turn may be made up of more than one move. movement A feature of the phonetic and phonological structure of a manual sign of a sign language concerning how the hands move. Two main types are primary (in which one or both hands change their location) and secondary (in which the shape of the hands change but the hands remain stationary). multi-channel sign A visual-gestural sign involving a combination of manual and non-manual components. multilingual corpus A corpus comprising texts from a number of different languages, ideally in equal amounts and in the same genres. mutual intelligibility The main criterion used by linguists for recognizing different forms of speech as dialects of a single language, this refers to the ability of speakers of one variety to understand speakers of the other without prior experience. narrative A genre of text that constructs a coherent sequence of events that tell a story. narrative structure The sequence of stages of various types (e.g. setting, complication, resolution) that narratives satisfy. narrow transcription A detailed phonetic transcription. See also broad transcription, transcription. nasal A sound produced by lowering the velum, permitting air to pass into the nasal cavity which acts as a resonating chamber. nasal cavity The chamber behind the nose through which air passes when the velum is lowered. neurolinguistics The neuroscience of language; neurolinguistics is concerned with the brain functions underlying speech and the learning of language. neuron A nerve cell, the type of cell found in the brain and nervous system. Niger-Congo Perhaps the largest language family in the world with upwards of 1,500 languages spoken over much of sub-Saharan Africa. The status of the grouping as a family is widely accepted, though its membership and division into branches is contentious. nominal A term sometimes used instead of noun in languages that do not distinguish adjectives and nouns as distinct parts-of-speech. nominalization A noun that is derived from another kind of unit, such as a verb, by use of a derivational morpheme. For example, eater is a nominalization, being a noun that derives from the verb eat by means of the derivational suffix -er. The term is also used of the corresponding morphological processes. nominative case The case of a noun or pronoun when it is subject of a clause, provided that this is different from its case as an object. For example, in English the nominative case of the first person singular pronoun is I. See also accusative case, case, intransitive, transitive. non-absolute universal A generalization that holds good for many (though not all) languages, a tendency; e.g. most languages have nasal phonemes. See also absolute universal. non-imagistic gesture A non-depictive gesture such as a pointing gesture or beat. non-implicational universal A universal generalization that expresses a characteristic held by most or all languages; e.g., all (spoken) languages have vowels. See also implicational universal.

Glossary non-manual sign A sign of a sign language that does not involve use of the hands in its production, but instead uses the face, eyes, mouth, head or torso. An example is the BSL and Auslan sign NO which expressed by a headshake. non-Pama-Nyungan A residual grouping of languages spoken in the Kimberley and Arnhem Land regions of the northern part of the Australian continent that do not belong to the Pama-Nyungan family. These languages fall into more than a score of genetic families that have not yet been shown to be related to one another. noun A part-of-speech comprising words that serve as the main lexical item in noun phrases, and in some languages show grammatical alternations for case, number and/or gender. Nouns typically denote concrete or abstract things – for example, dog, town, dishonesty. See also verb, part-ofspeech. noun phrase (NP) A syntagmatic grouping of words that typically functions as a referential expression, and serves in a grammatical relation such as Subject, Object, Actor, Agent, etc. Noun phrases are generally made up of a noun or pronoun, optionally together with modifying words such as adjectives and determiners, as in the moon, the ugly duckling, my pet pig. NP noun phrase Object The grammatical relation traditionally associated with the Undergoer (or patent) of an action; the role of the duckling in the farmer kissed the duckling. onomatopoeia Where the phonetic form of a word is suggestive of the meaning – for example, meow, woof. oral cavity The mouth. orientation A phonetic and/or phonemic feature of manual signs of sign languages concerning the direction of the palm and fingers during the production of the sign, e.g. upwards, downwards, left, right, or towards or away from the signer’s body. overextension of meaning Where a child learning a language generalizes the meaning of a word beyond the sense it has in adult language – for example, using doggy to refer to all four-legged hairy animals. overgeneralization of regular forms Where a child learning a language uses a regularly constructed form instead of the irregular form of the adult language – for example, feets or foots instead of feet as the plural of foot. palatal A consonant produced with constriction in the region of the palate. palate The hard part of the roof of the mouth behind the alveolar ridge, and in front of the velum (sometimes called the soft palate). Pama-Nyungan Term used for a family of languages spoken over most of the Australian continent (except for much of the Kimberley and Arnhem Land). Papuan A set of some 800 languages spoken on New Guinea and neighbouring islands that embraces all languages that are not Austronesian. paradigmatic relation A relation between a linguistic unit and other units that can occur in the same position in a construction – for example, /p/ and /b/ are in paradigmatic opposition in English. (/p/ and /æ/ are not since they can’t occur in the same position in a syllable.) See also syntagmatic relation. parsed corpus A corpus the sentences (and other linguistic units) of which have been analysed grammatically (i.e. parsed). A parsed corpus is usually understood as one in which tree structures (or the equivalent) have been provided for each sentence. See parsing. parsing The process of dividing a sentence or smaller linguistic unit into its component parts and assigning a structure to it.

457

458

Glossary part-of-speech A categorization of the morphemes of a language into types according to their grammatical behaviour. Parts-of-speech frequently identified include noun, verb, adjective, adverb and pronoun. pejoration Where a word takes on negative connotations – for example, abo in Australian English is a clipping of Aborigine, and has acquired negative connotations that the full form does not have. PET positron emission tomography pharyngeal A consonant sound with the pharynx as its place of articulation; the Danish rhotic is a pharyngeal approximant. pharynx The tubular cavity in the vocal tract located above the larynx and oriented roughly at right angles to the oral cavity. phonaesthesia The association between certain sounds and meanings in a language – for example, between initial sl in English and uncontrolled sliding movements. phone Smallest phonetic segment that can be isolated in a stream of speech – for example, [p], [æ]. phoneme A minimal unit in the phonology of a language that is capable of differentiating between words; a distinctive phone. See also allophone. phonetic realization The realization of a phoneme as an articulated sound. phonetics The scientific study of speech sounds. See acoustic phonetics, articulatory phonetics, auditory phonetics. phonological rule An explicit rule that accounts for the allophonic realization of a phoneme as a phonetic segment. phonology The sound system of a language, including the inventory of phonemes and their paradigmatic and syntagmatic patterning; also the study of the sound systems of languages. phrase A group of words of smaller than a clause, such as an NP (e.g. the dog). pictogram A stylized picture-like representation of a thing, event or idea that is used to accompany writing or as a precursor of writing. Pictograms do not represent words of a language. An example of a pictogram is ř. pidgin A contact language with simple grammar, a small lexicon and a small stylistic range that is no one’s mother tongue. See also contact language, creole, mixed language. pitch The frequency of vibration of the vocal folds. pitch accent system A system in which pitch differences are used to distinguish between accented and unaccented syllables; Japanese has a pitch accent system. place of articulation The location in the vocal tract of the constriction of airflow in the articulation of a consonant – for example, dental, palatal. See also manner of articulation. point(ing) A non-imagistic indexical gesture, often using the index finger, that draws attention to or locates an entity. polysemy The situation in which a single lexical or grammatical item has a range of different though related meanings; foot has polysemies including ‘a part of the body at the extremity of a limb used for locomotion’, ‘lower part (e.g. of a hill)’ and ‘part of an object that serves for support (e.g. of chair or building)’. polysynthetic language A language in which words are morphologically complex and typically made up of many morphemes, such that a single word may correspond to a full sentence in English. Many languages of North America are polysynthetic. See also agglutinating language, fusional language. positron emission tomography (PET) scanning A brain-scanning technology used to detect the location of brain activity in which a radioactive isotope is injected in the bloodstream. See also Functional Magnetic Resonance Imaging.

Glossary possession See alienable possession, inalienable possession. postposition A grammatical word or morpheme that follows a noun phrase and indicates its relation in a clause or another noun phrase. See also preposition. postpositional phrase A phase consisting of a noun phrase followed by a postposition. poverty of the stimulus The idea that the child learning a language is exposed to unstructured and noisy data, full of hesitations, incomplete utterances, slips of the tongue, and insufficient evidence on which to build a mental grammar. PP prepositional phrase or postpositional phrase pragmatics The study of meaning that is inferred from what is said rather than encoded. prefix An affix attached to the beginning of a root or stem, e.g. un- of unlike. preposition A grammatical word or morpheme that precedes a noun phrase and specifies its relation in another noun phrase or clause, e.g. to, from, at. prepositional phrase A phrase consisting of a noun phrase preceded by a preposition, as in to the woodshed. presequence A sequence of preparatory exchanges managed by a speech interactant in view of preparing the ground for the joint pursuance of a new discourse goal. presupposition Something that must be assumed true for a sentence to be appropriately uttered – for example, Have some more tea presupposes the addressee has already had some tea. primary sign language A language used primarily by deaf people in which the lexical and grammatical units are represented by gestures of the hands, lips, eyes, head, etc. American Sign Language (ASL), Auslan and British Sign Language (BSL) are primary sign languages. proclitic A clitic that is attached to the beginning of a word. See also clitic, enclitic. productivity A design feature of language referring to the ability of speakers to make new meanings by putting together linguistic elements in new ways to form novel expressions. pronoun A grammatical morpheme that is used to index a referent within or external to the speech situation – for example, I, him. proposition That which is expressed by a clause and may be either true or false. prosody A phonetic quality that is spread over a phone or sequence of phones – for example, stress, intonation, tone, loudness. proto-language The hypothetical language from which all languages of a family ultimately derive. ProtoIndo-European is the hypothetical parent language of all Indo-European languages. psycholinguistics The branch of linguistics concerned with the mental and cognitive processes involved in production and comprehension of speech, and in the acquisition of language. pulmonic airstream mechanism The airstream produced from the lungs; this is the most common airstream used in human languages. range In corpus linguistics the range of a word or expression is a measure of its frequency in terms of the relative number of texts in which it occurs rather than its absolute number of occurrences in the corpus. rapid fade One of Charles Hockett’s design features of language, the characteristic of spoken and gestured signs that once produced they disappear rapidly. reanalysis A type of morphological change in which a word with a certain structure comes to be analysed differently. rebus The principle employed in writing whereby a symbol representing a certain word is used to represent the sound or approximate sound shape of that word rather than the word itself. For instance, 1 might be used in written English by the rebus principle to represent the homophonous word won.

459

460

Glossary reduplication The morphological process involving repetition of all or part of a morpheme to produce a new word, as in wishy-washy, teeny-weeny. reference The relation between a linguistic unit and something that it identifies – for example, between the sun and a certain celestial object. reflexivity A design feature of language referring to the property that language can be used to communicate about itself. register tone system A tone system in which the relative height of the tone is crucial, not its direction of movement; such systems are found in many African languages, e.g. Twi. register, registerial variation Speech varieties or variations in speech that are associated with different contexts of use, e.g. scientific English, legalese, bureaucratese. regular expression A sequence of standardized characters that specifies a search pattern in a text or corpus. Most corpus software permits searches by regular expressions. regularization Any process by which irregular or partially regular constructions or patterns in a language are replaced by more regular ones. For example, the plural of ox is in the process of regularization to oxes in some varieties of English, replacing the irregular oxen. See analogical change. representative corpus A corpus that covers an appropriate range of categories such as genres, discourse types, registers, modalities and so on, for a corpus of that particular type and purpose. See also balanced corpus. respect variety A speech register used to show respect to an interlocutor or someone being spoken about. rhotic An r-like speech sound. A consonant characterized by the fact that it is written with some variant of the Latin letter . Rhotics form a disparate group of consonants that do not share significant articulatory features. root The base form of a lexical item that cannot be further analysed morphologically, e.g. happy in unhappily. rounded vowel A vowel accompanied by rounding of the lips, e.g. [u] and [y]. Sapir-Whorf hypothesis A hypothesis about the relation between language structure and thought made popular by the American linguists Edward Sapir and Benjamin Whorf. It is sometimes referred to as simply the Whorfian hypothesis since Whorf advocated the most extreme version of the hypothesis. See also linguistic relativity, linguistic determinism. second-language learning The learning of one or more languages after the first language has been fully or almost completely learnt. It is also called L2 learning. secret variety A speech register used by a subgroup of speakers of a language to exclude outsiders, and to underline the separate social identity of the members. semantic change A change in the semantics of a linguistic item over time. semantic compositionality The idea that the semantics of a sentence (or other complex grammatical unit) can be accounted for by putting together the semantics of the components of the complex unit. semantic bleaching The process by which the lexical meaning of an item is lost or attenuated as it becomes more grammatical. semantics The study of the linguistic meanings of morphemes, words, phrases, sentences and grammatical relations. Semantic meaning is the type of meaning that is encoded in a linguistic sign. semivowel see glide sense The inherent meaning of a linguistic sign. sentence The largest unit of syntax; larger phenomena are not structured grammatically (although they may be structured in other ways).

Glossary sentence comprehension The cognitive and brain processes involved in understanding sentences. sentence meaning The linguistic meaning of a sentence, the meaning that remains constant throughout all instantiations of the sentence. See also semantics, utterance meaning. sentence production The psychological and neurological processes involved in creation of sentences. sign A fundamental unit made up of two inherent components, a form (signifier) and a meaning (signified). sign language see alternate sign language, primary sign language, village sign language sign space The space within which most signs of a sign language are articulated. This space extends from just above the head to about the waist, and elbow to elbow when the arms are loosely bent. Sino-Tibetan A large genetic family of languages consisting of some 400 languages, that is second in size to the Indo-European family in terms of numbers of speakers. It is divided into two primary groups, Sinitic (14 languages, including Mandarin Chinese) and Tibeto-Burman (the remainder, including e.g. Tibetan). SLI Specific Language Impairment slip of the tongue An unintended divergence from the intended utterance, also called a tip of the slongue or speech error. sociolinguistics The field that studies language in its social context. social variety and variation Varieties of a language or variations in a language that are associated with different social groups, such as ages, geographical regions, social classes, religions. sound change Changes in the sounds and sound system of a language over time. sound correspondence A systematic correspondence between sounds in cognates in genetically related languages, e.g. the correspondence between /f/ and /p/ in English and French cognates. specialized corpus A corpus designed to represent a specific variety of a language (e.g. a dialect) or a specific genre of texts (e.g. narratives, advertisements) in the language. Specific Language Impairment (SLI) A cognitive disorder believed by some to be specific to language, SLI is characterized by difficulties in articulation and grammatical impairments, among other things. speech act The act a speaker accomplishes by using an utterance in a particular context – for example, it’s cold could be used as a request to turn on the heater. speech community A group of people who share a language or language variety and the norms for its use in social contexts. split-brain patient Someone whose corpus callosum has been surgically severed to separate the two hemispheres of the brain. This medical procedure used to be used in the treatment of epilepsy, but is no longer undertaken. spoonerism A type of speech error involving the exchange (metathesis) of initial segments of lexical words in a sequence, as in our very queer dean when our very dear queen was meant. Spoonerisms are named after the nineteenth-century Oxford don Reverend William Spooner who is said to have regularly produced such errors. stage Functional elements in the structure of a text or discourse. For example, stages in a narrative include setting, complication and resolution. standard dialect A dialect of a language that is accepted by speakers as the most correct form, is promoted in schools, and is used in public writing and speech. stop A speech sound in which the airflow is completely blocked for a brief time at some point in the vocal tract, e.g. [ʔ], [g], [b]. stem A word form (a root, root plus derivational affixes, or compound) to which inflectional affixes are attached.

461

462

Glossary stress, stressed syllable A syllable perceived as prominent due to greater length, loudness and/or higher pitch than other syllables in a word. structuralism Any approach to linguistics that focuses on the interrelatedness of linguistic units, the ways they form structures and systems of oppositions. style (of speech) A variety or manner of speech associated with certain interpersonal contexts, and usually differing from other styles in degree of formality. See also register. subgroup A subset of the languages of a genetic family consisting of languages that derive from a proto-language that is a granddaughter of the proto-language of the family. A subgroup is a group within a group. See also group. Subject The grammatical relation traditionally associated with the doer or performer of an action, as in The farmer kissed the duckling. According to some linguists, subject is a meaningless category; others aver that it has a meaning relating to how the proposition is presented. substitution A cohesive device or tie formed when general word is used as a type of counter, replacing words that have already been used in a text. In English one is used in this way. For example, in First you make one pile with the coloureds. Then you should make a new one with the whites, the second instance of one ties to pile by substitution. suffix An affix attached at the end of a root or stem – for example, -ed in finished. superordinate A general term that is an inclusive term in a relation of hyponymy. For example, colour is a superordinate for blue and green. suppletion, suppletive forms Allomorphs of a morpheme that are not phonologically related – for example, the irregular past tense went of the verb go involves root suppletion. suprasegmental see prosody suspicious pair A pair of phones that are sufficiently similar to be potentially allophones of a single phoneme, e.g. [p] and [b] (both are bilabial stops). syllabary A type of writing system in which, ideally, the graphic symbols represent syllables of the spoken language. The writing system for Cherokee devised by Sequoyah is a syllabary. syllable A minimal unit of speech production, normally composed of a vowel or vowel-like consonant that is optionally preceded and/or followed by a consonant; [ba], [ab] and [a] are syllables, though [b] is not. symbol, symbolic sign A sign in which the association between the form and the meaning is not motivated. See icon. synecdoche A type of meaning extension where the sense is extended from a part to a whole meaning – for example, the extension of tit ‘nipple’ to mean ‘whole breast’. synonymy The relation of similarity of meaning – for example, seat and chair are synonyms. syntactic bootstrapping The use of syntactic knowledge by a language learner in order to determine the meaning of words; experiments have shown that knowing a word is a verb (from its syntactic context) informs the child that it denotes an event. syntactic change A change in the syntactic patterns of a language, such as from SVO to SOV word order. syntagmatic relation A relation between linguistic items or categories that are present in an utterance. syntax The study of the structure of sentences in a language. taboo word A word considered inappropriate in certain social contexts, e.g. fuck and cunt on formal occasions. telegraphic speech The stage in first-language learning following the two-word stage, in which utterances consist primarily of lexical items.

Glossary tense A grammatical category, usually marked in verbs, that indicates the relative time of occurrence of an event, e.g. past, present, future. text A unified stretch of language that is primarily oriented to structuring and conveying information about a real or imaginary world. textural That which provides texture to an utterance, linking the component parts together. Theme A textural relation, the Theme of a clause anchors its message down, serving as a hook on which the message can be hung. Themes usually indicate what clauses are about. tone The contrastive pitch or pitch contour on a syllable in a tone language. Minimal pairs may exist that differ only in tone; for example, in Cantonese the syllable [si] with high falling tone is the word for ‘poem’, but with mid level tone is the word for ‘to try’. tone language A language in which tone is phonemic; many languages of Africa, America and SouthEast Asia are tone languages. top-down processing Language processing that takes into account the larger linguistic and extralinguistic environments, that generate expectations about what will be said. See also bottom-up processing. transaction A stage in the structure of a discourse. Transactions are oriented to the achievement of some intermediate subgoal, such as enquiry about the capabilities of a computer in a sales interaction in an electronics shop. transcription The representation of a spoken or signed utterance in the written medium. See broad transcription, phonemic transcription, narrow transcription. transfer The carrying over of grammatical patterns from a person’s L1 to L2. The ‘foreign accent’ of most second-language learners results from transfer of the phonetic and phonemic systems of the first language. Transfer can also occur in the opposite direction. transition relevance place (TRP) A point where an utterance is potentially complete, such as the boundary of a grammatical or intonation unit. See also turn-taking. transitive clause A clause which, in full form, has two obligatory noun phrases, e.g. a clause of caused movement (the sergeant marched the soldiers) or violence (the duckling bit the farmer). Trans-New Guinea A contentious genetic grouping of 300–500 Papuan languages mostly spoken on the mountainous spine of the island of Papua New Guinea. tree A diagrammatic representation of the hierarchical structure of a sentence. Trees can be labelled at nodes (indicating the type of unit) and branches (indicating the grammatical category). See also family tree. trill A speech sound involving the vibration of one articulator, often the tip of the tongue, against another, usually immobile. An example is the rhotic in the Spanish word perro ‘dog’. turn-taking The system regulating how conversation partners organize the exchange of speaker and hearer roles in a discourse. See also transition relevance place. two-word stage A stage in the child’s learning of their first language, usually beginning around 18 months, in which words are put together to form two-word utterances. unaspirated A voiceless stop which is not followed by a puff of air; if followed by a vowel, the vocal folds begin vibrating at the same time as the stop is released. See also aspirated. underextension Where the child assigns a narrower meaning to a word than it has in the adult language – for example, if doggy applies just to a pet dog. Undergoer The grammatical role of a noun phrase the referent of which suffers the action designated by a transitive clause – for example, the farmer in the duckling licked the farmer. See also Object. ungrammatical A syntactic form that does not conform to the grammatical patterns of a language.

463

464

Glossary unit Any stretch of language that behaves as a unified whole. Units range in complexity from the smallest indivisible units (e.g. morphemes, phonemes) to the largest (sentences). universal see language universal unmarked A category that is more natural than another category it contrasts with; the singular category for nouns in English is unmarked with respect to the plural. Unmarked categories tend to be expressed by shorter forms and to be more frequent than unmarked categories in usage and across languages. See also marked category, markedness. unrounded vowel A vowel produced without rounding of the lips, e.g. [ɪ], [e]. utterance A stretch of speech corresponding approximately to the sentence in grammar. utterance meaning The meaning of an utterance in its context of occurrence, which may be different from its meaning in different contexts; pragmatics studies utterance meaning. uvula The small appendage hanging down at the back of the soft palate or velum. uvular phone A speech sound made with the tongue making contact or approximating to the uvula, e.g. the stop [q] and nasal [ɴ]. vagueness Lack of specificity in the meaning of a linguistic sign; for example, wrong is vague between the senses ‘immoral’, ‘inappropriate’, ‘incorrect’. See also ambiguity, polysemy, homophone, homophony. velar A consonant produced with constriction in the region of the velum. velaric airstream mechanism An airstream produced by placing the back of the tongue against the velum and making a second closure further forward in the oral cavity. The enclosed space is then enlarged, rarefying the air within; the anterior closure is then released, and air flows inwards. See click. velum The soft part of the roof of the mouth behind the hard palate. verb A part-of-speech found in most languages containing words that serve as the main lexical item in a verb phrase; in some languages verbs display grammatical categories like tense, aspect, mood. Verbs typically denote events, states, processes, happenings and so on, e.g., kiss, sit, break. See also noun, part-of-speech. verb phrase (VP) A syntactic unit consisting of a verb together with syntagmatically related words (such as adverbials and auxiliary verbs) that typically serves in the grammatical relation Event. Examples are was eating, might have been watching closely. In formal grammar verb phrases are larger units also containing the object and other noun phrases and prepositional phrases – basically everything except the subject. village sign language A sign language that arises an isolated community with a high proportion of hereditarily deaf people; village sign languages are normally used by both deaf and hearing community members. An example is the Kata Kolok Sign Language of Bali. visual-gestural medium The medium or channel over which sign language is conveyed, which involves gestures made by the hands, face, eyes, head and torso (production) and the eyes (perception). See also auditory-vocal medium, medium, visual-inscribed medium. visual-inscribed medium The medium or channel deployed in writing, which involves technology such as pen and paper (production) and the eyes (perception). See also auditory-vocal medium, medium, visual-gestural medium. vocal folds A set of muscles in the larynx resembling a pair of flaps that can be brought together more or less tightly to modify the stream of air passing through. Also called vocal cords. vocal tract The body organs that are involved in the production of speech sounds, including the lungs, glottis, pharynx, and oral and nasal cavities.

Glossary vocalization Any sound produced by the vocal apparatus of an animal, e.g. barking and birdsong. voice onset time (VOT) The time between the release of a stop and the onset of voicing in a following vowel. Voice onset time can be negative, zero or positive. voiced phone A speech sound produced with regular vibration of the vocal folds. voiceless phone A speech sound produced with the glottis open, without vibration of the vocal folds. VOT voice onset time vowel A resonant speech sound that is produced without significant constriction in the oral cavity. vowel height The relative height of the highest point of the tongue in the mouth in the production of the vowel; [i] is a high vowel because the highest point of the tongue is high in the mouth (cf. [æ] where the high point is lower). VP verb phrase Wada test A test for determining which hemisphere is dominant in language processing by injecting sodium amytal into the carotid arteries. The ipsilateral hemisphere is deactivated, and if this is the language dominant one, speech is affected. Wernicke’s aphasia The type of aphasia normally resulting from damage to Wernicke’s area and characterized by difficulties in comprehension of speech. Wernicke’s area A classic language area located in the posterior (back) portion of the left hemisphere of the brain. It is named after Carl Wernicke, the German neurologist who observed that damage to this area correlates with certain language disorders. word A fundamental unit of grammar intuitively recognized by speakers of a language. The term is difficult to define, and is used in a variety of different ways in linguistics. According to a famous definition by the American linguist Leonard Bloomfield, a word is a minimal free form. zero morph, zero morpheme A morpheme or allomorph that has no phonetic form. For example, in many languages the third person singular form of a bound pronominal is a zero.

465

466

Notes

Chapter 1 1. A perhaps apocryphal story has it that Winston Churchill, the British prime minister during the Second World War, once received a minute from a civil servant who objected to his ending a sentence with a preposition in an official document. Churchill pencilled in the margin This is the sort of pedantry up with which I will not put. 2. We adopt the convention of indicating in brackets following the language name, on its first mention in the text, the name of the family to which it belongs (see Chapter 17) and its main country of origin. See Map 1 (pp. xxiv–xxv) for the approximate locations of the main languages mentioned in the book. 3. The concepts of form and meaning are both difficult to define. Form does not refer to the actual physical features of a particular instance of a sign. Any instance of a sign will always be slightly different from other instances, if you examine it closely enough; for example, minor variations in the paper will mean that the examples of the sign ∞ in each copy of this book are slightly different. (The actual physical instance is sometimes referred to as the token, sometimes the substance.) The notion of form is an abstraction from the instances, ignoring variations that make no difference – which are perceived or regarded as the same by users, who can’t tell them apart. In a way, the form is the sign-user’s concept of its shape. Meaning is also an abstraction from the specific meaning of an instance of use of a sign: for instance, in Saussure’s example, the concept ‘tree’ ignores the variation among different trees and types of tree. 4. Sometimes the paradigmatic dimension is described as vertical, in contrast with the syntagmatic dimension which is horizontal. Thus we could represent some of the syntagmatic and paradigmatic relations in our example sentence as follows (where the braces enclose items in paradigmatic relations):

⎧I ⎫ ⎪ ⎪ ⎪ ⎪ ⎪ He ⎪ ⎪ ⎪ ⎨ She ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ John ⎪ ⎪ ⎪ ⎩ My brother⎭

⎧ will ⎫ ⎪ ⎪ ⎪ ⎪ ⎪ won’t ⎪ ⎪ ⎪ ⎨ might ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ mightn’t⎪ ⎪ ⎪ ⎩ shall ⎭

⎫ ⎫ ⎪ never⎪ ⎬ ⎬ ⎪ ever ⎪ ⎭ ⎭

⎧ forget ⎪ ⎪ ⎪ remember ⎨ ⎪⎪ recall ⎪ ⎪⎩ describe

⎧ ⎪ ⎪ ⎪ ⎨ ⎪⎪ ⎪ ⎪⎩

⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩

that⎫

⎪ ⎪ this ⎪ ⎪ a ⎬ ⎪ ⎪ the ⎪ ⎪ one ⎭

⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩

⎫⎧ ⎪⎪ ⎪⎪ amazing ⎪ ⎪ ⎪⎪ splendid ⎬ ⎨ ⎪⎪ ⎪⎪ nasty ⎪⎪ ⎪⎪ remarkable⎭ ⎩ terrible

⎫ ⎪ ⎪ night ⎪ ⎪ afternoon⎬ ⎪ ⎪ year ⎪ ⎪ winter ⎭ day

467

468

Notes

Notice that not all of the sentences predicted from this display are acceptable: choice of one word can have implications on the selection of another. For instance, if you choose will or might, the following choice needs to be never, but if you choose won’t or mightn’t, it will be ever. 5. Hockett (1960) lists thirteen design features, the first of which is use of the vocal-auditory channel (or medium, in the terminology of the previous section). Sign languages, of course, do not use this medium. Accordingly, in this book we exclude this feature as a design feature of human language. 6. The reality is somewhat more nuanced than this. Those who are interested in the variation of their moggie’s meow are referred to Chapter 2 of Losos (2023).

Chapter 2 1. Strictly speaking there is a difference between post-alveolar and retroflex sounds, though the terms are often used interchangeably. Both are made with the tongue behind the alveolar ridge, but with retroflex sounds the tip is turned back. Many languages of the Indian subcontinent (e.g. Hindi-Urdu (Indo-European), Tamil (Dravidian) and Telugu (Dravidian)) have retroflex consonants. 2. This is the realization in Cultivated New Zealand English; in Broad New Zealand English a further change has occurred, and the vowel has lowered to the mid central [ə]. The other front vowels have correspondingly shifted upwards from their position in other dialects. For instance, bed is pronounced [be̝ d] or [be̝ d] (the diacritics indicate lowering and raising respectively) and bad as [bɛd]. 3. In the phonetic alphabet used specifically for Danish (e.g. in dictionaries), the last of these vowels is represented by the symbol [æ], which in the IPA represents a rather lower front vowel. 4. Speakers of languages which, like English, have orthographies that represent sounds poorly are apt to identify syllables from the written forms of words rather than from their spoken form. You should be careful not to do this, and always work from the pronunciation. 5. Did you correctly guess gi.ri.li [gɪ.ɻɪ.li]? 6. It is important to understand that it is not that these six different tones give different meanings of the word si in Cantonese, as you will sometimes read. There is no word si in the language, but rather six different words, each with a different form and meaning. 7. Absolutely free variation is rare. Usually there are reasons for the variation, perhaps the speaker’s dialect, perhaps the emphasis the speaker wants to put on the word. Free variation is ‘free’ in that it never results in a different word. 8. Sometimes a pair of phones are in partial complementary distribution in a language – there are some circumstances in which one of the phones occurs but the other does not, although this is not across the board. This can be illustrated by the unreleased [p˺] and the released and aspirated [pʰ] in English. They are not in complementary distribution because both can occur at the end of a word. But within words, in non-final position, they share no environment of occurrence: [p˺] occurs only before another stop consonant or a nasal (as in apt and one-upmanship), while [pʰ] cannot occur in this environment. 9. This is true for most of words of English. In fact, however, there are a small number of exceptions such as interjections like huh. One not infrequently encounters phonetic and phonological irregularities in these small though not unimportant words. Can you think of any other phonetic or phonological irregularities of such words?

Notes

Chapter 3 1. Note that in this conceptualization the word is a component of a sentence, and division into words means that each of the components identified must satisfy the properties of being minimal and free. 2. However, there is a word ducked (as in He ducked to avoid the javelin) that is phonologically identical with duct, and which can be divided into two morphemes, /dʌk/ ‘move rapidly downwards to evade something’ and /t/ ‘past time’. This example also illustrates that two distinct morphemes duck (the bird) and duck (the movement) can share the same phonological form. Such words are called homophones (see §6.2). 3. The three component corpora were: Australian Corpus of English (written), The Lancaster-Oslo/ Bergen Corpus of British English (written) and The Machine-Readable Corpus of Spoken English (spoken).

Chapter 4 1. It must be emphasized that the semantic characteristics mentioned here are not decisive in the part-of-speech classification of a word. Rather, it is the grammatical characteristics – which will be different in different languages – that are criterial. The semantic characteristics indicate tendencies among words of the particular part-of-speech, which motivate the label. 2. The meanings of the glosses for the grammatical morphemes may be disregarded here as they do not affect the point being made. 3. Likewise speakers of English do not always recognize borrowings from their own language. I know of at least one anthropologist who thought the words widjilbrig and widjilkok ‘subincision, subincised penis or person’ were words of an Indigenous Australian language. He failed to recognize in them the English words whistle, prick and cock, the last two being everyday terms for ‘penis’. 4. The reader is reminded that in talking of associations such as these, which have some iconic basis, there is no claim of necessity of connection. It is easy to find words in English which do not show the associations. The point is merely that some association exists as something that speakers of a language experience. If given the choice between feeny and mung as words to describe two objects, one a small pointed star-like object, the other a larger roundish glob-like object, most speakers would feel that feeny is more appropriate to the former object. There is experimental evidence supporting this claim. 5. The m is a prefix indicating the noun is human; the prefix is irrelevant here.

Chapter 5 1. If you doubt this, consider the interpretation of her if (5-3) were followed by She was sitting there beside the fence. Unusual, perhaps, but certainly not impossible. This is a device not infrequently employed in literature.

469

470

Notes

2. At least this is the standard claim. It has been disputed: there is at least one poem that uses this sentence as a line to show that it can be interpreted. 3. By obligatory I do not mean that the role must be realized by an NP actually present in the clause. The role might be present but the NP omitted because it is predictable. For instance, in What did you do yesterday? – Worked all day the final clause has no NP denoting the speaker, I. The NP can be omitted because it is clear from the circumstances; yet the role remains there in the grammatical structure.

Chapter 7 1. The stød is a prosodic feature associated with certain syllable types. It is often realized as creaky voice, which is produced by slow and irregular vibrations of just one end of the vocal cords, sounding a bit like the noise of a door swinging on unoiled hinges. However, it is more complex than this, and there is a range of phonetic correlates of the stød, segmental as well as non-segmental (Goldshtein 2023: 6–7). 2. Genders or noun classes are systems in which the nouns of a language are divided into different groups according to the forms taken by syntactically related items such as demonstratives and adjectives in the NP. For example, standard Danish distinguishes two genders (sometimes called common and neuter) that are indicated by the form of articles, determiners and adjectives. Thus the article en ‘a’ goes with nouns such as mand ‘man’, kvinde ‘woman’, øl ‘beer’, indicating that these nouns are of the common gender; by contrast the article et ‘a’ goes with nouns like land ‘country’, hæfte ‘notebook’ and tog ‘train’. 3. More recently, it has been argued that the variety now spoken by children at Daguragu and Kalkaringi, Gurindji Kriol, is a new language variety, a mixed language (see §17.4) involving components of Gurindji and Kriol, rather than a simplified form of Gurindji (McConvell and Meakins 2005).

Chapter 8 1. Labels on food packages often contain much more information than just the type of food, including details of composition, producer and many other things. Not all of this necessarily belongs to the text of the label as we are using the term, and is typically represented in small font, and in less central positions on the label. 2. The versions differ enormously in plot, but all share the theme of the recycled paper that is marked differently on different occasions. For instance, in a 1987 episode of the Canadian television series Degrassi Junior High entitled ‘The Experiment’, one boy tries to improve his grades by turning in old term papers written by someone else. He is given higher grades than the author received for the same papers but is eventually exposed for submitting someone else’s work. Other versions can be found on the internet; search for The resubmitted term paper.

Notes

Chapter 9 1. A century earlier Samuel Johnson had also based his dictionary partly on usage, albeit that of the great scholars and literary figures. 2. A few computerized corpora for grammatical investigations dating to the 1960s and 70s stored the language data on punched-cards that were sorted through by machine. A variant of this methodology was also sometimes used by linguists themselves to facilitate sorting of their data cards. 3. This usage is sometimes said by watchdogs of style to be an error. The fact, however, is that ‘dithering’ is one of the senses of the word, and this is a relevant fact to be noted by a descriptive linguist – or lexicographer.

Chapter 10 1. For simplicity we will henceforth drop the qualifier other and, following everyday parlance, use the term animal in the sense ‘non-human animal’. 2. The same thing happens in human beings: at puberty the larynx of males increases in size and lowers. The resulting sexual difference has no analogue in our primate relatives, which have by contrast a larger difference in overall body size between the sexes. 3. his effect is named after the German horse Clever Hans that lived around the turn of the twentieth century. Clever Hans appeared to be capable of reading, spelling and performing arithmetical sums shown on a blackboard by his trainer. He tapped out the answer with a hoof. It was eventually demonstrated that he was responding to subtle visual cues provided unintentionally by his trainer: on calculations that were concealed from the trainer, Hans’s performance was no better than chance. 4. This name is not very appropriate to Pinker’s theory. It is, rather, suggested by the standard evolution-by-natural-selection story whereby language was selected for as a way of attracting a mate, which is reminiscent of the old ‘sing-song’ theory favoured by Jespersen. According to this theory, language is to human beings as long-elaborate-tail is to peacocks. It functions to attract a mate – the better you are at talking, the better your chances at finding a mate. 5. Given the relatively small number of genes making up the human genome, it is unlikely that a single gene would be dedicated to language. On the other hand, people with the FOXP2 mutation show other types of impairment.

Chapter 11 1. Heider’s interpretation of her data has been challenged by Roberson et al. (2005). 2. Transposition of the endings (‘rhymes’) of syllables (e.g. V, VC) does sometimes occur. Thus Jacques Hadamard mentions writing will she instead of we shall, which mistake he puts down, probably

471

472

Notes

correctly, to an error in phonological processing that was subsequently written down (Hadamard 1996/1945: 79). (Note that there is a further change in the vowel quality of the first item. Can you suggest an explanation?) 3. Note that the retinas of each eye receive information from both the left and right visual fields.

Chapter 12 1. Telegrams (not to be confused with the Telegram instant messaging app) were written messages transmitted over telephone lines, printed out or written on paper and hand-delivered to the recipient; they were a popular mode of communication in the 1930s and 40s, being cheaper than long-distance phone calls. Telegrams were costed by the number of words, and so function words were kept to a minimum. Thus We arrived safely and are having a good time in telegram form might be Arrived safely having good time. Who would want to pay for we, and, are and a? 2. Basic-level categories are the most natural levels in taxonomies, the levels that are most cognitively prominent. Examples are the categories of dogs, cats, horses – in most everyday circumstances one refers to members of these species with the terms dog, cat and horse rather than the more generic animal or more specific terms like poodle or labrador.

Chapter 13 1. The circular and jerky movements of the hand towards the end of its movement presumably indicate the speaker’s uncertainty about the distance to measure out. At the end of the movement trajectory of the arm her replacement of the index by the middle finger as the pointer may underline her decision as to the distance. 2. More technically, sign language morphology can usually be more appropriately described in terms of item and process models (which identify lexical roots as the fundamental units and describe morphology in terms of processes of changes to their shape) than item and arrangement models (which identify morphemes and describe words as sequences of morphemes). (The distinction is due to Hockett 1954.) 3. In fact, there is some disagreement among sign language researchers as to the status of pronouns. While many concur with the proposition that pronouns exist and form systems comparable with those of spoken languages, there are some dissenters. It is beyond the scope of the present text to discuss the pros and cons of the various positions, and I have adopted the analysis that appears to me to be most viable. 4. It is important to observe that the reduplicated part is not the lexeme wanta ‘sun’, but in this instance a meaningless form that just happens to resemble a lexeme of the language. (Compare the English lexeme climax, which is obviously not a compound of climb and axe, though both phonological forms are identifiable in the shape of the word.)

Notes

Chapter 14 1. In what follows I restrict attention to written forms of spoken languages. As it happens, no deaf sign language has an accepted and widely used written system. Various systems of representing the gestures of sign languages have been developed, but these, like the IPA, are restricted to use by researchers. 2. The situation is somewhat less clear-cut than suggested here, and some elements found in writing as defined here do not represent linguistic units. These are reminiscent of gestures and paralinguistic features accompanying speech. Nonetheless, if a system of visual marks has no component elements representing linguistic elements directly, it will be excluded from writing by definition. 3. Another possible source of cuneiform is clay tokens of various shapes that were used for millennia prior to the emergence of writing, and were sometimes pressed into clay envelopes. This possibility is supported by similarities between these impressions and early Sumerian inscriptions. 4. For comparison, consider the fact that our so-called Arabic system of numerals arose in India around the sixth century CE (if not earlier). It was not until some four centuries later that Europeans started using the system, and it was many centuries after that that the system began to be widely used in Europe. 5. The first line gives the standard characters of the 简体 simplified style in use in mainland China. The second line gives the standard Pinyin spelling of each word; the diacritics (marks) over the vowels indicate different tones (recall §2.5). 6. To be sure, linguistic features of a printed document may give some indication of authorship (e.g. authorship of literary works and written extortion demands is sometimes established by frequency of use of certain lexical and grammatical items) and of social group membership of the writer.

Chapter 15 1. For simplicity, here and throughout the rest of the chapter we restrict attention to spoken languages, excluding sign languages. 2. We know that they are singular in form not because they do not end in /s/ or /z/, but because there are regular plural forms that can be used in referring to multiple types – there are many sheeps in New Zealand means ‘many sheep varieties’. 3. What is meant by the expression ‘colloquially conflate’ is that the conflation is not necessarily found in all lexical items and expressions, but rather is found in the most common, informal and conversational means of expressing the meaning.

Chapter 16 1. In the study of language change, the term law is used to describe a change that happened at some point in time to a particular language. It is not a universal generalization applying to all times and

473

474

Notes

places, as in the case of the laws of physics or chemistry; nor is it a stricture that a person can decide to disobey, as in legal contexts. 2. Later, after European colonization, this construction facilitated the creation of verbal expressions for new activities brought with the colonizers, such as working, branding cattle, shooting and so on.

Chapter 17 1. The estimates given in the first two paragraphs of this section are according to the 17th edition of Ethnologue, dated 2014. It is unlikely that the figures have changed significantly. 2. The reason why borrowings are excluded is that otherwise the proportion of shared items would be inflated between a pair of languages. A borrowed item does not reflect a shared retention. However, it has been shown that at least in some cases it does not matter greatly to the results of lexicostatistics whether borrowings are included or excluded from the counts. (The false positives of shared borrowings might be offset by shared negatives where the languages borrow different forms.) 3. Kwadi is an extinct language of which very little is known. Available evidence suggests that it belongs in a genetic group with Khoe languages (Güldemann 2004). 4. This is the standard view of the origin of creole languages. Not all scholars agree, and some argue that creole languages do not necessarily arise from previous pidgins. Glottolog adopts such a position, and classifies creoles according to the major lexical donor language; Tok Pisin is thus assigned to the Indo-European family. 5. In fact, most languages can probably be described as ‘mixed’ to some extent, due to borrowings. Pidgins and creoles can be described as mixed; so also can English, given the number of borrowings it has admitted during its history, both into basic and non-basic lexicon, and grammatical morphemes; indeed, some phonemic distinctions could be regarded as borrowings from French (or at least the result of extensive lexical borrowings). But the technical term mixed language excludes cases like these. A crucial feature of a mixed language is that the contribution of each of the two major source languages is to a certain domain of the lexicon or grammar; they are not mixed in a hodge-podge. Nevertheless, doubtful and intermediate cases do exist.

References

Abley, M. (2003), Spoken Here: Travels among Threatened Languages, London: William Heinemann. Abner, N., K. Cooperrider and S. Goldin-Meadow (2015), ‘Gesture for Linguists: A Handy Primer’, Language and Linguistics Compass 9 (11): 437–51. Ahlsén, E. (2006), Introduction to Neurolinguistics, Amsterdam and Philadelphia: John Benjamins. Aikhenvald, A. Y. and R. M. W. Dixon, eds (2017), The Cambridge Handbook of Linguistic Typology, Cambridge Handbooks in Language and Linguistics, Cambridge: Cambridge University Press. Aitchison, J. (1996), The Seeds of Speech: Language Origin and Evolution, Cambridge: Cambridge University Press. Aitchison, J. (2003), A Glossary of Language and Mind, Edinburgh: Edinburgh University Press. Aitchison, J. (2011), The Articulate Mammal: An Introduction to Psycholinguistics, Routledge Classics, Abingdon and New York: Routledge. Aitchison, J. (2012), Words in the Mind: An Introduction to the Mental Lexicon, 4th edn, Malden, Oxford and Chichester: Wiley-Blackwell. Aitchison, J. (2013), Language Change: Progress or Decay? 4th edn, Cambridge: Cambridge University Press. Allan, K. and K. Burridge (1991), Euphemism and Dysphemism: Language used as Shield and Weapon, Oxford: Oxford University Press. Allan, K. and K. Burridge (2006), Forbidden Words: Taboo and the Censoring of Language, Cambridge: Cambridge University Press. Anderson, S. R. (2012), Languages: A Very Short Introduction, Oxford: Oxford University Press. Anthony, D. W. (2007), The Horse, the Wheel, and Language: How Bronze–age Riders from the Eurasian Steppes Shaped the Modern World, Princeton and Oxford: Oxford University Press. Anttila, R. (1972), An Introduction to Historical and Comparative Linguistics, New York: Macmillan. Arbib, M. A. (2003), ‘The evolving mirror system: a neural basis for language readiness’, in M. H. Christiansen and S. Kirby (eds), Language Evolution, 182–200, Oxford: Oxford University Press. Arbib, M. A. (2011), ‘From mirror neurons to complex imitation in the evolution of language and tool use’, Annual Review of Anthropology, 40: 257–73. Arbib, M. A. (2012), ‘Mirror systems: evolving imitation and the bridge from praxis to language’, in M. Tallerman and K. R. Gibson (eds), The Oxford Handbook of Language Evolution, 207–15, Oxford: Oxford University Press. Aronoff, M. and J. Rees-Miller, eds (2017), The Handbook of Linguistics, 2nd edn, Hoboken: WileyBlackwell. Ashcraft, M. H. (1993), ‘A personal case history of transient anomia’, Brain and Language, 44: 47–57. Ayto, J. (1990), Dictionary of Word Origins, New York: Arcade Publishing. Baker, M. (2017), ‘Syntax’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 255–78, Hoboken: Wiley-Blackwell. 475

476

References

Baker, P. (2018), ‘Corpus methods in linguistics’, in L. Litosseliti (ed.), Research Methods in Linguistics, 2nd edn, 167–91, London and New York: Bloomsbury. Bakken Jepsen, J., G. De Clerck, S. Lutalo-Kiingi and W. B. McGregor, eds (2015), Sign Languages of the World: A Comparative Handbook, Berlin: De Gruyter Mouton. Bakker, P. (1994), ‘Michif, the Cree-French mixed language of the Métis buffalo hunters in Canada’, in P. Bakker and M. Mous (eds), Mixed Languages: 15 Case Studies in Language Intertwining, 13–33, Amsterdam: Institute for Functional Research into Language and Language Use (IFOTT). Bakker, P. and M. Mous, eds (1994), Mixed Languages: 15 Case Studies in Language Intertwining, Amsterdam: Institute for Functional Research into Language and Language Use (IFOTT). Bakker, P. and Y. Matras, eds (2013), Contact Languages: A Comprehensive Guide, Boston and Berlin: De Gruyter Mouton. Bannan, N., ed. (2012), Music, Language, and Human Evolution, Oxford: Oxford University Press. Baron, I. and M. Herslund, eds (2001), Dimensions of Possession, Amsterdam: John Benjamins. Barth, D. and S. Schnell (2022), Understanding Corpus Linguistics, London and New York: Routledge. Bartlett, T. (2011), ‘Review of William McGregor. Linguistics: an introduction. London: Continuum 2009, xx+384. (ISBN 978-1-8470-6366-3 (hb) / 978-1-8470-6367-0 (pb))’, Functions of Language, 18: 119–29. Bauer, L. (1983), English Word-Formation, Cambridge: Cambridge University Press. Bauer, L. (2003), Introducing Linguistic Morphology, Edinburgh: Edinburgh University Press. Bauer, L. (2004), A Glossary of Morphology, Edinburgh: Edinburgh University Press. Bauer, L. (2021), The Linguistics Student’s Handbook, 2nd edn, Edinburgh: Edinburgh University Press. Beattie, G. W. and P. J. Barnard (1979), ‘The temporal structure of natural telephone conversations (direct enquiry calls)’, Linguistics, 17: 213–29. Bellwood, P. (2005), First Farmers: The Origins of Agricultural Societies, Malden: Blackwell. Berko, J. (1958), ‘The child’s learning of English morphology’, Word, 14: 150–77. Berlin, B. and P. Kay (1969), Basic Color Terms: Their Universality and Evolution, Berkeley : University of California Press. Bhatt, P. and T. Veenstra, eds (2013), Creole Languages and Linguistic Typology, Amsterdam: John Benjamins. Bialystok, E. and K. Hakuta (1994), In Other Words: The Science and Psychology of Second Language Acquisition, New York: Basic Books. Biber, D. (1995), Dimensions of Register Variation: A Cross-linguistic Comparison, Cambridge: Cambridge University Press. Biber, D. and S. Conrad (2001), ‘Register variation: a corpus approach’, in D. Schiffrin, D. Tannen and H. E. Hamilton (eds), The Handbook of Discourse Analysis, 175–97, Malden: Blackwell. Biber, D., S. Conrad and R. Reppen (1998), Corpus Linguistics: Investigating Language Structure and Use, Cambridge: Cambridge University Press. Biber, D., S. Johansson, G. Leech, S. Conrad and E. Finegan (1999), Longman Grammar of Spoken and Written English, Harlow : Longman. Biber, D., S. Johansson, G. Leech, S. Conrad and E. Finegan (2021), Grammar of Spoken and Written English, Amsterdam: John Benjamins. Birdsong, D., ed. (1999), Second Language Acquisition and the Critical Period Hypothesis, Mahwah, NJ : Lawrence Erlbaum. Blake, B. J. (2001), Case, 2nd edn, Cambridge: Cambridge University Press. Blake, B. J. (2008), All about Language, Oxford: Oxford University Press.

References

Blakemore, D. (1992), Understanding Utterances: An Introduction to Pragmatics, Oxford: Blackwell. Bloom, P. ed. (1996), Language Acquisition: Core Readings, Cambridge, MA : MIT Press. Bloomfield, L. (1973/1933), Language, New York: Holt, Rinehart and Winston. Bolinger, D. (1975), Aspects of Language, New York: Harcourt Brace Jovanovich. Bolinger, D. (1980), Language – the Loaded Weapon: The Use and Abuse of Language Today, London and New York: Longman. Bond, Z. S. (1999), Slips of the Ear: Errors in the Perception of Casual Conversation, London: Academic Press. Boroditsky, L. and A. Gaby (2010), ‘Remembrances of times east: absolute spatial representations of time in an Australian Aboriginal community’, Psychological Science 21 (11): 1635–9. Botha, R. and C. Knight, eds (2009), The Prehistory of Language, Oxford: Oxford University Press. Bowerman, M. and S. C. Levinson, eds (2001), Language Acquisition and Conceptual Development, Cambridge: Cambridge University Press. Bradley, J. (1988), ‘Yanyuwa: “Men speak one way, women speak another” ’, Aboriginal Linguistics, 1: 126–34. Bragg, M. (2003), The Adventure of English: The Biography of a Language, London: Hodder & Stoughton. Bransford, J. D. and M. K. Johnson (1972), ‘Contextual prerequisites for understanding. Some investigations of comprehension and recall’, Journal of Verbal Learning and Verbal Behavior, 11: 717–26. Breen, J. G. and R. Penselfini (1999), ‘Arrernte: a language with no syllable onsets’, Linguistic Inquiry, 30: 1–25. Brennan, J. R. (2022), Language and the Brain: A Slim Guide to Neurolinguistics, Oxford: Oxford University Press. Brentari, D., ed. (2010), Sign Languages, Cambridge: Cambridge University Press. Bright, M. (1984), Animal Language, Ithaca, NY: Cornell University Press. Brown, R. (1957), ‘Linguistic determinism and the part of speech’, Journal of Abnormal and Social Psychology, 55: 1–5. Brown, R. (1973), A First Language: The Early Stages, Cambridge, MA : Harvard University Press. Brown, R. and J. Berko (1960), ‘Psycholinguistic research methods’, in P. H. Mussen (ed.), Handbook of Research Methods in Child Development, 517–617, New York: John Wiley & Sons. Burridge, K. (2004), Blooming English: Observations on the Roots, Cultivation and Hybrids of the English Language, Cambridge: Cambridge University Press. Burridge, K. and T. Stebbins (2020), For the Love of Language: An Introduction to Linguistics, 2nd edn, Cambridge: Cambridge University Press. Butler, C. S. (2003a), Structure and Function – A Guide to Three Major Structural-Functional Theories. Part 1: Approaches to the Simplex Clause, Amsterdam: John Benjamins. Butler, C. S. (2003b), Structure and Function – A Guide to Three Major Structural-Functional Theories. Part 2: From Clause to Discourse and Beyond, Amsterdam: John Benjamins. Bybee, J. (2003), ‘Mechanisms of change in grammaticalization: the role of frequency’, in B. D. Joseph and R. D. Janda (eds), The Handbook of Historical Linguistics, 602–23, Oxford: Blackwell. Bybee, J., W. J. Pagliuca and R. Perkins (1990), ‘On the asymmetries in the affixation of grammatical material’, in W. Croft, K. Denning and S. Kemmerer (eds), Studies in Typology and Diachrony: Papers Presented to Joseph H. Greenberg on his 75th Birthday, 1–42, Amsterdam: John Benjamins. Calvin, W. H. and G. A. Ojemann (1994), Conversations with Neil’s Brain: The Neural Nature of Thought and Language, Reading, MA : Perseus Books.

477

478

References

Campbell, L. (2021), Historical Linguistics: An Introduction, 4th edn, Cambridge, MA, and London: MIT Press. Capell, A. (1938), ‘The structure of Australian languages’, in A. P. Elkin (ed.), Studies in Australian Linguistics, 46–80, Sydney : The Australian National Research Council. Caplan, D. (2017), ‘Neurolinguistics’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 323–43, Hoboken: Wiley-Blackwell. Carr, P. (1993), Phonology, Basingstoke and London: Macmillan. Carreiras, M. (2010), ‘Sign language processing’, Language and Linguistics Compass, 4: 430–44. Carroll, L. (1899), Through the Looking Glass and What Alice Found There, London: MacMillan. Carroll, L. (1927/1866), Alice’s Adventures in Wonderland, New York: D. Appleton and Company. Carstairs-McCarthy, A. (2017), ‘Origins of language’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 3–19, Hoboken: Wiley-Blackwell. Carter, R. (2010), Mapping the Mind, revised and updated edn, Oakland: University of California Press. Carter, R., M. Page and S. Parker (2019), The Human Brain Book: An Illustrated Guide to its Structure, Function, and Disorders, 3rd edn, Delhi and London: DK Publishing. Cavalli-Sforza, L. L. (1991), ‘Genes, people and languages’, Scientific American, 265: 72–8. Cavalli-Sforza, L. L. (2001), Genes, Peoples and Languages, translated by M. Seielstad, Harmondsworth: Penguin. Chalmers, A. F. (2013), What Is This Thing Called Science? 4th edn, Indianapolis and Cambridge, MA : Hackett. Chappell, H. and W. B. McGregor, eds (1995), The Grammar of Inalienability: A Typological Perspective on Body Part Terms and the Part–Whole Relation, Berlin: Mouton de Gruyter. Charlesworth, B. and D. Charlesworth (2003), Evolution: A Very Short Introduction, Oxford: Oxford University Press. Cheney, D. and R. Seyfarth (1990), How Monkeys See the World: Inside the Mind of Another Species, Chicago: University of Chicago Press. Childs, G. T. (2003), An Introduction to African Languages, Amsterdam and Philadelphia: John Benjamins. Chittka, L. (2022), The Mind of a Bee, Princeton and Oxford: Princeton University Press. Chomsky, N. (1959), ‘Review of B. F. Skinner Verbal Behavior’, Language, 35: 26–58. Chomsky, N. (1986), Knowledge of Language: Its Nature, Origin and Use, New York: Praeger. Chrispin, L. and L. Fontaine (2023), ‘A cognitive-functional approach to watch as a verb of perception’, in C. Gentens, L. Ghesquière, W. B. McGregor and A. Van linden (eds), Reconnecting Form and Meaning: In Honour of Kristin Davidse, 209–36, Amsterdam and Philadelphia: John Benjamins. Christie, A. (1954/1920), The Mysterious Affair at Styles, London: Pan Books. Christie, A. (1967/1936), The ABC Murders, London and Glasgow: Fontana. Clark, E. V. (2009), First Language Acquisition, 2nd edn, Cambridge and New York: Cambridge University Press. Clark, J. and C. Yallop (1990), An Introduction to Phonetics and Phonology, Oxford and Cambridge, MA : Basil Blackwell. Coates, J. (1994), ‘No gap, lots of overlap: turn-taking patterns in the talk of women friends’, in D. Graddol, J. Maybin and B. Stierer (eds), Researching Language and Literacy in Social Context, 177–192, Clevedon: Multilingual Matters. Coates, J. (1997), ‘One-at-a-time: the organization of men’s talk’, in S. Johnson and U. H. Meinhof (eds), Language and Masculinity, 107–43, Oxford: Blackwell.

References

Coe, M. D. (1999), Breaking the Maya Code, New York: Thames & Hudson. Cohn, A. (2017), ‘Phonology: sound structure’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 185–210, Hoboken: Wiley-Blackwell. Collins, B. and I. M. Mees (2003), Practical Phonetics and Phonology: A Resource Book for Students, London and New York: Routledge. Comrie, B. (2003), ‘On explaining language universals’, in M. Tomasello (ed.), The New Psychology of Language: Cognitive and Functional Approaches to Language Structure. Volume 2, 195–209, Mahwah, NJ and London: Lawrence Erlbaum. Comrie, B. (2017), ‘Languages of the world’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 21–38, Hoboken: Wiley-Blackwell. Cook, V. (2017), ‘Second language acquisition: one person with two languages’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 557–81, Hoboken: Wiley-Blackwell. Corballis, M. C. (2003), ‘From hand to mouth: the gestural origins of language’, in M. H. Christiansen and S. Kirby (eds), Language Evolution, 201–18, Oxford: Oxford University Press. Corballis, M. C. (2012), ‘The origins of language in manual gestures’, in M. Tallerman and K. R. Gibson (eds), The Oxford Handbook of Language Evolution, 382–6, Oxford: Oxford University Press. Coulmas, F. (1989), The Writing Systems of the World, Oxford: Basil Blackwell. Coulmas, F. (1996), The Blackwell Encyclopedia of Writing Systems, Oxford: Blackwell. Coulmas, F. (2003), Writing Systems: An Introduction to their Linguistic Analysis, Cambridge: Cambridge University Press. Coulmas, F. (2013a), Sociolinguistics: The Study of Speakers’ Choices, 2nd edn, Cambridge: Cambridge University Press. Coulmas, F. (2013b). Writing and Society: An Introduction, New York, Cambridge University Press. Coulthard, M. (1985), An Introduction to Discourse Analysis, 2nd edn, London and New York: Longman. Coulthard, M. (2000), ‘Whose text is it? On the linguistic investigation of authorship’, in S. Sarangi and M. Coulthard (eds), Discourse and Social Life, 270–89, London: Longman. Crain, S. and D. Lillo-Martin (1998), Language and Mind, Oxford: Blackwell. Croft, W. (2017), ‘Typology and universals’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 21–38, Hoboken: Wiley Blackwell. Crothers, J. (1978), ‘Typology and universals of vowel systems’, in J. Greenberg, C. A. Ferguson and E. A. Moravcsik (eds), Universals of Human Language. Volume 2: Phonology, 93–152, Stanford: Stanford University Press. Crowley, T. and C. Bowern (2010), An Introduction to Historical Linguistics, 4th edn, New York: Oxford University Press. Crystal, D. (1987), The Cambridge Encyclopedia of Language, Cambridge: Cambridge University Press. Crystal, D. (2003), A Dictionary of Linguistics and Phonetics, 5th edn, Malden: Blackwell. Crystal, D. (2006), Language and the Internet, 2nd edn, Cambridge: Cambridge University Press. Crystal, D. (2008), Txtng: The Gr8 Db8. Oxford: Oxford University Press. Crystal, D. (2011), Internet Linguistics: A Student Guide, Milton Park and New York: Routledge. Crystal, D. (2012), Spell It Out: The Singular Story of English Spelling, London: Profile Books. Curtiss, S. (1977), Genie. A Psycholinguistic Study of a Modern-day “Wild Child”, New York: Academic Press. Curtiss, S., V. Fromkin, S. Krashen, D. Rigler and M. Rigler (1974), ‘The linguistic development of Genie’, Language, 50: 528–54.

479

480

References

Cutler, A., ed. (1982), Slips of the Tongue and Language Production, Berlin: Mouton. Cutler, A., J. A. Hawkins and G. Gilligan (1985), ‘The suffixing preference: a processing explanation’, Linguistics, 23: 723–58. Dalton, L., S. Edwards, R. Farquarson, S. Oscar and P. McConvell (1995), ‘Gurindji children’s language and language maintenance’, International Journal of the Sociology of Language, 113: 83–98. Daniels, P. T. (2013), ‘The history of writing as a history of linguistics’, in K. Allan (ed.), The Oxford Handbook of the History of Linguistics, 53–70, Oxford: Oxford University Press. Daniels, P. T. (2017), ‘Writing systems’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 75–94, Hoboken: Wiley-Blackwell. Daniels, P. T. and W. Bright, eds (1996), The World’s Writing Systems, Oxford: Oxford University Press. Darwin, C. (1898), The Expression of the Emotions in Man and Animals, New York: D. Appleton and Company. Davidoff, J. (1991), Cognition through Color, Cambridge, MA : MIT Press. Davis, J. E. (2010), Hand Talk: Sign Language among American Indian Nations, Cambridge: Cambridge University Press. Davis, J. E. (2015), ‘North American Indian sign language’, in J. Bakken Jepsen, G. De Clerck, S. LutaloKiingi and W. B. McGregor (eds), Sign Languages of the World: A Comparative Handbook, 911–31, Berlin: De Gruyter Mouton. De Fina, A. and B. Johnstone (2015), ‘Discourse analysis and narrative’, in D. Tannen, H. E. Hamilton and D. Schiffrin (eds), The Handbook of Discourse Analysis, 2nd edn, 152–67, Oxford: WileyBlackwell. de Vos, C. and R. Pfau (2015), ‘Sign language typology: the contribution of rural sign languages’, Annual Review of Linguistics, 1: 265–88. Denes, G. (2011), Talking Heads: The Neuroscience of Language, translated by Philippa Venturelli Smith, Hove and New York: Psychology Press. Denes, P. B. and E. N. Pinson (1993), The Speech Chain: The Physics and Biology of Spoken Language, 2nd edn, New York: W.H. Freeman. Dik, S. C. (1989), The Theory of Functional Grammar. Part 1: The Structure of the Clause, Dordrecht: Foris Publications. Donald, M. (1991), Origins of the Modern Mind: Three Stages in the Evolution of Culture and Cognition, Cambridge, MA : Harvard University Press. Donohue, M. (1997), ‘Tone systems in New Guinea’, Linguistic Typology, 1: 347–86. Doyle, A. C. (1894), The Memoirs of Sherlock Holmes, London: George Newnes. Dryer, M. S. (1997), ‘Are grammatical relations universal?’, in J. Bybee, J. Haiman and S. A. Thompson (eds), Essays on Language Function and Language Type: Dedicated to T. Givón, 115–43, Amsterdam and Philadelphia, PA : John Benjamins. Dryer, M. S. (2013), ‘Order of Subject, Object and Verb’, in M. S. Dryer and M. Haspelmath (eds), WALS Online (v2020.3) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7385533 (available online at http://wals.info/chapter/81). Dunbar, R. (1996), Grooming, Gossip and the Evolution of Language, London: Faber & Faber. Dunbar, R. (2010), How Many Friends Does One Person Need? Dunbar’s Number and Other Evolutionary Quirks, London: Faber & Faber. Dunbar, R. (2012), ‘Gossip and the social origins of language,’ in M. Tallerman and K. R. Gibson (eds), The Oxford Handbook of Language Evolution, 343–5, Oxford: Oxford University Press.

References

Eberhard, D. M., G. F. Simons and C. D. Fennig, eds (2022), Ethnologue: Languages of the World, Dallas: SIL International. Edwards, J. (2013), Sociolinguistics: A Very Short Introduction, Oxford: Oxford University Press. Eggins, S. (1994), An Introduction to Systemic Functional Linguistics. London: Pinter. Ehrlich, S., M. Meyerhoff and J. Holmes, eds (2014), The Handbook of Language, Gender, and Sexuality, 2nd edn, Malden and Oxford: Wiley. Elbourne, P. (2011), Meaning: A Slim Guide to Semantics, Oxford: Oxford University Press. Elman, J., E. A. Bates and M. H. Johnson, eds (1997), Rethinking Innateness: A Connectionist Perspective on Development, Cambridge, MA : MIT Press. Emmorey, K. (2002), Language, Cognition, and the Brain: Insights from Sign Language Research, Mahwah, NJ : Lawrence Erlbaum. Enard, W., M. Przeworski, S. E. Fisher, C. S. Lai, V. Wiebe, T. Kitano, A. P. Monaco and S. Pääbo (2002), ‘Molecular evolution of FOXP2, a gene involved in speech and language’, Nature, 418: 747–57. English, F. and T. Marr (2023), Why Do Linguistics? Reflective Linguistics and the Study of Language, 2nd edn, London and New York: Bloomsbury. Evans, N. (2000), ‘Word classes in the world’s languages’, in G. Booij, C. Lehmann, J. Mugdan, W. Kesselheim and S. Skopeteas (eds), Morphologie: ein internationales Handbuch zur Flexion und Wortbildung. Morphology: An International Handbook on Inflection and Word-Formation, 708–32, Berlin and New York: Mouton de Gruyter. Evans, N. (2022), Words of Wonder: Endangered Languages and What They Tell Us, 2nd edn, The Language Library, Hoboken: Wiley-Blackwell. Evans, W. E. and J. Bastian (1969), ‘Marine mammal communication: social and ecological features’, in H. T. Anderson (ed.), The Biology of Marine Mammals, 425–75, New York: Academic Press. Fernald, A. (1985), ‘Four-month-old infants prefer to listen to motherese’, Infant Behaviour and Development, 8: 181–95. Fernando, C. (1996), Idioms and Idiomaticity, Oxford: Oxford University Press. Finch, G. (2003), How to Study Linguistics: A Guide to Understanding Language, Basingstoke: Palgrave Macmillan. Finlayson, R. (1995), ‘Women’s language of respect: isihlonipho sabafazi’, in R. Mesthrie (ed.), Language and Social History: Studies in South African Sociolinguistics, 140–53, Cape Town: David Philip. Firth, J. R. (1935), ‘The technique of semantics’, Transactions of the Philological Society, 34: 36–72. Firth, J. R. (1957), Papers in Linguistics 1934–1951. London: Oxford University Press. Firth, J. R. (1962), ‘A synopsis of linguistic theory, 1930–1955’, in Studies in linguistic analysis. Special volume of the Philological Society, 1–32, Oxford: Blackwell. Fischer, S. R. (2001), A History of Writing, London: Reaktion Books. Fitch, W. T. (2000), ‘The evolution of speech: a comparative review’, Trends in Cognitive Science, 4: 258–67. Fleischman, J. (2002), Phineas Gage: A Gruesome but True Story about Brain Science, Boston: Houghton Mifflin. Fletcher, P. and B. MacWhinney, eds (1996), The Handbook of Child Language, Oxford: Blackwell. Foley, W. A. (1997), Anthropological Linguistics: An Introduction, Oxford: Blackwell. Foley, W. A. (2000), ‘The languages of New Guinea’, Annual Review of Anthropology, 29: 357–404. Frawley, W. ed. (2003), International Encyclopedia of Linguistics, New York: Oxford. Frisch, K. von. (1992/1973), ‘Decoding the language of the bee’, in J. Lindsten (ed), Nobel Lectures, Physiology or Medicine 1971–1980, 76–87, Singapore: World Scientific Publishing Co.

481

482

References

Fromkin, V. A. (1973a), ‘Slips of the tongue’, Scientific American, 229: 110–17. Fromkin, V. A. ed. (1973b), Speech Errors as Linguistic Evidence, The Hague: Mouton. Fromkin, V. A. ed. (1980), Errors in Linguistic Performance: Slips of the Tongue, Ear, Pen, and Hand, San Diego: Academic Press. Fromkin, V. A. (1988), ‘Grammatical aspects of speech errors’, in F. Newmeyer (ed), Linguistics: The Cambridge Survey. Volume 2. Linguistic Theory: Extensions and Implications, 117–38, Cambridge: Cambridge University Press. Fromkin, V. A. and R. Rodman (1974), An Introduction to Language, New York: Holt, Rinehart and Winston. Fromkin, V. A., R. Rodman and N. Hyams (2014), An Introduction to Language, 10th edn, Boston: Wadsworth. Gal, S. (1979), Language Shift: Social Determinants of Linguistic Change in Bilingual Austria, New York, San Francisco and London: Academic Press. Garcia, A. C. (2023), An Introduction to Interaction: Understanding talk in the workplace and everyday life, 2nd edn. London and New York: Bloomsbury. Gardner, B. and A. Gardner (1971), ‘Two-way communication with an infant chimpanzee’, in A. Schrier and F. Stollnitz (eds), Behavior of Non-human Primates. Volume 4, 117–84, New York: Academic Press. Garrett, M. F. (1988), ‘Processes in language production’, in F. J. Newmeyer (ed), Linguistics: the Cambridge Survey, Volume 3. Language: Psychological and Biological Aspects, 69–98, Cambridge: Cambridge University Press. Garry, J. and C. Rubino, eds (2001), Facts about the World’s Languages: An Encyclopedia of the World’s Major Languages, Past and Present, New York and Dublin: The H. W. Wilson Company. Gass, S. M. and A. Mackey (2012), The Routledge Handbook of Second Language Acquisition, London and New York: Routledge. Gass, S., J. Behney and L. Plonsky (2020), Second Language Acquisition: An Introductory Course, 5th edn, New York: Routledge. Gentner, D. and S. Goldin-Meadow, eds (2003), Language in Mind: Advances in the Study of Language and Thought, Cambridge, MA : MIT Press. Gleason, J. B. and N. B. Ratner, eds (1998), Psycholinguistics, 2nd edn, Belmont: Wadsworth. Gnanadesikan, A. E. (2009), The Writing Revolution: Cuneiform to the Internet, Malden and Oxford: Wiley-Blackwell. Goddard, C. (1998), Semantic Analysis: A Practical Introduction, Oxford: Oxford University Press. Goldshtein, Y. (2023), ‘Outline of a prosodic theory of stød’. PhD thesis, Aarhus University. Golinkoff, R., C. Mervis and K. Hirsh-Pasek (1994), ‘Early object labels: the case for a developmental lexical principles framework’, Journal of Child Language, 21: 125–55. Goodall, J. (1986), The Chimpanzees of Gombe: Patterns of Behaviour, Cambridge, MA : Harvard University Press. Goody, J. R. (1977), The Domestication of the Savage Mind, Cambridge and New York: Cambridge University Press. Gopnik, A., A. N. Meltzoff and P. K. Kuhl (2001), The Scientist in the Crib: What Early Learning Tells us about the Mind, New York: Perennial. Green, J. and D. Wilkins (2015), ‘Arandic alternate sign language(s)’, in J. Bakken Jepsen, G. De Clerck, S. Lutalo-Kiingi and W. B. McGregor (eds), Sign Languages of the World: A Comparative Handbook, 843–69, Berlin: De Gruyter Mouton.

References

Greenbaum, S. (1996), The Oxford English Grammar, Oxford: Oxford University Press. Greenbaum, S. and R. Quirk (1990), A Student’s Grammar of the English Language, Harlow: Longman. Greenberg, J. H. (1963), The Languages of Africa, Bloomington and The Hague: Indiana University Press and Mouton. Greenberg, J. H. (1987), Language in the Americas, Stanford: Stanford University Press. Greenberg, J. H., C. A. Ferguson and E. A. Moravcsik (1978), Universals of Human Language, Stanford: Stanford University Press. Greenfield, S. A. (2000), Brain-story, London: BBC . Grenoble, L. A. and L. J. Whaley, eds (1998), Endangered Languages: Language Loss and Community Response, Cambridge: Cambridge University Press. Grenoble, L. A. and L. J. Whaley, eds (2006), Saving Languages: An Introduction to Language Revitalization, Cambridge: Cambridge University Press. Grice, H. P. (1975), ‘Logic and conversation’, in P. Cole and J. L. Morgan (eds), Syntax and Semantics 3: Speech Acts, 41–58, New York: Academic Press. Grice, H. P. (1989), Studies in the Way of Words, Cambridge, MA : Harvard University Press. Güldemann, T. (2004), ‘Reconstruction through de-construction: the making of person, gender and number in the Khoe family and Kwadi’, Diachronica, 21: 251–306. Gumperz, J. J. and S. C. Levinson, eds (1996), Rethinking Linguistic Relativity, Cambridge: Cambridge University Press. Gussenhoven, C. and H. Jacobs (2003/1998), Understanding Phonology, London: Arnold. Haas, W. (1957), ‘Zero in linguistic description’, in Studies in Linguistic Analysis, 33–53, Oxford: Blackwell. Haberland, H. (1994), ‘Danish’, in E. König and J. van der Auwera (eds), The Germanic Languages, 313–48, London and New York: Routledge. Hadamard, J. (1996/1945), The Mathematician’s Mind: The Psychology of Invention in the Mathematical Field, Princeton: Princeton University Press. Hale, K. L., M. Krauss, L. J. Watahomigie, A. Y. Yamamoto, C. Craig, LaV. M. Jeanne and N. C. England (1992), ‘Endangered languages’, Language, 68: 1–42. Hall, K. and R. Barrett, eds (in preparation), The Oxford Handbook of Language and Sexuality, Oxford: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190212926.001.0001 Halliday, M. A. K. (1977/1975), Learning how to Mean: Explorations in the Development of Language and Meaning, London: Edward Arnold. Halliday, M. A. K. (1978), Language as Social Semiotic: The Social Interpretation of Language and Meaning, London: Arnold. Halliday, M. A. K. (1983), Spoken and Written Language, 2nd edn, Oxford: Oxford University Press. Halliday, M. A. K. (1985), An Introduction to Functional Grammar, London: Edward Arnold. Halliday, M. A. K and R. Hasan (1976), Cohesion in English, London: Longman. Halliday, M. A. K. and C. M. I. M. Matthiessen (2014), Halliday’s Introduction to Functional Grammar, 4th edn, London and New York: Routledge. Hammarström, H., R. Forkel, M. Haspelmath, and S. Bank (2022), Glottolog 4.7. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://glottolog.org Hammett, D. (2003/1929), Red Harvest, London: Orion Books. Hardy, G. H. (2006/1940), A Mathematician’s Apology, Cambridge: Cambridge University Press. Harley, T. A. (2013), The Psychology of Language: From Data to Theory, 4th edn, Hove and New York: Psychology Press.

483

484

References

Harley, T. A. (2017), Talking the Talk: Language, Psychology and Science, 2nd edn, Abingdon and New York: Routledge. Haspelmath, M. and A. D. Sims (2010), Understanding Morphology, 2nd edn, London: Hodder Education. Hawkins, J. A. and A. Cutler (1988), ‘Psycholinguistic factors in morphological asymmetry’, in J. A. Hawkins (ed.), Explaining Language Universals, 280–317, Oxford: Basil Blackwell. Hayes, B. (2006), ‘Gauss’s day of reckoning’, American Scientist, 94: 200–5. Hayes, C. (1951), The Ape in our House, New York: Harper & Row. Heath, J. (1978), Linguistic Diffusion in Arnhem Land, Canberra: Australian Institute of Aboriginal Studies. Heider, E. R. (1972), ‘Universals in color naming and memory’, Journal of Experimental Psychology, 93: 10–20. Heine, B. (1997), Possession: Cognitive Sources, Forces, and Grammaticalization, Cambridge: Cambridge University Press. Heine, B. (2003), ‘Grammaticalization’, in B. D. Joseph and R. D. Janda (eds), The Handbook of Historical Linguistics, 575–601, Oxford: Blackwell. Heine, B. and T. Kuteva (2002), World Lexicon of Grammaticalization, Cambridge: Cambridge University Press. Hengeveld, K., J. Rijkhoff, and A. Siewierska (2004), ‘Parts of speech systems and word order’, Journal of Linguistics, 40: 527–70. Herman, L. M., ed. (1980), Cetacean Behavior: Mechanism and Functions, New York: John Wiley & Sons. Herman, L. M., D. G. Richards and J. P. Wolz (1984), ‘Comprehension of sentences by bottlenosed dolphins’, Cognition, 16: 129–219. Hinton, L., J. Nichols and J. J. Ohala, eds (1994), Sound Symbolism, Cambridge: Cambridge University Press. Hock, H. H. (1991), Principles of Historical Linguistics, Berlin, New York and Amsterdam: Mouton de Gruyter. Hockett, C. F. (1954), ‘Two models of grammatical description’, Word, 10: 210–34. Hockett, C. F. (1958), A Course in Modern Linguistics, New York: Macmillan. Hockett, C. F. (1960), ‘The origin of speech’, Scientific American, 203: 88–96. Holm, J. (1988), Pidgins and Creoles. Volume 1: Theory and Structure, Cambridge: Cambridge University Press. Holm, J. (1989), Pidgins and Creoles. Volume 2: Reference Survey, Cambridge: Cambridge University Press. Holmes, J. and N. Wilson (2022), An Introduction to Sociolinguistics, 6th edn, London: Routledge. Hopper, P. and E. Traugott (2003), Grammaticalization, 2nd edn, Cambridge: Cambridge University Press. Horgan, J. (1996), The End of Science: Facing the Limits of Knowledge in the Twilight of the Scientific Age, Reading, MA : Helix Books. Huddleston, R. (1984), Introduction to the Grammar of English, Cambridge: Cambridge University Press. Huddleston, R. and G. K. Pullum (2002), The Cambridge Grammar of the English Language, Cambridge: Cambridge University Press. Huddleston, R., G. K. Pullum and B. Reynolds (2022), A Student’s Introduction to English Grammar, 2nd edn, Cambridge: Cambridge University Press.

References

Hudson, G. (2000), Essential Introductory Linguistics, Malden and Oxford: Blackwell. Hudson, R. (1984), Invitation to Linguistics. Oxford: Blackwell. Hudson, R. and W. Van Langendonck (1991), ‘Word grammar’, in F. G. Droste and J. E. Joseph (eds), Linguistic Theory and Grammatical Description, 307–35, Amsterdam: John Benjamins. Hughes, A., P. Trudgill and D. Watt (2012), English Accents and Dialects: An Introduction to Social and Regional Varieties of English in the British Isles, 5th edn, Abingdon and New York: Routledge. Hurford, J. R., B. Heasley and M. B. Smith (2007), Semantics: A Coursebook, 2nd edn, Cambridge: Cambridge University Press. Hurford, J. R., M. Studdert-Kennedy and C. Knight, eds (1998), Approaches to the Evolution of Language: Social and Cognitive Bases, Cambridge: Cambridge University Press. Hutchby, I. and R. Wooffitt (2008), Conversation Analysis: Principles, Practices and Applications, 2nd edn, Cambridge and Malden: Polity Press. Ingram, J. C. L. (2007), Neurolinguistics: An Introduction to Spoken Language Processing and its Disorders, Cambridge: Cambridge University Press. International Phonetic Association (1999), The Handbook of the International Phonetic Association, Cambridge: Cambridge University Press. Jabr, F. (2013), ‘Why the brain prefers paper’, Scientific American, 309: 34–9. Jackson, H. and P. Stockwell (2011), An Introduction to the Nature and Functions of Language, 2nd edn, London: Continuum. Jakobson, R. (1978), Six Lectures on Sound and Meaning, translated by J. Mepham, Brighton: Harvester Press. Jakobson, R. and L. R. Waugh (1979), The Sound Shape of Language, Bloomington: Indiana University Press. Jefferson, G. (1973), ‘A case for precision timing in ordinary conversation: overlapped tag-positioned address terms in closing sequences’, Semiotica, 9: 47–96. Jespersen, O. (1922), Language: Its Nature, Development and Origin, London: Allen & Unwin. Johansson, S. (2005), Origins of Language: Constraints on Hypotheses, Amsterdam and Philadelphia: John Benjamins. Johnston, T. and A. Schembri, eds (2003), The Survival Guide to Auslan: A Beginner’s Pocket Dictionary of Australian Sign Language, Sydney: North Rocks Press. Johnston, T. and A. Schembri (2007), Australian Sign Language (Auslan): An Introduction to Sign Language Linguistics, Cambridge: Cambridge University Press. Jones, C. and D. Waller (2015), Corpus Linguistics for Grammar: A Guide for Research, London and New York: Routledge. Joseph, B. D. (2017), ‘Historical linguistics: language change over time’, in M. Aronoff and J. Rees-Miller, eds, The Handbook of Linguistics, 2nd edn, 299–319, Hoboken: Wiley-Blackwell. Joseph, B. D. and R. D. Janda, eds (2003), The Handbook of Historical Linguistics, Oxford: Blackwell. Kaminski, J., J. Call and J. Fischer (2004), ‘Word learning in a domestic dog: evidence for “fast mapping” ’, Science, 304: 1682–3. Kaplan, R. B., ed. (2002), The Oxford Handbook of Applied Linguistics, Oxford: Oxford University Press. Kay, P. and W. Kempson (1984), ‘What is the Sapir-Whorf hypothesis?’, American Anthropologist, 86: 65–79. Kegl, J., A. Senghas and M. Coppola (1999), ‘Creation through contact: sign language emergence and sign language change in Nicaragua’, in M. DeGraff (ed.), Language Creation and Language Change: Creolization, Diachrony, and Development, 179–237, Cambridge, MA : MIT Press.

485

486

References

Kemmerer, D. (2023), Cognitive Neuroscience of Language, 2nd edn, New York and London: Routledge. Kempson, R. (2017), ‘Pragmatics: language and communication’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 417–43, Hoboken: Wiley-Blackwell. Kendon, A. (1988), Sign Languages of Aboriginal Australia: Cultural, Semiotic and Communicative Perspectives, Cambridge: Cambridge University Press. Kendon, A. (2004), Gesture: Visible Action as Utterance, Cambridge: Cambridge University Press. Kendon, A. (2008), ‘A history of the study of Australian Aboriginal sign languages’, in W. B. McGregor (ed.), Encountering Aboriginal Languages: Studies in the History of Australian Linguistics, 383–402, Canberra: Pacific Linguistics. Kendon, A. (2013), ‘History of the study of gesture’, in K. Allan, (ed.), The Oxford Handbook of the History of Linguistics, 71–90, Oxford: Oxford University Press. Kiesling, S. F. (2019), Language, Gender and Sexuality: An Introduction, Abingdon and New York: Routledge. Kirton, J. F. (1988), ‘Men’s and women’s dialects’, Aboriginal Linguistics 1: 111–25. Kita, S. (2003), ‘Pointing: a foundational building block of human communication’, in S. Kita (ed.), Pointing: Where Language, Culture and Cognition Meet, 1–8, Mahwah, NJ : Lawrence Erlbaum. Kirkpatrick, E. M. and C. M. Schwarz, eds (1995), The Wordsworth Dictionary of Idioms, Ware: Wordsworth. Knight, C., M. Studdert-Kennedy and J. R. Hurford, eds (2000), The Evolutionary Emergence of Language: Social Function and the Origin of Linguistic Form, Cambridge: Cambridge University Press. Kolb, B. and I. Q. Whishaw (2003/1980), Fundamentals of Human Neuropsychology, New York: Worth Publishers. Kroodsma, D. E., E. H. Miller and H. Ouellet, eds (1982), Communication in Birds, New York: Academic Press. Kuhl, P. K. and P. Iverson (1995), ‘Linguistic experience and the “perceptual magnet effect” ’, in W. Strange (ed.), Speech Perception and Linguistic Experience: Issues in Cross-language Research, 121–54, Baltimore: York Press. Labov, W. (1972), Sociolinguistic Patterns, Philadelphia: University of Pennsylvania Press. Labov, W. and J. Waletzky (1967), ‘Narrative analysis: oral versions of personal experience’, in J. Helm (ed.), Essays on the Verbal and Visual Arts: Proceedings of the 1966 Annual Spring Meeting of the American Ethnological Society, 12–44, Seattle: American Ethnological Society. Ladefoged, P. (1992), ‘Another view of endangered languages’, Language, 68: 809–11. Ladefoged, P. and S. F. Disner (2012), Vowels and Consonants: An Introduction to the Sounds of Languages, 3rd edn, Malden and Oxford: Wiley-Blackwell. Ladefoged, P. and I. Maddieson (1996), The Sounds of the World’s Languages, Oxford: Blackwell. Langacker, R. W. (1990), Concept, Image, and Symbol: The Cognitive Basis of Grammar, Berlin and New York: Mouton de Gruyter. Langacker, R. W. (1991), ‘Cognitive grammar’, in F. G. Droste and J. E. Joseph (eds), Linguistic Theory and Grammatical Description, 275–306, Amsterdam: John Benjamins. Langacker, R. W. (1999), Grammar and Conceptualization, Berlin and New York: Mouton de Gruyter. Laver, J. (2017), ‘Linguistic phonetics: the sounds of languages’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 161–84, Hoboken: Wiley-Blackwell. Lee, P. (1996), The Whorf Theory Complex: A Critical Reconstruction, Amsterdam and Philadelphia: John Benjamins.

References

Leech, G. (1992), ‘Corpora and theories of linguistic performance’, in J. Svartvik (ed.), Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82, Stockholm, 4–8 August 1991, 105–22, Berlin: Mouton de Gruyter. Lenneberg, E. H. (1967), Biological Foundations of Language, New York: John Wiley & Sons. Lesser, R. and L. Milroy (1993), Linguistics and Aphasia: Psycholinguistic and Pragmatic Aspects of Intervention, London: Longman. Levinson, S. C. (1992), Pragmatics, 2nd edn, Cambridge: Cambridge University Press. Levinson, S. C. (1995), ‘Three levels of meaning’, in F. R. Palmer (ed.), Grammar and Meaning: Essays in Honour of Sir John Lyons, 90–115, Cambridge: Cambridge University Press. Levinson, S. C. (1997), ‘Language and cognition: the cognitive consequences of spatial description in Guugu Yimithirr’, Journal of Linguistic Anthropology, 7: 98–131. Levinson, S. C. (1999), ‘H. P. Grice on location on Rossel Island’, Berkeley Linguistic Society, 25: 210–24. Lewis, M. P. (2009), Ethnologue: Languages of the World, 16th edn, Dallas: SIL International. http:// www.ethnologue.com/16. Lieber, R. (2010), Introducing Morphology, Cambridge and New York: Cambridge University Press. Lieberman, P. (2000), Human Language and our Reptilian Brain: The Subcortical Bases of Speech, Syntax, and Thought, Cambridge, MA and London: Harvard University Press. Lieberman, P. (2003), ‘Motor control, speech, and the evolution of human language’, in M. H. Christiansen and S. Kirby (eds), Language Evolution, 255–71, Oxford: Oxford University Press. Lockwood, D. G. (2002), Syntactic Analysis and Description: A Constructional Approach, London and New York: Continuum. Loring, D. W., K. Meador, G. Lee, A. Murro, J. Smith, H. Flanigan and B. Gallagher (1990), ‘Cerebral language lateralization: evidence from intra-carotid amobarbital testing’, Neuropsychologia, 28: 831–8. Losos, J. B. (2023), The Age of Cats: From the Savannah to Your Sofa, London: William Collins. Lucas, C. and R. Bayley (2011), ‘Variation in sign languages: recent research on ASL and beyond’, Language and Linguistics Compass, 5: 677–90. Ludden, D. (2016), The Psychology of Language: An Integrated Approach, Los Angeles: Sage. Luraghi, S. and V. Bubenik, eds (2013/2010), The Bloomsbury Companion to Historical Linguistics, London and New York: Bloomsbury. Lust, B. C. and C. Foley, eds (2004), First Language Acquisition: The Essential Readings, Malden and Oxford: Blackwell Publishing. Lyovin, A. V. (1997), An Introduction to the Languages of the World, New York and Oxford: Oxford University Press. Macmillan, M. (2000a), ‘Restoring Phineas Gage: a 150th retrospective’, Journal of the History of the Neurosciences, 9: 46–66. Macmillan, M. (2000b), ‘Commemorating the 150th anniversary of Phineas Gage’s accident’, Journal of the History of the Neurosciences, 9: 90–3. MacWhinney, B. (2017), ‘First language acquisition’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 397–413, Hoboken: Wiley-Blackwell. Maddieson, I. (1978), ‘Universals of tone’, in J. Greenberg, C. A. Ferguson and E. A. Moravcsik (eds), Universals of Human Language. Volume 2: Phonology, 335–65, Stanford: Stanford University Press. Maddieson, I. (1999/1997), ‘Phonetic universals’, in W. J. Hardcastle and J. Laver (eds), The Handbook of Phonetic Sciences, 619–39, Oxford: Blackwell. Majid, A., M. Bowerman, S. Kita, D. B. M. Haun and S. C. Levinson (2004), ‘Can language restructure cognition? The case for space’, Trends in Cognitive Sciences 8 (3): 108–14.

487

488

References

Malinowski, B. (1936/1923), ‘The problem of meaning in primitive languages’, in C. K. Ogden and I. A. Richards (eds), The Meaning of Meaning: A Study of the Influence of Language upon Thought and of the Science of Symbolism, 296–336, London: Kegan Paul, Trench, Trubner and Co. Malotki, E. (1983), Hopi Time: A Linguistic Analysis of the Temporal Concepts in the Hopi Language, Berlin and New York: Mouton de Gruyter. Marchand, H. (1969), The Categories and Types of Present-day English Word Formation: A SynchronicDiachronic Approach, Munich: C. H. Beck’sche Verlagsbuchhandlung. Marshall, C., A. Bel, S. Gulamani and G. Morgan (2021), ‘How are signed languages learned as second languages?’, Language and Linguistics Compass 15 (1): 1–17. https://doi.org/10.1111/lnc3.12403 Martin, J. R. (1985), Factual Writing: Exploring and Challenging Social Reality, Geelong: Deakin University Press. Martin, J. R and P. Peters (1985), ‘On the analysis of exposition’, in R. Hasan (ed.), Discourse on Discourse: Workshop Reports from The Macquarie Workshop on Discourse Analysis, February 21–25, 1983, 61–92, Wollongong: Applied Linguistics Association of Australia. Mathew, J. (1899), Eaglehawk and Crow: A Study of the Australian Aborigines Including an Inquiry into their Origin and a Survey of Australian Languages, London: David Nutt and Melbourne: Melville, Mullen and Slade. Matras, Y. and P. Bakker, eds (2003), The Mixed Language Debate: Theoretical and Empirical Advances, Berlin and New York: Mouton de Gruyter. Matthews, P. H. (1972), Inflectional Morphology: A Theoretical Study Based on Aspects of Latin Verb Conjugation, Cambridge: Cambridge University Press. Matthews, P. H. (1974), Morphology: An Introduction to the Theory of Word-Structure, Cambridge: Cambridge University Press. Matthews, P. H. (1995), ‘Syntax, semantics, pragmatics’, in F. R. Palmer (ed.), Grammar and Meaning: Essays in Honour of Sir John Lyons, 48–60, Cambridge: Cambridge University Press. Matthews, P. H. (2003), Linguistics: A Very Short Introduction, Oxford: Oxford University Press. Matthews, P. H. (2007), The Concise Oxford Dictionary of Linguistics, Oxford: Oxford University Press. McConvell, P. (1985), ‘Domains and codeswitching among bilingual Aborigines’, in M. Clyne (ed.) Australia: Meeting Place of Languages, 95–125, Canberra: Pacific Linguistics. McConvell, P. and F. Meakins (2005), ‘Gurindji Kriol: a mixed language emerges from code-switching’, Australian Journal of Linguistics, 25: 9–30. McCulloch, G. (2019), Because Internet: Understanding how Language is Changing, London: Vintage. McEnery, T. and V. Brezina (2022). The Fundamental Principles of Corpus Linguistics, Cambridge: Cambridge University Press. McGregor, W. B. (2003), ‘Language shift among the Nyulnyul of Dampier Land’, Acta Linguistica Hafniensia, 35: 115–59. McGregor, W. B. (2004), The Languages of the Kimberley, Western Australia, London: RoutledgeCurzon. McGregor, W. B. (2013), ‘There are existential constructions and existential constructions: presumption invoking existentials in English’, Folia Linguistica 47 (1): 139–81. McNeill, D. (1966), ‘Developmental psycholinguistics’, in F. Smith and G. Miller (eds), The Genesis of Language: A Psycholinguistic Approach, 15–84, Cambridge, MA : MIT Press. McNeill, D. (1992), Hand and Mind: What Gestures Reveal about Thought, Chicago and London: University of Chicago Press. McNeill, D. ed. (2000), Language and Gesture, Cambridge: Cambridge University Press. McNeill, D. (2005), Gesture & Thought, Chicago and London: University of Chicago Press.

References

Meier, R. P. (1990), ‘Person deixis in ASL’, in S. D. Fischer and P. Siple (eds), Theoretical Issues in Sign Language Research, Volume 1: Linguistics, 175–190, Chicago: University of Chicago Press. Meier, R. and E. Newport (1990), ‘Out of the hands of babes: on a possible sign advantage in language acquisition’, Language, 66: 1–23. Menn, L. (with contributions by N. F. Dronkers) (2015), Psycholinguistics: Introduction and Applications, 2nd edn, San Diego and Oxford: Plural Publishing. Menzel, C. R. (1999), ‘Unprompted recall and reporting of hidden objects by a chimpanzee (Pan troglodytes) after extended delays, Journal of Comparative Psychology, 113: 426–34. Mesthrie, R., J. Swann, A. Deumert and W. L. Leap (2009), Introducing Sociolinguistics, 2nd edn, Edinburgh: Edinburgh University Press. Mey, J. L. (2001), Pragmatics: An Introduction, 2nd edn, Oxford: Blackwell. Mitchell, T. F. (1975/1957), ‘The language of buying and selling in Cyrenaicia: a situational statement,’ in T. F. Mitchell (ed.), Principles of Neo-Firthian Linguistics, 167–200, London: Longman. Mithun, M. (1999), The Languages of Native North America, Cambridge: Cambridge University Press. Mohr, S. and A.-M. Fehn (2013), ‘Phonology of hunting signs in two Kalahari Khoe-speaking groups (Ts’ixa and ||Ani)’. Unpublished manuscript of paper presented to 87th Annual Meeting of the Linguistic Society of America, Boston, MA , 4 January 2013. http://www.linguisticsociety.org/ files/3522-6822-1-SM.pdf. Moravcsik, E. A. (2013), Introducing Language Typology, New York: Cambridge University Press. Morford, J. P. and B. Hänel-Faulhaber (2011), ‘Homesigners as late learners: connecting the dots from delayed acquisition in childhood to sign language processing in adulthood’, Language and Linguistics Compass, 5: 525–37. Morton, E. S. and J. Page (1992), Animal Talk: Science and the Voices of Nature, New York: Random House. Moskowitz, B. A. (1978), ‘The acquisition of language’, Scientific American, 239: 92–108. Mous, M. (1994), ‘Ma’a or Mbugu’, in P. Bakker and M. Mous (eds), Mixed Languages: 15 Case Studies in Language Intertwining, 175–200, Amsterdam: Institute for Functional Research into Language and Language Use (IFOTT). Mufwene, S. S. (2016), ‘The emergence of creoles and language change’, in N. Bonvillain (ed.), The Routledge Handbook of Linguistic Anthropology, 348–65, New York and London: Routledge. Myers, G. (2010), The Discourse of Blogs and Wikis, London and New York: Continuum. Myers-Scotton, C. (1993), Social Motivations for Codeswitching: Evidence from Africa, Oxford: Clarendon Press. Nakanishi, A. (1990/1980). Writing Systems of the World: Alphabets, Syllabaries, Pictograms, Boston and Tokyo: Tuttle Publishing. Narrog, H. and B. Heine, eds (2011), The Oxford Handbook of Grammaticalization, Oxford: Oxford University Press. Newport, E. and R. Meier (1985), ‘The acquisition of American Sign Language’, in D. Slobin (ed.), The Crosslinguistic Study of Language Acquisition. Volume 1: The Data, 881–938, Hillsdale, NJ : Lawrence Erlbaum. Oaks, D. D. (2001), Linguistics at Work: A Reader of Applications, Cambridge, MA : Heinle and Heinle (Thomson Learning). Ochs, E. (1988), Culture and Language Development: Language Acquisition and Language Socialization in a Samoan Village, Cambridge: Cambridge University Press. O’Grady, W., J. Archibold, M. Aronoff and J. Rees-Miller (2017), Contemporary Linguistics: An Introduction. 7th edn, Boston: Macmillan Learning.

489

490

References

Ohara, Y. (1997), ‘Shakaionseigaku no kanten kara mita nihonjin no koe no kotei [High and low pitch of the voice of Japanese from the point of view of sociophonetics]’, in I. Sachiko (ed.), Josei no Sekai [The World of Women], 42–58, Tokyo: Meiji Shoin. Okasha, S. (2016), Philosophy of Science: A Very Short Introduction, 2nd edn, Oxford: Oxford University Press. O’Keeffe, A. and M. McCarthy, eds (2022), The Routledge Handbook of Corpus Linguistics, 2nd edn, London and New York: Routledge. Olsson, J. (2009), Wordcrime: Solving Crime through Forensic Linguistics, London and New York: Continuum. Olsson, J. (2018), More Wordcrime: Solving Crime with Linguistics, London and New York: Bloomsbury. Ong, W. J. (1982), Orality and Literacy: The Technologizing of the Word, New York: Methuen. Onions, C. T. (1966), The Oxford Dictionary of English Etymology, Oxford: Oxford University Press. O’Shea, M. (2005), The Brain: A Very Short Introduction, Oxford: Oxford University Press. Ostler, N. (2005), Empires of the Word: A Language History of the World, London: HarperCollins. Parkvall, M. (2006), Limits of Language, London and Ahungalla: Battlebridge Publications. Pavey, E. L. (2010), The Structure of Language: An Introduction to Grammatical Analysis, Cambridge: Cambridge University Press. Payne, D. L. and I. Barshi, eds (1999), External Possession, Amsterdam: John Benjamins. Pereltsvaig, A. (2012), Languages of the World: An Introduction, Cambridge: Cambridge University Press. Petersen, S. E. and J. Fiez (1993), ‘The processing of single words studied with positron emission tomography’, Annual Review of Neuroscience, 16: 509–30. Petersen, S. E., P. T. Fox, M. I. Posner, M. Mintun and M. E. Raichle (1989), ‘Positron emission tomographic studies of the processing of single words’, Journal of Cognitive Neuroscience, 1: 153–70. Pierce, C. S. (1955), Philosophical Writings of Pierce, New York: Dover. Pike, K. L. (1948), Tone Languages: A Technique for Determining the Number and Type of Pitch Contrasts in a Language, with Studies in Tonemic Substitution and Fusion, Ann Arbor: University of Michigan Press. Pinker, S. (1994), The Language Instinct: The New Science of Language and Mind, London: Penguin. Pope, M. (2003), The Story of Decipherment: From Egyptian Hieroglyphs to Maya Script, London and New York: Thames & Hudson. Potts, A. and P. Baker (2012), ‘Does semantic tagging identify cultural change in British and American English?’ International Journal of Corpus Linguistics, 17 (3): 295–324. Powell, B. B. (2009), Writing: Theory and History of the Technology of Civilization, Malden and Oxford: Blackwell. Premack, D. and A. J. Premack (1993), The Mind of an Ape, New York: W. W. Norton & Co. Pullum, G. K. (2018), Linguistics: Why it Matters, Cambridge and Medford, MA : Polity Press. Pyle, T. and J. Algeo (1993), The Origins and Development of the English Language, Fort Worth: Harcourt Brace Jovanovich. Prince, G. (1982), Narratology: The Form and Functioning of Narrative, Berlin: Mouton de Gruyter. Propp, V. (1968), Morphology of the Folktale, translated by L. A. Wagner, Austin: University of Texas Press. Quirk, R., S. Greenbaum, G. Leech and J. Svartvik (1972), A Grammar of Contemporary English, London: Longman. Radick, G. (2007), The Simian Tongue: The Long Debate about Animal Language, Chicago and London: University of Chicago Press.

References

Rankin, R. L. (2003), ‘The comparative method’, in B. D. Joseph and R. D. Janda (eds), The Handbook of Historical Linguistics, 183–212, Oxford: Blackwell. Rasmussen, T. and B. Milner (1977), ‘The role of early left-brain injury in determining lateralization of cerebral speech functions’, Annals of the New York Academy of Sciences, 299: 355–69. Ratiu, P. and I.-F. Talos (2004), ‘The tale of Phineas Gage, digitally remastered’, The New England Journal of Medicine, 351: e21. doi: 10.1056/NEJMicm031024. Reid, N. J. (2003), ‘Phrasal verb to synthetic verb: recorded morphosyntactic change in Ngan’gityemerri’, in N. Evans (ed.), The non-Pama-Nyungan Languages of Northern Australia: Comparative Studies of the Continent’s Most Linguistically Complex Region, 95–123, Canberra: Pacific Linguistics. Reisman, K. (1974), ‘Contrapuntal conversation in an Antiguan village’, in R. Bauman and J. Sherzer (eds), Explorations in the Ethnography of Speaking, 110–24, Cambridge: Cambridge University Press. Renfrew, C. (1987), Archaeology and Language: The Puzzle of Indo-European Origins, London: Jonathan Cape. Renfrew, C. (1989), ‘The origins of the Indo-European languages’, Scientific American, 261: 106–14. Renfrew, C. (1994), ‘World linguistic diversity’, Scientific American, 270: 116–23. Richards, D. G., J. P. Wolz and L. M. Herman (1984), ‘Mimicry of computer-generated sounds and vocal labeling of objects by a bottlenosed dolphin’, Journal of Comparative Psychology, 98: 10–28. Richards, M. and A. L. Malchukov (2008), ‘Preface’, in M. Richards and A. L. Malchukov (eds), Scales, v–xi, Leipzig: Institut für Linguistik, Universität Leipzig. Rickerson, E. M. and B. Hilton, eds (2012), The Five-minute Linguist: Bite-sized Essays on Language and Languages, Sheffield and Bristol, CT: Equinox. Riemer, N. (2010), Introducing Semantics, New York: Cambridge University Press. Rijkhoff, J. (2007), ‘Word classes’, Language and Linguistics Compass, 1: 709–26. Roberson, D., J. Davidoff, I. R. L. Davies and L. R. Shapiro (2005), ‘Color categories: evidence for the cultural relativity hypothesis’, Cognitive Psychology, 50: 378–411. Robins, R. H. (1959), ‘In defence of WP ’, Transactions of the Philological Society, 58: 116–44. Robins, R. H. (1984), A Short History of Linguistics, London and New York: Longman. Robinson, A. (2002), The Man who Deciphered Linear B: The Story of Michael Ventris, London: Thames & Hudson. Robinson, A. (2009a), Lost Languages: The Enigma of the World’s Undeciphered Scripts, London: Thames & Hudson. Robinson, A. (2009b), Writing and Script: A Very Short Introduction, Oxford: Oxford University Press. Robinson, A. (2012), Cracking the Egyptian Code: The Revolutionary Life of Jean-François Champollion, London: Thames & Hudson. Rogers, H. (2005), Writing Systems: A Linguistic Approach, Malden and Oxford: Blackwell Publishers. Rogers, L. J. and G. Kaplan (2000), Songs, Roars, and Rituals: Communication in Birds, Mammals, and other Animals, Cambridge, MA : Harvard University Press. Romaine, S. (1995), Bilingualism, 2nd edn, Oxford: Blackwell. Romaine, S. (2017), ‘Multilingualism’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 541–56, Hoboken: Wiley-Blackwell. Rosenblum, L. D. (2010), See what I’m Saying: The Extraordinary Powers of our Five Senses, New York and London: W. W. Norton & Co. Rosenblum, L. D. (2013), ‘A confederacy of senses’, Scientific American, 308: 66–9.

491

492

References

Ross, M. D. (2005), ‘Pronouns as a preliminary diagnostic for grouping Papuan languages’, in A. Pawley, R. Attenborough, R. Hide and J. Golson (eds), Papuan Pasts: Cultural, Linguistic and Biological Histories of Papuan-Speaking Peoples, 15–66, Canberra: Pacific Linguistics. Ruhl, C. (1989), On Monosemy: A Study in Linguistic Semantics, Albany : State University of New York Press. Ruhlen, M. (1987), A Guide to the World’s Languages. Volume 1: Classification. London: Edward Arnold. Ruhlen, M. (1991), A Guide to the World’s Languages. Volume 1: Classification (with a Postscript on Recent Developments), rev. edn, London: Edward Arnold. Rymer, R. (1994), Genie: A Scientific Tragedy, London: HarperPerennial. Sacks, H., E. A. Schegloff and G. Jefferson (1974), ‘A simplest systematics for the organization of turn-taking in conversation’, Language, 50: 696–735. Sacks, O. (2012/1989), Seeing Voices: A Journey into the World of the Deaf, London: Picador. Sakel, J. (2015), Study Skills for Linguistics, London and New York: Routledge. Salkie, R. (1995), Text and Discourse Analysis, London: Routledge. Sagart, L. (2005), ‘Sino-Tibetan–Austronesian: an updated and improved argument’, in L. Sagart, R. Blench and A. Sanchez-Mazas (eds), The Peopling of East Asia: Putting together Archaeology, Linguistics and Genetics, 161–76, London: RoutledgeCurzon. Sampson, G. (2005), The ‘Language Instinct’ Debate, London and New York: Continuum. Sandler, W. (2012), ‘The phonological organization of sign languages’, Language and Linguistics Compass, 6: 162–82. Sandler, W. and D. Lillo-Martin (2017), ‘Sign languages’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 371–96, Hoboken: Wiley-Blackwell. Sapir, E. (1921), Language: An Introduction to the Study of Speech, New York: Harcourt, Brace and World. Sapir, E. (1929), ‘A study in phonetic symbolism’, Journal of Experimental Psychology, 12: 225–39. Saussure, F. de (1974/1959), Course in General Linguistics, translated by W. Baskin, Glasgow: William Collins. Savage-Rumbaugh, S. and R. Lewin (1994), Kanzi: The Ape at the Brink of the Human Mind, New York: John Wiley & Sons. Saville-Troike, M. (2002), The Ethnography of Communication: An Introduction, 3rd edn, Oxford: Blackwell. Saville-Troike, M. and K. Barto (2017), Introducing Second Language Acquisition, 3rd edn, New York: Cambridge University Press. Saxton, M. (2010), Child Language: Acquisition & Development, London: Sage. Schachter, P. and T. Shopen (2007), ‘Parts-of-speech systems’, in T. Shopen (ed.), Language Typology and Syntactic Description. Volume I: Clause Structure, 1–60, Cambridge: Cambridge University Press. Schiffrin, D., D. Tannen and H. E. Hamilton (eds), (2001), The Handbook of Discourse Analysis. Malden and Oxford: Blackwell. Sebba, M. (2007), Spelling and Society: The Culture and Politics of Orthography around the World, Cambridge: Cambridge University Press. Seboek, T. A. and R. Rosenthal, eds (1980), The Clever Hans Phenomenon: Communication with Horses, Whales, Apes, and People, New York: The New York Academy of Sciences. Seboek, T. A. and J. Umiker-Seboek, eds (1980), Speaking of Apes: A Critical Anthology of Two-Way Communication with Man, New York and London: Plenum Press. Seiler, H. (1983), Possession as an Operational Dimension of Language, Tübingen: Narr.

References

Senghas, A., S. Kita and A. Özyürek (2004), ‘Children creating core properties of language: evidence from an emerging sign language in Nicaragua’, Science, 305: 1779–82. Sheidlower, J. (1995), The F Word, New York: Random House. Shopen, T., ed. (2007a), Language Typology and Syntactic Description. Volume I: Clause Structure, Cambridge: Cambridge University Press. Shopen, T., ed. (2007b), Language Typology and Syntactic Description. Volume II: Complex Constructions, Cambridge: Cambridge University Press. Shopen, T., ed. (2007c), Language Typology and Syntactic Description. Volume III: Grammatical Categories and the Lexicon, Cambridge: Cambridge University Press. Siegel, J. (2008), The Emergence of Pidgin and Creole Languages, Oxford: Oxford University Press. Silver, S. and W. R. Miller (1997), American Indian Languages: Cultural and Social Contexts, Tucson: University of Arizona Press. Simpson, J. A. and E. S. C. Weiner, eds (1989), The Oxford English Dictionary, Oxford: Clarendon Press. Sinclair, J., ed. (1987), Collins COBUILD English language dictionary, 1st edn, London: HarperCollins. Sinclair, J., ed. (1990), Collins COBUILD English Grammar, London and Glasgow : Harper Collins. Singh, I. (2005), The History of English: A Student’s Guide, London: Hodder Arnold. Skinner, B. F. (1957), Verbal Behavior, Englewood Cliffs, NJ : Prentice-Hall. Slobin, D. I., ed. (1985a), The Crosslinguistic Study of Language Acquisition. Volume 1: The Data, Hillsdale, NJ, and London: Lawrence Erlbaum. Slobin, D. I., ed. (1985b), The Crosslinguistic Study of Language Acquisition. Volume 2: Theoretical Issues, Hillsdale, NJ, and London: Lawrence Erlbaum. Slobin, D. I., ed. (1992), The Crosslinguistic Study of Language Acquisition. Volume 3, Hillsdale, NJ, and London: Lawrence Erlbaum. Slobin, D. I. (1996a), ‘From “thought and language” to “thinking for speaking”’, in J. J. Gumperz and S. C. Levinson (eds), Rethinking Linguistic Relativity, 70–96, Cambridge: Cambridge University Press. Slobin, D. I. (1996b), ‘Two ways to travel: verbs of motion in English and Spanish’, in M. Shibatani and S. Thompson (eds), Grammatical Constructions: Their Form and Meaning, 195–219, Oxford: Clarendon Press. Slobin, D. I., ed. (1997a), The Crosslinguistic Study of Language Acquisition. Volume 4, Hillsdale, NJ, and London: Lawrence Erlbaum. Slobin, D. I., ed. (1997b), The Crosslinguistic Study of Language Acquisition. Volume 5: Expanding the Contexts, Hillsdale, NJ, and London: Lawrence Erlbaum. Smith, K. A. (2011), ‘Grammaticalization’, Language and Linguistics Compass 5 (6): 367–80. Snow, C. E. (1996), ‘Issues in the study of input: finetuning, universality, individual and developmental differences, and necessary causes’, in P. Fletcher and B. MacWhinney (eds), The Handbook of Child Language, 179–93, Oxford: Blackwell. Song, J. J., ed. (2011), The Oxford Handbook of Linguistic Typology, Oxford: Oxford University Press. Speake, J. (2002), The Oxford Dictionary of Idioms, Oxford: Oxford University Press. Spears, R. A., ed. (1990), NTC’s American Idioms Dictionary, Lincolnwood, IL : National Textbook Company. Spencer, A. (2017), ‘Morphology’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 211–33, Hoboken: Wiley-Blackwell. Spencer, A. and A. M. Zwicky, eds (2001/1998), The Handbook of Morphology, Oxford: Blackwell. Sproat, R. (2010), Language, Technology, and Society, Oxford and New York: Oxford University Press. Stemmer, B. and H. Whitaker, eds. (1998), Handbook of Neurolinguistics, New York: Academic Press.

493

494

References

Stokoe, W. C. (1960), Sign Language Structure: An Outline of the Visual Communication Systems of the American Deaf, Buffalo: Department of Anthropology and Linguistics, University of Buffalo. Strozer, J. R. (1994), Language Acquisition after Puberty, Washington, DC : Georgetown University Press. Stubbs, M. (1983), Discourse Analysis: The Sociolinguistic Analysis of Natural Language, Oxford: Blackwell. Sutton-Spence, R. and B. Woll (1999), The Linguistics of British Sign Language: An Introduction, Cambridge: Cambridge University Press. Tagg, C. (2012), Discourse of Text Messaging: Analysis of SMS Communication. London and New York: Continuum. Talmy, L. (2003), ‘Concept structuring systems in language’, in M. Tomasello (ed.), The New Psychology of Language: Cognitive and Functional Approaches to Language Structure. Volume 2, 15–46, Mahwah, NJ, and London: Lawrence Erlbaum. Talmy, L. (2007), ‘Lexical typologies’, in T. Shopen (ed.), Language Typology and Syntactic Description. Volume III: Grammatical Categories and the Lexicon, 66–168, Cambridge: Cambridge University Press. Tannen, D. (2003), ‘Gender and family interaction’, in J. Holmes and M. Meyerhoff (eds), The Handbook of Language and Gender, 179–201, Oxford: Blackwell. Tannen, D., H. E. Hamilton and D. Schiffrin, eds (2015), The Handbook of Discourse Analysis, 2nd edn, Oxford: Wiley-Blackwell. Terrace, H. S. (1979), Nim: A Chimpanzee who Learned Sign Language, New York: Knopf. Terrace, H. S., L. A. Petitto, R. J. Sanders and T. G. Bever (1979), ‘Can an ape create a sentence’, Science, 206: 891–902. Tervoort, B. T. (1953), Structurele Analyse van Visueel Taalgebruik binnen een Groep Dove Kinderen, Amsterdam: Noord-Hollandsche Uitgevers Maatschappij. Teubert, W. and A. Cermáková (2007), Corpus Linguistics: A Short Introduction, London and New York: Continuum. Thomas, J. (1995), Meaning in Interaction: An Introduction to Pragmatics, London and New York: Longman. Thomas, M. (2007), ‘The Evergreen Story of Psammetichus’ Inquiry into the Origin of Language’, Historiographia Linguistica, 34: 37–62. Thomas, M., ed. (2007), Culture in Translation: The Anthropological Legacy of R. H. Mathews, Canberra: ANU E Press and Aboriginal History. Thomason, S. G. and T. Kaufman (1988), Language Contact, Creolization, and Genetic Linguistics, Berkeley and London: University of California Press. Todd, L. (1990), Pidgins and Creoles, London: Routledge. Tomasello, M. (1999), The Cultural Origins of Human Cognition, Cambridge, MA : Harvard University Press. Tomasello, M. (2003a), Constructing a Language: A Usage-Based Theory of Language Acquisition, Cambridge, MA, and London: Harvard University Press. Tomasello, M. (2003b), ‘The key is social cognition’, in D. Gentner and S. Goldin-Meadow (eds), Language in Mind: Advances in the Study of Language and Thought, 47–57, Cambridge, MA : MIT Press. Tomasello, M. (2008), Origins of Human Communication, Cambridge, MA, and London: MIT Press. Tomlin, R. (1986), Basic Word Order: Functional Principles, London: Croom Helm.

References

Thompson, R. L. (2011), ‘Iconicity in language processing and acquisition: what signed languages reveal’, Language and Linguistics Compass, 5: 603–16. Trask, R. L. (1998), Key Concepts in Language and Linguistics, London: Routledge. Traugott, E. C. (2003), ‘Constructions in grammaticalization’, in B. D. Joseph and R. D. Janda (eds), The Handbook of Historical Linguistics, 624–47, Oxford: Blackwell. Traxler, M. J. (2012), Introduction to Psycholinguistics: Understanding Language Science, Malden and Oxford: Wiley-Blackwell. Trudgill, P. (1986), Dialects in Contact, Oxford: Basil Blackwell. Trudgill, P. (2023), The Long Journey of English: A Geographical History of the Language, Cambridge: Cambridge University Press. Tsunoda, T. (2005), Language Endangerment and Language Revitalisation, Berlin: Mouton de Gruyter. Vakhtin, N. (2002), ‘Language death prognosis: a critique of judgement’, SKY Journal of Linguistics, 15: 239–50. Valli, C., C. Lucas and K. J. Mulrooney (2005/1992), Linguistics of American Sign Language. An Introduction, Washington DC : Gallaudet University Press. Van Valin, R. D. (2001), An Introduction to Syntax, Cambridge: Cambridge University Press. Van Valin, R. D. (2017), ‘Functional linguistics: communicative functions and language structure’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 141–57, Hoboken: Wiley-Blackwell. Velupillai, V. (2012), An Introduction to Linguistic Typology, Amsterdam and Philadelphia: John Benjamins. Velupillai, V. (2015), Pidgins, Creoles and Mixed Languages: An Introduction, Amsterdam and Philadelphia: John Benjamins. Vermeerbergen, M. and L. Leeson (2011), ‘European signed languages – towards a typological snapshot’, in B. Kortmann and J. van der Auwera (eds), The Languages and Linguistics of Europe: A Comprehensive Guide, 269–287, Berlin and Boston: De Gruyter Mouton. Vossen, R., ed. (2013), The Khoesan Languages, London and New York: Routledge. Walsh, M. (2014), ‘Indigenous language maintenance and revitalisaton’, in H. Koch and R. Nordlinger (eds), The Languages and Linguistics of Australia: A Comprehensive Guide, 329–62, Berlin and Boston: De Gruyter Mouton. Warren, P. (2013), Introducing Psycholinguistics, Cambridge: Cambridge University Press. Warren, R. M. and R. P. Warren (1970), ‘Auditory illusions and confusions’, Scientific American, 223: 30–6. Wasow, T. (2017), ‘Generative grammar: rule systems for describing sentence structure’, in M. Aronoff and J. Rees-Miller (eds), The Handbook of Linguistics, 2nd edn, 119–39, Hoboken: Wiley-Blackwell. Weisser, M. (2016), Practical Corpus Linguistics: An Introduction to Corpus-based Language Analysis, Malden and Oxford: Wiley Blackwell. Wescott, R. (1980), Sound and Sense: Linguistic Essays on Phonosemic Subjects, Lake Bluff, IL : Jupiter Press. Whaley, L. J. (1997), Introduction to Typology, Thousand Oaks, CA : Sage. Whitney, P. (1998), The Psychology of Language, Boston: Houghton Mifflin. Whorf, B. L. (1956), Language, Thought and Reality: Selected Writings of Benjamin Lee Whorf, Cambridge, MA : MIT Press. Williams, J. M. (1975), Origins of the English Language: A Social and Linguistic History, New York: Free Press.

495

496

References

Wilton, D. (2009), Word Myths: Debunking Linguistic Urban Legends, New York: Oxford University Press. Winchester, S. (2003), The Meaning of Everything: The Story of the Oxford English Dictionary, Oxford: Oxford University Press. Wolfram, W. and N. Schilling-Estes (2006), American English: Dialects and Variation, Malden and Oxford: Blackwell. Woll, B. (2013), ‘The history of sign language linguisics’, in K. Allan (ed.), The Oxford Handbook of the History of Linguistics, 91–104, Oxford: Oxford University Press. Wray, A. and A. Bloomer (2012), Projects in Linguistics and Language Studies, 3rd edn, Abingdon and New York: Routledge. Wurm, S. A., ed. (1975), New Guinea Area Languages and Language Study. Volume 1: Papuan Languages and the New Guinea Linguistic Scene, Canberra: Pacific Linguistics. Yule, G. (1996), Pragmatics, Oxford: Oxford University Press. Yule, G. (2022), The Study of Language, 8th edn, Cambridge: Cambridge University Press. Zeshan, U. and C. de Vos, eds (2012), Sign Languages in Village Communities: Anthropological and Linguistic Insights, Berlin: Walter de Gruyter. Zeshan, U. and N. Palfreyman (2017), ‘Sign language typology’, in A. Y. Aikhenvald and R. M. W. Dixon (eds.), The Cambridge Handbook of Linguistic Typology, 178–216, Cambridge: Cambridge University Press. Zucchi, S. (2012), ‘Formal semantics of sign languages’, Language and Linguistics Compass, 6: 719–34. Zuckermann, G. (2006), ‘A new vision for Israeli Hebrew: theoretical and practical implications of analyzing Israel’s main language as a semi-engineered Semito-European hybrid language’, Journal of Modern Jewish Studies, 5: 57–71.

Language Index

Acehnese 121, 377 Acholi 372–3 Adamorobe Sign Language 313 Afrikaans 421, 433 Afroasiatic 426, 427 alternate sign languages 323–4, 325 American Sign Language, see ASL Amerind 423 Amharic 38, 377, 426 Anglo-Saxon 100 Anguthimri 374 ǁAni Khoe Sign Language 326 Arabic 35, 91, 212, 339, 340, 416, 426 Juba Arabic 434 Sudanese Creole Arabic see Juba Arabic Aramaic 339, 426 Archaic Chinese 397 Archi 129–30 Armenian Romani 435 Arrernte 364 Arrernte Sign Language 324–5, 330 ASL 243–5, 312–23 passim, 330 Asmat family 364 Atsugewi 378 Auslan 312, 314–23 passim, 329 Australian Sign Language, see Auslan Austronesian 396, 425–6, 429 Aymara 9 Babalia 434 Babungo 78 Ban Khor Sign Language 313 Banoni 408 Bantu 40, 279, 428–9, 435 Bardi 96, 383, 418–19, 437 Baric 429 Basque 4 Basque Romani 435

Beembe 36 Bemba 9, 377, 421 Bengali 10, 416 Berber 426 Blackfoot 9 Bodic 429 British Sign Language see BSL Broken 434 BSL 313–22 passim Bulu 10 Bunuba 37, 171, 383 Burmese 36, 37, 429 Burmese-Lolo 429 Cantonese 42, 164, 338, 377, 415 Celtic 406 Chadic 426 Cherokee 339, 377 Chichewa 96, 421 Chukchee 373–4 contact languages 432–5 Cree 434, 435 creoles 366, 433–4 Cushitic 426, 435 Dahalo 177, 426 Dani 260 Danish 9, 31, 35, 36, 37, 39, 92, 95, 164–5, 172, 214, 215, 298, 368, 377, 390, 398, 399, 401, 470 Danish Sign Language (DTS) 313 deaf sign languages see primary sign languages Dhurga 96 Djabugay 96 Dortika 435 Dutch 34, 37, 95, 298, 378, 395 Dravidian 404 Dyirbal 171 497

498

Language Index Egyptian 336, 426 English varieties Aboriginal 103 African American Vernacular English (AAVE) 168 American 38, 41, 47, 93, 99, 162, 164, 212–14, 218, 344, 395, 415 Australian 37, 41, 47, 50, 89, 92, 93, 97, 101, 109, 161, 164, 402, 415 BBC 39, 41 British 47, 109, 162, 164, 201, 209, 213, 218 Californian 33–4 Estuary 37 Manually Coded 330 Martha’s Vineyard 403 Middle English 213, 343, 389, 394, 398, 400, 404 Modern English 213, 389, 400, 403 New York City 165–6, 181 Northumbrian 34, 37 Old 139, 213, 342, 343, 389, 393, 396, 398, 399, 400, 403 Scottish 36, 37 Signed see Manually Coded New Zealand 38, 93, 164, 175 Ewe 36, 87–8, 376, 377, 398, 404 Fanagalo 432–3 Fang 428 Finnish 300, 366, 368, 377 Finnish Sign Language 320 French 34, 37, 39, 91, 92, 133, 170, 181, 212, 279, 300, 342–3, 361, 374, 390–1, 396, 399, 400, 404, 422, 434–5 Old 400 Parisian 34, 390 French Sign Language (LSF) 313, 319 Frisian 378 Friulian 173 Gaelic Irish 376, 377 Scots see Gàidhlig Gàidhlig 175 Gan 338, 429 Ge’ez 377 Georgian 377 German 34, 36, 91, 95, 123–4, 172, 173, 212, 298, 299, 345, 378, 390, 399, 422

Standard 37, 345 Pennsylvanian 91 Swiss 172 Germanic 378, 390–1 Goemai 53, 67, 361, 367, 369 Gooniyandi 9, 10, 41, 45, 47–8, 53, 64, 65, 93, 142, 171, 172, 367, 368, 369, 384 Greek 10, 92, 337 Ancient 337, 340, 377, 425 Gros Ventre 166 Guaraní 172, 366 Gumbaynggirr 11–12, 76 Gun-djeihmi 367, 368, 369 Gurindji 173–4, 177, 470 Guugu Yimithirr 90–1, 261–2 Hadza 431 Haitian Creole 366 Haiǁom 430 Hakka 338, 377, 429 Hausa 87, 377, 426 Hawaiian 364, 438 Hebrew 178, 339 Biblical 377 Hindi 10, 168, 345, 377, 415, 416, 468; see also Urdu Hiri Motu 433 Hittite 336, 377 Hixkaryana 377 home signs 313, 325 Hopi 259–60 Hungarian 9, 10, 34, 43, 71–2, 117, 173, 291, 298, 366, 367, 368, 373, 377, 393, 418 Old 404 Hungarian Sign Language 313 Icelandic 10, 392 Idoma 40 Ilocano 393 Indo-European 418, 423–5 Indonesian 10, 128, 133 Israeli Sign Language 313, 318 Italian 173, 374, 392 Jahai 133 Japanese 10, 49, 87, 117, 166, 170, 339, 345, 367, 369, 377, 378, 416 Jaru 96, 171

Language Index Jiwarli 376 Juǀ’hoan 263, 430 Kannada 404 Kanuri 421 Kata Kolok Sign Language 313 Kaqkchikel 420 Kele 37 Keren 429 Kharia 374 Khoe-Kwadi 430–1 Khoisan 40, 47, 430–2 Khwe 430 Kija 172 Kinyarwanda 377 Kiowa 95 Kisi 170, 367, 369, 377 Korean 170, 300, 340–1, 345, 377, 378 Kriol 172, 173, 174, 439 Kukatja 163 Kuman 402 Kuot 54, 78, 367, 368 Kurmanjî Kurdish 377 Ku Waru 367, 368 Koyukon 367 Kwadi 430, 474 Kwaza 367, 368 Kx’a 430, 431 Lakes Plain family 364 Lao 367, 369, 377 Latin 9, 63–3, 100, 340, 342, 343, 345, 367, 368, 370, 373, 392, 393, 395–6, 399, 400, 408–9, 420, 422 Laven 367, 369 Lingala 377 Luganda 428 Ma’a 435 Malagasy 129, 377 Malay 173 Malayalam 374, 377 Malayo-Polynesian 425 Mandarin Chinese 10, 36, 86, 91, 164, 198–200, 212, 338, 361, 367, 368, 369, 376, 377, 383, 397, 398, 415, 429 Mandinka 404

Māori 115, 178, 438 Marathi 404 Martha’s Vineyard Sign Language 313 Mbugu 435 Melpa 37 Mende 377 Meryam Mir 70 Michif 367, 368, 434–5 Miriwoong 383, 403 mixed languages 434–5 Monastic sign language 327–8 Mongolian 377 Mosetén 95 Motu 406, 433 Nadëb 377 Nama 377, 430 Nama-Damara, see Nama Navajo 378 Near East Qirishmal 435 Ngandi 394 Ngan’gityimerri 389–90 Ngarinyin 98, 117, 383 Nicaraguan Sign Language 313, 329 Niger-Congo 425, 427–9 Nilo-Saharan 426, 427 Nkore-Kiga 375–6 Northern Sotho 77, 115–16 Norwegian 378, 390 Norwegian Romani 435 Nubi 434 Nung 429 Nyangumarta 383 Nyikina 172, 418–20, 437 Nyulnyul 44, 60, 61, 93, 101, 176, 177, 418–19, 437 Old Norse 400 Omotic 426 Paamese 98, 372, 395 Pacific Pidgin English 434 Pali 366–7, 409 Panare 377 Pangasinan 438 Papago 9 Papuan 429–30, 433 Phoenician 339–40

499

500

Language Index Phrygian 248 Pidgins 432–3 Pidgin Yimas 433 Pig Latin 170, 181 Pipil 377, 397 Pirahã 47 PISL see Plains Indian Sign Language Pitta-Pitta 85 Plains Indian Sign Language 325–6, 330 Police Motu 433 Polish 43, 298 Portuguese 92, 212, 408–9, 416 primary sign languages 312–23 proto-Afroasiatic 426 proto-Austronesian 425 proto-Germanic 139, 422 proto-Indo-European 390–1, 406, 422, 423, 425 proto-Niger-Congo 427 proto-Romance 420 proto-Trans-New Guinea 430 proto-Uralic 404 Qiang 429 Quechua 40, 398 Quiche Mayan 288 Ritharrngu 394 Romance 378, 390, 396, 420 Romani 435 Romanian 95, 98–9 Russian 212, 416 Sabaot 367, 368 SAE see Standard Average European Saliba 76–7, 95–6, 127–8, 361 Salishan 86 Samoan 9, 86, 438 Sandawe 431 Sango 377 Sanskrit 133, 405, 409 Savosavo 133 Semitic 339, 340, 426 Setswana 88, 428 Shan 438 Shona 421, 428 Shua 9, 44, 361, 367, 369, 376, 377, 430 Sidamo 166–7

sign languages see primary sign languages, alternate sign languages Sindhi 40 Sinitic 429 Sino-Tibetan 429, 430 Southern Min 429 Southern Sotho 428 Spanish 35, 88, 92, 172, 212, 261, 291, 298, 374, 377, 378, 393, 397, 398, 400, 416 Castilian 37 Old 397 Spanish Sign Language 317 Standard Average European 259–60 Sumerian 16, 178, 336, 308, 336, 337–8 Swahili 43, 172–3, 367, 368, 421, 428, 429 Asian Swahili 434 Cutchi-Swahili 434 Swedish 37, 212, 279, 378, 390, 392 Taba 53, 162, 367, 368 Tahitian 438 Tagalog 59, 377, 393 Tamazight 377, 426 Tambora 175 Tamil 378, 468 Tarahumara 260 Tay-Nung 438 Telugu 468 Tetum Prasa 434 Tetun 434 Tetun Dili 434 Thai 44, 263, 274, 287, 377 Tibeto-Burman 429 Tok Pisin 433, 434 Tongan 100, 148 Torres Strait Creole see Broken Trans-New Guinea 429–30 Tshàúkák’ùí see Ts’ixa Sign Language Ts’ixa 377, 430 Ts’ixa Sign Language 326–7 Turkish 69–70, 291, 345, 366, 378 Tuu 430–1 Twi 377 Tzotzil 377 Una 377 Unggumi 37

Language Index Urdu 168, 345, 404, 415, 416, 468; see also Hindi Urubú 377

Western Desert language 163, 398 Wunambal 383

Verlan 181 Vietnamese 39, 98, 345 Village sign languages 313

Xhosa 179, 263, 361, 428 !Xóõ 40, 47, 430, 431

Walmajarri 43, 96, 172, 383, 418 Wanyjirra 173–4 Warao 129 Wari’ 40 Warlpiri 325, 361 Warlpiri Sign Language 324–5 Warrwa 14, 64, 77, 86–7, 93, 310–11, 367, 368, 418–19, 437 Warumungu 69 Welsh 37, 371, 377 West Greenlandic 367, 368

Yanyuwa 167 Yawijibaya 383 Yawuru 67, 383 Yélî Dnye 40, 152 Yimas 433 Yindjibarndi 34 Yingkarta 58, 59 Yolngu Sign Language 325 Yoruba 377 Yup’ik 58, 107, 361, 367 Zulu 361, 428, 432–3

501

502

Name Index

Alice 94, 150–1 American English 2006 corpus (AmE06) 218–22 passim AntConc 217–29 passim Arbib, Michael 250 Ashcraft, Mark 271

Crowley, Terry 402 Crystal, David 2, 348, 351–2, 355

Bakker, Peter 435 Bartlett, Tom xvii Beagle Bay Mission, 176 Bellwood, Peter 426 Bloomfield, Leonard 107 Boas, Franz 259 Bolinger, Dwight 311 British English 2006 corpus (BE06) 218, 221, 222 British National Corpus (BNC) 209 Broca, Paul 268, 270 Brown Corpus 209

Ethnologue 415, 423, 427, 429–30, 436

Carroll, Lewis 9, 93–4, 108, 138, 150 Caxton, William 343 Child Language Data Exchange System (CHILDES) 214 Chimpsky, Nim 244–5, 255–6 Chomsky, Noam 17, 108, 126, 251, 259 Chrispin, Lucy 219 Christie, Agatha 18 Churchill, Winston 467 Clever Hans, 471 Coates, Jenifer 196, 201 Cognitive Linguistics 134–5 Collins Birmingham University International Language Database (COBUILD) 209, 222, 229–30 Construction Grammar 17, 145 Corpus of Contemporary American English (COCA) 99, 212, 217–27 passim Corpus Resource Database 212

Darwin, Charles 237 Dik, Simon 17, 122, 123 Dunbar, Robin 250–1

Fillmore, Charles 17, 136 Firth, J. R. 140, 226 Fodor, Jerry 259 Fontaine, Lise 219 Freud, Sigmund 272 Frisch, Carl von 238, 239 Functional Grammar 17, 123 Gage, Phineas 267–8, 278 Gal, Susan 173 Gardner, Beatrix and Allen 244 Genie 248 Givón, Talmy 259 Glottolog 182, 312, 417, 427, 430, 436, 474 Goodall, Jane 241 Greenberg, Joseph 423, 426, 430 Grice, Paul 149 Grimm, Jacob 390 Gutenberg, Johannes 346 Haas, William 73 Halliday, Michael A.K. 17, 122, 123, 126, 136, 169, 191, 203 Hamburg Sign Language Notation System 314 Hasan, Ruqaiya 191 Haviland, John 261 Hayes, Keith and Cathy 243 Head, Henry 272 Heath, Jeffrey 394 503

504

Name Index Heider, Elanor 260 Hockett, Charles 12, 13, 19, 311, 335 Humboldt, Wilhelm von 259 Humpty Dumpty 9, 94 International Corpus of English (ICE) 212, 213 International Phonetic Alphabet (IPA) 28, 29–30 Jäger, Andreas 423 Jespersen, Otto 248, 394, 401 Kaminski, Juliane 242 Kanzi 246 Kellogg, Winthrop and Luella 243 Kendon, Adam 311, 325, 330 Labov, William 165–6, 181, 403 Lakoff, George 134, 259 Langacker, Ronald 17, 122, 123, 134, 259 Leborgne 270 Leipzig glossing rules 62 Lennard, Maudie 310–12, 282 Levinson, Stephen 152, 261 Malinowski, Bronislaw 152, 198 Martinet, André 17 Matthews, Peter 136 McConvell, Patrick 173–4 McGurk, Harry 264 McNeill, David 250, 310, 311 Mitchell, T.F. 198 Mous, Maarten 435 Müller, Max 248 Murray, James 209 Ohara, Y. 166 Oxford English Dictionary 209 Panbanisha 246 Panzee 13–14, 247 Pierce, Charles S. 237 Pike, Kenneth 377 Pinker, Stephen 251, 259 Poirot, 18 Premack, David and Ann 246

Psammetichus 248 Quirk, Randolph 209, 210 Rask, Rasmus 390 Recorde, Robert 8 Reid, Nick 389–90 Renfrew, Colin 425 Rico 242–3 Ruhlen, Merritt, 415, 423 Sagart, Laurent 426 Sapir, Edward 98, 259–61, 311 Sarah 246 Saussure, Ferdinand de 7, 16, 133 Savage-Rumbaugh, Sue 246 Sejong, King 340 Sequoyah 339 Sheng, Bi 346 SIL International 18 Sketch Engine 217 Skinner, B.F. 294 Slobin, Dan 260 Spooner, Reverend Archibald 266 Stokoe, William 321 Summer Institute of Linguistics see SIL International Survey of English Usage 211 Talmy, Leonard 378, 382 Tan see Leborgne TenTen corpora 215 Terrace, Herbert S. 244–6 Tomasello, Michael 241, 252 Trudgill, Peter 168 Twitter (X) 351–2 Ventris, Michael 337 Wada, Juhn 274 Webster, Noah 344 Wernicke, Carl 269, 270–1 West Coast Functional Grammar 17 Whorf, Benjamin L. 259–60, 278 Wordsmith Tools 217, 218 Yerkes, Robert 243

Subject Index

abbreviations, use of 352 abjads 339–40 ablaut 316 absolutive case 373–4 accent 162 accusative case 63, 373–4 acronyms 89–90 Actor 120–2, 135, 144, 319–20, 375 adjacency pairs see exchanges adjectival phrase 117 adjectives 84, 318 absence of 86 adverbial phrase 117 adverbs 85, 318 affixes 59–60 distinguishing properties of 63–5 in sign languages 316 affixing typology 370 affricates 36, 166, 289, 365 agglutinating languages 366, 418 agrammatic aphasia 270 agreement (grammatical) 63, 316, 318 allographs 340 allomorphs 58–9 lexical conditioning of 67 morphological conditioning of 67 phonological 67 phonological conditioning of 67 suppletive 67 allophones 44–5 alphabets 319, 340–1 alveolars 29, 33, 34, 36, 37, 40, 289, 364 alveo-palatals 29, 34, 289 ambiguity 112, 121, 149 amelioration 400 analogical change 394–5 animacy hierarchy 374, 379

animal communication 236–47 alarm calls 239, 240–1 bee dances 13, 238–9 birds 239–40 bodily gestures and signs 237–8, 241, 250 chimpanzees 241–2 vervet monkeys 240–1 annotation 216 anomic aphasia 271, 272 anticipatory errors 266 antonyms 141 antonymy 141 aphasia 270–2 approximant 37–8 arbitrariness 8, 9, 13, 420 arcuate fasciculus 269, 271, 272 aspiration 35; see also voice onset time assimilation 392–3, 401 auditory-vocal medium 12 auxiliaries 85, 116, 318, 395 avoidance styles 170–1 babbling 286 backformation 96 backness of vowels 38–9 basic mastery of language 287 basic vocabulary 420, 421, 422 beats 311 bifurcation 395 bilabials 33, 34, 35, 37, 40, 44 bilingualism 172–4, 404 binomials 99 blendings 90 blogs 352–3 bodily signs see under animal communication; see also gestures bootstrapping 297

505

506

Subject Index borrowing 90–2, 318, 342, 394, 397, 404, 420, 422, 432 from English 92 into English 91–2 bottom-up processing 264 brain scanning 274–7 Broca’s aphasia 270, 272 Broca’s area 268–9, 270, 272, 275, 276 calques 91 caretaker speech 288 case 63, 69–70, 177, 325, 365, 366–7, 373–4, 383, 384, 396, 420 categorical perception 263 cerebral cortex 267 chain shift 391 child language learning developmental stages in 285–8 of lexicon 290 of morphology 291–2, 295 of phonetics and phonology 288–9 of semantics 290–1, 297 of sign languages 322 strategies in 294–7 of syntax 292–3 classifiers 104, 287, 316–7 clauses 113 relational structure 120–4 as a syntagm of phrases 118–19 transitivity of 376 clay tablets 336 ‘Clever Hans’ effect 242, 471 click languages 263, 361 clicks 29, 40, 263, 361, 430, 431; see also velaric airstream clipping 89 clitics 63 distinctive properties of 64–5, 66 possessive 63, 65, 67, 75, 85, 117, 291, 298, 399 clusters 227–8 coarticulation 40 coda 187, 189 code-switching 173–4 cognates 390–1, 418 coherence 190–1 cohesion 191–5 cohesive devices 191 co-hyponym 141 coinage 93–4

collocations 140, 225–7 colour terminologies 260, 300 comparative method 418–20 complementary distribution 45, 49, 58 complex sentences 113, 293, 321, 398 componential analysis 143–4 compositional semantics 144 compounding 95, 317, 326, 403 concordances 223–5 conditioned-response learning 294 conduction aphasia 271, 272 conjunction 193 conjunctions 85, 117, 293, 318 connotation 133–4, 399, 400 consonant clusters 168, 289, 338, 339, 392 consonants 29, 32–8, 41, 47, 289, 337, 339–40, 341, 364, 392, 431 continued learning of language 287–8 contractibility 111–2 contra-lateral control 267 conventionality 8 Conversation Analysis (CA) 5, 200 converses 141 cooing 286 cooperative principle 148–50 corpora general 212 historical 213 learner 213–14 multilingual 214 multimedia 214 parsed 213 specialized 212 corpus 208–9 corpus linguistics 5, 209–30 limitations 228 creativity see productivity critical period 240, 299, 304, 323 cultural transmission 14, 239, 240 cuneiform 336 deaf sign languages see primary sign languages decipherment 355 deictic expressions 148 deletion of phones 392 demonstratives 115, 148, 192, 399 demotic script 336

Subject Index dentals 29, 33–4, 36, 40, 341 derivation 95, 316 derivational affix 60–1, 65, 66, 83, 96, 316, 394, 398–9 distinctive properties of 63–4 descriptive orientation of linguistics 2 Devanagari script 168, 345 dialect 162–4, 415 continuum 163 standard 164 dialectal variation 162–4 dichotic listening test 273–4 diglossia 172, 345 digraphia 345 diphthongs 40–1, 403 direct speech act 146–7 Discourse Analysis 5, 200 discourse particles 225 discourse structure 195–200 discourses 161, 186 displacement 13–4, 239, 242, 286 dissimilation 393 domains 160–1 duality 14–5, 241, 246, 313 dyslexia 344 dysphemism 101 ejectives 29, 39–40 electroencephalograms (EEGs) 276–7 electronic media 347–53 ellipsis 194 embedding 117 emojis 335, 349 emoticons 335, 349 enclitic 63; see also clitic epenthesis see insertion (of phones) ergative case 373–4 ethics 216–7 etymology 88 euphemism 100–1 Event 122, 319 evolutionary linguistics 5, 247–52 exchange errors 266 exchanges 197–8, 199 expanded pidgins 433 experiential roles 120–2, 319 expositions 189–90

extension of meaning 96–7, 139, 290–1 of grammatical constructions 397–8 families of languages 417 feedback loop 27 felicity conditions 147 ‘feral’ children 248 fingerspelling 218–19 first language learning see child language learning flaps 37, 45, 87 fluent aphasia see Wernicke’s aphasia foreign influence in language change 404 FOXP2 gene 251 free variation 45, 58 frequency of words 217–19, lists 217–18 fricatives 29, 36, 166, 289, 365, 390–1, 406 functional causes of language change 402–3 functional linguistics 17, 125 functional magnetic resonance imaging (fMRI) 275 fusional language 366–7, 370 garden path sentences 265 -gate suffix 73, 91 gender (of nouns) 164–5, 167, 429, 470 generality (of meaning) see vagueness generative grammar 17, 18, 126, 296–7 genres 189, 190 gestural origins of language 250 gestures 309–12, 318; see also bodily gestures and signs under animal communication imagistic 310 interface with language 311 non-imagistic 311 pointing 311 global aphasia 272 glides 29, 37–8, 289, 419 glottals 29, 31, 35, 36, 341 glottalic airstream 39–40, 361 gossip 251 grammaticality 108–9 grammatical constructions 4, 145, 266, 292–3, 379 changes in 397–8 frequency of use of 164–5, 219

507

508

Subject Index types comparative 141, 397 complex sentence see complex sentences negative 292–3, 299 imperative see imperatives interrogative see interrogatives passive 127, 397–8 possessive 298, 372–3 reflexive 397–8 subordinate clause 321, 402 there’s family and (there’s) family 210, 216, 224, 228 verbal 389–90, 403 grammaticalization 398–9 grammatical relations 120–4 universality of 374–5 Great English Vowel Shift 343 Gricean maxims 148–50 Grimm’s Law 390–1, 393, 411 handshape (feature in sign languages) 314–15, 316, 317, 324, 326 Hangul 340–1, 355–6 hapax legomena 219 height of vowels 38–9 hieratic script 336 hieroglyphs 336, 337 historical comparative linguistics 6, 417–20, 392 holophrastic stage in language acquisition 286 homophony 138 homonymy see homophony honorifics 170 hyperlinks 352 hypernym see superordinate hypernymy 141 hyperonym see superordinate hyperbole 400 hypocorism 89 hyponymy 141–2 hypothesis testing in language learning 296 icon see iconic sign iconic sign 7, 8, 9–10, 13, 323 iconicity 10, 93, 323, 326, 341 identity maintenance as a cause of language change 403 idioms 83, 97–9, 134, 220, 319

illocutionary force 145–7 Immediate Constituent Analysis (IC Analysis) 110 imitation 245, 289, 294–5, 266 imperatives 146, 320 implosives 29, 40 indexes see indexical sign indexical sign 237, 311, 323 indirect speech act 147 Indus Valley script 337 infix 59–60, 370 inflectional affixes 62–3 distinctive properties of 64–5, 66 inflectional language see fusional language innateness 296–7 insertion (of phones) 392 instant messaging 349–51 intension 133, 137, 144 interdentals 34 interjections 85, 113, 221, 318 interrogatives 146–7, 293, 299, 320 intonation 42–3, 265, 286, 288, 293, 299 intonation unit 200 isogloss 164–5 isolating language 366, 367, 370 key word in context see KWIC keyness 221–2 keywords 221–2 kohunga reo 178 KWIC 223 labials 33, 35, 289 labiodentals 33, 36, 38, 46, 433 language learning 5 by children 284–97 by adults 298–301 acquisition device (LAD) 296–7 biological constraints on 379–80 change 388–406 choice in bilingual communities 172–3 death 175 design features 13–16, 19, 246–7, 335 endangerment 175–8 families 417–8, 422–32 isolates 422

Subject Index maintenance 178 nests see kohunga reo notion of 415–16 obsolescence 175–8 origins 247–52 revival 178 shift 174–7 causes of 175–6 structural changes accompanying 176–7 as a sign system 9–12 stocks 417 substratum 404 superstratum 404 typology 6, 361–2, 365–79 universals 6, 361, 363–5 language-ready brain 250, 251 languages distribution of 416 functional unity 380 genetic relatedness 417 number of 415–17 number of speakers of 416–17 variation in 5, 6, 161–71 larynx 30, 31, 39–40, 238, 243, 471 lateralization 268, 270, 273, 274, 276 laterals 29, 36–7, 93, 289, 393, 402, 406 left hemisphere dominance in language processing 268, 322 lemma 218 lexical cohesion 194–5 lexical density 346–7, 356 lexical lookup 264–5 lexical decision task 265 lexicon 82–101 mental lexicon 83 openness 84 lexicostatistics 422 Linear A 337 Linear B 337, 338 linguistic determinism 259 linguistic relativity 259 linguistic typology 362 linguistics ancient 16 applications of 18 branches of 4–6 formal 17

functional 17 modern 16–18 as a science 2–3 literacy 344–5 loanwords 90–1 loan translations see calques location (feature in sign languages) 315–16, 324, 326, 328 localization 268, 270–3 locative case 69–70 logograms 352 logographic writing systems 377–8, 339 loss of morphemes 364 of phones 392 magnetoencephalograms (MEGs) 277 manner of articulation 35–8 manual signs 314 marked 371, 376 markedness 371–2 markup 216 mass comparison 420–2 Maxim of Manner 149 Maxim of Quality 149 Maxim of Quantity 149 Maxim of Relevance 149 McGurk effect 264 meaning 4, 132–51 contextual 139–40 figurative 134–5 literal 134, 135 sentence 135–6, 144–5 utterance 135–6, 146 meaning mismatch 290–1 meaning narrowing 97 meronymy 142–3 metaphor 134–5, 311, 218 metathesis 393 metonymy 134 minimal pair 48 morphs 59 morphemes 58–9 free 59, 60, 66 bound 59, 60, 62–3, 66 grammatical 61–3, 66 lexical 60–1, 66

509

510

Subject Index morphological analysis 71–4 by speakers 73–4 morphological change 394–5 morphological rules 68 morphological typology of languages 370–4 morphology 56–74 learning of 73, 291–2, 298 compared with syntax 124 item-arrangement 75, 124 of sign languages 316–18 word-paradigm 75, 124 morphophoneme 68 morphophonemic form 68 morphophonemic rule 68 mother-in-law languages 170 motor theory of phonetics 267 movability 111 movement (feature in sign languages) 316, 317, 324 moves 196–7 multi-channel signs 314 multilingualism 172, 404 mutual intelligibility 415 N400 component 277 narratives 187–9 narrative structure 187–9 nasal cavity 31, 32 nasals 29, 32, 36, 39, 40, 286, 289, 298, 362, 364, 392, 402 absence of 364 negation 150, 321, 292–3 neologisms 348 neurolinguistics 5, 267–77 neuron 267 neutralization 371 nominalizations 346–7 nominative case 63, 373–4 non-manual features (in sign languages) 314, 320–1, 324, 326 nouns 84 noun classes see gender (of nouns) noun phrases 114–15 number marking learning of 291–2, 298 typology of 364, 365, 370–2 numeral classifiers 104, 287 numeral incorporation 317

Object 122–3, 375 one-word stage in child language learning 286 onomatopoeic words 9–10, 87, 93, 249 oracle bones 336 oral cavity 31–2 oral phones 32 orientation (feature in sign languages) 316, 326 origins of language ancient ideas 247 modern theories 249–52 genetic predisposition 251 gestural origins 250 grooming hypothesis 250–1 social cognition 252 myths concerning 247 nineteenth century theories 248–9 overgeneralization 292, 295 overlaps 200–1 palatals 29, 33, 34, 36, 40 paradigm 11 paradigmatic relations 11–12 parsing 211, 213 parts-of-speech 84–8 in sign languages 318–19 passives 127, 397–8 pejoration 399–400 perception of speech sounds 262–4 performatives (explicit) 146 personal identity function of language varieties 161–2, 403 pharyngeals 29, 33, 35, 390 pharynx 35 phonaesthesia 93 phones 27–8 phonemes 44–7 phonemic transcription 49–50 phonetic transcription 49–50 phonetics 4 acoustic 27 articulatory 27, 29–43 auditory 27 of sign languages 313–16 phonological reduction 398 phonological rules 45–7 phonology 4, 43–9 of sign languages 313–16

Subject Index phrases conjoining of 117 embedding of 117 nature of 113–14 types of 114–17 physiological causes of language change 379–80 pictograms 335, 336 pitch 42–3, 238, 288, 377–8 pitch accent 377 place of articulation 32–5 plasticity of brain 269–70, 276 plosives see stops polysemy 138–9 polysynthetic language 367, 370 positron emission tomography scanning (PET scanning) 274–5 possession 372–3 possessive clitics 63, 65, 85, 117, 291, 298, 399 constructions 298, 372–3 prefixes 60, 177 pronouns 115, 177, 319 post-alveolar phone 33, 34 postpositional phrase 117 postpositions 85 poverty of the stimulus 296 pragmatics 4, 136, 145–51 pre-sequences 201–2 proto-languages 417 reconstruction of 418–20 prefixes 59 prefixing languages 370 pre-language stage in language learning 286 prepositional phrase 117 prepositions 85 prescriptive 2 presuppositions 150–1 primary sign languages 312–23 printing press 343, 346 proclitic 63 productivity (as a design feature) 15, 239, 240, 247 pronouns 85, 148, 167, 170, 171, 177, 192, 364, 370–1, 374 in primary sign languages 317, 318, 322 prosody 42–3, 54, 65, 265 prototype theory 133, 144 psycholinguistics 5, 258–67 pulmonic airstream mechanism 31, 361, 363

rapid fade 335 reanalysis 395, 397 rebus principle 336 reduplication 87, 95–6, 317, 325 reference 132–3, 147–8, 192–3 anaphoric 192 cataphoric 192 endophoric 193 exophoric 193 reflexivity 15–16, 247, 324 register 169–71 regular expressions 220 regularization 394, 405 repetition 195, 245–6, 255, 295 respect varieties 170–1 retroflexes 29, 34, 36, 40, 286, 468 reverses 141 rhotics 29, 35, 37, 45, 90, 289, 390, 393 roots 60 rounding of lips 38 rules of realization of morphemes 68 of phonemes 45–7 runic letters 342 Sapir-Whorf hypothesis 259–62 second-language learning 298–301 effects of age 299 interference 298, 300 negative transfer see interference of morphology 298 of phonetics and phonology 298 stages in 298–9 of syntax 299 transfer 300, see also interference secret varieties 170 semantic bleaching 398 semantic change 399–401 semantic features 143–4 semantics 4, 136–45 lexical 136–44 sentence 144–5 semivowels 37–8 sense 133 sentence comprehension 265 notion of 107–8

511

512

Subject Index sentences complex see complex sentences grammatically acceptable 108 simple 113 ungrammatical 108–9 sign 6–12 slang 103–4, 167–8, 170, 403 slips of the tongue 266 SMSing see text messaging social identity function of language varieties or variation 161, 172, 344 social upheaval as cause of language change 405 sociolinguistics 5, 160–78 sound change 390–3 sound correspondences 418–20 space, use of in sign languages 318, 322 spatial cognition 261–2 spatial terms 261, 401 Specific Language Impairment (SLI) 251 speech accommodation 168 comprehension 262–5, 267 errors 266, 278 primacy of 12, 335 production 266–7 speech acts 145–7 speech chain model 26–7 speech community 162 split-brain patients 273 spoonerisms 266 stem 61 stickers 349, 351 stops 29, 31, 35–6, 37, 40, 44, 45, 46, 263, 289, 339, 340, 363, 364, 392, 402, 406 stød (in Danish) 164–5, 470 stress 43 string constituent analysis 110, 116 structuralism 16, 17 structural pressure as cause of language change 405–6 style 169 Subject 122–3, 375 substitution 193–4 subtraction paradigm 274–5 suffix 59 suffixing language 370

superordinate 141 suprasegmentals see prosody suspicious pairs 47, 48, 49 syllabaries 338–9 syllables 41–2, 43, 87, 170, 219, 266, 286, 338, 339, 355, 364, 377, 402 symbol see symbolic sign symbolic sign 8, 9 synecdoche 134 synonymy 140 syntactic change 395–8 syntactic units tests for 111–12 types of 112–17 syntagm 11 syntagmatic relations 10–1 syntax 106–25 hierarchical structure in 109–12 openness of 106–7, 108 compared with morphology 124 in sign languages 319–21 taboo see word taboo tag questions 109, 122, 288 tail-wagging dance 13, 238–9 taps 29, 37, 45, 90 telegraphic speech 285, 287 tense 71, 72, 85, 116, 259–60, 291, 395, 399 text messaging 349 texts 185–6 Theme 123–4, 204–5 tips of the slongue see slips of the tongue tone 42, 274, 293, 377–8, 429, 468 top-down processing 264–5 Topic see Theme transactions 198 transcription 49–50, 212, 216, 314 transition relevance places (TRPs) 200 tree diagrams 110 trills 29, 34, 37, 40, 45, 90, 390 turn taking 200–1 tweets 351–2 two word stage in language learning 287, 322 typology of case systems 373–4 of motion verbs 378–9

Subject Index of number marking 370–2 of possession 372–3 of word order 375–7 Undergoer 120–2, 319–20, 375 understatement 400 universals of language absolute 363, 364 explanations of 379–80 implicational 363, 364–5 non-absolute 363, 364, 365 non-implicational 331 unmarked 371–2, 373 utterances 10, 135, 145, 196 utterance meaning 135–6, 148 uvulars 29, 34, 37, 40, 390 vagueness 139 value 133 variation in languages 161–2 dialectal 162–4 ethnic 168 gender 166–7 generation 167–8 regional 162–3 register 169–71 religious 168 social 165–6 velars 29, 33, 34, 36, 37, 40, 166–7, 286, 289, 298, 341, 392–3, 402 velaric airstream 40, 361; see also clicks velum 32, 34, 36 verbs 85 of motion 378–9 verb phrases 115–16 visual-gestural medium 12, 308–11, 312, 322 visual-inscribed medium 12, 334–5 vocal folds 31 vocal tract 30–2, 166, 238, 250 voice onset time (VOT) 35, 263 voicing 31

vowels 30, 32, 38–9, 41, 43, 45, 93, 243, 289, 362, 364–5, 402 nasal 39, 45, 47, 330, 362, 402 in writing 337, 339, 340, 342 Wada test 274 Wernicke’s aphasia 270–1, 272, 322 Wernicke’s area 269, 270, 272, 275 Whorfian hypothesis see Sapir-Whorf hypothesis wikis 353 Wikipedia 353 word identification and recognition 264–5 notion of 56–7 types of 57 word classes see parts-of-speech word order 375–7 basic 376 change 395–6 fixed 376 free 376 in sign languages 319 word taboo 99–101, 404 writing compared with speech 346–8 on the internet 348–53 reforms 344–5 secondary to speech see speech, primacy of writing systems Cherokee 339 Chinese 336–7, 338, 340, 345 English 342–3 etymological spellings 343 logographic tendencies 343 standardization of 343 Greek 337, 338, 340 Japanese 339 Olmec 337 Rongorongo 335 zero morpheme 73

513

514

515

516

517

518