Pronunciation is in the Brain, not in the Mouth: A Cognitive Approach to Teaching it 9781463236533

This book investigates the cognitive roots of pronunciation in children and adults and the emergence of accent with adul

239 16 3MB

English Pages 276 [274] Year 2014

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
TABLE OF CONTENTS
FOREWORD
ACKNOWLEDGMENTS
LISTS OF SYMBOLS AND PHONETIC LABELS
CHAPTER 1: MY STORY WITH LANGUAGES, PRONUNCIATION AND ACCENT
CHAPTER 2: THE COGNITIVE BASE OF LANGUAGE
CHAPTER 3: LANGUAGE IN THE BRAIN OF A CHILD
CHAPTER 4: LINGUISTIC ACCENT: DEFINITION, CLASSIFICATION AND DEMONSTRATION
CHAPTER 5: A BROAD BASE FOR UNDERSTANDING THE PEDAGOGY OF TEACHING PRONUNCIATION
CHAPTER 6: TEN COMMANDMENTS FOR TEACHING EFFECTIVE PRONUNCIATION
CHAPTER 7: EXAMPLES OF CROSS-LANGUAGE ACCENT-CAUSING CONSONANTS
CHAPTER 8: EXAMPLES OF CROSS-LANGUAGE ACCENT-CAUSING VOWELS
CHAPTER 9: EXAMPLES OF CROSS-LANGUAGE ACCENT-CAUSING SUPRASEGMENTALS
CHAPTER 10: THE ROLE OF ARTICULATORY SETTINGS IN PRONUNCIATION AND ACCENT
CHAPTER 11: PRINCIPLES OF A MULTICOGNITIVE APPROACH TO TEACHING PRONUNCIATION
CHAPTER 12: PRINCIPLES OF MULTISENSORY APPROACH TO TEACHING PRONUNCIATION
CHAPTER 13: EXEMPLARY APPLICATIONS OF ACCENT REMEDIATION TECHNIQUES
CHAPTER 14: TIPS FOR ACCENT REDUCTION AND ACCENT DETECTION
REFERENCES
Recommend Papers

Pronunciation is in the Brain, not in the Mouth: A Cognitive Approach to Teaching it
 9781463236533

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Pronunciation is in the Brain, not in the Mouth

Pronunciation is in the Brain, not in the Mouth

A Cognitive Approach to Teaching it

Edward Y. Odisho

9

34 2014

Gorgias Press LLC, 954 River Road, Piscataway, NJ, 08854, USA www.gorgiaspress.com Copyright © 2014 by Gorgias Press LLC

All rights reserved under International and Pan-American Copyright Conventions. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise without the prior written permission of Gorgias Press LLC. 2014

‫ܚ‬

9

ISBN 978-1-4632-0415-0

Library of Congress Cataloging-in-Publication Data Odisho, Edward Y. Pronunciation is in the brain, not in the mouth : a cognitive approach to teaching it / By Edward Odisho. p. cm. ISBN 978-1-4632-0415-0 1. English language--Pronunciation. 2. Cognitive grammar. 3. Psycholinguistics. I. Title. PE1137.O423 2014 421’.540071--dc23 2014032984 Printed in the United States of America

TABLE OF CONTENTS Table of Contents ...................................................................... v Foreword................................................................................ xiii Acknowledgments .................................................................. xix Lists of Symbols and Phonetic Labels ...................................... xxi Chapter 1: My Story with Languages, Pronunciation and Accent ............................................................................... 1 1.1. Prelude ....................................................................... 1 1.2. The Evolution of my Interest in Linguistics and Phonetics ................................................................... 2 1.2.1. Natural Language Internalization: Language Acquisition ............................................ 3 1.2.2. A Major in English Language in a nonEnglish Environment ............................................. 4 1.2.3. Full Immersion as an Adult in Two Languages ............................................................. 5 1.2.4. Phonetic and Linguistic Orientation in Graduate Education ............................................. 10 1.2.5. Educational and Professional Challenges in the U.S. ........................................................... 11 1.3. The Impact of my Linguistic/Professional Background on the Evolution of an Approach ........... 12 1.3.1. Impact of my Linguistic Background ................. 12 1.3.2. Impact of my Teaching Career .......................... 14 1.4. Concluding Remarks ................................................. 20 1.4.1. Childhood Trilingualism Triggered Interest in Languages........................................... 21 1.4.2. Learning Kurdish Triggered Interest in Linguistics ........................................................... 21 1.4.3. Graduate Study Immersed me in Phonetics and Linguistics .................................... 22 1.4.4. Professional Challenges in the U.S. .................... 23 v

vi

PRONUNCIATION IS IN THE BRAIN

Chapter 2: The Cognitive Base of Language ............................. 25 2.1. Language: A Species-Specific Code of Communication ........................................................ 25 2.2. Language: A Cognitive-Social System Superimposed on other Systems ...................................................... 27 2.2.1. Vocal Tract Modification ................................... 28 2.2.2. Vocal Folds (Cords) Modes ................................ 29 2.2.3. Tongue Functions and Maneuverability ............. 30 2.2.4. Lip Configurations............................................. 30 2.2.5. Cavities Resonance............................................ 31 2.3 Brain ‘Speaking’ via Respiratory and Digestive Systems .................................................................... 31 2.4. Economy in Language ............................................... 33 2.5. Conscious and Subconscious Brains ........................... 37 2.6. Concluding Remarks ................................................. 40

Chapter 3: Language in the Brain of a Child ............................ 41 3.1. Learning vs. Acquisition: Conceptual Differences ...... 41 3.2. The Brain of a Child and Language ........................... 42 3.2.1. Child Brain Formation and Maturation .............. 42 3.2.2. Formative Months and Years of Mother Tongue ................................................................ 44 3.3. Cognitive Transition in Sound Perception and Production ............................................................... 46 3.3.1. Transition from Phonetics to Phonology ............ 48 3.3.2. The Brain as the Commander-in-Chief of Language Acquisition: The Cognitive Roots of Linguistic Accent ............................................. 50 3.4. Fossilization or Psycholinguistic Insensitivity ............ 52 3.5. There is Room in the Human Brain for more than One Language .......................................................... 54 3.6. Narrowing Down the Broad Definition of Accent....... 55 3.7. Implications for Understanding the Cognitive Nature of Accent ...................................................... 56 3.8. Concluding Remarks ................................................. 57 Chapter 4: Linguistic Accent: Definition, Classification and Demonstration ................................................................. 59 4.1. Introductory Remarks ............................................... 59 4.2. Intralanguage and Interlanguage Accents .................. 60 4.3. Phonetic and Phonological Accents ........................... 62

TABLE OF CONTENTS

vii

4.4. Accent: A Normal Linguistic Phenomenon................. 64 4.5. What is Meant by Accent Acquisition, Accent Reduction and Accent Impersonation ....................... 65 4.5.1. Accent Acquisition ............................................ 66 4.5.2. Accent Reduction (Remediation) ....................... 67 4.5.3. Accent Impersonation or Faking ........................ 69 4.5.4. Intralanguage Accent Reduction and Impersonation ..................................................... 72 4.6. Cultural Accent ......................................................... 73 4.7. Transition of Accent into Orthography ...................... 74 4.8. Concluding Remarks ................................................. 76

Chapter 5: A Broad Base for Understanding the Pedagogy of Teaching Pronunciation ................................................... 79 5.1. Introductory Remarks ............................................... 79 5.1.1. Speech: A Cognitive Phenomenon ..................... 80 5.1.2. Pronunciation: Multisensory Access .................. 81 5.1.3. Pronunciation: Multicognitive Access ................ 82 5.1.4. Pronunciation: An Integrated and Holistic Process ................................................................ 83 5.1.5. Pronunciation: Top-Down & Bottom-Up Dynamics ............................................................ 84 5.1.6. Pronunciation: The Complementary Nature of Acquisition and Learning ..................... 85 5.1.7. Pronunciation: A Natural Gift for Children ........ 86 5.1.8. Pronunciation Should be Premised on a Triangular Base of Perception, Recognition and Production .................................................... 87 5.1.9. Pronunciation & Psycholinguistic Insensitivity......................................................... 89 5.1.10. Pronunciation: Understanding its Scientific Premises............................................... 90 5.1.11. Pronunciation: Its Feedback Mechanisms ........ 91 5.1.12. Pronunciation: In Light of Multiple Intelligences Theory ............................................ 91 5.1.13. Pronunciation: A Generative Skill.................... 92 5.1.14. Pronunciation: Interactive Involvement of Instructors and Learners .................................. 93 5.2. Concluding Remarks ................................................. 94

viii

PRONUNCIATION IS IN THE BRAIN

Chapter 6: Ten Commandments for Teaching Effective Pronunciation .................................................................. 95 6.1. Introductory Remarks ............................................... 95 6.1.1. Thou Shall Teach Pronunciation as a Cognitive Undertaking ........................................ 97 6.1.2. Thou Shall Teach Children and Adults Differently ........................................................... 97 6.1.3. Thou Shall be Qualified for Instruction in Pronunciation...................................................... 98 6.1.4. Thou Shall Familiarize Learners with Human Speech Production .................................. 99 6.1.5. Thou Shall Orient Learners Psychologically ................................................... 99 6.1.6. Thou Shall Use all Sensory Modalities to Prop up Instruction ........................................... 100 6.1.7. Thou Shall Use all Cognitive Modalities to Prop up Instruction ........................................... 101 6.1.8. Thou Shall Transform Learners from Listeners into Performers ................................... 101 6.1.9. Thou Shall Refrain from Insistence on a Learner.............................................................. 102 6.1.10. Thou Shall Make the Classroom a Place for Learning and Fun ......................................... 102 6.2. Concluding Remarks ............................................... 103 Chapter 7: Examples of Cross-Language Accent-Causing Consonants .................................................................... 105 7.1. Introductory Remarks ............................................. 105 7.2. Outline of the English Consonant System ................ 105 7.2.1. Interdental Pair /, / ..................................... 106 7.2.2. Approximant /r/ ............................................. 107 7.2.3. Voiceless and Voiced Alveolar Fricatives /s/ and /z/ ........................................................ 110 7.2.4. English Plosives: /p b, t d, k g/ ....................... 111 7.2.5. Labio-Dental Fricatives /f, v/ .......................... 113 7.2.6. The Affricates /ʧ ʤ/ ....................................... 114 7.3. Concluding Remarks ............................................... 115 Chapter 8: Examples of Cross-Language Accent-Causing Vowels ........................................................................... 117 8.1. Salient Features in General Vowel Description ........ 117

TABLE OF CONTENTS

ix

8.2. The Vowel System of English .................................. 121 8.2.1. Simple Vowels of General American English .............................................................. 122 8.3. Selections of Cross-Language Accent-Causing Vowels ................................................................... 124 8.3.1. Hispanic Learners of English Vowels ............... 125 8.3.2. Arab Learners of English Vowels ..................... 128 8.4. Concluding Remarks ............................................... 131

Chapter 9: Examples of Cross-Language Accent-Causing Suprasegmentals ............................................................ 133 9.1. A Description of the Most Salient Features of Suprasegmentals..................................................... 133 9.2. Stress and Rhythm .................................................. 136 9.3. Tone and Intonation................................................ 140 9.4. Basic Pitch Patterns................................................. 140 9.5. Consonant Clusters.................................................. 141 9.6. Concluding Remarks ............................................... 145

Chapter 10: The Role of Articulatory Settings in Pronunciation and Accent .............................................. 147 10.1. Introductory Remarks ........................................... 147 10.2. Salient Features of Articulatory Settings of Selected Languages................................................. 150 10.2.1. English Articulatory Settings ......................... 150 10.2.2. Spanish Articulatory Settings ........................ 155 10.2.3. Arabic Articulatory Settings .......................... 162 10.3. Concluding Remarks ............................................. 165 Chapter 11: Principles of a Multicognitive Approach to Teaching Pronunciation ................................................. 167 11.1. Introductory Remarks ........................................... 167 11.2. Multicognitive Principles for Teaching Pronunciation......................................................... 169 11.2.1. Think about L2 Speech Sounds ...................... 170 11.2.2. Transition from Hearing to Listening............. 171 11.2.3. Learn Something about Speech Production ........................................................ 171 11.2.4. Mechanical Repetition Hardly Works with Adults L2 Learning .................................... 172 11.2.5. Follow the ‘Perceive, Recognize and Produce’ Procedure ........................................... 173

x

PRONUNCIATION IS IN THE BRAIN 11.2.6. Instructor’s Academic and Professional Qualifications .................................................... 176 11.2.7. Plan Instructional Connection with Learners ............................................................ 177 11.2.8. Explain, Demonstrate and Demonstrate Multisensorily ................................................... 177 11.2.9. Deal with Pronunciation in a Holistic Fashion ............................................................. 178 11.2.10. Consider both Top-Down and BottomUp Perspectives ................................................. 179 11.2.11. Do not Confuse Memorization with Retention .......................................................... 179 11.2.12. Deal with Pronunciation as a Generative Skill ................................................. 181 11.3. Concluding Remarks ............................................. 181

Chapter 12: Principles of Multisensory Approach to Teaching Pronunciation ................................................................ 183 12.1. Introductory Remarks ........................................... 183 12.2. Multisensory Principles for Teaching Pronunciation......................................................... 184 12.2.1. Auditory Modality......................................... 184 12.2.2. Visual Modality ............................................. 186 12.2.3. Tactile, Kinesthetic, Proprioceptive Modalities ......................................................... 188 12.3. Developing Teaching and Learning Strategies ....... 189 12.3.1. Developing Teaching Strategies ..................... 189 12.3.2. Developing Learning Strategies ..................... 192 12.4. Concluding Remarks ............................................. 194 Chapter 13: Exemplary Applications of Accent Remediation Techniques .................................................................... 197 13.1. Introductory Remarks ........................................... 197 13.2. Techniques for Teaching Selected Consonants ....... 197 13.3. Techniques for Teaching Labial-Dental Sounds...... 198 13.4. Techniques for Teaching Interdental Fricatives / / ........................................................................... 202 13.5. Techniques for Teaching Tense (Long) vs. Lax (Short) Vowels ....................................................... 207 13.6. Techniques for Teaching Vowel Reduction ............ 212 13.7. Techniques for Teaching Accentuation (Stress) ..... 217

TABLE OF CONTENTS

xi

13.8. Concluding Remarks ............................................. 223

Chapter 14: Tips for Accent Reduction and Accent Detection. 225 14.1. Introductory Remarks ........................................... 225 14.2. Tips for Accent Reduction ..................................... 226 14.2.1. Tackle the most Salient Phonological Problems ........................................................... 226 14.2.2. Tackle the most Salient Phonetic Problems ........................................................... 229 14.2.3. Improve other Linguistic Skills ...................... 230 14.3. Accent Detection ................................................... 230 14.3.1. Accent Detection by Ordinary Individuals ........................................................ 231 14.3.2. Accent Detection by Professionals ................. 231 14.3.3. Telling the Linguistic Background of a Speaker through Accent .................................... 232 14.3.4 Hiding an Agent through Hiding an Accent ............................................................... 235 14.4. Concluding Remarks ............................................. 239 References ............................................................................. 241

FOREWORD My fascination with human language, in general, and pronunciation, in particular, seems to be intimately connected to my trilingual upbringing as a child in the city of Kirkuk/Iraq and my later exposure to three more languages. The immense diversity in the sound systems of those six languages afforded me extensive coverage of a wide range of sounds, sound patterns and sound systems. Gradually, throughout the five decades that I spent in teaching English language, linguistics and pronunciation in different countries and to speakers of a wide variety of languages, my passion for the diversity in their pronunciation patterns deepened. Thus, teaching pronunciation did not just become my favorite subject, but it also became the focus of my academic research. Throughout the last decade, I became fascinated with the concept of accent and whether it is a sensory deficiency with some people at a certain age or the side effect of cognitive and linguistic perfection in the internalization of the native language (L1). After lengthy observations and investigations, I came to the conclusion that accent is a cognitive phenomenon and seems to be the outcome of a deficit between what is known in psycholinguistics as language acquisition and language learning. The focus in this book is, above all, on accent and its cognitive roots. Such understanding will determine the framework of an approach to teaching pronunciation. In my book: Techniques of teaching pronunciation in ESL, bilingual and foreign language classes (2003),1 I introduced several innovative principles which outlined my cognitive approach to teaching pronunciation. One such principle emphatically stated 1

München: Lincom-Europa.

xiii

xiv

PRONUNCIATION IS IN THE BRAIN

that “pronunciation is in the brain prior to being in the mouth” which constituted the foundational premise for my cognitive approach to teaching pronunciation in general. Naturally, therefore, several of 2003 principles are reintroduced in certain chapters with further refinement, elaboration and application to accent in human speech. As I am still pursuing the teaching of pronunciation in more details in this book with the linguistic phenomenon of accent being a major constituent, it certainly further highlights the significance of the cognitive perspective especially in the case of adult learners’ rendition of an L2 pronunciation. Since the cognitive potential of human beings is fed and nurtured via the five senses, the approach to teaching pronunciation has been identified as ‘multisensory and multicognitive’. The former emphasizes the significance of the joint involvement of as many sensory modalities as possible especially those that feed the cognitive processing of human language and speech, namely, auditory, visual and tactile/kinesthetic, whereas the latter highlights the mobilization of all cognitive processes that human beings practice such as thinking, comparing, contrasting, analyzing, synthesizing, associating and memorizing, among others. It is worth underscoring that accent is not a pathological defect and its remediation is not exclusively the responsibility of a speech pathologist as claimed by some. In an accent modification instruction course announcement, it is stated that “a speech-language pathologist then teaches the client appropriate speech modification techniques”.2 Accent remediation falls within the professional domain of any person qualified in the science/art of speech production, foremost of whom is a phonetician. Nevertheless, each one of those who are professionally qualified to deal with the teaching of speech and pronunciation should combine his/her academic expertise with an effective instructional methodology that incorporates the latest linguistic and educational findings. 2

http://www.speechenrichmentcenter.com/index.php?option=com_Cont

ent&view=article&id=10

FOREWORD

xv

Hence, if accent is not a ‘pathology’, then what is it? From the perspective of this book, it is one of the normal symptomatic side effects of the cognitive and physical maturation of human beings and the gradual transformation in delegating the majority of conscious biological and social survival functions, foremost of which being language, to the subconscious brain. Stated differently, accent is consequential of cognitive transformation of humans from all-conscious operants to predominantly subconscious ones. The recent literature on human language internalization is replete with substantial evidence that the perfection in the mastery of the native language, especially its pronunciation, falls primarily within the years of childhood ranging from birth—in fact, even before birth—until puberty and adolescence. Once a person is fully immersed in the native language (L1) throughout those years, he/she will grow up accentless, but he/she is highly likely, beyond that period, to manifest a certain degree of accent in L2 ranging from light to heavy. It is the inherent cognitive bias to L1 that interferes with the accurate perception and recognition, hence production, of L2 sound system. The years of childhood are so significant for language internalization that a child can grow up as balanced bilingual or even balanced trilingual if there is ample exposure to and immersion in more than one language. My childhood is a typical example. This is so because the brain of any normal child is neurolinguistically wired for such a mission and is powerful enough to handle more than one language. Accent, therefore, tends to be a symptomatic trait of adulthood resulting from the cognitive maturation with regard to L1 rather than being a physical sensory deficiency in the perception and production of speech sounds by L2 learners. It is this difference that has convinced linguists to label the natural internalization of L1 by children as acquisition, while adults’ attempt at internalizing L2 as learning. Doubtless, the two processes are not treated here as mutually exclusive as some inexperienced persons may make it look like; rather they are complementary in nature depending on the age of the person among several other factors to be dealt with in due course. Nevertheless, the younger the person is the greater the role of acquisition in comparison to learning and the reversal of the roles with adults.

xvi

PRONUNCIATION IS IN THE BRAIN

The emergence of accent in the speech of adult L2 learners should never imply the closing of the window on the likelihood of improvement of pronunciation and the remediation of accent. The claim that adults suffer from the so-called fossilization in their learning of L2 pronunciation (Selinker, 1972) is rejected in this study partly due to the lack of a thorough understanding at the time of the cognitive nature of L1 acquisition (internalization) and its subsequent impact on L2 learning and partly due to the inefficient approach and methodologies used to teach pronunciation to adults and their accent remediation. A multisensory multicognitive approach has proven very effective in learning pronunciation and teaching it to adults of different linguistic backgrounds and the remediation of their accent to different degrees depending on several other specific factors such as aptitude, enthusiasm, focus, time allocation etc… All those factors have been relevant in my personal case as I, at the age of thirtythree (33), improved my pronunciation of English and immensely broadened my inventory of different sounds in perception, recognition and production. Certainly, remediation may not be perfect in most cases, but it is definitely evident. As hinted earlier on, the cornerstone of implementing this approach is primarily the ‘multisensory multicognitive’ principles; however, there are several other functional considerations for implementation foremost of which is the distinction between phonological accent and phonetic accent. The former represents sound substitutions that directly result in semantic confusion in words as well as in sentences, whereas phonetic accent may not result in semantic confusion directly, but it may generate noise or uncertainty that interferes with proper conveyance of meaning. Although I had actively implemented this functional distinction in my classes in mid 1990s, it appeared formally in print in 2003. Field and classroom observations substantiate the paramount functional significance of this dichotomy. In teaching pronunciation, the remediation of phonological accent should take precedence over phonetic accent. To demonstrate, it is far more significant to teach a Hispanic learner of English not to substitute a [j] sound as in = [ju] with [ʤ] and change the word to = [ʤu] than to substitute a tap or trill for the English approximant as the former ex-

FOREWORD

xvii

ample represents a phonological accent, whereas the latter represents a phonetic one. In light of what preceded, the effective and efficient teaching of pronunciation of an L2 to adults requires highly qualified professionals implementing an approach that seriously takes into consideration the cognitive nature of pronunciation and the implementational approach that is compatible with it. This book is all about such an approach. The book is in fourteen (14) chapters with the first chapter being a reflection on the author’s trilingualism as a child and later addition of three more languages as well as the experiences that led to his majoring in linguistics with focus on phonetic science and pronunciation. If the reader is eager to go directly to the core materials related to human speech and pronunciation, he/she can skip the first chapter although the chapter is rich with personal experiences. Chapters two through five are essential to understanding the cognitive nature of human language, pronunciation and accent. The reader is expected to encounter difficulty in understanding certain contents of the next chapters without thoroughly absorbing the contents of those four chapters. Chapter six represents a sketch of the most significant principles to be considered for the application of the multisensory multicognitive approach to teaching pronunciation. As for chapters seven through nine, they cover the most common segmental (consonants and vowels), suprasegmental (stress, rhythm, tone, intonation) that cause pronunciation problems and accent in cross-language learning. Intimately related to those chapters is chapter ten which deals with the very important but less known aspect of pronunciation the so-called ‘articulatory settings’. Chapters eleven and twelve elaborate on the main applications of the multicognitive modalities and multisensory modalities, respectively. The book ends with chapters thirteen and fourteen which focus more narrowly on accent and its acquisition, reduction and remediation. Finally, a couples stylistic notes are unavoidable so that the reader would not misinterpret my intention as the writer. First, parts of the book are written in the first person simply because they reflect my personal life and experience; it would, therefore, be awkward to state them in third person pronoun. Second, my use of ‘he’ pronoun for the learner does not exclude the ‘she’. I

xviii

PRONUNCIATION IS IN THE BRAIN

have tried the combination he/she’, but after a while it becomes stylistically too repetitive. I am a liberal thinker and my acknowledgements recognize three women that have deeply impacted my life Edward Y Odisho March 3, 2014

ACKNOWLEDGMENTS I recently celebrated my 75th birthday and with it came my decision not to write more books after this, my 11th one, but rather focus on producing more research papers which reflect some of the thoughts and themes that have been with me for a long time. Certainly, I have not had the opportunity to develop them and bring them all to fruition. However, I seize this opportunity to express my deep gratitude to all institutions and publishers which have assisted me in transforming my writings into books and promoting them locally and internationally. My first book was published by the Iraqi Ministry of Culture in 1971 followed by the second one published by Al-Mustansiriya University/Baghdad. After my escape from Iraq during Saddam’s regime in 1980, my third book was published by Otto Harrassowitz Verlag/Wiebaden, 1988. Between 1988 and 2002, my academic efforts were devoted to publishing research papers which put me on fast track for tenure and academic promotion to professorship. In 2003, I decided to produce a series of books to document my experience in classroom teaching. Towards the end of 2003 my book Techniques of teaching pronunciation in ESL, bilingual and foreign language classes was published by Lincom Europa/München followed by A linguistic and cognitive approach to the teaching of the English alphabet, Edwin Mellen Press/New York in 2004. Between 2007 and 2011, I published the following books all by Gorgias Press of New Jersey. Techniques of teaching comparative pronunciation: English-Arabic, 2005. Linguistic tips for Latino learners and teachers of English, 2007. Linguistic and cultural studies in Aramaic and Arabic, 2009. Modern Assyrian (Aramaic) language between speech and writing: linguistic examination, 2011. Although this book still deals with pronunciation, it has a somewhat narrower thematic focus as it attempts to deal with accent in pronunciation in greater academic and empirical depth. It certainly incorporates the views of many distinguished xix

xx

PRONUNCIATION IS IN THE BRAIN

scholars and experts in the fields that deal with human sound production intricacies, at large, and pronunciation, in particular. To those views, I added the personal reflections on my childhood as a balanced trilingual alongside my experiences accumulated during five decades of classroom teaching and real-life observations. In publishing this book, I must express my gratitude to tens of thousands of students that I have taught and who, in return, taught me through their difficulties as well as their successes in overcoming those difficulties. Finally, there are a few people to whom I am indebted. There are three women who have changed my life: my mother, Shakira, with whom I shared my interest in books, my doctoral supervisor, Celia Scully, who twisted my arm to be more science-oriented and my wife, Wardia Shamiran, for being my friend and partner in whatever I have accomplished during the last four decades of our marriage. Edward Odisho, Morton Grove, Illinois, March 3, 2014

LISTS OF SYMBOLS AND PHONETIC LABELS The conventions and symbols of the International Phonetic Association (IPA) and their acceptable substitutes are listed below and they have been used throughout this book. The Arabic alphabet and its diacritics are also listed because they have been employed where necessary. The following is a list of symbols and conventions used: Vowels  

Phonetic Description Close front with spread lips

Close front to close-mid (somewhat central-

ized) with spread lips 

Close-mid front with unrounded lips



Open-mid front with unrounded lips



Open-mid central with unrounded lips

   

Near-open front with unrounded lips Open front with unrounded lips Open back with unrounded lips

Open-mid back with rounded lips



Close-mid back with rounded lips

u

Close back with rounded lips

   ɚ

ɝ

Near-close near-back with rounded lips Open-mid back with unrounded lips

Mid central (neutral) vowel (schwa)

R-colored (rhotacized) mid central (schwar) R-colored (rhotacized) open-mid central xxi

xxii

PRONUNCIATION IS IN THE BRAIN

Consonants Phonetic Description b Voiced bilabial plosive p p

h

Voiceless unaspirated bilabial plosive Voiceless aspirated bilabial plosive

d

Voiced alveolar plosive

t

Voiceless aspirated alveolar plosive

t

h

 g

k k

h

c c

Voiceless unaspirated alveolar plosive Voiced palatal plosive Voiced velar plosive

Voiceless unaspirated velar plosive Voiceless aspirated velar plosive

Voiceless unaspirated palatal plosive h

q  ʤ ʧ

Voiceless aspirated palatal plosive

Voiceless (unaspirated) uvular plosive Glottal stop

Voiced postalveolar affricate

Voiceless postalveolar affricate

v

Voiced labialdental fricative



Voiced interdental fricative

f

 z s

 

    h

Voiceless labialdental fricative Voiceless interdental fricative Voiced alveolar fricative

Voiceless alveolar fricative Voiced postalveolar fricative

Voiceless postalveolar fricative Voiced uvular fricative Voiceless uvular fricative

Voiced pharyngeal fricative

Voiceless pharyngeal fricative Voiceless glottal fricative

LIST OF SYMBOLS AND PHONETIC LABELS Voiced labialdental approximant



Voiced labialpalatal approximant



Voiced alveolar approximant

 

Voiced retroflex approximant

l

Voiced alveolar lateral approximant

w

Voiced labialvelar approximant

j

Voiced palatal approximant

m

Voiced bilabial nasal (approximant)

n

Voiced alveolar nasal (approximant)



Voiced velar nasal (approximant)



Voiced dental/alveolar tap Voiced retroflex tap

 r

Voiced dental/alveolar trill

Diphthongs in RP English au as in

ai

as in

ou

as in

oi ei i

e

u

as in as in as in as in as in





Conventions / / Phonemic transcription []   _ _

Phonetic transcription Vowel full length

Vowel half-length

Superscript indicating aspiration

Superscript indicating strong stress

xxiii

xxiv –

C V

PRONUNCIATION IS IN THE BRAIN Subscript dot under /d t  s/ = [    ] indi-

cates /‫ض‬, ‫ ط‬, ‫ ظ‬, ‫ ص‬/ the emphatic sounds of Arabic

In syllable structure patterns, ‘C’ stands for a ‘Con-

sonant’

stands for a ‘Vowel’

Arabic Symbols Consonants IPA Phonetic Description ‫ء‬ [] glottal stop

‫ب‬ ‫ت‬

[] voiced bilabial plosive

‫ث‬ ‫ﭖ‬ ‫ج‬ ‫ﭺ‬ ‫ح‬ ‫خ‬ ‫د‬

[] voiceless interdental fricative

‫ذ‬ ‫ر‬ ‫ز‬ ‫ﮊ‬ ‫س‬ ‫ش‬ ‫ص‬ ‫ض‬ ‫ط‬

[] voiced interdental fricative

[h] voiceless aspirated alveolar plosive [p] voiceless bilabial plosive (Farsi) [ʤ] voiced postalveolar affricate

[ʧ] voiceless postalveolar affricate (Farsi) [] voiceless pharyngeal fricative [] voiceless uvular fricative [d] voiced alveolar plosive [] voiced alveolar trill

[] voiced alveolar fricative

[] voiced postalveolar fricative (Farsi) [] voiceless alveolar fricative

[] voiceless postalveolar fricative [s

voiceless emphatic alveolar fricative

[t

voiceless (unaspirated) emphatic alveolar

[d

voiced emphatic alveolar plosive

plosive

‫ظ‬

[

voiced emphatic interdental fricative

LIST OF SYMBOLS AND PHONETIC LABELS

‫ع‬

[ʕ] voiced pharyngeal fricative

‫غ‬

[] voiced uvular fricative

‫ف‬ ‫ﭪ‬ ‫ق‬ ‫ك‬ ‫ﮒ‬ ‫ل‬ ‫م‬ ‫ن‬ ‫ه‬ ‫و‬

[] voiceless labialdental fricative

‫ي‬ ‫ــّــ‬

[]

[v] voiced labialdental fricative (Farsi)

[] voiceless unaspirated uvular plosive [] voiceless velar plosive

[g] voiced velar plosive (Farsi) []

voiced alveolar lateral

[] bilabial nasal

[] alveolar nasal

[] voiceless glottal fricative [] labialvelar approximant palatal approximant

Superscript on consonant indicating geminated (double) consonant.

Vowels (Letters)

‫ا‬ ‫ي‬ ‫و‬

[a] long counterpart of [a] [i] long counterpart of [i]

[u] long counterpart of [u]

Vowels (Diacritics) ‫ــَــ‬ ‫ــِــ‬

Superscript over consonant indicating short [a]

vowel.

Subscript over consonant indicating short [i] vow-

el.

‫ــُــ‬

Superscript on consonant indicating short [u]

vowel.

‫ـــْـ‬

Superscript on consonant indicating absence of

vowel.

xxv

CHAPTER 1: MY STORY WITH LANGUAGES, PRONUNCIATION AND ACCENT 1.1. PRELUDE

Very simply, this book focuses on the linguistic nature of human language, in general, and pronunciation and accent, in particular. Two important principles govern the overall approach, namely, cognitive and the pedagogical principles. With regard to the first, language as a structure and system originates in society but its blueprint is in the brain. It is, therefore, a social product, but a cognitive entity. Whenever social survival needs the services of language, it signals to the brain which, in turn, activates its neurons and synapses to generate communication. As for native language (L1) pronunciation, it, ceteris paribus, is a process of natural acquisition with perfection; however, learning effective pronunciation of a second language (L2) by adults requires conscious effort by the learner assisted with the linguistic and educational knowhow of the instructor. This latter statement highlights the pedagogical principles of the approach taken in this book. The instructor should have a high level of professional competence and experience in the sound systems of the languages involved. He/she should also follow an educational philosophy that premises the success of a teaching approach on the extent of interactive connection with the learners to ascertain that there is an effective mode of two-way interaction. It is imperative that the instructor diversify his/her cognitive and sensory strategies and techniques of teaching as well as discover the individual learning styles of the learners and encourage them to get actively involved in the process.

1

2

PRONUNCIATION IS IN THE BRAIN

1.2. THE EVOLUTION OF MY INTEREST IN LINGUISTICS AND PHONETICS

There are five major linguistic experiences in my life that have contributed to the evolution of my interest in linguistics and phonetics. First, the natural acquisition of three languages in my childhood, namely, Assyrian (Modern Aramaic) Turkmeni and Arabic due to the multilingual environment in which I grew up in the city of Kirkuk/Iraq. My natural trilingualism was the result of full immersion in a context-embedded and situationembedded linguistic environment. In the framework of this study, the use of these two terms is somewhat different from their circulation in previous literature, especially by Cummins (Cummins, 1979, 1984). Context-embedded implies the use of language in discourse format rather than in isolated words and sentences, while situation-embedded implies the use of a discourse that matches the situation in which it actually takes place. To clarify the latter statement, a discussion of the action of airplanes taking-off and landing at an airport will be more authentic than a discussion about the same theme in a café or a classroom situation. This is simply because in the former scenario the actions are multisensorily and realistically perceived, hence readily comprehended. Second, for my first degree, I majored in English language at Baghdad University which implies partial immersion in the target language (L2) due to the absence of fully contextembedded and situation-embedded environments. During my four-year study for my degree, English was dominant only during the class sessions and occasional conversations with faculty members (the majority of whom were native English speakers). Outside those two environments, the day-to-day language of conversation was in Arabic or in any other native languages of Iraq such as Assyrian, Turkmeni, etc. Third, in my early adult life, I experienced full immersion in the Kurdish language. This type of full immersion was repeated later in my adult life with English in both England and the United States. Fourth, in my graduate studies of four years in England, I specialized in linguistics, in general, and phonetic science, in particular. This duration thoroughly exposed me to a wide varie-

CHAPTER 1

3

ty of sound materials from different languages. Such an experience afforded me a scientific insight into human language both as an acquisition process and a learning one. Fifth, the five-decade long professional experience in teaching I have had at different levels of education, in different countries and in diversified educational and professional situations added to the depth of my linguistic experience and honed it. Below is an elaboration on the above five experiences. 1.2.1. Natural Language Internalization: Language Acquisition

Any study of human language internalization should consider the manner in which normal children master their native language or any language they are immersed in as opposed to adults embarking on learning a second language. The two processes are known in psycholinguistic literature as acquisition vs. learning. Acquisition tends to be a subconscious, automatic and effortless process of internalizing a language, whereas learning tends to be more conscious, mechanical and effortful. Inasmuch as pronunciation acquisition is concerned, all that children need to accomplish it is ample exposure to speech in real-life contexts and situations. My hometown Kirkuk was at the time1 the most multilingual city in Iraq with the five languages of Turkmeni, Arabic, Kurdish, Assyrian and Armenian spoken in its different neighborhoods. The overwhelming majority of its population was, at minimum, bilingual with many being trilinguals. Which of the five languages one mastered, depended partly on the size of the population that spoke a given language and partly on the neighborhood in which one resided. In my case, Assyrian was my home language while Turkmeni and Arabic were the two languages to which I was most exposed. It should be clarified, however, that Arabic was then confined to a couple of neighborhoods because of the very small population of Arabs in Kirkuk at the time. Fortunately, I lived in one such neighborhood. What is important to highlight with regard to Arabic is the fact 1

Demographics have significantly changed since the 1940s.

4

PRONUNCIATION IS IN THE BRAIN

that it was the official language of the country especially in education and governmental business—everyone had to pick it up or, at least, be familiar with it. In light of the diversified linguistic environment in Kirkuk, I grew up fully orally competent in Assyrian, Turkmeni and Arabic. I always felt that functionally all three were my native tongues, the only difference between them being that Assyrian was my ethnic native tongue and my home language. The process of internalizing Turkmeni or Arabic to the degree of a native tongue can be attributed to the context-embedded and situation-embedded environment in which the two languages were encountered and used. No conscious effort was made to internalize them; they were simply naturally acquired as long as the exposure to them continued almost every day and throughout most of the day. In a nutshell, my native competency in those three languages as well as their cultures was a typical example of child language and culture acquisition. Turning to another aspect of this childhood linguistic experience and its impact on my later adult linguistic experience, several observations can be made especially with regard to my competence in the future performance of sounds and sound systems of other languages. First, this childhood experience afforded me a much broader inventory of sounds and richer competence in the production of some unfamiliar sounds of other languages. Second, on the flip side, in certain aspects of the sound systems, especially in the domain of accentuation (stress placement) my childhood experience was so dominant that it subconsciously permeated the English language that I learned in the early stages of my exposure to English. In many respects, my stress placement in English was seriously sullied by my childhood languages, especially Assyrian and Arabic. I lived with this misplacement of stress in English without knowing it until I settled in Britain for four years receiving systematic education in phonetic sciences and linguistics. There will be extended elaboration on this subject in due course. 1.2.2. A Major in English Language in a non-English Environment

Completing a major in English language at Baghdad University—obviously a non-native environment—implied partial im-

CHAPTER 1

5

mersion in the target language (L2) due to the absence of a fully context-embedded and situation-embedded environment. During my four-year study for my degree, English was dominant only during class sessions and occasional conversations with faculty members (the majority of whom were native English speakers). Outside those two environments, the day-to-day language of conversation was in Arabic and in any other native languages of Iraq such as Assyrian, Turkmeni, etc. In other words, my exposure to L2 was very limited in terms of time as well as in the variety of contexts and situations in which a language is normally used. Stated differently, we rarely, if ever, had real exposure to and use of English in, for instance, a market place situation or a casual family gathering. Throughout four years of education, my fellow students and I were capable of talking about the three witches in Shakespeare’s Macbeth more than conducting a conversation with the owner of a shop-keeper in a fruit and vegetable market. This limited exposure in terms of time, contexts and situations in which a language is naturally used is in striking contrast to the ample time and the multitude of contexts and situations in which my three childhood languages were acquired. Majoring in English in Baghdad was, to a large extent, a typical example of a context-reduced and situation-reduced experience; it was a classic example of an adult language learning model as opposed to a child language acquisition model. Consequently, my learning of English in my early adulthood in context-reduced and situationreduced environment had many deficiencies. 1.2.3. Full Immersion as an Adult in Two Languages

The following deals with my full immersion in the Kurdish and English languages as an adult and the impact of such experiences on my linguistic orientation and the later evolution of a passion for pronunciation and the pedagogy of teaching it. 1.2.3.1. Full Immersion as an Adult in all-Kurdish Environment

After gaining my degree in 1960, I was appointed as a teacher of English language in the Kurdish city of Sulaimaniya where the daily language of communication was predominantly Kurdish. Thus, I was immersed in another language as an adult. Learning

6

PRONUNCIATION IS IN THE BRAIN

Kurdish was a necessity in a city where almost all daily interactions were in this language. Additionally, for reasons unknown to me at the time, except for the fact that I was an open-minded and progressive young man with an interest in other languages and cultures, I embarked on learning the Kurdish language and culture exceptionally seriously, and with a passion. The manner in which I learned—or ‘acquired’—Kurdish turned out later to be innovative. At the time I was not a linguist to implement the approaches or techniques promoted by applied linguistics to master a second language as an adult, neither was I familiar with the modern concept of immersion in an L2 situation to acquire or learn it. I simply handled the Kurdish language in the following manner, which, decades later, I discovered to be full immersion in the target language: “The attempt at learning Kurdish went through four stages. The first stage was to focus entirely on the oral communication with minimum recourse to the written form except when a certain word or phrase was extremely necessary. For a couple months or so, I did not attempt to speak; it was simply a period of listening to other people’s communication and carefully watching their facial and body gestures that accompanied the conversation. Occasionally in classroom, I used to ask my students to translate some short statements in an English language dialogue into Kurdish and I would carefully listen to the translated segments. The second stage began around the third month when I decided to go one step further beyond the listening period. I ventured to speak by getting involved in very short conversations with some intimate Kurdish friends who appreciated my intention to learn their language. They would correct me when necessary and they would also repeat certain statements so that I would be able to internalize them. I was never intimidated by the mistakes or hesitations that would occur in my conversation. In order to carry my listening and speaking skills one step further and associate them with reallife contexts/situations, I took upon myself almost daily to go to the marketplace to do my shopping and carefully listen to live interactions between shop-keepers and customers. This experience was the most effective and efficient in help-

CHAPTER 1

7

ing me predict the meaning of many words and expressions that I did not know previously. The actual context/situation of the conversations aided me in the prediction of meaning. Perhaps, more significant than just predicting the meaning was the higher possibility of retaining the meaning also because of the clues from the context/situation. At the third stage, which began by the end of the first school year—usually nine months—I was able to sustain simple social conversations. After my marketplace ventures, I made more Kurdish friends with whom I spent my evenings in their homes or in cafés and social clubs for teachers and other civil service employees. The beginning of the second year was the final stage when my colleagues at school, my students and my friends in the community began to address me in Kurdish most of the time. I avoided using Arabic as much as possible; however, when I found myself groping for the right word in Kurdish I did not hesitate to double-dip in both Arabic and Kurdish. In other words, I resorted to some familiar linguistic devices that bilinguals use such as ‘code-switching’ or ‘code-mixing’ of two languages. I should not forget to reveal another strategy I used to teach myself Kurdish. This was a two-prong strategy of listening to songs and retaining their lyrics, as much as possible, as well as the retention of some popular proverbs and sayings. In both cases, the retention was aided by the music in the first instance and by the uniqueness of meaning and other linguistic niceties that this genre of human language usually has. By the end of the fifth year, which was the last year of my service in Sulaimaniya, my communication with people was predominantly in Kurdish. I was good in overall fluency, but excellent in pronunciation.”

After this brief outline of my experience with Kurdish, it is absolutely essential to point out that the manner in which I learned the language so successfully is still an experience which I cannot fully explain. Did I succeed with Kurdish because I was originally a multilingual or was it because I accidentally ran into an oral approach that turned out to be a context/situation-based linguistic pedagogy of which I knew nothing then except for the fact

8

PRONUNCIATION IS IN THE BRAIN

that it seemed a more natural way for human language acquisition or learning? Equally puzzling was the motive that pushed me towards a future in the study of linguistics, in general, and phonetics, in particular, as a profession while still living in a country (Iraq) where linguistics, let alone phonetics, was at the time much less known. After successfully finishing my graduate studies, the question of how I had become so obsessed with pronunciation and phonetic sciences was still nagging me and I craved for a satisfactory explanation. As I was accidentally sifting through the pages of some of my early college psychology and methodology textbooks I noticed that I had persistently underlined statements and paragraphs that were related to language learning, in general, and child language acquisition, in particular. In my view, my childhood multilingualism contributed significantly to my pursuit of linguistic studies as an adult. 1.2.3.2. Full Immersion as an Adult in an all-English Environment

My four-year stay in England placed me in a linguistic environment with maximum immersion in English coupled with extensive exposure to diversity in authentic contexts and situations in which the language was used. Those four years were, in many respects, virtually the reversal of my experience with English during the four years in Baghdad majoring in English. The longer I stayed in England and became acquainted with the linguistic and phonetic principles of language learning and teaching, the more I discovered the weaknesses and flaws in my English at all levels—grammar, style, lexicon and, above all, pronunciation which is the focus of this book. Fortunately, in the consonantal domain of English, there were no serious phonological problems simply because of my broad phonetic and phonological base attributed to my childhood trilingualism. The most noticeable consonantal mispronunciation was my rendition of the approximant English < r> as a tap or a trill one, which, fortunately, constituted a mere ‘phonet-

CHAPTER 1

9

ic accent’2 that did not interfere with meaning. Obviously, all consonant clusters involving an such as /pr, tr, cr/, etc., did cause some deviation in pronunciation. Nevertheless, this was a minor phonetic aberration compared to, for instance, stress placement in words and longer stretches of speech. For instance, in almost all patterns of verbs such as: , , , among many other word patterns, I, like the overwhelming majority of adult Assyrians and Arabs, placed stress on the final syllable which is hardly there in English. Also, in noun compounds such , , etc., where the accent is usually on the first word, I would place it on the second not knowing that it makes a huge difference in reference. A is a certain type of bird that happens to be ‘black’, whereas a is any bird that is ‘black’. These mispronunciations were all transfers from my Assyrian and Arabic languages. For example, in Assyrian the compound word for an (literally, a white-bearded man) is ܚܘܪܕܩܢܐ‬with accent on the second part. I did not know what stress was and how it functioned within a language, and neither did I know what intonation was. Throughout my four years of English language education in Baghdad, which was more or less literature-oriented, I had never heard an instructor draw attention to such important language dynamics.3 In a nutshell, the English we learned in Baghdad was subconsciously colored, to different extents, with the overall pronunciation of Arabic and/or the other native languages of the learners. In light of such facts, and with my schooling in the Department of Phonetics in the techniques of listening to a wide variety of sounds and producing them, I gradually began to concentrate my attention on the pronunciation of the native speakers of English with emphasis on both the segmental and supra2

It will be argued later that a ‘phonetic accent’ is a mispronunciation that

does not alter meaning as opposed to ‘phonological accent’ which does. 3

Bear in mind, I am referring to the years 1956–1960. In the years begin-

ning with the 1970s there were several faculty members in different universities in Iraq who specialized in linguistics and they would emphasize such prosodic features of both English and Arabic.

10

PRONUNCIATION IS IN THE BRAIN

segmental aspects of their pronunciation. I remember vividly how I rectified my pronunciation of the verbs and which I used to pronounce with stress on the first syllable instead of the second one. In fact, in the case of I had a transaction at the bank and I pronounced the word as and the banker said: “You mean ”. With a very mellow touch of linguistic embarrassment, I said: “Yes.” I also soon discovered that a certain category of words such as: could be rendered verbs or nouns depending on the position of stress. In this latter case, I do remember that one of our British instructors in Baghdad did mention in passing such a rule, but did not dwell enough on it neither did he demonstrate the difference for us in order to internalize what he was saying. Pedagogically, this example is typical of the fact when the instructor fails to establish a bridge of interaction with his students assuming that once a rule is mentioned everyone picks it up. The absence of such a methodological strategy of connection between instructors and learners has become one of the cornerstones in my approach to teaching: simply, make sure you and your students are on the same page and they connect with you. 1.2.4. Phonetic and Linguistic Orientation in Graduate Education

The first year of my graduate education was a two-pronged intensive orientation in theoretical and applied principles of phonology and phonetics. The theoretical component covered a wide variety of schools and theories of phonology. The applied side of it involved thorough exposure to human speech from the acoustic, physiological and aerodynamic perspectives which were reinforced with experimental laboratory work as well as practicals in the perception, recognition and production of a wide variety of human sound specimens. This type of educational orientation continued throughout the four years although it was most intensive during the first year. Such specialized intensive education in human speech had far-reaching impact on my perception, recognition and production of a broad array of sounds from different languages. For instance, prior to this education I was able to recognize and produce two bilabial sounds [b p]; however, after the orienta-

CHAPTER 1

11

tion the inventory doubled or even tripled into such sounds as [b b p p  ]. This perceptual and productive enhancement in discriminative and articulatory skills afforded me far-reaching insight into other languages; perhaps, more importantly, it afforded me better insight into my own language repertoire. In the latter case, for instance, I was able to phonetically identify different bilabial plosives such as [b, p, p, p, p]4 some of which turned out to be phonologically significant, i.e., they triggered semantic differences between words. Also, equally important to broadening my range of sound perception, recognition and production, I nurtured an acute kinesthetic and proprioceptive sense for articulatory maneuvers of sound production. Stated differently, I began to feel the movements, positions and shapes of my tongue in the oral cavity, feel the tightness or relaxation of muscles and proprioceptively detect airflows, frictions and vibrations. Such kinesthetic and proprioceptive skills are extremely helpful for learning and teaching sounds especially in L2 situations. If the instructor does not bring such skills to the attention of the learners, they will pass unnoticed and learning fails. 1.2.5. Educational and Professional Challenges in the U.S.

I entered the U.S. as a refugee at the age of forty-three (43) with a family. I stayed unemployed for a year, desperately looking for employment to help my family survive and help myself pursue my academic profession. There was no room for me then to be choosy for employment. The English sayings “Beggars cannot be choosers” and “Don’t look a gift horse in the mouth” applied to me. Every job that I took in U.S. was only broadly within the realm of my academic and professional background, but was not in the core of my expertise so that I would immediately begin to be creative. As one will see later on, I went through three major professional and academic assignments each of which was a daunting challenge that occasionally pushed me to the brink of frustration. Nevertheless, I was a refugee and I was determined 4

The subscript dot indicates emphatic ( ‫خمة‬ ُّ ‫ ) ُُمف‬versions.

12

PRONUNCIATION IS IN THE BRAIN

to face head on any challenge. I succeeded in all three challenges and ended up gaining massive professional and academic experience from each one of them. Very humbly, I felt that the challenges I encountered during the first ten years in the U.S. granted me the value of several more doctorates. I felt that I was academically ‘reborn’ and professionally ‘baptized’ as an applied linguist that added depth to my skill as a teacher and enhanced my research fervor in both theoretical and applied linguistics and phonetics with pronunciation being at the core. 1.3. THE IMPACT OF MY LINGUISTIC/PROFESSIONAL BACKGROUND ON THE EVOLUTION OF AN APPROACH

No doubt, my linguistic upbringing coupled with my linguistic education provided the hidden dynamics that helped cultivate my passion for teaching pronunciation; in turn, it helped mold my approach to teaching pronunciation that is cognitive in essence, but requires further well-defined pedagogical principles and techniques for its implementation. 1.3.1. Impact of my Linguistic Background

The scenario of my linguistic background detailed in the above sections represents diversity in the types of languages to which I was exposed both as a child and adult. Briefly, I had different degrees of linguistic exposure to three major language families, namely Semitic (Assyrian and Arabic), Indo-European (English, Kurdish and German5) and Turkic (Turkmeni). A different perspective for my linguistic exposure is reflected in the nature of the particular experience I have had with each language. Was the experience in the form of acquisition very much akin to what children go through in their native language as subconscious internalization and full-time immersion in context/situation-embedded environment? Was it more in the form of an adult conscious learning without full-time immersion 5

I had some exposure to German through courses in my first degree cou-

pled with other courses at a German Institute in Baghdad. Most of all, my focus was on pronunciation.

CHAPTER 1

13

and with only minimum context/situation-embedded environment? Or was it a partial combination of the above two experiences in the sense that it was a full-time immersion in context/situation-embedded environment, but as an adult? The exposition of my cumulative linguistic experience in the previous sections indicates that I have been through all three types of linguistic environments. My early childhood experience with trilingualism in Kirkuk was of the first type: full-immersion as a child in context/situation-embedded environment which led to typical acquisition of the three languages involved. My exposure to English beginning with 5th grade throughout my high school on the basis of one hour per day could, at best, be described as minimal exposure to a language in almost context/situation-reduced environment using the most traditional approach to teaching known as ‘grammar-translation’. This approach uses translation into the native language (L1) as a medium to teach L2. It is a crude and utterly mechanical way of learning L2. My four-year college experience with English in Baghdad was relatively much better and more effective than the seven years in elementary and high school. There was more exposure to English, better contact with native speakers and more coordination between oral skills (listening and speaking) and literacy skills (reading and writing). However, the whole experience of school and college was a learning experience compared to my childhood trilingual acquisition. My experience with Kurdish and the four years of my graduate study in England had the characteristics of both acquisition and learning. Acquisition was facilitated by the linguistic environments that afforded me ample context/situation-embedded experiences in both languages. I also had plenty of opportunity in terms of time to intentionally learn what I wanted to. I vividly remember that when I was first introduced to ‘tones and intonation’ in human language I was totally lost because I had no idea, whatsoever, about these two aspects of human language. To familiarize myself with this aspect of human language, I imposed on myself a very strict schedule of attendance at the phonetic lab throughout one week and listened to a diversity of tones and intonation patterns. I also focused on stress and stress patterns in English as well as other languages. For English, I did two things to improve my ability in stress identification and placement. First, I listened

14

PRONUNCIATION IS IN THE BRAIN

very carefully to native speakers of English and carefully watched their body gestures. In one instance, I watched the economic correspondent of BBC television in 1970s (Dominick Harrod) whose straight black hair moved visibly down and up his forehead often synchronized with stressed syllables, especially those with primary stress. I rarely missed his appearances. For other languages, I used to listen to the English pronunciation of scores of foreign students attending Leeds University. Second, once I discovered my misplacement of stress, I focused all my attention on the correct pronunciation and repeated that forcefully as many times as needed and at times loudly so that I could hear my performance. I stopped when I felt that I had auditorily, kinesthetically and cognitively developed an acoustic image in my mind of the location of stress and its rendition in the targeted word and pattern of words. 1.3.2. Impact of my Teaching Career

Altogether, forty-nine (49) years of my life were spent in education. The first eleven (11) of which I spent in different Iraqi high schools teaching English. Four years after that were devoted to my graduate studies in England where I also taught Arabic on a part-time basis. After graduation, I returned to Iraq and taught linguistics and phonetics at both undergraduate and graduate levels for five years. I had to flee Iraq for political reasons and settle in the United States at the beginning of March 1981. In the United States, I first joined Loyola University as an adjunct professor teaching two courses in linguistics. In 1984, I was offered a full-time position of lecturer in English as a Second Language (ESL) with the City Colleges of Chicago (CCC) with which I stayed until 1990. During the years 1987–1990 I assumed the position of instructional advisor for ESL and Bilingual teachers within the CCC system. In August 1990, I resigned my position to join Northeastern Illinois University (NEIU) as associate professor in the Department of Teacher Education where I stayed until my retirement (effective January 2009). In the following sub-sections I will highlight the types of educational and professional challenges that I encountered and how each challenge contributed to building and shaping my cognitive and psychological approach to teaching pronunciation.

CHAPTER 1

15

1.3.2.1. Teaching Linguistics and Phonetics

The five years of university teaching of linguistics and phonetics in classroom situations in Baghdad helped put my theoretical knowledge into practice. While I was teaching my students, I was gaining real-life experience which helped me transform that into research papers that were accredited internationally. Nevertheless, I still did not have a vision of an approach to which I could claim ownership. The experience of twenty (20) years of teaching linguistics at Loyola University was, more or less, a continuation of what I did in Iraq except for the fact that one of the courses I was assigned focused on the ethnic and linguistic communities of Chicago. This course was quite challenging for me in nature since I was still a newcomer to Chicago. I devoted long hours to educating myself about the ethnic and linguistic composition of Chicago, which, in reality, represented almost all large urban areas of the United States. It was this course that introduced me to the ethnic and linguistic nature and composition of the overall society in my adopted country. 1.3.2.2. Teaching and Training ESL Teachers

The first three (3) of the six (6) years I spent with CCC were simply classroom teaching of English to L2 learners; however, the last three years were far more challenging both professionally and pedagogically. I was given the responsibility of training ESL teachers to assume the task of teaching English to a wide variety of speakers of other languages. At this juncture, I was professionally only a linguist who knew something about applied linguistics of which ESL is a discipline. The reader should be reminded that towards the end of 1980s ESL had become a primary discipline of applied linguistics with its own pedagogy in the form of theories and methodologies. Sensing that I was not prepared for the challenge, I decided to devote my whole time to review as much published literature in ESL as possible. I was so overwhelmed with sifting through the relevant materials, especially those most relevant to the daily teaching of ESL in classroom situations with all the needed teaching and learning strategies, I felt as if I was preparing for another doctorate. After approximately six months, I had built up some confidence in conducting lectures and running training workshops. I conduct-

16

PRONUNCIATION IS IN THE BRAIN

ed many of them in all of the seven colleges of CCC; in fact, I did more than what was expected of me. I began to receive kudos from the participants in those workshops, the deans of the ESL programs as well as the administrators in the central office of CCC. I was even offered an administrative position in the central office which I declined to have because I preferred to be in the classroom or field rather than in the office. After the end of those three years, I sensed that I had transformed myself into an ESL specialist. I began to present workshops and papers at local and state conferences as well as publish articles and papers. Personally, I felt that I had enhanced my academic reputation not just as a phonetician and theoretical linguist,6 but also as an applied one. Apparently, because of my performance and professional reputation I was approached by the College of Education at NEIU to apply for a position newly opened in the Bilingual/Bicultural Program (BLBC) which was part of the Department of Curriculum and Instruction later known as Department of Teacher Education. After the formalities of interviews, I was offered the position at the rank of associate professor to teach four courses—two bilinguals and the other two language arts (English)—in the regular program of Teacher Education. I believe I was offered the position not really because of my expertise in teaching bilingual education and language arts, but rather because of my academic credentials, my quality publications and my multilingual background in several languages. My academic assignment at NEIU turned out to be far more challenging than the ESL one with CCC simply because I had limited professional experience in both areas. Nevertheless, I welcomed the challenge simply because I still felt I was a refugee ready to confront any challenge in order to survive professionally and socially as a person with a family.

6

In 1988 I published my first book outside Iraq titled: The Sound System of

Modern Assyrian (Neo-Aramaic), Harrassowitz Verlag, Germany.

CHAPTER 1 1.3.2.3. Teaching Language Arts

17

In preparing teachers for English language in elementary and high schools in the United States, usually the course that covers its methods of teaching is named as ‘language arts’. Once I was assigned two language arts courses the first semester at NEIU, I had a quick look at the text and I was shocked at how remote the contents were from the modern linguistic perspective of teaching a language in an age when linguistics was permeating every study of every aspect of human language be it as L1 or L2. Unlike the ESL and bilingual language instruction, the teaching of English as a native language (L1), usually under the rubric of ‘language arts’, is so untouched by linguistics that, at times, the approach to its analysis is utterly surface-structured and orthography-based that many inaccurate practices and misconceptions riddle the approach. For instance, English vowels are taught as if they are five in number and occasionally six when is added. Such a statement, which is characteristic of phonics, is often a letter-based approach as opposed to a sound-based one. Linguistically such a statement is baseless because in all varieties of English there is a minimum range of 15–20 vowel phonemes (units) in the form of both simple vowels and diphthongs. Moreover, in the phonics-based approach consonant clusters (so called ‘blends’) may still be determined on the basis of letters rather than sounds. The teaching of spelling is so generically and obscurely conducted that there is hardly any distinction made between graphic spelling (based on written letters) and oral spelling (based on letter-names) two totally different language processes that require different methodologies for implementing. I was so frustrated with this assignment that I was almost about to ask the chair of my department to replace them with other courses. After further contemplation I thought it would be unwise on my part to ask for replacement since I was a ‘rookie’ professor who had to build up his professional and academic reputation in order to secure employment, promotion and tenure. In face of such a professional dilemma, I recalled fondly my successful experience with my ESL professional struggle. Once again I decided to confront the professional and academic challenge. I embarked on a massive reading and researching campaign to familiarize myself with all aspects of teaching language arts including the so-called theories, approaches, techniques and

18

PRONUNCIATION IS IN THE BRAIN

styles. Everywhere I thought I could infuse knowledge from linguistics into teaching language arts I did not hesitate to do so. I was relieved when the first semester came to an end and I managed to survive satisfactorily. With the next semester, I felt more prepared and at ease. As the years passed, I fell in love with the teaching of language arts because I succeeded in gearing it in the direction of modern linguistics in combination with up-to-date cognitive theories in language teaching pedagogy. At this juncture, I began to feel that I had accumulated enough knowledge and experience to formulate what could be called ‘my personal approach’. Theoretically, the pedagogy of this approach was premised on the works of four intellectual pillars of cognitive pedagogy to human language acquisition and learning. They were Noam Chomsky with his transformation-generative (TG) approach to the nature of human language and child language acquisition premised on the concept of language acquisition device (LAD); Howard Gardner with his multiple intelligence theory (MIT); Lev Vygotsky with his zone of proximal development (ZPD); and Jean Piaget with his child cognitive development (CCD). With the knowledge I acquired from these giants, I infused my own academic and professional experience of decades of teaching and researching to formulate what I had called at an early stage: ‘Multisensory Multicognitive Approach’ to teaching pronunciation (Odisho, 2003; 2007/a). Since then I have benefitted from further theoretical insight into child vs. adult acquisition and/or learning of sounds and sound systems. I have equally benefitted from the feedback that I received during the last ten years of the application of the approach in classroom situations with a large number of adult learners from a broad range of linguistic and cultural backgrounds. Gradually, I began to feel that there were so many weaknesses in the traditional approach to teaching ‘language arts’ that I had to unveil them publicly for other people in the field. My focus was primarily on areas related to my academic orientation and research interests. I began presenting and publishing research works with focus on the shaky and linguistically indefensible premises of ‘phonics’ as a tool to teach letter-sound correspondence and the ensuing vowel and consonant systems as well as the overall pronunciation and spelling. Phonics, for instance, confuses letters with sounds in both domains of conso-

CHAPTER 1

19

nants and vowels but more so in the latter. The alleged dichotomy of long vs. short vowels is the most striking example of confusion in the quality and quantity7 of the English vowel system. The manner in which phonics approaches the study of the sound system of English is utterly vulnerable from the perspective of modern linguistics. After several years of teaching language arts, I had so many inaccuracies and misconceptions to reveal that I had to write a book titled: A Linguistic approach to the application and teaching of the English alphabet (2004). 1.3.2.4. Teaching Bilingual Education

I have to emphasize the fact that I was hired by NEIU primarily to fill a position in the BLBC program. In other words, teaching bilingual/bicultural courses was my primary responsibility. Certainly, my multilingual background and my professional linguistic orientation coupled with my quality research work were of great help in handling some of the courses in the program. Nevertheless, there were areas in bilingual theories and bilingual education with which I was not familiar. Since the late 1960s bilingual/bicultural education and bilingualism, at large, have taken gigantic strides forward in theory and pedagogy. Very much like ESL, bilingual education has during the last few decades constituted a major domain in applied linguistics. In order to be ready to handle all aspects of bilingual/bicultural education I had to seriously acquaint myself with the latest innovations and applications in the field. The works of Jim Cummins, Stephen Krashen, Colin Baker, among many others, were thoroughly studied. As if I had not done enough self-education in the field of bilingual education, the Dean of the College of Education called for a special meeting of the BLBC faculty during which he assigned me the responsibility of developing two master level programs in bilingual education. This implied the need to develop 7

Because the words quality and quantity will keep recurring and acting

jointly in vowels, I have opted to blend them together in the form of ‘qualtity’ to be used where necessary.

20

PRONUNCIATION IS IN THE BRAIN

from scratch a minimum of six (6) graduate courses supported with the needed bibliography. Fortunately I was relieved of 50% of my teaching load for one semester. Throughout the semester, besides teaching my other two courses, I had exclusively devoted my time to this new professional assignment. The first two months of the semester were dedicated to surveying the literature on bilingualism and biculturalism in combination with pertinent insights from applied linguistics especially ESL. The focus during the remaining two months was on designing the framework of each course, identify its contents and supplement a core bibliography. The courses were approved by the College of Education and certified by the educational authorities in Springfield/Illinois. Very humbly stated, this achievement in bilingual education enhanced my professional status among my colleagues; besides, personally I felt it significantly added to my professional stature. I also began to feel that I stepped beyond my theoretical linguistic or phonetic skills into the realm of applied linguistics and the pedagogy of teaching language in a wide variety of classroom and real-life situations. Concurrently with my triumphant academic wrestling with ESL, language arts and bilingual/bicultural teaching I began a massive campaign of national and international academic presentations and publications. I published scores of papers in refereed journals and special volumes in honor of renowned scholars throughout the world. In 2002, I decided to put my long experience in teaching and research in several books prior to my retirement. Thus, I published: Techniques of teaching pronunciation in ESL bilingual and foreign language classes in 2003; A Linguistic and cognitive approach to the teaching of the English alphabet in 2004; Techniques of teaching comparative pronunciation: English-Arabic, in 2005; and Linguistic tips for Latino learners and teachers of English, in 2007. 1.4. CONCLUDING REMARKS

My early multilingualism seems to have been the motive behind my interest in languages, while learning Kurdish seems to have been the turning-point that guided me in the direction of linguistics. As for my early pre-doctorate writings, they narrowed down my general interest in linguistics focusing it on phonetics and pronunciation. Some light should be shed on this statement.

CHAPTER 1

21

1.4.1. Childhood Trilingualism Triggered Interest in Languages

There seems to be ample evidence that my trilingual/tricultural upbringing in Kirkuk was the flicker that lit the road for me in the direction of interest in languages and their teaching. Since the formal teaching of my native Assyrian language was very limited, I gradually developed interest in the last two years of high school in both Arabic and English beyond their formal classes. I finally ended up majoring in English for my first degree and graduated with honors. 1.4.2. Learning Kurdish Triggered Interest in Linguistics

My first degree in English in Baghdad does not seem to have triggered my future pursuit of self-education in linguistics in mid 1960s; instead, I give most credit to the five years of my stay in Sulaimaniya where I passionately pursued the learning of Kurdish. The first inkling of an infatuation with what I later found to be linguistics was born there. During the six years of teaching after that in Basra and then in Baghdad most of my readings were in the realm of human language with focus on the history of the English language, the etymology of words and some pronunciation issues. My pursuit of language studies in the direction of linguistics was utterly self-motivated. I published more than twenty (20) articles all in Arabic8 in different Iraqi newspapers and magazines followed by my translation of a book from English into Arabic titled Sounds and signs which was published by the Ministry of Culture in 1971. By this time, in Iraq, I was known as a ‘linguist’. Remember this was all prior to my formal graduate study in phonetics and linguistics in England.

8

Except for one article in English, published in the ‘Baghdad Observer’

dealing with Arabic loanwords in English via Spanish which retained their definite article < ‫[ = >أل‬]. 25 years later I revisited the theme extensively and in depth with a publication in Zeitschrift für arabische Linguistik, Vol. 33, 1997.

22

PRONUNCIATION IS IN THE BRAIN

1.4.3. Graduate Study Immersed me in Phonetics and Linguistics

In 1971, after eleven years of applying for some sort of scholarship, I was granted a study leave, which was financially a third class sort of scholarship. The reason why it took this long to have even a third class privilege was simply due to political, ethnic and religious discrimination. Nevertheless, after gaining this scholarship I joined Leeds University determined to do my utmost. I joined Leeds because it was the only university in Britain that granted degrees in phonetics as a separate subject from linguistics. In my wildest imagination, I never ever thought that to major in phonetics would be so challenging, demanding and frustrating on top of the cultural shock of moving from Iraq and settling in Britain. The shock was so powerful in the first weeks that it pushed me to the verge of frustration; however, I put all my thoughts together and remembered all those eleven years of my struggle to continue my higher education as an ethnic Assyrian. I, then, decided to face the challenge and do everything possible to succeed. My work was the second doctoral project since the founding of the Department of Phonetics in 1948, the first being granted in 1970. I had two supervisors: a linguist and a phonetician. The latter was a young, very active and academically ambitious, faculty member in charge of the phonetics lab and the courses in acoustics, anatomy and physiology. She wanted me to be in line with her academic interests and ambitions. She pushed me almost beyond my scientific tolerance; fortunately, I managed to respond and began gradually to be more scienceoriented in my research. The bulk of my research was lab-based. I had to conduct a wide variety of experiments using different gadgets. The most challenging, and at times scary, experiments were the ones that involved inserting catheters or polyethylene tubes through my nostrils for two purposes. One catheter, with a photo-transducer at the end of it, was to be positioned just above the vocal folds to detect the opening and closing of the glottis and whether there was any vibration in the vocal folds. The other catheter had two small openings at its end to be inserted through the nasal passage and positioned between the larynx and pharynx.

CHAPTER 1

23

The purpose of this experiment was to monitor the intraoral pressure changes in the vocal tract. Such experiments were initially very intimidating and I did not have the support of anyone who had conducted them before. Nevertheless, the dream of a doctorate in phonetics by an Assyrian who suffered from discrimination and struggled for eleven years to have a go at it, made the trial less daunting. Inserting the catheters was a truly scary and disgusting ordeal. I suffered from severe coughing and nausea and my nose was almost dripping with mucous. During one year, I conducted fifteen successful attempts and accumulated more data than what I needed for my doctoral thesis. Being a fast writer, I finished my work, had my thesis of 420 pages typed and bound 6 months before the first day of my eligibility to submit. My graduate studies radically changed me as an educated person. It gave me academic depth, enhanced my tolerance for research and innovation and infused more scientific orientation in my approach to problem finding and problem solving. I became a passionate researcher with a broader horizontal perspective and deeper vertical vision. 1.4.4. Professional Challenges in the U.S.

The professional challenges in the U.S. in the form of teaching ESL, bilingual education and traditional language arts were not in nature what a phonetician was trained to do. Nevertheless, for a refugee professor facing those professional challenges was a fact of survival; indeed, I did not just survive, but I also gained a treasure trove of professional expertise in those three fields of teaching. As a researcher and professional, I developed a better understanding of my strengths and weaknesses, promoted more intimate connection with my students and cultivated richer teaching and learning styles. Perhaps more importantly, my knowledge base in linguistics and phonetics became more applied especially because of a better understanding of human language and speech as a multilingual person from childhood to adulthood. My greatest discovery was that any study of language, especially human speech, should begin with the evolutionary potential of the brain and its application through interactions in real-life situations. The knowledge and experience I accumulated throughout almost five decades of teaching led me

24

PRONUNCIATION IS IN THE BRAIN

to the pedagogy of teaching pronunciation that is multisensory and multicognitive in theory and in application.

CHAPTER 2: THE COGNITIVE BASE OF LANGUAGE 2.1. LANGUAGE: A SPECIES-SPECIFIC CODE OF COMMUNICATION

Language as a species-specific entity implies that only human beings are genetically born with a potential for language acquisition in its generative sense that Chomsky has intensively promoted. In this study, the generative nature of human language entails the potential to produce and comprehend instinct-free and stimulus-free infinite chains of meaningful structures using finite number of rules. In its underlying structure, the generative characteristic of human language also implies the potential for producing infinite number of meaningful structures from a very finite number of meaningless sound units (minimal structures) that are traditionally known as phonemes. This potential is part of the genetic makeup of human beings and it is ingrained in their brain. The code of language with human beings is radically different from that of other beings including apes and birds simply because with human beings the code is open-ended (infinite in the generation of meaning) whereas with non-humans it is closeended (finite in the generation of meaning). Thus linguistically, the word ‘language’ is exclusively used for human beings, while its use for other creatures is, at best, figurative; theirs is simply a finite code of communication to serve finite functions during their life span. This faculty of language is the result of millions of years of natural evolution of Homo sapiens away from chimpanzees, especially with regard to encephalization or “amount of brain mass to body mass”.1 In more strictly scientific terms, the brain 1

http://en. wikipedia.org/wiki/Encephalization.

25

26

PRONUNCIATION IS IN THE BRAIN

of a human being has the highest number of nerve cells (neurons) compared to all other animals; these neurons number in their hundreds of billions. A certain percentage of those billions constitutes the blueprint of the most sophisticated code of meaning-generation and meaning-decipherment: language. Although certain locations or areas in the brain such as “Broca’s expressive and Wernicke’s receptive areas” (Joseph, 2011) are more commonly associated with human language, language as it is understood nowadays is the function of the brain as a whole. Many scholars whose scientific works are in line with Darwin’s theory of evolution lump the gradual human brain encephalization with the gradual emergence of the faculty of language. This reciprocal relationship of evolution between the brain and language has been identified as a co-evolution phenomenon in nature (Pinker, 1994; Deacon, 1997; Christensen, 2001). However, the evolution of modern man goes beyond just the coevolution of brain and language. Developing an upright gait and freeing the hands (so-called bipedality, see Ackerman, 2006) seems to have preceded this particular co-evolution and, indeed prepared for it. Bipedality afforded man three extremely significant advantages. First, it freed the front limbs and transformed them into what became hands to be manipulated for more creative survival strategies. Second, it granted him a more efficient panoramic vision. Thus, vision wise, he did not have to stand on his hind legs, as many animals still do, to see beyond the level of his eyes when resting on four legs. Third, the upright gait gradually helped prepare the vocal tract for both physical survival and powerful potential for speech sounds generation. These three socalled sub-evolutions, jointly with others, enhanced his power to control his environment and interact with it more intimately and creatively. It is such combinations of sub-evolutions that have triggered the reciprocal evolution of human brain in the form of massive multiplication of its neurons. Figuratively speaking, the evolution of human brain capacity is similar to the gigantic increase in the capacity of the hard drive of modern computers from megabytes to Gigabytes to terabytes, etc., but in a more creative manner. Obviously, natural survival forced man to be as creative as possible. The most important strategy for survival was to become a social ‘animal’ and form a cooperative community with

CHAPTER 2

27

others. In turn, living in a community necessitated an efficient means of communication to plan and consolidate cooperation. Consequently, this need for communication led to the gradual emergence of a system to facilitate it. It is highly likely that in the beginning, the system was based on primitive hand and body gestures combined with different types of murmurs, grunts and noises which were more suprasegmental (long) in nature. No doubt, this was a very primitive and crude system. With time, and in order to grant the system more efficiency, it eventually evolved into well-defined segmental sounds to be coalesced together in conjunction with melody and rhythm to generate longer meaningful structures. In a sense, it was an evolution and progression from more generic to more specific, similar to the evolution of writing from longer and more general units to shorter and more specific (i.e. pictographic, ideographic, syllabic and alphabetic). In order for such meaningful messages to be comprehended by all members of the community, they had to be governed by rules: rules for sound combinations and others for word combination or syntax. It was these minimal sound units that were joined together with rules into larger meaningful units that paved the way to what we identify now as language. This gradual evolution of rule-governed systems and structures of meaning generation and comprehension could not have been possible without the qualitative and quantitative enhancement of the human brain. In turn, this enhancement through the power of selective evolution was genetically internalized. Such a genetic transformation granted newly-born babies the potential for activating, in appropriate social environments, a code of infinite production/comprehension of semantic messages that justified identifying language as species-specific. It is this potential that convinced Lenneberg to name his book Biological foundations of human language (1967). 2.2. LANGUAGE: A COGNITIVE-SOCIAL SYSTEM SUPERIMPOSED ON OTHER SYSTEMS

The gradual evolution of language required the generation of certain types of neurons and neural connections to assume responsibility for the internalization and activation of language. The more language distanced itself from grunts, murmurs and

28

PRONUNCIATION IS IN THE BRAIN

gestures in the direction of more readily generated and easily identified sounds (noises and voices), the greater the need became for manipulating many of the basic organs that help the other systems to assume additional functions. Along this line of thinking, this newly-evolving sociocognitive system had no specific organs assigned to it to initiate the needed acoustic and aerodynamic conditions for speech generation. Hence, in order for speech to be generated, it had to manipulate the existing organs in the human body, especially organs from the digestive and respiratory systems. From the evolutionary perspective this phenomenon of assigning double functions to certain organs has been recently named ‘exaptation’.2 A standard definition of the term is “an evolutionary process in which a given adaptation is first naturally selected for, and subsequently used by the organism for something other than its original, intended purpose” (Croom, 2003). Stated differently, the process stands for assigning an extra function to a system or organ in addition to its original function or purpose. The most commonly cited example for exaptation as an evolutionary process is the case of feathers in birds. Biologically, it has been stated that the original purpose of feathers in birds had been for the control of body temperature; it was only later that the evolutionary process adapted them for flying. In the case of speech, a typical example of exaptation is the radical modification in the shape of the vocal tract and the additional functions assigned to different digestive and respiratory organs. Below are some of the typical exaptation examples that facilitated the proper production of speech. 2.2.1. Vocal Tract Modification

The most significant outcome of exaptation accompanying language evolution is the modification of the tract between the lips and the vocal folds (cords). Presently, in human beings, instead of the tract having an approximately 45º curve, as it still is with many mammals as well as with newborn babies, in a mature 2

Etymologically, the word was coined from the prefix +

with the deletion of (ad).

CHAPTER 2

29

adult that tract is now almost at 90º curve and it is known as the vocal tract. A newly-born baby “has a mammalian larynx that can rise, enabling concurrent breathing and eating, and not until the age of three months are its speech organs ready for producing vowels” (Pinker 1994:354). It is because of the early 45º shape of the vocal tract and their ability to separate the breathing and eating tracts that babies can drink liquids while lying on their back whereas adults have serious difficulties doing that with a 90º tract. This 90º curve has resulted from the considerable lowering of the larynx, thus pulling the epiglottis away from the velum, and making contact between the epiglottis and velum no longer possible. It is because of this separation that swallowing is not possible while breathing for adults and vice versa. In modern man, the 90º vocal tract has two fairly distinct dimensions: a horizontal dimension beginning with the lips in front and the end of the oral cavity in the back and a vertical dimension beginning with the pharynx down to the vocal folds enclosed in the larynx. This evolutionary modification of the vocal tract has remarkably enhanced the articulatory, aerodynamic and acoustic suitability of the vocal tract for immensely diversified sound generation (in the larynx) and noise generation almost along the entire vocal tract. 2.2.2. Vocal Folds (Cords) Modes

The vocal folds (commonly known as cords) are small lip-like muscular tissues that are jointly and horizontally anchored to the thyroid cartilage (Adam’s Apple) in front and separately connected at the back to the arytenoid cartilages. Their natural function in the human body is to seal against the accidental entry of any object into the lungs while swallowing and to open them while breathing. With the evolution of speech, the vocal folds assumed an extremely important function, namely, voice generation through their different modes of vibration. In fact, vocal folds constitute the major source of human sound inventory enrichment. They are involved in the generation of both segmental (consonants and vowels) and suprasegmental (stress, tone and intonation) sounds. It is worth mentioning that the presence or absence of vocal folds vibration constitutes the richest and most economic principle for the generation of contrastive pairs of sounds. The voiced vs. voiceless

30

PRONUNCIATION IS IN THE BRAIN

dichotomy, which is a universal feature in human speech (Aitchison, 1996: 183), is the most economic and convenient distinctive feature to double the basic sounds in any language. Aesthetically, when voice was embellished with harmony and melody, humans crafted the most popular form of entertainment in human history—singing. 2.2.3. Tongue Functions and Maneuverability

The role of the tongue in all mammals is of prime importance for survival. “It has vital functions in feeding: It plays a major role in ingestion, as in licking, lapping, and browsing; and it moves food distally through the oral cavity from the incisors to the post-canines for chewing, and then to the pharynx for bolus formation and swallowing” (Hiiemae and Palmer, 2000). Nevertheless, the role of the tongue in speech is equally important. It is so important that in many cultures, such as English, Turkish, Assyrian, Hebrew, etc., the word ‘tongue’ is synonymous with ‘language’. In actual articulation, its configurations, distances from and approximations to the other passive or active articulators inside or outside the vocal tract (such as the lips) give birth to almost all common vowels in human languages. In the production of consonants, its role is no less significant. It is the primary articulator that determines the classification of the majority of unmarked (common) speech sounds in terms of place and manner of articulation. The versatile muscular structure of the tongue grants it so much plasticity that it can even endure antagonistic movements such as having the tip of the tongue placed at the incisors and alveolar ridge with a simultaneous gesture of a drastic push of its back/root posteriorly into the pharynx as is the case with the emphatics طُُص ُُض ُُظ‬in Arabic. In short, evolution has given the tongue, the forefront organ of the digestive system, equally important functions in human speech production system, especially through its greater maneuverability within the vocal tract. 2.2.4. Lip Configurations

The essential biological function of the lips is to help food begin its journey along the digestive system. In speech production, they are the organs that generate most of what could be called

CHAPTER 2

31

the visible sounds or sound features for both vowels and consonants (Odisho, 2003). In the case of vowels, for instance, lipposition (spread, neutral, rounded), which is the only distinctly visible feature in vowel production, constitutes one of the three primary parameters in vowel formation and description. As for consonants, the lips, jointly or severally, are active in the formation of all bilabial e.g., [b, p, , ], labialdental [v, , f] and labialvelar [w] sounds. The labial/bilabial feature is not only felt by the speaker, but is also seen by the listener. 2.2.5. Cavities Resonance

The current 90º shaped vocal tract has augmented the sources of speech resonance, thus contributing considerably to the diversification of speech sounds and allowing different languages throughout the world to select their own sound inventories based on places and manners of articulation. Among such resonance cavities are the laryngeal, buccal (mouth), nasal and pharyngeal cavities. No doubt, all those cavities were biologically exclusively designed to serve primarily the respiratory and digestive systems. The laryngeal and buccal cavities have responded to evolutionary pressure leading to clear-cut exaptation; less so, for instance, with the nasal cavity. The structure and the anatomy of the nasal passage with the absence of a movable part (articulator), with many side chambers and heavy coating of mucous membrane render it a highly efficient airconditioning tract for respiration, but a very inefficient cavity for sound production and resonance. This explains why in human speech production mechanism, it is the oral passage that is predominant. Only a few nasals and nasalized sounds are attested in human languages, in general. Proprioceptively, it is quite difficult to feel nasal airflow, but one certainly can prove its presence when a nasal sound, such as [m] or [n], is sustained and then suddenly the nostrils are shut off with the fingers. 2.3 BRAIN ‘SPEAKING’ VIA RESPIRATORY AND DIGESTIVE SYSTEMS

It is clear from the above brief descriptions that the biological functions of certain parts and organs of the digestive and respiratory systems and the additional functions assigned to them throughout millions of years of evolution to generate language

32

PRONUNCIATION IS IN THE BRAIN

have all led to radical changes in the brain of human beings and other biological systems. The evolutionary growth in brain capacity and the concomitant evolutionary modifications in some organs have granted human beings much better qualifications for physical, cognitive and social survival and creativeness. In the forefront of such qualifications is the emergence of language as a unique sociocognitive privilege that sets human beings apart from other primates. When the brain fires its instructions for a message to be delivered, it is primarily the respiratory system and the upper end of the digestive system that facilitate the transformation of the cognitive message into an audible one through aerodynamics and acoustics. The lungs pump the air (the dynamic power) and send it through the necessary channels for vibration generation (voice) at the vocal folds level, if needed, combined with the appropriate degree of turbulence noise at different junctions along the vocal tract. It is this concomitant combination of systematic and rule-governed acoustic signals that impact the ear of the listener whose brain then decodes those signals according to pre-conceived code of a given language. Without such a preconceived code, the on-coming acoustic signals would be meaningless. It is just like listening to a language that one does not know. Usually in a face-to-face communication, speech is more readily transmitted and decoded between speakers and listeners because it is naturally accompanied by facial, hand and body gestures. This is yet another aspect of human speech where it out-performs the communication code of other primates. According to more up-to-date research, “it seems that gestures have a tight and perhaps special coupling with speech in present-day communication. In this way, gestures are not merely add-ons to language—they may actually be a fundamental part of it” (Kelly, et al, 2009). In other words, the authors conclude: “If you really want to make your point clear and readily understood, let your words and hands do the talking.”

CHAPTER 2 2.4. ECONOMY IN LANGUAGE

33

Some of the plain definitions of the term ‘economy’ read like the following: “Careful, thrifty management of resources; an orderly, functional arrangement of parts; an organized system”.3 According to such definitions human language turns out to be one of the most economic systems ever developed in nature. It is a system that uses finite minimal meaningless units to generate infinite meaningful multi-length units. It is this genius in human language that Martinet (1964) calls ‘double articulation’. The principle of double articulation is mathematically generative in nature. It first creates minimal units without meaning because if they were with meaning then any single unit or any combination of them would result in meaningful units. With such a system it would be too confusing for the brain to retain so many units with meaning. In the face of such vulnerability to confusion, the brain imposes rules that generate redundancy which, in turn, ‘provides the sufficient stimulus needed to acquire the system of language. Redundancy, therefore, provides the stimulus needed to acquire a complex grammar system’.4 One such powerful rule of redundancy is to deprive the minimal units of the bottom-most layer of language—the so-called phonemes—of meaning and start assigning meaning to units at a higher level generally known as morphemes. Other sets of rules are added to regulate the syntactical relationships of longer stretches of speech. It is this multi-layered reversed-pyramidal structure of language that becomes infinitely generative. It uses a finite set of rules to create finite systems which jointly are capable of generating infinite meaningful structures. These sets of rules serve two primary purposes: first, enable the speaker/listener to guess meaningful units from meaningless ones within a given language; second, enable the brain to internalize the finite rules subconsciously and interfere consciously when needed. Such a transformational production/reception of meaning is made possible by the very nature of the human brain which 3 4

http://www.thefreedictionary. com/economy.

http://en.wikipedia.org/wiki/ Redundancy_linguistics.

34

PRONUNCIATION IS IN THE BRAIN

has the capacity to function as the perfect encoder (generator of the code) in the case of the speaker and perfect decoder (decipherer of the code) in the case of the listener. In order for the speaker-listener flow of interaction to be triggered and continued, the embedded cognitive code in the brain of the speaker has to be able to reciprocate with the cognitive code in the brain of the listener and vice versa. Stated differently and plainly, they have to be speakers of the same language. Meaningful linguistic communication will be impeded if the underlying cognitive code is not the same or identical. Evolution through natural selection has empowered certain centers in the human brain to function as the decoding or encoding centers. Although the human brain is holistically responsible for human language, more specifically the former is identified as the reception center, commonly known as Broca’s area and the latter as the production center commonly known as the Wernicke’s area. Nature is simultaneously a creation and a creator. It rests on three pillars, namely biology, physics and mathematics. Biology creates, physics balances and mathematics computes. It is the change in the biology of the brain and other survival systems (respiratory and digestive) brought about by evolution that made language possible. Then, the rules of physics kicked in to balance the ensuing changes through computations that mathematics made available. This is nature at its best. Under the influence of evolution, creations in nature are susceptible to change. At times, with changes problems emerge. Nevertheless, nature is conscientious enough not to disturb its own balance by only initiating problems; its problems are followed by solutions. For example, when nature enabled the brain to grow larger and more powerful and capable of housing the species-specific language, it (nature) forced the respiratory and the digestive systems to adapt and facilitate the physical, aerodynamic and acoustic prerequisites of language generation. For instance, the 45º mammalian digestive/respiratory tract evolved into a 90º tract to accommodate the speech apparatus. The vocal folds had to acquire far more maneuverability than merely adducting the passage to the lungs when swallowing. They increased in elasticity to develop different and more sophisticated patterns of abduction/adduction and tension/relaxation. The relevant muscular apparatus and innervations had to be able to enhance the

CHAPTER 2

35

raising/lowering capability of the larynx to adjust the aerodynamics of voice and noise generation not just for speech production but also for different forms of singing. Another of the most characteristic features of the human brain is its hemispheric functional asymmetry which is completed through the so-called lateralization process according to which the left brain controls the right side functions of the body and vice versa. This lateralization process is usually completed by the age of puberty. If one seeks an explanation for this hemispheric specialization the answer seems to combine the principle of functional economy with increased specialization.5 The ability of human beings to develop their cognitive, physical, aerodynamic and acoustic potentials to accommodate language is the best gift uniquely endowed by nature upon them. The evolution of such a potential to express limitless meaning with minimum physical and mental effort is a superb example of the principle of economy in action when economy is defined as ‘the minimum amount of effort to achieve the maximum result’ (Vicentini, 2003). It is because of this dominating principle of economy that human beings “are able to produce about three words per second or one sound every tenth of a second on average and make only about one sound error per million sounds and one word error per million words” (Caplan, 1995). Within this grand system of economy in language, there are different sub-systems that contribute to building the unique system of speech generation. Without such sub-systems the burden of language on the brain would simply be too much to endure. To illustrate, the human speech apparatus can hypothetically generate infinite number of sounds; however, two questions immediately arise. First, does a human language need that many sounds to generate speech? Second, does the brain, which has millions of other biological functions to handle, like to burden itself with stocking up thousands of speech sounds? The answer to both questions is ‘no’. In the first instance, the generative de5

http://pandora.cii.wu.edu/vajda/ling201/test4materials/language_and

the_brain.html.

36

PRONUNCIATION IS IN THE BRAIN

sign of language requires only tens of sound units (phonemes) to be recycled again and again in a recurrent and generative manner. This highly economic design is a universal feature of human language without which it will lose its limitless creativeness. In the second instance, there is a salient tendency in language to manipulate sounds whose articulatory maneuvers are predominantly easy to produce and easy to perceive. The above two dynamics minimize ambiguity and enhance clarity. It is, therefore, not accidental that the majority of speech sounds are produced in the anterior half of the vocal tract in the form of labial, bilabial, inter-dental, dental, alveolar, etc., sounds; moreover, although in actual speech each sound unit may have a large number of different phonetic variations (allophones) for the same unit,6 the speaker does not recognize and store all those phonetic variations; rather, it only cognitively internalizes one abstraction for all the variations of a given sound unit to be known as a ‘phoneme’. It is only in the context of live speech that the phonemes mold themselves to the context in which they occur, thus yielding suitable contextual variants (allophones). Let us consider yet another example to demonstrate the principle of economy in language. When speaking, the brain fires instructions to the phonemes (abstractions) to construct a word, but the dynamic nature of speech, especially the mutual interactions of sounds in the flow of speech, creates different shades (allophonic variations) for each fired phoneme. It is only the listener who hears in phonemes because they go directly to his brain which stores only a finite inventory of phonemes. To state the same fact in more accurate linguistic terms, the speaker speaks phonetically while the listener recognizes phonemically (phonologically).

6

For instance, the /p/ phoneme may have several different realizations in

different contexts; it can occur as aspirated as in or unaspirated as in , with lip-rounding as in or with lip spreading as in .

CHAPTER 2 2.5. CONSCIOUS AND SUBCONSCIOUS BRAINS

37

In the preceding section, different pieces of evidence were knit together to highlight the economic premise of language as a reflection of the robust tendency towards economy in brain energy. It is the massive data storage capacity of the brain that renders language an open-ended system capable of producing and recognizing endless meaningful stretches of speech. Nevertheless, one of the most salient attributes of the brain that relates to human language and empowers it to be such a rich, creative and sublime medium of communication will be discussed in this section. This attribute is the twin-nature of human brain as conscious and subconscious (or unconscious). Although some nonhuman creatures may manifest a few hints of such a division of labor for the brain, in reality it is only a nominal division compared to that of human beings. It is, therefore, logical and substantiable to say that the human dichotomy of conscioussubconscious brains has been one of the main evolutionary developments that gradually emerged to manage, administer and execute millions of biological, social and cultural functions that humans have to successfully execute in order to survive healthily and rationally. One such fundamental function of the brain is language; indeed, without a highly sophisticated brain there would be no language. Furthermore, without the dichotomy of conscious and subconscious brains, language would be too much of a mental burden on the conscious brain to be able to handle it so smoothly and effortlessly. Here again there is a distinct division of labor between the two brains for the sake of economy in effort through securing maximum coordination and harmony in the execution of the myriad functions of the brain. At best, the conscious brain is responsible for approximately 10% of total brain functions, the rest being the responsibility of the subconscious brain. The conscious brain is responsible for any action that it decides to initiate. Once it decides on a certain action, most of the requirements for completion are automatically delegated to the subconscious brain. For instance, to participate in a marathon running competition is the responsibility of the conscious brain, but the physical preparation of the body (neuromuscular coordination and respiration) and the implementation of the actual running are executed by the subconscious brain. To cite another exam-

38

PRONUNCIATION IS IN THE BRAIN

ple, when engaged in an informal conversation with family members and friends, a sizeable percentage of the conversation is managed by the subconscious brain simply because one does not seriously engage in planning the contents of the conversation such as the selection of the needed vocabulary and monitoring the morphological and syntactical rules because the storage of the lexicon and the rules of grammar are in the subconscious. However, unlike this informal conversation, if a person engages in delivering a formal speech orally, the conscious brain assumes a greater role to cater for targeted contents, careful selection of the needed lexicon and greater adherence to formal grammatical rules of morphology and syntax. Because of the conscious role of the brain, the speaker might have more pauses, hesitations and repetitions than in a casual social conversation due to a covert conflict between the two brains. For all biological functions, the subconscious brain never sleeps because it has to monitor the heartbeat, respiration, blood circulation, secretion of the necessary glands, digestion and scores of other survival tasks. Most important of all, the subconscious brain is the sentinel of our normal continued existence especially inasmuch as language is concerned; it is the seat of the long-term memory unlike the conscious brain which usually handles sensory memory (for just a split second) and at best shortterm memory (for a few seconds). With every impression, experience and event that a person intends to maintain for the long term, it is the duty of the conscious brain to serve as a medium to transfer them to the subconscious brain followed by some reinforcement strategies.7 It is not unrealistic to say that when in the morning we prepare ourselves to go to work most of what we do is to obey our autopilot (subconscious brain) which will help us dress ourselves, have breakfast, start the car, leave the garage, close the garage door and drive. If all those habitual actions were to be handled by our conscious brain, by the time we were at work we would already be stressed out.

7

The retention in the long-term memory is the result of anatomical or bio-

chemical changes that occur in the brain (Tortora and Grabowski, 1996).

CHAPTER 2

39

In an earlier study of the strategies for teaching pronunciation, it was pointed out that language acquisition is a process of mental (cognitive) habit formation (Odisho, 2003). I used the term ‘cognitive’ to mean storing the linguistic habits in the brain and retrieving them instantaneously when needed. It is common knowledge to say that “whenever anything has been repeated a sufficient number of times to have become habitual, it becomes second nature, or rather a subconscious action” (Larson, 1912). We all as babies, toddlers or young children have slowly and gradually learned how to grasp things with our fingers, balance our bodies, walk, run, ride a bicycle and perform countless number of functions which through constant and systematic repetition have been transformed into automatic, effortless and subconscious survival functions. Such transformations are the greatest relief that nature has ever bestowed upon human beings. It is this transfer of the mental load from the conscious brain to the subconscious that helps the former avoid collapse under the pressure of too many requests and commands for action. Without mental habit formations, life would be too stressful and burdensome on the conscious brain. In our day-to-day life, when we say that life is becoming stressful we simply mean that we are engaging the conscious brain in responding to several problems simultaneously, knowing that the brain prefers to handle one problem at a time. Thus, normal survival of human beings without a powerful subconscious brain is virtually impossible. It is estimated that the subconscious brain has 40,000,000 nerve impulses per second, while the self-conscious brain fires 40 nerve impulses per second.8 The ease with which human beings use language as their most efficient social and cultural tool has only been possible through continuous mental transferences from the conscious to the subconscious. All the required basic constituent units of sounds, the rules that combine them in different formations and assign meaning to them are gradually transferred from the conscious brain to the subconscious especially in childhood. “It is estimated that the window for language is from birth to 10 years 8

http://brainforsuccess.com/howyourbrainwork.html.

40

PRONUNCIATION IS IN THE BRAIN

old. Note how quickly children learn a new language compared to adults. Moreover, unless the children learn the new language at a very early age, they will most likely have an accent in that language for life” (Carpenter, 2004). 2.6. CONCLUDING REMARKS

Evolution is the magic with which Mother Nature works its miracles. If nature imposes a new function or creates a new problem for its beings, it must, later on, devise a road map to solve the problem. In this instance, nature forced man to innovate a culture of survival. The foremost tool for such a culture of survival was language. However, language as a sophisticated generative system needs a large brain to store its building blocks, its rules and organize them to generate the semantic strings that speech is made of. So, the human brain gradually grew larger and larger to be the largest relative to body size of all brains of all animals. In order for language to be spoken, it had to have the necessary organs to initiate physical energy, to create the aerodynamic conditions that, in turn, afford the needed acoustic prerequisites that are translated into speech in the brain of the listener. This is how nature instructed the brain to negotiate with the respiratory and digestive systems to collaborate without compromising their biological functions. Both systems responded by imposing some modifications on their basic biological structures and functions. Thus, evolution gave human beings a modified physical tract that became capable of (almost) perfectly operating three systems (respiration, digestion, speech) instead of two. The evolution of human language is a typical example of “compromise between nature and nurture” (Bates, 1999). If the human brain is the most powerful computer ever to have evolved (even compared to our modern computers) it is because the brain is the result of millions of years of evolution whereas the latter are only the product of the last few decades. Stephen Hawking, in The Universe in a nutshell, says, “Present computers remain outstripped in computational power by the brain of a humble earthworm” (cited in Carpenter, 2004).

CHAPTER 3: LANGUAGE IN THE BRAIN OF A CHILD 3.1. LEARNING VS. ACQUISITION: CONCEPTUAL DIFFERENCES

In dealing with human mastery of language, the two terms ‘learning’ and ‘acquisition’ are the most frequently used to describe the process. Obviously, ‘learning’ is more generic in connotation than ‘acquisition’. The latter carries more technical denotation inasmuch as the phenomenon of human language internalization is concerned. The technical denotation has emerged with the rise in the popularity of modern linguistic studies, especially applied linguistics with specific focus on child language mastery (i.e. acquisition), bilingualism and second language learning in adulthood. In its specific sense, the term ‘acquisition’ has been used to identify the natural process through which a normal child masters its native spoken language perfectly and almost effortlessly compared to second language learning by an adult. There are very significant theoretical and applied distinctions between the two processes that are unfortunately known only to people who have been exposed to neurolinguistics, cognitive psychology and linguistics, per se. The majority of people use the term ‘learning’; some use the two terms interchangeably. The applied side of the distinction will be dealt with in the upcoming chapters. Theoretically, however, there are three rationales for shedding light on the distinction this early. First, the distinction between the two processes is related to the age of the learner and the nature of brain involvement in each case. Second, the implicit association of the two processes with the dichotomy of nature vs. nurture is significant in the context of this book. Third, although learning and acquisition are occasionally used as mutually exclusive, in the application of the pedagogy promoted here the two processes are handled as complementary in nature. 41

42

PRONUNCIATION IS IN THE BRAIN

3.2. THE BRAIN OF A CHILD AND LANGUAGE

Children are the best learners of language on earth simply because nature has endowed them with this gift. They begin internalizing their mother tongue not just in the postnatal period, but also before birth. Language begins burgeoning with the early formation of the fetus. One of the key findings of an ongoing research project by Canadian and Chinese researchers who are studying infant development suggests that while still in the womb, our brains learn speech patterns, laying the groundwork for language acquisition.1 “Although we think of human infants as being behaviorally immature at birth, there are many aspects in which the human brain is unusually mature at birth” (Clancy and Finlay, 2001). 3.2.1. Child Brain Formation and Maturation

In terms of the power and speed for information encoding, generation, transmission, reception and decoding, there is no match, to be identified yet, in nature and in modern technology to the human brain. If a comparison is made between the human brain and the latest most powerful computer in the world, research shows that “each neuron is comparable to a single computer. With one hundred billion neurons in the brain, we have the neural capacity of approximately one hundred billion networked computers each of which gets modified and updated on an everyday basis”.2 Since the focus of this book is on language, in general, and pronunciation, in particular, many of the biological and physiological aspects of the human brain fall outside the realm of the book. All attention will be centered on the function of the neurons (brain cells) and their synapses,3 which con1 2

http://abcnews. go.com/Technology/ story?id=97635&page=1.

Wesson, Neuroscience: http: //www.sciencemaster.com/columns/wes

son/wesson_part_03.ph 3

A synapse is a junction between two nerve cells, consisting of a minute

gap across which impulses pass by diffusion of a neurotransmitter. When a nerve impulse reaches the synapse at the end of a neuron, it cannot pass directly to the

next one; instead, it triggers the neuron to release a chemical neurotransmitter.

CHAPTER 3

43

stitute the neural network of the brain in crafting and operating human language. The neuron transmits information through its axons and receives information through its dendrites (França, 2006). The intricate manner in which this network enables a child to internalize the code of a language into which it is born with so much ease is simply the prime miracle of Mother Nature. It is startling to realize how much of fundamental brain morphology and organization is already laid down by the end of the first trimester even before many mothers realize that they are pregnant (Clancy and Finlay, 2001). At birth, each neuron in the cerebral cortex has approximately 2,500 synapses.4 By the time an infant is two or three years old, the number of synapses is approximately 15,000 synapses per neuron (Gopnick, et al., 1999). At its peak, the cerebral cortex of a healthy toddler may create 2 million synapses per second.5 In general, young children may have more synapses than they will ever need. The development of synapses occurs at an astounding rate during children’s early years in response to childhood experiences. In fact, the sudden increase in synapses amounts to more than what an adult brain usually has. The significant question at this juncture is: why is there such a gigantic surge in synapse formation or synaptogenesis? The obvious answer lies in the immense environmental pressure on the brain brought about by the physical and mental growth of a child. In fact, the abundance in synapse formation is a safety and precautionary measure on the part of nature to successfully carry a child through the most significant phase in human life. Once the formative years carry the child along the path to The neurotransmitter drifts across the gap between the two neurons. On reaching the other side, it fits into a tailor-made receptor on the surface of the target neuron, like a key in a lock. This docking process converts the chemical signal

back into an electrical nerve impulse. (http://www.sciencemuseum.org.uk/Who AmI/FindOutMore/Yourbrain/Howdodrugsaffectyourbrain/Whatsasynapse.aspx). 4 5

http://faculty.washington.edu/chudler/plast. html.

https://www.childwelfare. Gov /pubs /issue_briefs/brain_development

/how.cfm, 2009

44

PRONUNCIATION IS IN THE BRAIN

maturity and enable it to successfully and creatively engage in normal physical and cognitive survival, a visible sudden drop in the number of synapses, called synapses pruning, takes place. Human and animal studies show that the mammalian brain undergoes massive synaptic pruning during childhood, removing about half of the synapses before puberty (Chechik, et al 1999). It is, therefore, “fair to say that the infant arrives in the world with a nervous system whose working components are in place and organized. All cells are generated, major incoming sensory pathways are in place and all have already gone through a period of refinement of their total number of cells, connections, and topographic organization” (Clancy and Finlay, 2001). There is ample scientific evidence that the human brain is primed right from its fetal phase throughout adulthood to be the unique and powerful organ to construct, operate and manage the perfect communication system ever known in history—human language. The relationship between the human brain and human language typifies a perfect balance between nature and nurture: the former creates and demands while the latter reacts and heals. As for building cognitive habits in the brain, it is worth mentioning that the more frequently the neuron connections are used, the more they retain information and the stronger they become.6 It is in this manner that children execute the process of natural acquisition of language. 3.2.2. Formative Months and Years of Mother Tongue

Research indicates that infants are able to respond to sound 10 weeks before birth through bone conduction (Shiver, 2001). Nevertheless, the actual acquisition7 of language begins with birth when the infant is not only exposed to its mother’s speech 6

http://www.brighthubeducation.com/infant-development-learning/

35203-sensory-stimulation-for-infant-brain-development/ 7

‘Acquisition’ is used here in its technical denotation as a natural, effort-

less and subconscious process of language internalization. It is collectively, but not absolutely, opposed to the process of ‘learning’ which tends to be effortful and conscious.

CHAPTER 3

45

but also to the speech of other members of the family and to all natural sounds in its environment. The period from birth throughout childhood and early adolescence is the prime time for language acquisition. The brain, through billions of neurons and trillions of synapses, is ready to assimilate any structures of language beginning with its minimal sounds through larger combinations of them in the form of words, phrases, clauses and sentences leading to discourse. This burgeoning of language acquisition neatly coincides with the mushrooming of the synapses in the brain that become the pathways to the neurons for a twoway communication of speech production and perception. Clancy and Finlay (2001) neatly summarize child language evolution during the first thirty (30) months as follows: “First, the period between 8-10 months is a behavioral watershed, characterized by marked changes and reorganizations in many different domains including speech perception and production, memory and categorization, imitation, joint reference and intentional communication, and of course word comprehension. It seemed plausible that this set of changes (which are correlated within individual children) might be related to patterns of connectivity and brain metabolism. Second, the period between 16 and 30 months encases a series of sharp non-linear increases in expressive language, including exponential increases in both vocabulary and grammar. A link seemed possible between this series of behavioral bursts and a marked increase in synaptic density and brain metabolism that was estimated to take place around the same time.”

Once the base of language is consolidated in the brain of a child, the massive synaptic activity slows down in volume and speed. It is at this stage the synaptic pruning phase kicks in. A huge number of neurons perish and even greater numbers of synapses are simply eliminated; it is truly a “use-it-or-lose-it” situation (Shiver, 2001). Other synapses are redirected to handle additional cognitive functions. For instance, not all millions or billions of synapses that were engaged in internalizing the sound system (phonology) of the native language are needed because the phonology and the articulatory maneuvers required for its actual production/perception are already perfectly internalized

46

PRONUNCIATION IS IN THE BRAIN

to the extent of being fully cognitively habitual or subconscious. It is also quite likely that a sizeable percentage of the neurons and synapses are assigned different linguistic functions such as focusing on enriching the lexicon (vocabulary) and upgrading the morphological and syntactical rules. For example, it is quite common for very young native speakers of English to demonstrate excellent mastery of their sound system, but fail to correctly conjugate past tense and past participle of irregular verbs such as . For instance, instead of conjugating as and , they apply the dominant rule of ending resulting in in both cases. In light of such grammatical flaws, it is likely that some of the phonologyoriented synapses in the brain of those young learners will be redirected to engage in the mastery and refinement of other grammatical rules. This neural and synaptic scenario in terms of peak activity, slow down and redundancy is very reminiscent of the working conditions in a factory. When there is high demand for goods the laborers work overtime, when the demand decreases a slowdown follows forcing the management to lay off some laborers and/or reassign others to different jobs. In this game of brain and language, the brain is the manager, the neurons and synapses are the laborers and the requirement of language acquisition is the demand. 3.3. COGNITIVE TRANSITION IN SOUND PERCEPTION AND PRODUCTION

Once the child is born into the world, it has to transition from the world of hearing to the world of listening as the former is a “sense while the latter is a skill… indeed, listening can be thought of as applying meaning to sound, allowing the brain to organize… listening is where hearing meets brain… listening to language is uniquely human” (Beck, 2011). In the process of language acquisition, the child has to hone its listening skills to be able to discriminate one sound from the other. Discriminative listening is a cognitive process through which sounds are internalized into the long term memory to be subconsciously and effortlessly identified, selected and produced when needed. Neurolinguistic literature is fraught with evidence substantiating the fact that very young infants discriminate not only the segmental contrasts of their native language, but many

CHAPTER 3

47

nonnative contrasts as well (Best, 1991). This implies that the phonetic inventory of infants and young children in their native language is richer than when they become adults. Best’s (2002) findings further suggest that infants begin life with languageuniversal abilities for discriminating phonetic sounds that do not exist in their language environment. However, these abilities gradually decline in cross-language speech perception between infancy and adulthood; in fact, adult speech perceptual ability is more limited, reflecting discrimination of only those contrasts which are phonemic in the listener’s native language (Werker and Tees, 2002). It might be that the difficulties adults have will reflect a permanent, absolute loss of sensory-neural sensitivity to the acoustic properties of those contrasts discriminating between the pair members of many nonnative contrasts, or their linguistic properties (Eimas, 1978). True, the loss in sensory-neural sensitivity is a serious one, but is neither absolute nor permanent; there is always room for neuroplasticity in early adulthood. Nevertheless, the gradual loss in sensory-neural sensitivity is a natural phenomenon and seems to correspond neatly with the synaptic pruning that the infant brain undergoes in the early years. The synaptic pruning is an attempt on the part of the brain to internally reorganize its neural functions and avoid any redundancies in the whole process of phonology acquisition. Thus, when the baby hears people speak a certain language, the brain strengthens connections for the sounds of that language; consequently, the connections for the sounds of other languages become weaker and may eventually wither.8 When the brain of the infant senses that it has an almost complete mastery over the sound system (phonology) and communication is sustained almost flawlessly with parents and other siblings, it reinforces the system and locks it in the subconscious brain and indirectly excludes other sounds that are not experienced and do not recur in his native language. Linguistically, this transition from discriminating a large number of sounds in childhood to recognizing only a limited core in adulthood is 8

http://www.fcs.uga.edu/outreach.

48

PRONUNCIATION IS IN THE BRAIN

known as the transition from phonetics to phonology or a transition from physical entities to cognitive ones. Stated differently, it is a transition from the world of concrete speech sounds in the environment to abstract sounds in the brain. Once this concreteto-abstract transformation takes place the child gradually becomes psycholinguistically insensitive or, at times, deaf9 to sounds that are absent in his language or occur only allophonically (as contextual variations of a phoneme). This transition is the foremost reason why adults learning a second language (L2) usually fail to accurately pronounce sounds that are alien to the native phonology of their language or dominant dialect; it is this failure that we identify as accent in the pronunciation of a given language or even a given dialect within a language. 3.3.1. Transition from Phonetics to Phonology

In linguistic studies, phonetics is the study of human capabilities for speech sounds production: their places and manners of production, their voicing and voicelessness, among others. Human speech organs are capable of generating thousands of sounds, some of which differ from each other in only minuscule and insignificant features (Catford, 1977). However, this profusion in physical sound generation is incompatible with the dominating tendency in the brain towards economy in both physical and mental effort. Stated differently, the rule of the brain is to store finite sound units in memory, but generate infinite stretches of meaning. In order to attain this goal, the brain applies a sound compression technique in the form of the process of abstraction. In other words, the brain, right from infancy, begins the process of compressing many physically and/or perceptually related sounds under one cognitive unit stored in the subconscious brain which is traditionally known as a phoneme or a sound unit. 9

When in England as a graduate student, I had an Arab friend from Syria

who used to pronounce = [ʤʤ] as []. When I brought that to his attention, he said: “I said [ I didn’t say [ ”. He absolutely did not realize

that he was still pronouncing [ʤʤ] as []. Indeed, he was deaf to this sound. It took me a long time to convince him that he had to pronounce with [ʤ] not [].

CHAPTER 3

49

As pointed out earlier, each phoneme is the abstraction of many physical sounds identified as allophones. To illustrate, the sounds of

in the following words: , , , and are different from each other. Generally speaking, they are simply phonetically described as follows: aspirated; unaspirated or deaspirated; aspirated with lip-rounding; and aspirated with lip-spreading. Each of those four represents an allophone of the phoneme /p/. In reality, the human brain is not simply satisfied with this process of abstraction; abstraction is coupled with a process of selection and elimination. The brain of a typical native speaker of a given language selects a number of sounds10—usually in tens—from a huge inventory of possible natural sounds to form the inventory of his L1 sound system. It is this limited inventory of units that is known as phonology. Once this transition is completed in the brain of a child, two major linguistic consequences are distinctly noticed. First, the natural phonetic gift of a child in being able to perceptually discriminate a wide variety of sounds including many that are not part of the inventory of his native language (L1) gradually begins to subside with age. Accordingly, his skill in perceiving, discriminating, let alone producing non-native sounds erodes progressively. Naturally, the erosion continues with older age. It is worthwhile emphasizing that the erosion is not the result of loss in the plasticity of the vocal organ as much as it is a consequence of the loss of sensitivity (neural plasticity) to sounds and sound phenomena outside the periphery of the native language phonology. Second, barring any speech deficiency, the mastery over the phonology of L1 becomes perfect, but the perfection will be at the expense of non-L1 sounds perception/production. Such biases facilitate rapid perception of native speech, but seriously impede perception of speech sounds in a language other than the native one (Port, 2007). It is in the above transitions in the skill of sound perception and production that the root cause of the linguistic phenomenon of accent is buried. In sum, it is the 10

The majority of languages throughout the world have an inventory

range between 20–50 units.

50

PRONUNCIATION IS IN THE BRAIN

phonology of the native language (L1) that imposes phonetic limitations on perceiving and producing L2 sounds. 3.3.2. The Brain as the Commander-in-Chief of Language Acquisition: The Cognitive Roots of Linguistic Accent

Detailed discussion of the applied implications of accent in language will be dealt with in a specific chapter. At this juncture, since the focus of this book is on the cognitive pedagogy of teaching pronunciation, only the cognitive roots of accent will be tackled in this section. In a nutshell, accent is a psycholinguistic problem; it is a cognitive problem whose solution should be primarily cognitive as well. It emerges as a result of the brain trying to immaculately internalize the phonology of the native language and transfer it from the conscious brain to the subconscious and render the process of pronunciation as effortless as possible with the purpose of alleviating the stress on the brain if all language interactions are conscious. It seems that the brain of an infant assigns a massive contingent of neurons and synapses to perfect the process of internalizing the phonology. As if during the formative years, the brain keeps communicating mutely with the infant and negotiating the conditions for accomplishing the mission. The mute dialogue between the infant and his brain might go on in the following manner: Brain: Do you love your mother tongue? Infant: Yes, I do; indeed, I adore it. Brain: Do you want to master its pronunciation perfectly? Infant: Yes, I love to. Brain: I will help you with it. Infant: Oh! Thank you! Thank you! Brain: Are you planning to learn a second language as an adult? Infant: May be. Brain: It may be difficult for me to help you master its phonology as perfectly as your L1. Infant: What do you mean?

CHAPTER 3

51

Brain: You may have an accent. Infant: Why? Brain: You have either to start much earlier or you have to have a gift as well as a creative instructor to help you reduce your accent. Infant: O.K. I have no better choice.

The gist of the above assumed and tacit dialogue between the brain and the self implies that once a person approves the conditions of the brain for securing perfection in the phonology of L1 at the expense of the phonology of L2, he/she implicitly approves the likelihood of emergence of accent in L2. The argument that accent is a cognitive problem and its solution should also be cognitive-oriented with the assistance of as many sensory modalities (channels) as possible is premised on three cognitive strategies on the part of the Commander-inChief, the brain. First, it is the economic principle of employing almost negligible cognitive effort to generate the maximum product by internalizing a finite inventory of sound units and storing them in the subconscious for spontaneous and effortless retrieval. Second, store the sound system in a safe zone as if the brain intends to grant it immunity against interferences from non-native sound systems and other outside interferences. Third, redirect the redundant neurons and synapses after successful internalization of the sound system to reinforce other linguistic skills including the lexical, morphological, syntactical and stylistic. It is the above argument that has inspired the title of this book: ‘Pronunciation is in your Brain, not in your Mouth’. If one intends to ‘eliminate’, or avoid an acute accent the only option to achieve that is to negotiate with the brain for ‘entry visas’ to accommodate additional ‘alien’ sounds. Children are always granted such ‘visas’ whenever they have ample linguistic exposure to an ‘alien’ language. Adults, unfortunately, have to try hard and even wait for longer time to secure a ‘visa’. Oftentimes, adults are granted the ‘visa’, but with constraints attached to it—you have to have a certain degree of accent. It is almost a universal tradition that a child born and raised in a certain country secures the citizenship of that country. By the same to-

52

PRONUNCIATION IS IN THE BRAIN

ken a child born and raised in a certain language becomes the citizen of that language. It is a natural privilege of children. 3.4. FOSSILIZATION OR PSYCHOLINGUISTIC INSENSITIVITY

There is no doubt, whatsoever, that children show more adeptness in language mastery than adults do, especially in the area of pronunciation. As has been discussed earlier, this distinction is attributed to the natural age-bound neuronal and synaptic proliferation and activity with regard to native language internalization in early childhood followed by a gradual slowdown. With age, adults begin to progressively lose their aptitude for the automatic and subconscious internalization of pronunciation. Consequently, the process of mastering the pronunciation of a second language becomes increasingly more conscious, mechanical and effortful. In the available literature, failure of adults to further improve their mastery of L2 pronunciation beyond a certain limit has been identified by some researchers as fossilization. (Selinker, 1972).11 According to the cognitive pedagogy adopted here, the use of the term ‘fossilization’ to identify the slowness and imperfection of adults’ mastery of L2 pronunciation is rejected for several reasons. First, the choice of the term ‘fossilization’ is unfortunate since it implies excessive rigidity to be associated with the enormous potential plasticity of the human brain. Second, the neuronal and synaptic activities which are behind the internalization of L1 sound system do not seem to be exhausted because many adults, young and middle-aged, can still secure a highly satisfactory proficiency in pronunciation. Some investigators (Johnson and Newport, 1989) conclude that there is no single moment when the window of linguistic opportunity slams shut.12 Third, the inefficient methods of teaching pronunciation 11

The definition of the word ‘fossilization’ is “the process of being turned

to stone”. 12

The author significantly improved his English pronunciation at the age

of thirty-three (33) by eliminating many phonetic and phonological residues left

over in his English from his Aramaic and Arabic languages. He also added a wide variety of new sounds to his phonetic inventory. It was a very demanding

CHAPTER 3

53

that rely exclusively on the auditory modality (i.e. repeat-afterme technique) instead of an approach that is multisensory and multicognitive in essence and secures significant results (Odisho, 2003). Fourth, the extent of exposure to L2 and the context of exposure13 are of considerable importance in reducing the acuteness of L2 accent. Finally, if self-motivation on the part of the learner is added to the above four factors the accent can be considerably reduced and, occasionally, minimized. The rejection is justified based on the fact that systematic multisensory and multicognitive orientation helps all learners, regardless of age and aptitude for pronunciation, to improve their skills, though to different degrees, in the acquisition/learning of L2 pronunciation. Using a combination of diversified multisensory and multicognitive techniques and exercises, the learning process can continue, albeit slowly, but it will hardly cease completely to the extent claimed by fossilization. In an earlier work (Odisho, 2003), this slowness or intransigence of adults in the acquisition/learning of L2 pronunciation was known as psycholinguistic deafness which, unlike fossilization, does not imply total cessation of learning; rather, it keeps the doors of acquisition/learning of L2 pronunciation open depending on the rationales mentioned above. Maybe the choice of ‘deafness’ was too loaded a term; nevertheless, what I implicitly meant was a transitional decline in efficiency or sensitivity to speech sound perception/production, but not a total cessation. Stated differently, with adulthood, speakers of any language gradually develop a certain degree of psycholinguistic insensitivity to the perception, hence production, of speech sounds outside the phonological realm (inventory) of their native language (L1). It is, however, noteworthy that the distinction between the adeptness of children and adults to language acquisition is priand time-consuming task, but it was doable for two reasons: first, it was full

immersion in an all-English speaking environment; second, it was an environment of academic specialization that imposed on the brain to react to the specific situation. 13

Is it classroom exposure only or is it combined with sufficient social in-

termingling in the community of the targeted language?

54

PRONUNCIATION IS IN THE BRAIN

marily confined to the linguistic domain of pronunciation and not necessarily to other domains, such as morphology, syntax and lexicon in which adults may be equally adept or even more adept than children in many cases. It is, therefore, logical to conclude that for a child there is no easy or difficult language or languages to acquire, especially their pronunciation; the difficulty in mastering the pronunciation of second language tends to be a trait of adults. 3.5. THERE IS ROOM IN THE HUMAN BRAIN FOR MORE THAN ONE LANGUAGE

Bilingualism is a normal natural phenomenon in human civilization. There are as many bilinguals around the world as there are monolinguals; indeed, there is hardly any country that does not exhibit a certain degree of bilingual communication. In the long history of human civilization, bilingualism has never been a marginal or accidental linguistic phenomenon that emerges sporadically and intermittently here and there. Conversely, bilingualism is a constant component of the overall structure of human civilization; it automatically emerges when two or more language communities or speakers come into contact. Bilingualism is an easily justifiable normal and natural sociolinguistic and psycholinguistic phenomenon. The sociolinguistic naturalness of bilingualism is substantiated by its pervasiveness throughout all linguistic communities; likewise, psycholinguistically, bilingualism is a normal and natural phenomenon because human beings, especially the young, internalize it readily, implying that the human brain is endowed with enough cognitive potential to absorb more than one language (Odisho, 2002). The brain of a child with billions of virgin neurons and synapses is a massive generator of cognition, imagination, creation and innovation. It is, therefore, quite natural for a child to automatically acquire two, or even three languages if it has ample access and exposure to them. If the exposure is balanced then naturally the competency in the two languages will be balanced as well. In fact, each language may be handled as a system in its own right and the competency will greatly resemble that of monolingual child in either language (de Houwer, 1990). Stated differently, if the child is exposed to two languages from a very early age, he will essentially grow as if there were two

CHAPTER 3

55

monolinguals housed in one brain.14 The linguistic competency with many bilingual children can be so high that enables them to transition instantaneously and subconsciously from one language to the other. Obviously, some unintended switching and mixing between the two languages is to be expected. The human brain is powerful enough for more than one language, but in the absence of a bilingual or multilingual environment, the child will naturally grow up as an immaculate monolingual. 3.6. NARROWING DOWN THE BROAD DEFINITION OF ACCENT

The broadest definition of linguistic accent is that it is a deviation from a given norm of pronunciation acceptable to a group of speakers. With this broad definition, it is irrational and impractical to design one approach to remedy all types of deviations from the given standard or acceptable model of pronunciation. Often, the term ‘accent’ carries a negative connotation since it is usually associated with comments such as: ‘He speaks with an accent’; ‘He has a strange accent’; ‘I can’t understand him because of his very heavy accent’. With today’s linguistic refinement in the description and assessment of human languages, the term is still often used generically without much specificity. The only distinction that has relatively been made somewhat clearer is between a ‘dialect’ and an ‘accent’. The former is usually used to refer to a combination of grammatical, lexical and pronunciation differences, whereas the latter is essentially confined to pronunciation differences. In light of the above information, two points should be taken into consideration. First, highlight all identifications and descriptions of the term ‘accent’ and select the one that will be the focus of this study. Second, identify the approach that will be used to teach pronunciation, in general, and accent remediation in L2, in particular. The response to the first point is covered by the following two dichotomies: intralanguage accent vs. interlanguage accent and phonetic accent vs. phonological accent which will be the focus of Chapter 4. The approach will be touched 14

Petitto, quoted in [email protected].

56

PRONUNCIATION IS IN THE BRAIN

upon lightly in this chapter, but details will be dwelt upon thoroughly in some of the remaining chapters. 3.7. IMPLICATIONS FOR UNDERSTANDING THE COGNITIVE NATURE OF ACCENT

There are both theoretical implications and practical applications for the cognitive identification of the underlying causes of accent especially with adult L2 learners in contrast to children internalizing their L1 or even L2.15 The practical applications will be elaborated on elsewhere in this book; thus, the focus here is on the theoretical implications that should outline the pedagogical roadmap for application. Foremost among those theoretical implications are the following. 1) Develop a cognitive approach that reflects the latest understanding of distinctions between the nature of human language acquisition and language learning. 2) Envisage the methodology or methodologies and identify the teaching and learning techniques that take the above distinctions into consideration and implement them. 3) Be aware of the phonetic vs. phonological differences in pronunciation and how the latter should receive the priority as they cause semantic (meaning) confusion. 4) Determine the proficiency level of pronunciation that is targeted; is it native, near-native or just acceptable? 5) Determine the main functional goals and objectives of teaching pronunciation.

The striking disparity between children and adults is that children internalize language through a natural process identified in applied linguistics as acquisition. Cognitively, they are primed to master their L1 or even beyond that. Their brain is ready to masterfully absorb any linguistic materials to which they are 15

In fact, being fully exposed to and immersed in three different languages

as a child, I grew up trilingual with native oral mastery of all three of them (Assyrian, Arabic and Turkmeni).

CHAPTER 3

57

amply exposed in authentic social contexts and situations. Collectively, all those three conditions of cognitive readiness, ample exposure and authentic context lead to the process of acquisition which is a teacherless process in the formal meaning of teaching. All that a normal child needs is the habitat of a family and the community. Children acquire the core of the native language, especially pronunciation, from mere exposure to the community around them which is also known as ‘incidental learning’. “Incidental learning through overhearing occurs when children listen to speech not directly addressed to them, yet they learn from it. Amazingly, very young children learn approximately 90% of the information they acquire incidentally” (Beck, 2011). 3.8. CONCLUDING REMARKS

In chapter 2, the focus was on the interaction between brain and nature and the power of the latter in the evolution of the former. This chapter had a triangular emphasis on nature, brain and social environment in that language is a product of all three and children are a most fortunate beneficiary of the combination of the three forces. Language acquisition for a child is identical with the acquisition of walking; they both evolve, ceteris paribus, naturally and are seamless in their function. This is why children do not have an accent when thoroughly exposed to a language besides their own. Accent is a linguistic phenomenon that is associated with adults. Naturally, no adult wants to speak an L2 with an accent; nevertheless, there will often be a ‘sliver’ of accent. How ‘thick’ or ‘thin’ the sliver would be depends on many conditions. In a social-linguistic context, a thinner sliver is better than a thick one. It is for this reason that the next chapter emphasizes the distinction between a phonological accent and a phonetic one, the former being the thick (implying a thick accent), and the latter being the thin. Functionally and professionally, priority should be given to thinning (i.e., reducing) the phonological accent first then handling the phonetic one.

CHAPTER 4: LINGUISTIC ACCENT: DEFINITION, CLASSIFICATION AND DEMONSTRATION 4.1. INTRODUCTORY REMARKS

As mentioned earlier on with regard to the difference between dialect and accent, the former is usually used to refer to a combination of grammatical, lexical and pronunciation differences, whereas the latter is essentially confined to pronunciation differences. In fact, in ESL classroom situations or real-life interactions, one is able to identify the accent of an L2 speaker not necessarily based exclusively on pronunciation inaccuracies and deviations from acceptable targeted standard pronunciation. In many instances, there are grammatical, morphological and lexical hints that trigger an accent in pronunciation. Take, for instance, the case of some English past tense and past participle formations of verbs which end with suffixes that are in the form of voiced or voiceless consonant clusters. Often such clusters are broken up or reduced in one way or another by Hispanic learners of English because they are alien to their native phonology. For example, the past tense of is and are pronounced as ]. Traditionally, Hispanics try to overcome the problem in one of two ways both of which are, unfortunately, wrong: 1) Reduce the cluster by dropping one of its elements, usually the latter. This means that instead of pronouncing [] and [] correctly, they would simply reduce them to [] and []. 2) Insert a vowel between the two elements of the cluster, break up the cluster and cause a reshuffle in the syllabic structure of the word. To demonstrate, instead of pronouncing the word or as [] and [], they are rendered roughly as [] and []. Undoubtedly, this is a grammatical problem but its roots are in pronunciation. In the totality of the performance of English, 59

60

PRONUNCIATION IS IN THE BRAIN

such instances could be treated as both morphological and grammatical errors besides being pronunciation errors. Errors such as those are different from replacing [v] with [b] or [z] with [s] as in the words and , respectively, which are sheer pronunciation inaccuracies. By the same token, an Arab student may make a statement as follows: ‘The house beautiful’ instead of ‘The house is beautiful.’ Surely, this is a grammatical error attributed to the absence in Arabic of what is equivalent to the verb ‘to be’ in English; nevertheless, it is more readily captured by the listener through a gap in the overall pronunciation. Certainly, errors in the overall pronunciation of a statement in L2, such as the ones above, tend to point in the direction that accent as part of speech production and perception process on the part of the speaker/listener can receive interference from other linguistic systems besides pronunciation. In light of such broadening of the concept of accent, one can justify considering it in terms of surface layer vs. deep layer with the former ascribed exclusively to mispronunciations, whereas the latter ascribed to other linguistic factors including grammatical and lexical. 4.2. INTRALANGUAGE AND INTERLANGUAGE ACCENTS

The intralanguage accent refers to sound differences that exist among different dialects or varieties of a given language such as the difference in the pronunciation of // as an alveolar trill or a uvular trill or fricative across different German dialects. Similarly, the pronunciation differences between New York and Chicago dialects or even the differences between Received Pronunciation (RP) of England and the General American English (GAE) of the Midwest constitute intralanguage accent. In the latter case, there are, for instance, some major pronunciation differences between the two varieties. The differences are more distinct in the vowel system than the consonant system. Within the vowel system, the difference is more in diphthongs than in pure (simple) vowels. In consonants, except for the phonetic nature of /r/ and the pronunciation of the intervocalic /t/ as in the word , there are hardly any significant differences. In Assyrian (Modern Aramaic) dialects, the grapheme ܟ‬has different renditions such as [k], [k], [c and [ʧ]. In Arabic, the standard realization of the graphemes ج‬and ق‬, for ex-

CHAPTER 4

61

ample, are different throughout the Arabic speaking countries. The former has at least four different realizations, namely [ʤ , a voiced postalveolar affricate as in Iraq; [g], a voiced velar plosive as in Egypt; [ʒ , a voiced postalveolar fricative as in Syria and Lebanon and [Ɉ , a voiced palatal plosive as in Sudan. The ق‬has at least three different realizations. It is a voiceless unaspirated uvular plosive [q] in Standard Arabic and in several other dialects; in Iraq, it may be realized as voiced velar plosive [g]; and in Egypt as a glottal stop []. In sum, intralanguage accent represents pronunciation differences within L1. Conversely, the interlanguage accent stands for pronunciation differences that emerge when one moves from the native language (L1) to the target language (L2). For instance, when a native speaker of English embarks on learning French or Spanish, a major pronunciation difference one encounters is the radical shift in rhythm. The same is true for Frenchmen and Spaniards learning English. Some typical vowel differences that result in acute accent for native speakers of English learning German and French are the front rounded vowels such as [  ] and also the French nasal vowels such as [   ]. A foremost difficulty in learning the pronunciation of the Semitic languages, especially Arabic, is the dominance of the so-called guttural sounds [, , , , ] and the emphatic (also known as pharyngealized) sounds [   ]. For the specific teaching of the latter category see Odisho, 1981. In teaching pronunciation in cross-linguistic situations, it is important to be aware of the difficulties that arise when one moves from one language to another or even when one moves within the different varieties of one given language. There are other specific situations in which two individuals of the same L1 learning an L2 may have different pronunciation problems due to dialectal differences. To illustrate, if two German speakers, one with an alveolar trill-r dialect and the other with a uvular trill-r dialect embark on learning Spanish or Italian, the former is less vulnerable to r-pronunciation problem than the latter. There are scores of such cross-language pronunciation problems. With native speakers of Arabic learning English, the /ʧ/ as in is not a problem for those speakers of Iraqi Arabic as opposed to Egyptian, Lebanese or Syrians speakers because Iraqis have the sound in their local dialect. The focus in this study

62

PRONUNCIATION IS IN THE BRAIN

is essentially on interlanguage phonological and phonetic accent. 4.3. PHONETIC AND PHONOLOGICAL ACCENTS

From the functional perspective, the phonetic vs. phonological distinction in the nature of accent is extremely significant in teaching pronunciation. It was a major distinction that I developed in the mid-1990s and began implementing in my classes. The distinction was in print in 2003. Pedagogically, it made a substantial difference in helping learners focus on more important problems of pronunciation facing them rather than scratching on the surface of phonetic accent. One needs to understand the difference prior to any elaboration on the applied side of the dichotomy. Phonetic accent refers to a mispronunciation that does not result in a semantic (meaning) change, though it may negatively interfere with the proper comprehension of meaning due to partial detraction from the acceptable standard rendition of a given pronunciation. In other words, it is a mispronunciation that does not directly cause a miscomprehension, but it may hamper it or delay it and become a distraction. Let us now elaborate on the key words in the last statement. An example for the non-semantic nature of this accent is the massive replacement of the English approximant /r/ (English [] and American []) by a tap/flap [], or trill [r] or retroflex tap/flap [] by millions of learners of English. This replacement does not cause a change in meaning in English; it simply phonetically deviates from the normal standard and acceptable rendition of it. It is, therefore, a form of phonetic accent. Similarly, if a native speaker of English learning Spanish aspirates the unaspirated Spanish [p, t, k] or a native speaker of Spanish learning English deaspirates the normally aspirated English [p, t, k] no miscomprehension will result, only a phonetic deviation. Although a phonetic accent may not directly hamper comprehension by replacing one sound for another, it may indirectly interfere with or impede comprehension because a phonetic mispronunciation or a combination of phonetic mispronunciations may serve as an element of noise that can confuse the phonological filter of the listener and, hence, cause miscomprehension or, at least, a delay of comprehension. The latter two symp-

CHAPTER 4

63

toms portray themselves verbally through repeated questions or statements such “What?” “What did you say?”, “I beg your pardon!” or “I did not understand”, etc. Nonverbally, the listener may give you a facial impression of lack of understanding. As for the phonological accent, it is a mispronunciation that directly causes semantic confusion and impedes comprehension. Obviously, a phonological accent always implies a phonetic accent. If, for instance, the inaccurate phonetic rendition of aspiration and non-aspiration between the speakers of English and Spanish was a good example of phonetic accent, a similar inaccuracy in phonetic rendition between the speakers of English and Thai languages will certainly amount to a phonological accent on the part of the English learner of Thai since in the latter language aspiration vs. nonaspiration amounts to a phonological distinction. To illustrate, if a speaker of English fails to aspirate the name [tai] (a person from Thailand), he will be pronouncing it as if it were the word [tai], which means (kidney). A real conversation between an Indian student from Kerala/India and a Thai student went on like this. Indian student: “Are you Tai? She replied: “No.” He went on: “Aren’t you from Thailand?” She replied: “Yes”. He continued: “Why do you say you are not Tai?” She replied: “Tai, means ‘kidney’. I’m not kidney. I am Thai’”. There was laughter.

Similarly, the absence of [ʤ] sound, as in [ʤʤ], in German is easily felt in the pronunciation of German learners of English. Germans usually replace the [ʤ] with its voiceless counterpart [ʧ] as in the English words and which are rendered [ʧk and [ʧouk], respectively, thus causing radical semantic change, a typical outcome of phonological accent. Interestingly, when Arnold Schwarzenegger occasionally appeared on Jay Leno’s show one could hear him say “Jay, are you joking?” which sounded:

64

PRONUNCIATION IS IN THE BRAIN

“Chay, are you choking?” Therefore, the English pronunciation of the word and its derivations constitute a problem for Germans because it sounds as . However, luckily, they do not have this problem in the native language because they either pronounce those derivations with a [g] sound as in [g] similar to in the English word or use the Germanic counterpart of the word, . For Spanish learners of English, one of the main sources of phonological accent is the vowel system both for quantity (short/long or lax/tense) and quality (overall vowel impression). The absence of short/long or lax/tense contrasts in Spanish and their presence in English creates a very serious problem for Hispanic learners of English because virtually thousands of words are semantically confused—some of them resulting in embarrassing situations. Luckily for English learners of Spanish, hardly any phonological accent results from their handling of the five vowels of Spanish; however, English learners of Spanish may have noticeable phonetic accent, especially because of the imposition of their lax vowels, such as schwa [], and other vowel reductions and schwaizations1 that accompany unstressed syllables. In addition to the semantic difference that ensues from the distinction between phonetic and phonological accent, perceptually the phonological accent is much more readily identified by the native listener than the phonetic one because it either triggers confusion in meaning or propagates meaninglessness. 4.4. ACCENT: A NORMAL LINGUISTIC PHENOMENON

The overall tone of the preceding discussions of the phenomenon of accent should give the impression that accent is a normal linguistic fact. It emerges as a result of the cognitive attitude of the brain of a child acquiring L1 versus that of an adult learning L2. Every adult human being has to demonstrate a certain de1

Schwaization, after the neutral vowel schwa []; a vocalic change in the

direction of this vowel.

CHAPTER 4

65

gree of accent whether trying to learn a dialect within his language or a second language besides his. Thus, to speak a dialect or a language with an accent is not a stigma or a pathology and should not be treated as such; however, when the accent seriously interferes with meaning it then begins to obstruct the normal conveyance of the message between two individuals. Earlier on, this was identified as phonological accent versus phonetic accent, the latter of which does not seriously obstruct meaning directly except when it involves a certain degree of deviation from the targeted rendition of several segmental (consonants and vowels) and suprasegmental components (stress, tone, intonation) of the targeted language. Briefly, the general goal of teaching pronunciation should be an attempt at eliminating phonological accent and reducing the phonetic one. Some learners or speakers of an L2 claim that they do not care whether they manifest an accent in their speech or not. Such a claim is made under the pretext that the speaker wants his accent to reflect his ethnic and linguistic identity. In my view, this is a baseless claim and it is no more than a cover-up for the linguistic failure. Logically and aesthetically, any speaker of L2 should portray his ethnic and linguistic identity through perfect demonstration of his/her L1 rather than through the loading of an L2 with mispronunciations infused through L1. 4.5. WHAT IS MEANT BY ACCENT ACQUISITION, ACCENT REDUCTION AND ACCENT IMPERSONATION

To have or not to have an accent is a particularly controversial question. How much ‘accent’ one has is even more controversial simply because it can be a very subjective judgment especially by non-linguists. In real-life situations, there are different routes to accent minimization, namely through: accent acquisition, accent reduction (remediation) and accent impersonation or faking.2 In all three cases, the goal of the person is to try to diminish the differences in pronunciation between himself and a typical na2

Accent impersonation and faking will be used interchangeably according

to the context in which they occur.

66

PRONUNCIATION IS IN THE BRAIN

tive speaker of a target language (L2). In what follows, an attempt will be made to afford a descriptive account of each term coupled with some highlights. 4.5.1. Accent Acquisition

Accent acquisition is primarily the gift that nature bestows upon a child who grows up immersed in the authentic environment of a language—be that the language of the country of birth or the country of the resettlement where he grows up abiding by the rules and conventions of the pronunciation of that given language. The gift could be broadened in scope to include two more conditions that qualify a child or a person for accent acquisition. The former is the total and equal immersion of a child in two languages while the latter is a linguistically talented adolescent or even a young adult who is amply exposed to an L2 and has a passion for the pursuit of mastering that language. No one should exclude others from qualifying for accent acquisition, but the instances are very rare for adults. Nevertheless, very many adults can excel and even outsmart native speakers in their immaculate and creative competence in L2 lexicon, morphology syntax, stylistics and overall fluency. Unfortunately, they may not master the pronunciation simply because the latter seems to be bound cognitively with the limitations of age—the younger the more perfect, the older the less. The renowned novelist Joseph Conrad, a Pole in origin, is a typical example of a person who achieved the highest level at all aspects of linguistic performance in English except for his Polish accent in pronunciation which is a case commonly identified as ‘Joseph Conrad syndrome’. Another example is Henry Kissinger, the former Secretary of State whose fluency, syntax, lexicon are immaculate, but his accent is striking to the ear. The contrast in matters of pronunciation between the skill of a child and an adult reminds one of the difference between the terms ‘acquisition’ vs. ‘naturalization’ in becoming a citizen of the United States. Briefly, obtaining citizenship by acquisition is primarily a right by birth, whereas citizenship by naturalization, especially for adults, is a privilege to be earned by a set of procedures and requirements. Translating this difference in terms of ‘accent-free speech’ and ‘accented-speech’ does not amount to an exact analogy, but it hints at the fact that being

CHAPTER 4

67

born and raised in the same language is different from being born and raised in a language, but relocated to a second one later in life as an adult. Simply, accentless speech is a natural gift for a child born and/or raised in what is supposed to be practically an L1 environment. 4.5.2. Accent Reduction (Remediation)

Nevertheless, if accent acquisition is a difficult goal to achieve because it requires perfection or near-perfection in the execution of a language skill, it should not deter anyone from emulating the targeted pronunciation through accent-reduction as much as possible. In accent reduction, the objective is to suppress the most salient features of one’s L1 and attempt to replace them with those of L2 when speaking it in order to camouflage any readily detectable indications of an accent. Accent reduction is targeted for different purposes foremost of which are the following. 4.5.2.1. Linguistic and Aesthetic

Attain the highest proficiency in pronunciation as part of the overall competency in L2 to avoid accent as much as possible. This could simply be for linguistic and aesthetic purposes. Needless to say, in any L2 language classes, learners aim at a better level of proficiency in the target language, especially in its pronunciation through which they give the native listener a very positive impression especially because of higher intelligibility. 4.5.2.2. Spontaneous Interpretation

Spontaneous oral interpretation that is a daily practice in the United Nations’ General Assembly, Security Council or in any high level negotiation where room for misinterpretation and miscomprehension attributed to any linguistic component (syntactical, morphological, lexical and phonological) should be eliminated or reduced to the minimum. 4.5.2.3. Acting and Broadcasting

An actor or actress would certainly be far more impressive and convincing if he/she were to impersonate a foreign-speaking character as accurately as possible. For instance, Ben Kingsley’s performance in Gandhi’s film was extremely impressive in act-

68

PRONUNCIATION IS IN THE BRAIN

ing and demeanor, although his language would have needed a touch of retroflexion, especially with his ‘r’ sound to impact the native Hindi audience more authentically. Equally important is the role of newscasters in delivering their news bulletins loaded with names of foreign personalities. For instance, now that the name of the Syrian President بَ ّشار ُأال‬is in the news, I have yet to listen to an American newscaster or anchor pronouncing it correctly; in reality, its pronunciation is seriously distorted. Ironically, the name does not contain any sound that does not exist in English; rather, the distortion in its rendition lies in the misplacement of stress, reduction of Arabic vowels to schwas [] and the overall shift in syllabification. The Arabic pronunciation of the name sounds like this: [baar alasad]3 as opposed to an approximately typical mispronunciation by a native English-speaking newscaster: [br lsd]. The following are some of the major differences between the two renditions. a) Almost all Arabic vowels are Anglicized, especially in the direction of vowel neutralization, i.e. infusion of [] vowels in place of [a]; also the replacement of vowel [a] with [] b) The germination (doubling) of [] (= ) is reduced to a single one. c) The elimination of the germination results in reshuffling the syllabic structure of the first name; thus, instead of [ba] + [ar] it becomes [b] + [r]. d) Finally, the Arabic rolled = [r] is replaced with an approximant retroflex one [] or [] which is not a major change.

Cumulatively, the change in the overall rendition of the name goes beyond being an accent; rather, it is an overall distortion in pronunciation both phonologically and phonetically. A very similar example of mispronunciation in broadcasting that is retained in my memory from some decades ago is the 3

The bold syllables indicate the stressed syllable in each case.

CHAPTER 4

69

pronunciation of the name of the former French President Valéry Giscard d’Estaing. The proper pronunciation of the name is typically [valeʁi ʒiskaʁ dɛ ]; however, the Arabic-speaking Baghdadi television anchor pronounced the name with such a heavy Arabic accent that it was stripped off all of its French language characteristics. Some such typical phonetic/phonological features of pronunciation in French language include: strong tendency for word stress to fall on the final syllable; nasal vowels; voiced uvular fricative [] instead of an alveolar ; and a typical voiced postalveolar fricative [ʒ in place of a voiced postalveolar affricate [ʤ . None of those features were found in the rendition of the name by the Iraqi news anchor. 4.5.2.4. Spying and Espionage

Spying or espionage is a ‘profession’ in which the person may use language, especially pronunciation to camouflage his/her personality without raising suspicion. In other words, the person hides his identity behind an ‘adopted’ pronunciation. Because this ‘profession’ is an extremely risky assignment, it requires a high level of accent reduction and/or accent impersonation. This dimension of accent reduction will be revisited from a different perspective. 4.5.3. Accent Impersonation or Faking

In this section the term ‘accent faking’ is used with a specific connotation; hence, it undoubtedly needs some clarification. In a broad sense, ‘accent reduction’ and ‘accent faking’ overlap to some extent because they involve impersonation of a targeted speaker of a language except for the fact that the purpose can be different. Accent faking can project itself in different forms or strands. 4.5.3.1. Comedians Faking an Accent

A common example of someone impersonating or faking an accent is what comedians do especially when mimicking speakers of languages other than their own to generate laughter. It is interesting to note that not all comedians are equally skillful in impersonation—some are better than others. Let us just consider the case of Jay Leno impersonating Arnold Schwarzenegger. Le-

70

PRONUNCIATION IS IN THE BRAIN

no is a great comedian, but is not a good impersonator; nevertheless, he is very impressive when tackling Schwarzenegger. It is noteworthy that phonetically, his success is not attributed to his imitation of Schwarzenegger’s vowels and consonants, but rather to his overall rhythm, tempo and intonation. What comedians usually do is highlight the most salient features of the targeted speech and mimic them in a caricaturized way to capture attention. This means that a comedian may not necessarily be skillful in meticulous impersonation, but may be good in highlighting some salient pronunciation features similar to what caricaturists do in drawing—to highlight the most striking facial and bodily features and exaggerate them for comedic effect. 4.5.3.2. Building Intimacy

Accent faking may also take the form of socially impersonating the language or dialect of an interlocutor to sound friendly, intimate and trustful; however, in some instances, all those three attributes may be used either honestly or with a twist of dishonestly. In the first instance, the impersonator aims sincerely at bonding with the interlocutor for no ill intention. In the latter instance, he may be aiming at enticing the interlocutor to unveil personal information or even secrets. This may sound like spying, but in reality it tends to be more the action of a curious and nosy individual; however, there is always the likelihood that it can cross over to indirect interrogation or information gathering. As an instructor of phonetics and pronunciation, I have used accent faking of languages that I did not speak, but I knew the sound systems of, such as Russian, Greek and Hindi, among others. I did the faking for two purposes. First, to demonstrate the power of the knowledge of phonetics, especially articulatory phonetics, as a science. Second, to assess my own skill in faking as a strategy for teaching. In the former instance, faking an accent was used as a tool to attract the attention of the learners to my presentation and build up confidence in their ability to hone their skills through attention and practice. In the latter instance, I did it to test my own skill in faking a given accent. Let me cite some examples of my attempts at accent impersonation. In one instance, I had the following encounter with an adult American lady of Russian ethnic background:

CHAPTER 4

71

“One day on the campus of my university, I stopped by a colleague of the Department of Chemistry to arrange for a university event. While we were chatting, the lady in charge of the chemistry labs stopped by to ask my colleague a couple questions. Just as a courtesy gesture, my friend introduced her to me as ‘Ludmila’. Looking at my non-fair complexion, she apparently became more curious as to my ethnic background. I also became equally curious because she portrayed a distinct Russian accent in her English. Once I detected the Russian accent I wanted to ‘play’ my so-called accent impersonation ‘game’ as a means to collect data for this book. The following conversation went on between the two of us: She asked me: “Where are you from?” I said, infusing some sort of Russian phonetic features in my English: “I am from ex-Soviet Union.” She looked at me strangely as if in disbelief: “Are you Russian?” I said: “No, my parents were originally from Azerbaijan.” I chose Azerbaijan to justify my facial complexion as non-fair. She went on: “Do you speak Russian.” I replied: “Just a few words.” At this stage, I felt that she was extremely confused because she was suspicious of my story. I also felt guilty of the confusion I caused her. I immediately, apologized to her and told her I was simply trying to impersonate a Russian accent. One could readily notice that she was relaxed after my apology. She further looked at me and said: “But you don’t know how good you make it.” Linguistically, I was very happy after this encounter because I felt that my impersonation of the Russian accent seemed to have been good enough to make her believe in my fake story.”

With regard to the Greek language, I know only a few words and phrases including: , , , , ; however, when I use them in a Greek shop or with a Greek person, the immediate question is: “Are you Greek?” When my re-

72

PRONUNCIATION IS IN THE BRAIN

sponse is ‘No’, the next utterance is always: “I don’t believe you.” All that I do is apply my articulatory knowledge of the Greek sound system and use the right accentuation. This makes the difference between a phonetician and a ‘lay’ speaker of a language. Let me explain this difference with the help of the following real anecdote: One day, one of my friends asked me whether I knew Greek. I said: “No”. He went on to say: “Do you know the greetings?” I said: “Yes”. He then said: “What is ‘Good Morning’?” I said: “[kali mea ”.

When I asked him why of all Greek language he wanted this phrase, his response was that his mother shared a room in a nursing home with an old Greek lady and he simply wanted to greet her in her native language as a courtesy gesture. A few days later, I saw my friend and asked him about his visit to his mother and her Greek roommate. He said: “Yes, I greeted her, but she did not show any response at all.” I said: “Are you sure you used the right phrase?” He responded: “Yes, I said ‘[kl m].

If one notices carefully there are at least seven (7) phonetic differences between the authentic Greek pronunciation and his Americanized rendition of it. I then thought to myself that she did not respond to his greeting because she did not realize it was in Greek due to the very heavy English accent. 4.5.4. Intralanguage Accent Reduction and Impersonation

Accent reduction and impersonation can be at both dialect level (intralanguage) as well as language level (interlanguage). Although the emphasis is on the latter, some comments must be made about the former. All languages have many dialects that evolve for geographic, socio-economic or ethnic reasons. In

CHAPTER 4

73

many instances, a certain stigma may be attached to a given dialect causing many of its speakers to avoid using it either completely or in certain situations. On the contrary, one of the dialects, usually the standard,4 tends to be the most prestigious, hence a large number of speakers of other less prestigious or local dialects tend to adopt it through formal education or through special orientation such as radio and television anchors and announcers. Many such educated and professional people tend to reverse back to their own social, regional and local dialects in casual communication; hence, they are typically bidialectals. Usually, a good percentage of educated people tend to be bidialectal moving back and forth between the standard and their own dialects as the situation dictates. 4.6. CULTURAL ACCENT

As equally important as the linguistic accent, there are solid grounds for justifying the existence of a cultural accent. Thus, following the common pattern of L1 for native language and L2 for a target language, C1, henceforth, will stand for native culture and C2 for target culture. However, the question still remains as to those grounds that justify the recognition of a cultural accent. There are certainly some non-verbal gestures such as hand, eye and body movements as well as some interjectional filler ‘words’ that differ from culture to culture.5 Strictly speaking, the filler ‘words’ are not necessarily supposed to be words in the linguistic sense. Some of them can be simply interjectional utterances with no well-defined lexical denotations such as = [:] or [], and = [m]; [m] or [m]. The reason why such interjections are phonetically transcribed somewhat differently is because they do not have an exact pronunciation across individuals, dialects as well as languages. For instance, in English is more popular than ; besides, even if the latter is shared, for example, between native English and Russian 4

Linguistically, the so-called ‘standard language’ is also a dialect which

happens to be associated with the schooling system and formal education per se. 5

http://en.wikipedia.org/wiki/Filler_(linguistics).

74

PRONUNCIATION IS IN THE BRAIN

speakers, its pronunciation is somewhat different. Interestingly, the use of such language-specific filler ‘words’ is so deeply ingrained in one’s native language due to acquisition that they tend to be some of the last remaining accent traces of L1 and/or C1 in L2 and/or C2. If one carefully listens to some fairly competent Russian speakers of English, their speech tends to be punctuated with [m]s as fillers. An equally interesting observation about these filler words is that the vowel element in them tends to be consistent with the phonology of the given language. It is not unexpected to have the schwa [] as a dominant vowel in the coinage of filler words in English since this vowel has the highest frequency of occurrence in English. In the absence of a schwa in Arabic and the high frequency of [a] or [] (‫)فَتحه‬, it is this vowel that coins the filler word = [a] or [] in Arabic. With regard to hand and body gestures, as part of the cultural accent, there are many examples of them differing with different peoples and cultures. For example, one of the most typical examples of cultural accent for the Japanese while meeting and greeting natives of other languages and cultures is to bow with or while exchanging hand shaking. Equally noticeable is the tradition among Arabs, especially of rural areas or of rural background, who immediately and subconsciously touch their chest with the right hand after shaking hands. 4.7. TRANSITION OF ACCENT INTO ORTHOGRAPHY

Unquestionably, phonological accent is far more detectable than phonetic accent not just in speech, but also occasionally in orthography (written form). In fact, one of the indications of the seriousness of a phonological accent is when it is carried on by the L2 speaker into the orthographic renditions of the mispronunciations. Let us consider some misspellings of adult Hispanic students learning English. It is quite likely for a Hispanic to spell as or the vice versa. This is certainly attributed to mispronunciation. It is an established fact that the Spanish vowel system has a limited inventory of vowels based chiefly on quality differences with no quantitative (length) distinctions. This is a system that was first identified technically as a centrifugal one as opposed to a centripetal system of English in which

CHAPTER 4

75

vowels allow both qualitative and quantitative distinctions (Odisho, 1992). In Spanish, the vowel in (without) has the qualtity that is almost half way between the English vowels in and . Consequently, a Hispanic student fails to distinguish English words such as vs. , vs. or vs. . It is exactly because of the absence of such qualtity differences in their language, pairs of words such as the above are misspelled by adult Hispanic learners of English. In light of such examples, it is not uncommon for a Hispanic person to write a sentence such as: ‘This cars are expensive.’ For Arabic, the sound

is phonologically irrelevant, though phonetically the sound may occur in certain contexts such as when followed by an aspirated sound as in ابتداء‬ [] (beginning); nevertheless, without training and practice, /p/ is the most difficult, and at times embarrassing, sound for Arabs because in the absence of such a sound, hundreds, perhaps even thousands, of words are confused. Words such can become . The impact of this mispronunciation can occasionally overflow into orthography in the form of replacing words of

spelling with . In fact, at times, the phenomenon known as over-compensation may develop, according to which the fear of mispronouncing a given sound leads the speaker to reverse the rendition of the relevant two sounds. Once this over-compensation kicks in, the situation worsens because the Arab learner of English will not only reverse the pronunciation of /b/ and /p/, but will also reverse their orthographic renditions. I have come across many Arab students who pronounce or write as and as . In one instance, in the Iraqi city of Basra, a traffic officer had ordered the sign to be engraved on a concrete slab; unfortunately, it ended up being spelled as . In Kurdish, which is an Indo-European language, the English interdental fricative pair [, ] is absent. It is consistently replaced in pronunciation with the alveolar fricative pair [s, z]. This sound substitution is so powerful that it occasionally stealthily sneaks into their orthography. The anecdote below is relevant to this phenomenon.

76

PRONUNCIATION IS IN THE BRAIN “During the years 1960-1965, I worked as a teacher of English in a high school in the Kurdish city of Sulaimaniya/Iraq. Once, I gave my students a written test. During the test, I moved around the classroom monitoring the performance. On the desk of one of the students, I noticed a small piece of paper with Arabic scribbling on it. At first glance, I did not pay attention to it because the written test was in English; however, when I moved away from student’s desk, I had a different vision of the Arabic scribbling. There were some strange orthographic features in the scribbling that made it look different from the overall visual impression of standard Arabic orthography. The writing had more than usual recurrence of the letters س‬and ز‬. I went back to the student’s desk and picked up the piece of paper and began reading it. To my utter surprise, it was a text in English transliterated in Arabic and had relevance to the exam. The reasons why it had an extraordinary number of س‬and ز‬was simply because all English sounds—there are many of them—were transcribed as and because they reflected his Kurdish rendition of sounds. The student was assigned an ‘F’ in the test.”

All the above examples from Spanish, Arabic and Kurdish languages serve as evidence that when cross-language pronunciation causes a heavy accent, the accent may occasionally be transferred into the orthographic system of the targeted language. 4.8. CONCLUDING REMARKS

It is about time we stopped using the term ‘accent’ in a generic manner, especially when accent has professional implications such as in teaching, acting, broadcasting and information gathering at large. Any instruction in pronunciation, especially when related to accent reduction, should be designed for the targeted purpose. If the purpose is L2 learning, especially for adults, it is of prime importance to distinguish between phonetic accent and phonological accent and place the emphasis on the latter. Besides, any orientation in accent reduction should bear in mind that accent is not confined to segmental features (consonants and vowels); rather, the suprasegmental features are of equal

CHAPTER 4

77

significance in shaping an accent or reducing it. Furthermore, it is important to bear in mind that the acquisition of accentless pronunciation is an ideal achievement for all children and some adolescents immersed in L2 environment; also, perhaps, for a few adults with distinct linguistic aptitude for pronunciation. Serious and purposeful instruction in accent reduction should be handled exclusively by professionals with general linguistic knowledge and specific phonetic/phonological expertise who implement a multisensory and multicognitive approach.

CHAPTER 5: A BROAD BASE FOR UNDERSTANDING THE PEDAGOGY OF TEACHING PRONUNCIATION 5.1. INTRODUCTORY REMARKS

The pedagogy of teaching pronunciation according to the approach promoted here has been premised on cognitive principles implying that pronunciation, at large, and accent, in particular, is the reflection of some underlying cognitive settings. Consequently, any instruction targeting the improvement of pronunciation, especially in L2 situations for adults, should be designed in light of those cognitive settings. Teaching pronunciation to adults through memorization in the form of mechanical repetition becomes a highly ineffective practice because sound features and segments are simply meaningless, mono-dimensional acoustic signals that impact the ear and have no other semantic mnemonic to assist with their retention. They are very much unlike morphemic units—especially words—and syntactical stretches where structure, meaning and organization kick in to render them bi-dimensional or even multi-dimensional which collectively assist with comprehension and retention. This is exactly why many adult L2 learners may excel in morphology, syntax and lexicon, but manifest a distinct phonetic and phonological accent—Joseph Conrad being a typical example. Consequently, teaching sounds through mechanical auditory repetition by the instructor and rote memorization by the learner makes their cognitive retention quite difficult. More channels of learning are needed for better and more permanent retention. This is why the suggested approach calls for the joint involvement of as many sensory and cognitive channels of input as possible. To understand the difference between memorization and retention, it is enlightening to consider the following analogy. A balloon that is fastened with a single string may easily be lost 79

80

PRONUNCIATION IS IN THE BRAIN

when the string snaps, whereas a balloon that is fastened with several strings remains firmly in place even if one or more of them snap. Based on this analogy, teaching a sound feature auditorily (by listening to the sound), visually (by observing the accompanying facial and bodily features) and kinesthetically (by feeling the concomitant sensations) will secure much better retention, while mere rote memorization will not. Below are some of the most relevant principles that help with the understanding of the pedagogy. 5.1.1. Speech: A Cognitive Phenomenon

Human speech is a cognitive faculty, a potential in the brain before being in the mouth. Teachers will often see adults experiencing serious difficulty in producing a new sound to which they have never been exposed. This is a good example of the cognitive requirement for sound production, meaning that the brain may need enough exposure time to the new sound to perceive and recognize it before being able to produce it appropriately. Therefore, any instruction in pronunciation should target both the cognitive potential for perception and recognition prior to the necessary physical maneuvers of production. If, for instance, an adult native speaker of English is asked to produce a Spanish trilled [r], and he, after continuous modeling by the instructor fails repeatedly to produce it and instead persists in producing a frictionless continuant (approximant), like the English ‘r’, then the whole situation indicates that the learner is psycholinguistically unable to recognize it and, hence, produce it. This is a typical condition that is identified in this study as psycholinguistic deafness or insensitivity to sound, a condition that is characteristic of adults learning L2. Psycholinguistic deafness or insensitivity in the teaching of pronunciation cannot be remedied without an approach and sets of techniques that enable the brain to cognitively perceive and recognize the new sound and then fire the commands to the vocal organs to embark on a period of trial and error in executing the articulatory maneuvers needed for the accurate production of the targeted sound.

CHAPTER 5 5.1.2. Pronunciation: Multisensory Access

81

In real-life situations, sensing is rarely mono-dimensional and is often multi-dimensional. Usually two or more senses function jointly in a situation which reflects the nature of our physical and cognitive existence. Since speech is a cognitive faculty which is fed by a broad base of sensory modalities, especially the auditory, visual and tactile/kinesthetic ones, any pedagogy for teaching speech and pronunciation should be multisensory in nature. In fact, learning occurs more rapidly when more than one sense is involved. This also implies that all the strategies and techniques used for implementation should emanate from all those sensory modalities. For instance, the visual modality should take into consideration all facial and body gestures that are intertwined with the overall dynamics of speech production. All the non-verbal gestures that accompany speech perception, recognition and production are extremely helpful in teaching pronunciation. Learners have to be prepared not just to hear and produce the sounds, but also, and equally importantly, to see and feel the sounds in conjunction with the concomitant sensations and physical gestures in the context of authentic speech. This is why the approach to implement this pedagogy is multisensory in essence. In light of this approach a certain group of sounds could legitimately be labeled visible sounds because the listener can see the facial features that produce them. For example, consonantal sounds such as the bilabial [b p], labialdental [v f], interdentals [ ] and dentals/alveolars [d t] will squarely fall into this category. Features of sound production such as lip configurations (i.e. lip spreading and rounding), lip protrusion, jaw depression and elevation are also fairly visible features. It is also possible to visually detect some facial and bodily gestures indicating some characteristics of tense and lax sounds, especially vowels. Such gestures are readily detectable in the pronunciation of some English vowels, e.g. the high front ones / i / = [i], as in vs. [] as in . This is why speakers of languages such as Spanish, Greek, Russian and French have difficulty distinguishing those two English vowels because they have only one variety which is transcribed here as []. To visually detect the difference between the two English vowels, learners have to

82

PRONUNCIATION IS IN THE BRAIN

notice the extra stretching and spreading of the lips and the adjacent cheek musculature with less jaw separation. Even with suprasegmental features, such as stress and rhythm, there are several facial and bodily gestures that are part and parcel of stress execution and they hardly go unnoticed. In fact, a proper and natural placement of stress in actual speech cannot be executed without some facial gestures and occasionally body gestures especially when the stressing is emphatic. A teacher has to bring those features to the attention of learners who, in turn, should watch for those gestures; they are usually synchronized with the syllables or words to which stress is assigned. 5.1.3. Pronunciation: Multicognitive Access

Indeed pronunciation is a cognitive process and in order to internalize it one needs a multicognitive approach. Learners have to be encouraged to try to attentively listen to sounds, remember them, compare and contrast them with sounds already part of their psycholinguistic (cognitive) inventory or with versions of sounds produced by other learners. Simply, learners have to practice thinking consciously of the sounds and the process of their production through association, analysis, synthesis, comparison, contrast, memorization, etc. There is also room for metacognition, which is a state of the mind when people are aware of their own cognitive processes (Bourne, et al 1986); in other words, they are highly conscious of their thinking process. Learners should be instructed to feel their tongue movements in their mouth, watch the shape of their lips and sense the contact of the tongue with the teeth and lips as well as other facial muscular gestures. Although these cognitive activities may sound too abstract for some learners or even instructors to know about them, in reality, they do exist and their presence can be felt in different ways. Often when an instructor models a certain sound and then allows for a break before the reproduction session, many of the learners are already thinking of the reproduction. One can readily infer the thinking process through the facial and bodily gestures they unconsciously manifest. For instance, you can easily see a learner moving his tongue inside the oral cavity to feel the place of articulation or to try to create a rounded con-

CHAPTER 5

83

figuration for the lips, or even to depress or elevate the jaw to secure the targeted degree of oral opening. Every speaker of every language unconsciously manifests some head, hand, foot movements or facial gestures synchronized with the muscular effort needed for the execution of stress placement. These movements and gestures are all reflections of inner and mute endeavors on the part of the speaker and should be brought to the attention of the learner to master the dynamics of stress placement. There are times when the learner is quite conscious of his inner effort at the processing of the sounds and the outer gestures accompanying them. Quite often, when learners are asked about those cognitive processes and their physical reflexes, they admit to them. It is because of all those mental processes and the physical gestures that are associated with them, that the approach in this book is identified as multicognitive. It should be highlighted that the so-called classical technique of ‘ear training’ cannot solely accomplish the effective and successful mission of teaching L2 pronunciation; eye training, neuro-muscular training and, above all, brain training should supplement ear training. In fact, teaching efficient pronunciation requires guiding the learners of L2 through the processes of cognition and metacognition. 5.1.4. Pronunciation: An Integrated and Holistic Process

Pronunciation is an integral part of overall human communication (Morley, 1991). Human speech is portrayed in a wide variety of integrated combinations of segmental and suprasegmental (prosodic) elements. Both categories should be handled inseparably from the overall articulatory, visual, auditory and tactile/kinesthetic features accompanying speech production. The latter sets of features form the basis of what is differently labeled as ‘articulatory settings’ (Honikman, 1964) or ‘phonetic settings’ (Laver, 1980), among others. No cross-language teaching of pronunciation will be authentic and dynamic in nature and a reflection of the native-speaker’s proficiency without a serious consideration of the articulatory settings of the targeted language. For instance, a language like Arabic with a limited vowel system and a heavy dependence on guttural sounds and emphatics has such specific articulatory settings that without the incor-

84

PRONUNCIATION IS IN THE BRAIN

poration of the settings into the overall approach to learning Arabic by non-native speakers and learning of foreign languages by Arabs the results will be highly unsatisfactory. Similarly, it is important to know that there are languages whose vowel systems are identified in this book as centripetal as opposed to centrifugal systems (Odisho, 1992). The former systems tend to have a schwa vowel, [], with the rest of the vowels in the system showing a tendency to be reduced to schwas or schwa-like vowels in unstressed positions. The centrifugal systems tend to avoid contrasts in vowel quantity (length) as well as any noticeable schwa-inclination of other vowels regardless of their stressed or unstressed positions. English is a typical representative of a centripetal system as opposed to Spanish, which is a typical representative of a centrifugal system. Any cross-language teaching of pronunciation between these two types of languages can hardly be effective if the approach handles the vowels individually and as decontextualized segments. An efficient mastery of each other’s vowel system cannot be attained without a holistic approach to the teaching of the systems through their most characteristic features. Those learners with a native centripetal vowel system should be trained on both the avoidance of schwas and vowel quality and quantity reduction when learning languages with centrifugal systems. Conversely, learners with centrifugal systems should be trained on the production of schwas and vowel quality and quantity reduction in appropriate contexts and conditions of L2 learning. 5.1.5. Pronunciation: Top-Down & Bottom-Up Dynamics

Pronunciation is a dynamic cognitive and physical process. It should be taught in a dynamic way with both bottom-up and top-down approaches. A bottom-up approach implies teaching it from smaller to larger units (i.e., segments to suprasegmentals), while a top-down approach entails the reversal of the order of units. In other words, teaching pronunciation is like two-way traffic in which both directions of movement are needed in order to complete the cycle of communication. Traditionally, pronunciation has been taught through a bottom-up approach with emphasis on vowels and consonants often lacking proper contextualization and embedding in longer meaningful stretches of speech. Recently, there has been a twofold focus of attention in

CHAPTER 5

85

teaching pronunciation; firstly, an intra-segmental emphasis with attention on distinctive features, i.e. a more microscopic perspective; secondly, an inter-segmental emphasis with attention on prosodic features, i.e. a more macroscopic perspective (Pennington and Richards, 1986). 5.1.6. Pronunciation: The Complementary Nature of Acquisition and Learning

Teaching pronunciation should distinguish between the processes of acquisition and learning. Acquisition tends to be a subconscious, automatic and effortless process of internalizing a sound system, whereas learning tends to be more conscious, mechanical and effortful. The former is primarily characteristic of normal children’s mastery of the pronunciation of their native language or any given language, whereas the latter is primarily associated with the manner in which adults master pronunciation. Despite the difference between the two processes, acquisition and learning are not mutually exclusive in nature and function. Conversely, they tend to be complementary depending on many factors such as the age of the learners, extent of exposure and the conditions of exposure to the linguistic materials, level of motivation and the approach to teaching. Generally speaking, research as well as life experience adduce ample evidence in the direction of more acquisition than learning by children as opposed to more learning than acquisition by adults. Hence, in the description of language internalization by children, the appropriate compound action verb would be ‘acquire-learn’ and the reversed order, ‘learn-acquire’, would be more appropriate for adults. However, the above two orientations in language/speech internalization should not, in any way, imply that some adults are unable to attain a near-native proficiency. No doubt, those adults who have some degree of linguistic aptitude and a gift for language internalization will tend to handle languages with an acquire-learn strategy similar to that of children. Nevertheless, even some adults who do not entertain a linguistic aptitude may improve their chances of better learning regardless of age if the conditions and techniques of learning/teaching are conducive enough to motivate them and activate all their sensory and cognitive processes of knowledge and skills acquisition.

86

PRONUNCIATION IS IN THE BRAIN

The immediate instructional implication of the above statement is that acquisition and learning are not two processes of language internalization that are mutually exclusive in the absolute sense. The statement also entails that teaching pronunciation to normal children requires a methodology that is drastically different from that of adults. All that children need to acquire-learn the pronunciation of a given language is cognitive and developmental readiness coupled with ample authentic exposure to the reception and production of linguistic materials in the target language (L2). This does not mean that children learn only inductively with no benefit attached to deductive learning. There is always some of both—more of the former in this case and less of the latter. Consequently, mere exposure to language materials as well as repetitive drilling such as the traditional repeat-after-me technique may be far more functional and effective with children than with adults. With the latter, mere exposure is not sufficient and, oftentimes, the above technique turns out to be useless because adults tend to repeat after themselves, i.e. repeat in terms of their own phonology. In other words, adults may reproduce L2 sounds in terms of their L1. 5.1.7. Pronunciation: A Natural Gift for Children

The acquisition of pronunciation by children, which is so natural, efficient and perfect, should not exclusively be attributed to the pre-puberty potential and readiness for language acquisition. There are several other factors which are no less efficient in activating natural potential and nurturing it. The following are some of the foremost factors relevant in this regard: 

tens of thousands of hours of exposure to authentic native language materials in the first few years,



exposure often to context-embedded and situationembedded materials,



holistic exposure to language materials and skills



natural feel for language as a tool for normal social survival



affectionate risk-free environment afforded by the parents.

CHAPTER 5

87

The absence of some or all of the above conditions or their excessive deficiency renders L2 learning/acquisition by adults a far more challenging social and cognitive task. 5.1.8. Pronunciation Should be Premised on a Triangular Base of Perception, Recognition and Production

Any teaching of pronunciation should thoroughly follow the three-stage procedure of sound acquisition: perception, recognition and production in the sequence indicated. The above triangular procedure is highly consistent with the three-stage procedure of registration, retention and retrieval in learning and with the three types of sensory, short-term and long-term memories in information storing. In each case, the earlier stage serves as the gateway to the next and final stage. The transition to the final stage cannot be completed without continued rehearsal. A brief clarification of the terminology is invaluable. Perception is used to denote the condition of feeling and sensing the presence of a given sound; recognition includes the condition of perception as well as the condition of being able to distinguish the given sound from others and, perhaps, identify the difference(s) in comparative/contrastive situations. According to Parasuraman and Beatty and in terms of cognitive processing, “the distinction between perception and recognition appears to be the matching of the external sensory pattern with some internal sensory engram1 and the bringing of this to awareness” (cited in Kissin, 1986). As a further enhancement of the above quotation, Kissin states, “The definition of recognition as the process of matching external perceptions against existing internal correlates, implies a second level of activity” (Ibid). As for production, it satisfies the above two conditions of perception and recognition in addition to the ability to retrieve the sound and reproduce it at will with an acceptable degree of proficiency and accuracy. The sequential triplet of learning is: registration, retention and retrieval. In standard literature on learning, registration refers 1

What was supposed to be the trace of a sound in the brain.

88

PRONUNCIATION IS IN THE BRAIN

to the perception, encoding and neural representation of stimuli at the time of an original experience; retention is the neurological representation of an experience to be stored for later use; and retrieval is the permit to access previously registered and retained information (Arnold, 1984; Levitt, 1981). As for information storing in the brain, there are three different kinds of storing systems. The sensory memory is the initial level of information storing in the form of an impression; information stored here is extremely limited and is retained for only a few seconds. Sensory memory is a sort of photographic memory (Loftus: 1980) that is represented in two forms: auditory sensory memory known as echoic and visual sensory memory known as iconic. It is interesting to note that auditory memory appears to be more durable than visual because a sequence of spoken digits is better remembered than a sequence of digits presented visually (Baddeley, 1993). Short-term memory is not as limited as sensory memory; it can store about seven items plus or minus two items and for no more than half a minute or so. Although short-term memory may be transient and limited in capacity, it may be very useful in ear-training sessions where the temporary retention may allow the learner to better perceive the sound; it may also play a crucial role in conscious thinking. In plain wording, the half-minute or so allows the learner to think about the sound and its production. Long-term memory is the storing system where information is retained for longer periods of time and even permanently. In terms of cognitive knowledge, the process of learning is essentially one of transferring information from the environment into the long-term memory. Long-term memory is a more-or-less permanent repository of general knowledge about the world and past memories (Bourne et al, 1986). The explanation above suffices to portray the functional and operational parallelism across the processes of sound acquisition, general learning and memory and the sequential stages through which they usually go. For instance, in order to perceive a sound one has to be exposed to it at least in passing through the sensory memory; to have it registered, temporarily, it should be stored in the short memory; however, in order to retrieve and produce a sound at will, it has to be retained and consolidated in the long-term memory through rehearsal. Sequencing of stages is significant and bypassing a stage may neg-

CHAPTER 5

89

atively impact the outcomes. For instance, with insufficient and improper exposure to unfamiliar sounds, one is highly unlikely to succeed in producing them. On the contrary, it is highly likely that the learner will subconsciously relapse into his native inventory and produce a sound that is not the targeted L2 one. A serious flaw in the traditional approach to the teaching of pronunciation is attributed to either insufficient dwelling on the perception and recognition stages or their total negligence. Those two conditions lead to an immediate jump to the production stage—a condition that is typically embodied in the ‘repeatafter-me’ technique of teaching pronunciation, which is usually so incompatible with the learning styles of adults. 5.1.9. Pronunciation & Psycholinguistic Insensitivity

When very young children are able to perceive and even recognize a sound, but fail to produce it, the reason may be attributed to lack of neuromuscular maturation. To express it differently, such cases, which are very common with children, indicate cognitive perception and recognition of sounds, but lack of physical maturation and practice to coordinate, synchronize and set in motion the relevant articulators to assume the targeted articulatory configurations and postures. To substantiate such instances of lack of maturation, I had to conduct the following test with my son when he was four (4) years old: “He had an alveolar lisp because he was unable to pronounce the alveolar fricatives [s] and [z]. So I selected the minimal pair of vs. and repeated them several times and asked him to repeat them after me. He produced for both of them. I wrote the words on a piece of paper in big letters and read them very clearly and emphatically and asked him to repeat them after me. Once again he repeated for both. When I tried a third time, he yelled at me and said: This is [] pointing to and this is [] pointing to . I knew at that juncture that he was able to perceive and recognize the sounds, but was unable to articulate the difference. Six months later, when he went to pre-school, he came to me one day and said: ‘Do you want me to pronounce

90

PRONUNCIATION IS IN THE BRAIN and ’. He pronounced them perfectly. It was simply a maturational problem.”

It is not a serious problem teaching pronunciation to children. Ample exposure and proper practice will automatically enable the child to overcome the difficulty. If, however, adults feel it is difficult to produce certain sounds in L2 or if they completely fail to produce them in spite of repeated modeling by the instructor, it is highly likely that this situation has arisen because of a failure to perceive and recognize the targeted sounds. Such a condition has been earlier on labeled as psycholinguistic insensitivity or deafness. This is a condition which develops as a result of exclusive exposure to one’s native language (L1) in which a certain sound does not exist. It is assumed that the exclusive exposure to the speech of L1 creates an unintentional and inadvertent phonetic and phonological bias to the sounds and the sound system of L1 at the expense of those of any L2. It is a situation like this where the repeat-after-me procedure is highly ineffective. Nevertheless, a multisensory and multicognitive approach, which provides a broad selection of teaching styles, tends to be very effective in managing the perception/production of unfamiliar L2 sounds. Obviously, different learners will respond differently to various sensory and cognitive techniques or a combination of them. 5.1.10. Pronunciation: Understanding its Scientific Premises

Some knowledge of the articulatory movements is indispensable for teachers and can be very beneficial for learners, especially adults. Familiarity with the function of the vocal folds, the different parts of the tongue, the hard and soft palates and the lips is invaluable in understanding the nature of speech as a dynamic process as well as its teaching. If, for instance, a Hispanic learner of English tends to replace a voiced labialdental fricative [v] with a voiced bilabial plosive/stop [b], it should not be a difficult problem for the instructor, as well as the student, to overcome. The teacher should focus on visual demonstration of the labial articulatory differences between the two sounds: a posture of the two lips contacting each other for [b] as opposed to the lower lip contacting the upper teeth for [v]. Because these

CHAPTER 5

91

two articulations are clearly visible they become readily imitable and learnable. 5.1.11. Pronunciation: Its Feedback Mechanisms

The production of speech requires the simultaneous and coordinated use of respiratory, phonatory and articulatory mechanisms. Speech is such a complex activity that some method of feedback control seems likely (Borden and Harris, 1980). The physical, aerodynamic and acoustic dynamics, movements and perturbations that result from the action of the mechanisms often yield multifarious sets of internal sensations of touch, pressure, movement, position etc., which constitute the feedback control systems. Two important instructional facts emerge as a result of the above emphasis on diversified speech production feedback systems. Firstly, it is the auditory feedback system, which still, almost exclusively, dominates our approach to teaching speech and pronunciation in our schools. Unfortunately, in addition to the exclusiveness of the auditory feedback, it is oftentimes applied in the mechanical sense of mere listening or listening and repeating after instructor. Secondly, all types of feedback mechanisms, especially tactile/kinesthetic/proprioceptive, which are extremely important in sound acquisition, should be brought into play jointly in the form of different pronunciation teaching and learning techniques and activities. 5.1.12. Pronunciation: In Light of Multiple Intelligences Theory

Gardner’s Multiple Intelligences Theory (MIT) is one of the significant pillars of recent cognitive philosophy and orientation in knowledge acquisition. MIT is broad enough to encompass any aspect of our life and any field of knowledge dissemination and acquisition. Due to the relatively recent emergence of MIT in terms of classroom application, it is still at the abstract and philosophical level of an approach to instruction. Any attempt to bring it down to the classroom floor in the form of strategies for daily application requires its transformation from an approach to sets of techniques and strategies.

92

PRONUNCIATION IS IN THE BRAIN

With its nine intelligences of linguistic, logical/mathematical/scientific, visual/spatial, musical, bodily/physical/kinesthetic, interpersonal, intrapersonal, naturalist and existential, several of them are directly or indirectly related to the teaching/learning of language, in general, and more than one of them is directly involved in the teaching/learning of pronunciation. It is the belief here that in the teaching/learning of pronunciation, the linguistic, visual, musical and kinesthetic intelligences are involved to different extents in the development of a cognitive pedagogy to teaching pronunciation promoted in this book. It is, therefore, very important for the instructor to approach the learner via more than one sense and the learner should be prepared and encouraged to learn likewise. It was also made clear that the so-called technique of ‘ear training’ cannot solely do the job. Eye training, neuro-muscular training and brain training should supplement ear training. It is through a joint set of procedures involving the above modes of training that sound production and its dynamics will be assimilated and accommodated by the brain in the form of traces in long-term memory for immediate retrieval. The view that pronunciation is a complicated cognitive process that taps into more than one intelligence implies the need to activate those intelligences and involve them in the learning process. In order to secure maximum involvement of the learners, it is incumbent on the instructor to diversify the teaching techniques/styles to encompass all relevant sensory and cognitive modalities. It is only through this diversity that more adult learners will be invited to participate and multiple intelligences will be stimulated. The extent of the success of any class depends on the degree to which the teaching styles match the learning styles. The diversity of the teaching/learning styles will serve the significant purpose of discovering the intelligences of the learners and design future instruction accordingly. 5.1.13. Pronunciation: A Generative Skill

Obviously, the term ‘generative’ is associated with Chomsky’s theory of linguistics. The term is reused here with a slightly different denotation though still somewhat related to the Chomskyan connotation. The pertinence of the generative nature of the pedagogy promoted here implies that mastering the percep-

CHAPTER 5

93

tion, recognition and production of one sound should facilitate the mastery of more than that one sound. In other words, developing a skill in one aspect/domain of pronunciation should serve as a key to enhance or generate a skill to master other aspects/domains of pronunciation. For instance, in English, mastering the production of the sound of schwa [] vowel does not only help with the mastery of the complicated vowel system, but it will also considerably facilitate the process of stress placement and the overall rhythmic performance. In the domain of consonantal features, mastering the features of aspiration vs. nonaspiration in one pair of consonants (e.g. /p/ vs. /p/) should enable the person to apply that skill to other pairs of consonants (e.g. /t/ vs. /t/, /k/ vs. /k/, /c/ vs. /c/). Also, learning how to kinesthetically and proprioceptively sense a tongue tip contact at the alveolar ridge should develop the skill of sensing other contacts of the tongue in the oral cavity. Even in the dynamics of sound production, mastering accentuation in a given word should pervade other words and the overall rhythm mastery in the targeted language—or any other language for that matter. In sum, the pedagogy espoused here for teaching pronunciation is a holistic one because human speech is more than a combination of sounds; human speech is gestalt in nature. 5.1.14. Pronunciation: Interactive Involvement of Instructors and Learners

The instructor of pronunciation according to the multisensory multicognitive pedagogy suggested here should be knowledgeable about the processes involved in speech production and perception as well as having some awareness of the recent cognitive orientation in the theories of language and education. The theoretical knowledge base must also be reinforced by some practical skills of application. Such theoretical and applied know-how is indispensable because teaching pronunciation is no longer a strict, mechanical and exclusive imposition of mechanical repetition. Pronunciation is no longer a stand-alone physical phenomenon; it is rather contextually ingrained in the brain like the rest of the cognitive processes and activities. More importantly, the instructor should ascertain that there exists an interactive connection between him and the learners. To establish this connection, the instructor should also be knowledgeable in the fol-

94

PRONUNCIATION IS IN THE BRAIN

lowing respects. Firstly, he should make sure that the learners know what the theme/activity under demonstration is all about. For instance, if the activity is about accentuation (stress placement), he should make sure that learners know what stress and stress placement as phonetic phenomena are. He should not pass instructions that only he comprehends. Secondly, he should take the age of the learners into consideration and plan accordingly. Thirdly, to secure the above two considerations, the instructor should conduct a few testing and assessment activities to discover their knowledge and familiarity with speech acquisition. Finally, he should also conduct further direct and indirect assessment and probing activities to identify the type of pronunciation difficulties they may have, especially those emerging as a result of L1 and L2 interference. 5.2. CONCLUDING REMARKS

Language acquisition for a child follows the pattern of Caesar’s phrase of ‘Veni, vidi, vici’ (I came, I saw, I conquered) restated in the following format: ‘I listened, I recognized, I acquired’. Unfortunately, it is not that straightforward for adults embarking on an L2. There is no doubt that different adults have different potentials for an L2 learning; however, the majority of them do not have the disposition that a child has in mastering an L1. Adults have already exhausted their natural potential while acquiring their L1. It is the exhaustion of their innate L1 bias that stands in the way of duplicating the childhood experience as adults. It is because of this L1 bias that the methodology of teaching adults an L2 requires a broad array of sensory and cognitive modalities and strategies to render their mastery of L2 as successful as possible.

CHAPTER 6: TEN COMMANDMENTS FOR TEACHING EFFECTIVE PRONUNCIATION 6.1. INTRODUCTORY REMARKS

This section will serve as an introduction to the main implementational principles of the pedagogy promoted in this book. It will also summarize the factors that throughout my long career in teaching qualified me to manage the teaching of pronunciation in an effective and efficient way. The layout of the chapter is in the format of ‘ten commandments’ stated in a positive tone. They are collectively a reaction to three educational experiences in my life both as a student and instructor. First, occasionally learners do not understand the explanations and/or instructions of the instructor pertaining to certain aspects of language, but they fail to ask for clarification because of one of the three reasons: they are shy; they think they understand the point; and they are simply negligent. So, they remain in limbo. Second, some instructors—including myself at the early stage of my teaching career—fail to double check on the learners’ connection with what they are teaching or explaining. Such a scenario disrupts the interaction between the two, a situation that defeats the purpose of teaching. If such a disconnection prevails, the instructor will often be the only learner together, perhaps, with a minority of students. Third, some instructors—including myself at the early stage of my teaching career—may not be professionally qualified to teach pronunciation. This latter fact makes them vulnerable to irrelevant or inaccurate explanation of linguistic facts about speech, in general, and pronunciation, in particular. Later in my career, those three experiences were a major source of feedback for developing my own approach to teaching pronunciation. I benefited tremendously from my graduate education and my extensive classroom experience. I tried to over95

96

PRONUNCIATION IS IN THE BRAIN

come my own weaknesses and those of my teachers throughout my early education. I gradually made the adjustments needed. Fortunately, my graduate education in speech science extended the horizons of my knowledge and experience. I became more experienced in discovering my pronunciation problems and those of my students. Solutions to those problems had to be part of my lesson plan. Since my students were usually of different language backgrounds, I had to prepare strategies for each linguistic group. I always pressed my students to be straightforward in asking questions and making comments regarding my presentations, explanations of materials and implementational techniques used. I also used different strategies to discover whether they understood the technical jargon I was using in my instruction. A wide variety of techniques, demonstrations, facial, bodily and hand gestures were used to prop up demonstrations and explanations of facts and procedures. I carefully watched their faces for any indications of confusion and uncertainty. In asking a certain student whether he understood what I was saying, I never took a ‘yes’ response for a genuine ‘yes’. I usually probed the students to verify their positive response. The following is a real example pertinent to the latter point. While I was once engaged in teaching my students about stress placement or accentuation in English and how it can render the same category of words as verbs or nouns depending on the position of stress such as in: , I noticed that the facial gestures of one of the students indicated uncertainty and confusion. I asked the student whether she grasped the difference. She said: “Yes”, but with a tone of indecision. I had to demonstrate the examples once again highlighting the difference in accentuation. When I asked her to repeat the demonstration, she managed some of them but stumbled with others. In light of this situation, I had to use additional strategies until the student grasped the fundamental difference. The strategies were a combination of visual and auditory sensory modalities. I selected the word for demonstration and marked the strong syllable with a large dot and the weak one with a small dot. For a noun = [], I tapped on her desk strongly for the first syllable and lightly for the second one. For the verb = [], I reversed the strength of the taps: light

CHAPTER 6

97

for the first syllable and strong for the last. I also asked her to watch the movement of my hand while tapping. After two days, we had another session in the beginning of which I once again highlighted the difference and asked the same student to give a demonstration. This time she was excellent and with no hesitation, whatsoever. The mission was accomplished successfully. Let us now move to the ‘ten commandments’. 6.1.1. Thou Shall Teach Pronunciation as a Cognitive Undertaking

Like thinking, language is a cognitive skill that occupies considerable space in the brain which needs a social environment to nurture and cultivate the skill. The roots of human language grow with the birth of the brain and its gradual maturation. It is one of the first gifts the fetus receives; they grow and mature together. Since the brain is responsible for the physical, cognitive and social growth of a child, it has to function with superb economy of effort in all three types of growth. The brain gives the child a golden opportunity to perfectly internalize the language of the community into which it is born. This perfection is a onetime privilege for a child; once the child gradually steps out of the realm of its childhood the perfect skill in language acquisition begins to taper off. This explains why adults usually manifest a certain degree of imperfection in pronunciation known as accent. Hence the deficiency in the pronunciation of L2 and the emergence of accent are primarily attributed to the perfect efficiency in the pronunciation of L1. Accent in the rendition of L2 by adults is a normal natural cognitive phenomenon and the instruction to reduce it should be premised on cognitive basis and implemented accordingly. 6.1.2. Thou Shall Teach Children and Adults Differently

Since language internalization by children is a process of natural acquisition of L1, while adult embarking on an L2 is a contrived process of learning, the approach and the teaching/learning strategies have to be different in each case. In actual fact, children are not formally taught the sounds of their language and their collective pronunciation; rather, their brain is simply ready to assimilate the sound system through social exposure and in-

98

PRONUNCIATION IS IN THE BRAIN

teraction within the family and the community. The school simply reinforces the early linguistic education that home and community had already initiated. Adults are taught their L2 in schools or gradually pick it up in the L2 community in which they reside. The task of learning an L2 by adults is a conscious and effortful one. Oftentimes, they manifest a certain degree of accent. 6.1.3. Thou Shall be Qualified for Instruction in Pronunciation

Teaching pronunciation at a professional level requires general linguistic knowhow with qualification and experience in phonetics. Unlike other linguistic skills, such as morphology, syntax and vocabulary, pronunciation is the earliest skill a child is exposed to and it is, thus, naturally acquired in L1. This is why if the skills of sound production and overall pronunciation of L2 are attempted in adulthood the process can be very demanding because the brain shows reluctance to reshuffle the L1 system through relearning additional constituents. Any instructor of adults learning L2 pronunciation should be thoroughly aware of the sound system of both L1 and L2 to highlight the differences and similarities and the areas where problems would be expected. Knowledge of sound description and identification in terms of voicing/voicelessness, place of articulation and manner of articulation are seriously beneficial. Perhaps, most important of all is the possession of a diversified set of teaching strategies that take the learners beyond the most common and the least effective, strategy of parroting or ‘repeat-after-me’. Because of adults’ imminent psycholinguistic insensitivity to L2 phonology, they often repeat after themselves under the influence of their L1 phonology. The instructor has to use as many sensory and cognitive strategies as needed and diversify them in order to be able to penetrate the protective shield of L1 phonology. At times, individual learners require individualized attention and instruction. It is also the responsibility of the instructor to introduce adult learners to new sensory and cognitive learning strategies beyond the mechanical repetitions after the instructor.

CHAPTER 6

99

6.1.4. Thou Shall Familiarize Learners with Human Speech Production

Since pronunciation is the reflection of cognitive activities transmitted via the speech organs it is quite helpful for learners of other languages to familiarize themselves with some of the articulatory activities involved in speech production. As sounds in any language are based on three distinctive features, namely, voicing/voicelessness, place of articulation and manner of articulation for consonants or the shape and location of tongue for vowels, it is advantageous to introduce learners to the basics of those features. For instance, if one is teaching a group of Hispanic learners of English who are having difficulty in distinguishing a /b/ from a /v/, it is absolutely necessary to visually demonstrate the two sounds. The demonstration will readily show them the difference between the two sounds because the /b/ is bilabial (the two lips come together), whereas the /v/ is labialdental (the upper incisors touch the lower lip). Once they realize the physical (articulatory) difference, it is very easy to master the pronunciation of both sounds. 6.1.5. Thou Shall Orient Learners Psychologically

The instructor has to prepare adult learners and guide them step by step during the process of mastering L2 pronunciation. First and foremost, he has to instill in them a positive attitude for learning coupled with confidence. Next, comes the need to transform the physical ability to hear into a cognitive skill of listening to hone the potential for distinguishing one sound from the other. The instructor has to be careful not to let learners feel that they are under constant watch by him; rather, the interaction between him and the learners should be very casual yet focused on problem identification and problem solving. Learners should feel absolutely free to comment, criticize and seek further clarification. During all the interaction that goes on between the instructor and the learners, the instructor has to bear in mind that some learners tend to be more outgoing than others. It is his responsibility to make every learner feel utterly comfortable to ask, participate and demand further clarification of points under discussion. The instructor has to understand that in response to his question ‘Did you understand what I said?’

100

PRONUNCIATION IS IN THE BRAIN

learners nodding their heads does not necessarily mean that everyone has really understood what was said. There are always some learners who are shy, a couple who are slow and do not want to openly admit they did not understand, while some others may think they understood, but in reality they did not. It is utterly the responsibility of the instructor to check all these possibilities and respond accordingly. 6.1.6. Thou Shall Use all Sensory Modalities to Prop up Instruction

Pronunciation is not the exclusive responsibility of the auditory sense; unfortunately, the overwhelming majority of people think that is the case. This is why statements such ‘Listen to me’, ‘I will say it again’ or ‘repeat after me’ are commonly used by ordinary people as well as by teachers who are professionally not qualified in the methodology of teaching pronunciation. This exclusive focus on the auditory sense is one of the primary reasons for the failure of teaching effective and efficient pronunciation to adults. In addition to being heard and listened to, sounds are oftentimes visually detected through the facial and bodily gestures of the speaker. Besides, the speaker has the potential for kinesthetic and proprioceptive sensing of movements and/or contacts of vocal organs. In other words, the speaker has the potential to sense the position, location and orientation of speech organs inside his body beginning with the air movement in the lungs and throughout the vocal tract as well as the movements of the tongue in the pharyngeal and oral cavities. As for the opening of the oral cavity, the different lip configurations and lips/tongue contact, they are both seen and sensed. Accordingly, one of the foremost instructions to give to learners is to watch the facial and bodily gestures of the instructor during the modeling of new and unfamiliar sounds. Also one has to try to sense the movements or contacts of the tongue or other parts of the vocal organs while pronouncing or practicing sounds. This is the best technique to monitor your own articulation of sounds.

CHAPTER 6

101

6.1.7. Thou Shall Use all Cognitive Modalities to Prop up Instruction

Foremost among the cognitive modalities in teaching and learning pronunciation to be considered by the speaker or the learner is to think of his production of sounds or to associate a sound that one is learning with a sound that one has already mastered. The best way to demonstrate the cognitive modalities is through examples. Let us go back to the Hispanic learners of English and their difficulty with the [v] sound. All that the learners need to do is to watch the instructor’s face and create a contact between the lower lip and upper incisors. Next, ask them to repeat the contact several times so that the brain registers the kinesthetic and proprioceptive impression between the teeth and the lip. If the brain registers the contact through repetition it will serve as a reminder for the learner to perform the contact when a [v] is the targeted articulation. Proper repetition of articulation will gradually transform the maneuver into a cognitive process. The same Hispanic learners will face difficulty with the production of a [z] sound which is habitually replaced with an [s] sound. Since the difference between the two sounds is the vibration of the vocal folds with [z] and its absence with [s], the best advice an instructor can give his students is to perform the exercise of detecting vibration. This is executed by pressing the palms of one’s hands on the ears while performing a sustained [sssssss] sound followed by a sustained [zzzzzz] sound. With the latter sound, the person should feel some sort of humming-like echo in the ears. If such echo is missing it means the person is not able to set the vocal folds into vibration. The instructor should demonstrate the two sounds and ask the learner to do so until he gets the hang of it. Once the learner feels the difference in the form of the vocal folds vibration, it gradually becomes a cognitive process and is registered in the brain. 6.1.8. Thou Shall Transform Learners from Listeners into Performers

Although explanation of some aspects of speech production is of help in the process pronunciation, too much of it becomes distracting. Emphasis should better be on practical performance and demonstrations. The instructor should monitor the perfor-

102

PRONUNCIATION IS IN THE BRAIN

mance of learners and identify those who achieve an early success. Those early achievers should be encouraged by him to guide the other learners who need further demonstration and practice. At times, peer teaching and learning can be more effective than that of the instructor. The aim of the instructor should be the transformation of the learners into performers. This implies that each session of theoretical explanations should be followed by a session of actual performance of the targeted sounds. 6.1.9. Thou Shall Refrain from Insistence on a Learner

If a learner fails to master the performance of a certain sound after repeated demonstrations by the instructor or other successful performers, the instructor should not persist in pressing the learner for continued action. A situation like this confuses the learner even more and makes him conscious of his inability to perform. Once the instructor feels that such a situation has arisen, he should immediately stop either by moving to another learner or simply change the exercise or stop it. The instructor should also avoid focusing on one or two learners simply because they are good performers. Such a scenario will make other learners feel inferior or see themselves as under-performers. 6.1.10. Thou Shall Make the Classroom a Place for Learning and Fun

Mastering of certain segmental speech sounds (consonants and vowels) or suprasegmental ones such as stress, tone and intonation can result in funny situations that elicit laughter. For example, if one wants to prove that an is a nasal sound (the air runs through the nose not through the mouth), one can begin humming and then suddenly clip the nostrils. Of course, the humming (or the mmmmmm) will suddenly cease because the nasal air is blocked from flowing. The cessation of airflow suddenly builds up high pressure in the vocal tract and the ears. Usually, this unexpected swift buildup of pressure results in a burst of laughter which is perfectly acceptable and part of the hands-on learning about the dynamics of human speech. Also when one teaches a retroflex , typical of the subcontinental Indian languages, the exercise requires teaching learners the tilting of the tongue-tip backwards. This is a very

CHAPTER 6

103

unfamiliar articulatory maneuver and few people can perform it at first attempt. Working on it can raise a lot of laughter which is a normal and integral part of mastering it. Perhaps, teaching pitch movement or the tones is the funniest of all, simply because many of the learners may turn out to be tone-deaf. You may ask a learner to do a rising tone and he gives you a falling one. Indeed, there are few other sounds or sound phenomena that are not acquired without fun and laughter such as the different types of clicks in some African languages. In light of such funny situations all that the instructor has to do is to control the fun and laughter within acceptable limits. Having fun while practicing unfamiliar sounds is a normal, natural, learning situation—all that the instructor has to do is not let some of the learners be carried away with fun and interfere with class management. 6.2. CONCLUDING REMARKS

Effective teaching in classroom situations is a collaborative effort between instructors and learners. To make this collaboration successful, effective teaching and learning strategies should be premised on combinations of sensory and cognitive considerations. This diversity in considerations is indispensable in teaching L2 pronunciation to adults.

CHAPTER 7: EXAMPLES OF CROSS-LANGUAGE ACCENT-CAUSING CONSONANTS 7.1. INTRODUCTORY REMARKS

Since English is the most widely used international language, the focus will be now be on English to help restrict the number of comparisons and contrasts. We will carefully identify some of the most noticeable pronunciation problems that natives of other languages encounter and, in turn, identify problems that native speakers of English encounter when learning other languages. Another way to contain the unneeded breadth of comparisons is by limiting the number of selected sounds and sound phenomena that will be tackled. This chapter will be devoted to consonant features with the next two chapters covering vowels and suprasegmental features. 7.2. OUTLINE OF THE ENGLISH CONSONANT SYSTEM

In terms of unmarked (common) and marked (uncommon) sounds, the consonant system of English leans in the direction of the former more than the latter. In actual fact, there are no consonants in English that can be identified as marked (such the pharyngeals and emphatics consonants of Arabic, the pervasive retroflexion of Hindi and the three-way plosive distinction of Korean, etc.). However, based on the difficulties of learners of English as L2, the most problematic pair of consonants is the [] and [] as in and , respectively. Also, English sounds such as , , , , ,

can be the source of phonetic and phonological accent for speakers of some languages. Below are further comments on some of the cross-language phonological inconsistencies.

105

106

PRONUNCIATION IS IN THE BRAIN

7.2.1. Interdental Pair /, /

In terms of markedness and unmarkedness of human language sounds, this pair [,  ] can be considered marked because it is rarely attested in the majority of known languages. Due to its rarity, many learners of English encounter serious pronunciation problems resulting in a distinct phonological accent. More widely, the pair [,  ] is often replaced with the alveolar fricative pair [s, z] or the alveolar plosive pair [t, d] depending on the phonology of the specific language. The less common replacement of [, ] is with the labialdental fricatives [f, v]. This last substitution is also known in literature as ‘th-fronting’ which is common in some dialects of British English as well as in AfricanAmerican English.1 It is also observed in New Zealand English.2 To demonstrate examples of the above-mentioned substitutions, a German or Kurd learner of English, for example, renders the words and as and , while a Pole, Filipino or Assyrian learner of English renders the same two words as and , respectively. In each case, the outcome is a very serious phonological accent as well as a phonetic one. It results in a phonological accent because it changes the meaning of thousands of words causing crucial semantic confusion. Even without the semantic change resulting from the failure to properly pronounce the pair [, ], the mere phonetic change can cause serious phonetic accent. If one listens to exPope Benedict XVI3 delivering speeches in English, there is a sibilant impression running throughout his English pronunciation. The impression of sibilance is generated by the fact that English is naturally a language rich in sibilant sounds, typically [s, z, , ʒ . Now, if a German, Frenchman or Kurd learner of English were to convert all the [, ] sounds into [s, z] that 1 2

http://en.wikipedia.org/wiki/Th-fronting.

http://www.victoria.ac.nz/lals/resources/publications/nzej-

backissues/2003-elizabeth-wood.pdf. 3

Listen

to

his

reading

com/watch?v=w3fi93umuc4)

a

text

in

English

(http://www.youtube.

CHAPTER 7

107

would seriously reinforce the dominance of sibilance. It should be pointed out, however, that the high frequency occurrence of [, ] sounds, especially [], is not because they occur in the structure of many English words; rather, because the words in which they occur are of high frequency of occurrence such . Thus, when all the sibilance imposed by an L2 learner of English is added to the existing sibilance that is already naturally in English, it generates a pervasive semantic confusion and disturbing sibilance which reverberates as noise. Because of this dominating pervasiveness of sibilance in the pronunciation of a German or Kurd learner of English, the influence at times may be reflected even in orthography (as was pointed out in 4.6, above). The other form of conversion of [, ] is into [t, d] which causes equally serious phonological accent as it semantically confuses scores of words such as which are rendered , respectively. Phonetically, it creates an overflow of [t, d] sounds which, in turn, infuses considerable noise into the flow of oral discourse in English. 7.2.2. Approximant /r/

The most unmarked (common) types of sounds in the majority of languages throughout the world are the tap = [] and the rolled = []. Although such sounds exist in several dialects of English, the one dominant in Standard English varieties such as the English RP and GAE tends to be an approximant or the so-called frictionless continuant despite some phonetic difference between the two. The English RP is a voiced postalveolar approximant [], whereas the American one is a voiced postalveolar retroflex approximant []. Perhaps, the most interesting and atypical fact about the phonetic behavior of this sound in English is its suprasegmental impact on the context of the words in which it occurs. The contextual rules of ‘r’ pronunciation in the two varieties are different. In the general classification of the English dialects and varieties, GAE is categorized as an ‘r-dialect’ (or rhotic dialect) meaning that the ‘r’ is pronounced in all linguistic contexts, whereas RP is, perhaps, the most typical of all ‘r-lessdialects’ (non-rhotic dialect) meaning that the ‘r’ is not pro-

108

PRONUNCIATION IS IN THE BRAIN

nounced except in certain linguistic contexts. The rule for the positional pronunciation of ‘r’ in RP is very simple—‘r’ is pronounced when it is followed by a vowel sound within a word and across the word boundary. For instance, in the following pre-consonant, pre-silent vowel and word-final positions, the ‘r’ is not pronounced: 

Pre-consonant: , and ,



Pre-silent vowel: , and ,



Word-final: ,

,

,

,

,

and

whereas in the following pre-vowel within a word or across a word boundary it is pronounced: 

Pre-vowel: , , and ,



Cross-word pre-vowel: , .

It should be reiterated once again that the difference in RP or GAE pronunciation by learners of English as L2 is essentially of a phonetic nature implying that it rarely amounts to phonological accent;4 nevertheless, the phonetic difference is extremely noticeable. The retroflexion in GAE colors the adjacent vowels, whereas the positional constraints on RP ‘r’ results in two phonetic variations in the overall pronunciation. First, dropping the creates a consonantal vacuum for an L2 learner. Second, dropping it in word final positions or pre-silent vowels results in more diphthongal renditions of the vocalic elements as in [] and [] versus their renditions in GAE as less diphthongized vowels [] and []. Additionally, the substitution of the approximant English with a tap, rolled or retroflex flap /r/ as with speakers of Span4

Except, of course, when the is confused with another sound, such

as in Japanese when it is replaced with .

CHAPTER 7

109

ish, Italian, Arabic, Russian, Turkish, Indian, among many other languages, tends to result in so much phonetic accent that it may snowball into some sort of acoustic noise that, in turn, interferes with comprehension on the part of the native English listener. A special note is in order with regard to the typical retroflex flap /r/ of the sub-continental Indian languages. In such languages, retroflexion is pervasive and phonetically covers almost all consonants and vowels. Actually, from the perspective of Firthian linguistics,5 phonetic retroflexion in such languages functions as a prosody that runs throughout complete words and even discourse. In languages such as Hindi and Urdu, retroflexion seems to run throughout their entire oral discourse. This explains why many Hindi or Urdu learners of English, or any other language, color their L2 with a heavy trace of retroflexion. Most strikingly, retroflexion constitutes the most salient feature in the articulatory settings of Indian learners of other languages, in general. It is not a thematic diversion in this section to bring in the /l/ and /r/ confusion by Korean, Japanese and Chinese learners of English. Let us consider the case of Japanese learners. Phonetically, in Japanese language the , which is, generally speaking, identified as a tap or flap, while the , which is identified as a lateral, are two phonetic variations (allophones) of the same phoneme. In terms of the neurolinguistic storage of sounds in the brain, the two sounds of Japanese are stored in one slot. Unlike Japanese, in English the and are two autonomous phonemes independently stored in two separate slots. Besides the primary difference between the phonological function of the s and s, their phonetic realizations are also different. It is, therefore, quite natural for Korean, Japanese and Chinese students to encounter serious phonological difficulty in mastering the independent contextual production of the two phonemes of English. In fact, these two liquid phonemes in English remain some of the last sounds of English that those oriental learners of English manage to successfully pronounce. 5

After the late Professor J. R. Firth of London University.

110

PRONUNCIATION IS IN THE BRAIN

7.2.3. Voiceless and Voiced Alveolar Fricatives /s/ and /z/

This pair in English represents the unmarked variety of fricatives. Their mispronunciation is not as pervasive as the interdental pair /, /. However, in cross-language studies, differences arise that lead to mispronunciations in the form of accent. For instance, the voiced alveolar fricative /z/ phoneme is missing in Spanish which results in its replacement with /s/ in Latin American Spanish and a // in Continental Spanish leading to both phonological and phonetic accent. In Latin American Spanish, words such as , , and are confused with , , and resulting in semantic confusion. Furthermore, what is linguistically significant about [z] is that even when it does not cause phonological confusion, it certainly causes a very noticeable phonetic accent because of the high frequency of occurrence of /z/ sounds in English. The failure to pronounce the [z] is extremely noticeable even among otherwise fluent Hispanic speakers of English. Just recently, CNN television broadcast a program on the techniques of interrogation of foreign terrorists in which a CIA high-ranking officer of Hispanic background pronounced most of his [z] sounds as [s] in spite of the fact that his English was fluent and native-like. I vividly recall that the word = [p] was rendered [ps]. English words such as: , , , etc. in which the is pronounced [z], are traditionally realized as [s] by Hispanics. It is also noticeable that for Greek learners of English, the alveolar sounds [s, z] are often replaced by the alveolo-palatal sounds [ ] which are auditorily and impressionistically confused with the [, ʒ pair for the native listener. The confusion is not unexpected as they actually sound like them because [, ] are located exactly half-way between [s, z] and [, ʒ in terms of their place of articulation. It is worth noting that because Greek lacks the postalveolar pair [, ʒ sibilants “their sounds like Sean Connery’s ”.6 Traditionally, classical Greek historians have confused the alveolar pair [s, z] or the postalveolar 6

http://greek.kanlis. com/ phonology.html.

CHAPTER 7

111

one [, ] of other languages with their alveolo-palatal sounds [, ]. A piece of historical evidence that supports this claim is embedded in the chronicles of ancient Greek historians documenting Alexander the Great’s invasion of the Middle East. For example, notice that the English word ‘Assyria’ is the anglicized rendition of the Greek name ‘Aσσυρία’ in which the geminated sigma originally represents the germinated = [] in the ancient Assyrian name = [aur]. This indicates that the Greek historians identified the ancient Assyrian [] as sigma which English translators, in turn, rendered as thus creating the word . 7.2.4. English Plosives: /p b, t d, k g/

These three pairs of English plosives are quite unmarked (common), as in many languages, but learners of English as L2 with different linguistic backgrounds may encounter some problems in pronouncing them. Before citing some examples of pronunciation problems, it is necessary to point out that the voiceless plosives /p, t, k/ of English are aspirated i.e., [p, t, k] which means they are followed by a puff of air when released. English does have voiceless unaspirated variants of the aspirated ones only in post [s] initial clusters such as [sp__, st__, sk__]. Hence, the aspirated plosives in = [pn], = [tl] and = [kn] become unaspirated (or rather deaspirated) as in = [spn], = [stl] and = [skn]. Unfortunately, sounds that have only a phonetic occurrence in a language are not recognized phonologically by the brain of the native speaker. With some learners of English the presence or absence of aspiration may lead to phonological and/or phonetic accent. For instance, the Thai language has phonologically three-way plosive contrasts, namely voiced, voiceless unaspirated and voiceless aspirated as in the following examples in table 7.1, below.7

7

Handbook of the international phonetic association, Cambridge University

Press, 1999.

112

PRONUNCIATION IS IN THE BRAIN Voiced Plosives

Example

Meaning

[b]

[bā:n

to bloom

[d]

[dâ:n]

calloused







Voiceless Unaspirated Plosives [p]

[pā:n

birthmark

[t]

[tā:n

sugar palm

[k]

[kā:n

act

[p]

[pā:n

belligerent

[t]

[tā:n

alms

[k]

[kā:n

shaft

Voiceless Aspirated Plosives

Table 7.1. Three-way plosive distinction in Thai language

Thai learners of English are not expected to encounter pronunciation difficulties with the English aspirated plosives [p, t, k] except maybe in final position when Thai speakers pronounce them with no audible release whereas in English there is an option to release them (Kanokpermpoon, 2007). They may also have a perceptual problem in distinguishing the voiced plosive from the voiceless unaspirated plosive. On the reverse, English learners of Thai will encounter serious problems in mastering the production of the voiceless unaspirated plosives and distinguishing them from the voiced ones. Hispanic learners of English do not have a phonological problem with English plosives; nevertheless, they might demonstrate a certain degree of phonetic accent in rendering the aspirated plosives—especially in initial position—as unaspirated since their own voiceless plosives are unaspirated in nature. Because of the high frequency occurrence of initial plosives in English, a rendition of English by someone who fails to produce the aspiration gives the overall pronunciation a distinct phonetic accent. This has been typical of some of my Hispanic students.

CHAPTER 7

113

Arabic has all the plosives of English8 except the voiceless bilabial plosive /p/. Its absence is by far the most significant phonological problem for Arab learners of English inasmuch as segmental phonology is concerned. The failure to produce a /p/ and replace it with a /b/, almost exclusively, riddles the Arabic rendition of English with /b/s. It creates an unmistakable phonological and phonetic accent. Also noticeable with Turkish, Farsi and Greek learners’ rendition of English is the replacement of the velar plosives /k/ and / g/ with palatal plosives // or // for the voiceless and /Ɉ/ for the voiced. Although such a shift does not add up to a phonological accent, it does, however, color the rendition of English discourse with an distinct palatalized impression which, subsequently, manifests a readily distinguishable phonetic accent that runs throughout their discourse in English. Such palatal plosives are most distinct among Iranian speakers of English because of the palatal plosives in Farsi. Just to demonstrate the Farsi substitution of English velar plosives with palatal plosives, an acquaintance of mine used to pronounce the title of a cultural festival in the village of Skokie/Illinois ‘Coming Together in Skokie’ as [cmn tdr n scoci] for [km tgr n skoki]. Although there is no phonological accent, the phonetic one is more than a mouthful. 7.2.5. Labio-Dental Fricatives /f, v/

This pair of sounds can be a source of difficulty for learners of English of different linguistic backgrounds. For instance, in the Tagalog language of the Philippines, this pair is missing, therefore, the pair is replaced with /p/ and /b/.9 The substitution of such a pair in combination with the substitution of the English pair /, / with /t, d/ results in a very serious phonological accent in the rendition of English by Tagalog speakers. Typically, 8

It does not have /g/, but it is widely found in local vernacular Arabic

and many dialects. 9

For an interesting impersonation of English by a native speaker of Taga-

log watch the video at: (http://www.highpoint-ieltsblog.com/2011/03/filipinopronunciation.html).

114

PRONUNCIATION IS IN THE BRAIN

when the people of the Philippines are asked about their identity as a people they reply ‘Pilipino’ not ‘Filipino’. In Farsi and Turkish, the English /v/ is realized differently; most likely as the labialvelar approximant [w] or the labialdental approximant []. When replaced with the former it may lead to a phonological accent in pronouncing as or as ; otherwise, it merely leads to a phonetic accent. In Assyrian (Modern Aramaic), /v/ has a wide range of phonetic realizations (phonetic variants) depending on different regional and tribal dialects as demonstrated in table 7.2, below. Realizations of /v/

Phonetic

[]

Labial-velar approximant10

[]

Labial-palatal approximant

[]

Labial-dental approximant

Description

Table 7.2. Different phonetic realizations of the English phoneme /v/ in Assyrian.

7.2.6. The Affricates /ʧ ʤ/

Although these two sounds in English do not have specific letters to signal them, they are in fact quite common sounds and can be a cause of mispronunciation for L2 learners of English. It has already been pointed out earlier that Germans have difficulty with /ʤ/, which they replace with its voiceless counterpart /ʧ/. The French do not have this pair, therefore, they substitute the sounds with their fricative counterparts /, ʒ/. For Arab learners of English, the difficulty of such sounds may be relative depending on which Arabic dialect is in the background. For instance, speakers of eastern Arabic dialects such as Iraq, Saudi Arabia and the Gulf have much less difficulty 10

The feature “labial” with [w and with [ should really be “bilabial’

because both lips are involved in conjunction with the tongue configuration in the velar region for the former and the palatal region for the latter.

CHAPTER 7

115

in pronouncing /ʧ / and /ʤ/ and / as opposed to the speakers of Western dialects, especially Lebanon, Syria and Egypt. In the first two countries, they are usually replaced with // and /ʒ/, respectively, whereas in Egypt the /ʤ/ tends to become a voiced velar plosive /g/ and the /ʧ / is non-existent, and if attempted it will become a //. Interestingly, the pair /ʧ ʤ/ does not occur in Greek; therefore, Greek learners of English tend to replace the pair with the alveolar affricates /ʦ ʣ/. For instance, a Greek learner of English is expected to pronounce as [tʦ and as [ʣʣ . It is a substitution that generates a distinct phonetic accent. 7.3. CONCLUDING REMARKS

Because English is the most widely used language throughout the world, it has been used as the basis for reflecting the broad variety of phonological and phonetic accents that learners of English as L2 demonstrate in their rendition of its consonants. This, however, should not conceal the fact that English learners of other languages encounter a broad array of phonological and phonetic accents. Another aspect of the English consonantal system that should be highlighted is that most consonants of English are phonetically unmarked (common). This fact suggests that English learners may encounter their most salient difficulties when tackling the marked (uncommon) sounds of other languages such as the gutturals and emphatic of Arabic, retroflexes of Hindi, palatal plosives of Farsi and above all, the clicks in some African languages.

CHAPTER 8: EXAMPLES OF CROSS-LANGUAGE ACCENT-CAUSING VOWELS 8.1. SALIENT FEATURES IN GENERAL VOWEL DESCRIPTION

Foremost among the phonetic features used for vowel description are the terms quality and quantity. Quality is defined in terms of tongue-height, tongue-position and lip-shape, etc. and their combined acoustic impact on the ears of the listener. As for quantity, it is defined in terms of shortness vs. length and/or laxness vs. tenseness. A more comprehensive option for vowel systems identification has been introduced in the form of centripetal vs. centrifugal dichotomy which affords a more general model (Odisho, 1992). Because of some differences between the GAE and RP vowel systems and the manner in which those differences are transcribed, the RP system will be used to demonstrate the nature of the centripetal system. Generally speaking, this system tends to have a schwa vowel, [], with the rest of the vowels displaying different degrees of quantitative and/or qualitative values in unstressed positions. In figure 8.1, below, the vowel symbols indicate only differences in the quality of RP vowels, while in 8.2 the vowels are impressionistically marked phonetically for full-length and the short are left unmarked with schwa being the shortest and most reduced in quality. In the description and transcription of RP vowels Gimson’s (1967) model is followed because his transcription of individual vowels indicates both quality and quantity features unlike Daniel Jones’ model (1956) where the emphasis is primarily on quantity (i.e., either short or long). Gimson’s, in my opinion, is the most innovative and practical system especially in teaching advanced and efficient cross-language comparative pronunciation with emphasis on accent reduction. In this study, the only divergence from Gimson is in the identification of the English vowel []. Almost all those who have dealt 117

118

PRONUNCIATION IS IN THE BRAIN

with this vowel have identified it as short. However, as a phonetician and native speaker of Arabic, I beg to differ with this identification because the Arabic vowel الفُمد‬, which is the long counterpart of فَتحة‬, sounds phonetically, especially in non-emphatic contexts,1 almost identical with English [] as in the English and Arabic words in table 8.1, below. Arabic words

Meaning

Arabic words

Meaning

English words

هَم‬ بَت‬ َسم‬ فَت‬ َسد‬

worry

هام‬ بات‬ سام‬ فات‬ ساد‬

important

stayed

poisonous

elapsed

prevailed

with Fatha

decision poison weaken dam

with Alif

with []

Table 8.1. Words matching Arabic الفُمد‬with English vowel [].

Impressionistically, the Arabic words in column #3 and the English ones in #5 are pronounced almost identically. Thus, in contexts other than with Arabic emphatics, the gutturals and , الفُمد‬has a phonetic variant that is identical in vowel quality with English [] especially when the pronunciation adheres to that of Modern Standard Arabic and Classical Arabic. It is true though that فَتحة‬is not absolutely exactly of the same vowel quality of الفُمد‬, but the latter is identical with English [] in both quality and quantity. This is the rationale for assigning the length mark to the English vowel []. Marking such quantity and quality differences is extremely important especially for learners of English whose native languages tend to have centrifugal systems. Essentially, the centrifugal system has the propensity of avoiding two inclinations. First, it avoids any quantity (length) contrasts. Second, it leans in the direction of avoiding vowel reduction especially in the form of a schwa. Doubtless, there will 1

Not adjacent to < ‫>طُصُضُظ‬.

CHAPTER 8

119

be some difference in the quantity of the unstressed vowels versus the stressed ones, but this is hardly a striking difference. According to such templates for vowel identification, English will be a typical representative of a centripetal system as opposed to Spanish, which will be a typical representative of a centrifugal system (figure 8.3) with Arabic (figure 8.4) falling half-way between the two. Notice that the vowels for Arabic indicate major quantitative contrasts coupled with a limited degree of quality difference which is not readily perceptible by people with no professional phonetic experience.



 

  

 





 

Figure 8.1. The English RP simple vowel system—a typical centripetal one.

120

PRONUNCIATION IS IN THE BRAIN

Figure 8.2. The English RP simple vowel system—a typical centripetal one with relative length marks.



 

 

Figure 8.3. The Spanish vowel system—a typical centrifugal one.

CHAPTER 8



121

 







Figure 8.4. The Arabic vowel system—halfway between centripetal and centrifugal.

Any cross-language teaching of pronunciation to avoid phonological and phonetic accent can hardly be effective if the approach handles the vowels individually and as decontextualized segments. An efficient mastery of a different vowel system cannot be attained without a holistic approach to the teaching of the systems through their most characteristic features. Even minor phonetic differences in a given language may have phonological weight in the target language. Thus, in cross-language teaching of pronunciation any phonetic differences matter—if not to avoid being caught in the phonological trap then at least for dodging a phonetic accent. To demonstrate, those learners with a native centripetal vowel system should be trained on both the avoidance of schwas and vowel quality and quantity reduction when learning languages with centrifugal systems. Conversely, learners with centrifugal systems should be trained on the production of schwas and vowel quality and quantity reduction in appropriate contexts and conditions. 8.2. THE VOWEL SYSTEM OF ENGLISH

Obviously, English being the native language of several countries and also the most widely used international language, one

122

PRONUNCIATION IS IN THE BRAIN

expects some differences, or perhaps even some major differences, within its different standard varieties. To avoid drifting sidewise into minutia, the focus will be on GAE simply because it is gradually becoming more commonly used internationally. Occasional mention of RP will be made when and where necessary. 8.2.1. Simple Vowels of General American English

There are some noteworthy differences between the vowels systems of GAE and RP. Generally speaking, the two systems are more different in the domain of diphthongs than in simple vowels. With regard to simple vowels, there are some differences in the qualitative and qualitative values assigned to the same phonetic symbols. For instance, in GAE the vowel phoneme // as in and // as in are used to indicate simple vowels, whereas for RP the quality of vowels in the preceding two words is diphthongal and is marked by the symbols /ei/ and /ou/, respectively. However, a more striking difference is in the absence of the vowel quality [o] = [ɒ] as in the RP rendition of the words , etc. Another diversion in GAE away from RP system, is the emergence of what are known as ‘r-colored vowels’ such as [ɚ and [ɝ also named as ‘unstressed schwar’ and ‘stressed schwar’, respectively.2 Table 8.2 below, represents GAE simple vowels with examples. The relative phonetic and/or phonological quantitative differences of vowels are marked with length mark [] and its absence indicates shortness. Highlighting such phonetic/phonological differences are absolutely essential in crosslanguage teaching of pronunciation without which learners can hardly avoid manifesting accent.

2

MacKay, 1978.

CHAPTER 8 Vowel Quality

Vowel Quantity





Symbol

 

Indicator

123 Example beat bit

 

bait bet

 



bat





father; shot





boot



book



but





boat





bought



about

ɚ ɝ

writer 

bird

Table 8.2. Simple vowels of General American

Several points are worthy of consideration. First, the English vowel system, whether GAE or RP, is rich in quality compared to many vowel systems the most widespread of which being the five-vowel system.3 Second, to mark the quality and quantity differences sheds better light on the nature of the vowel systems a fact that is extremely significant to highlight in comparative phonetic and phonological studies especially when accent reduction and remediation are targeted. To demonstrate, if one has to phonetically transcribe the Spanish vowel in the word (without) compared to and of English, one has no choice but to mark the vowel in Spanish with halflength mark [i] because it is slightly different in both quality and length compared to English . English has to be transcribed as [] to set it apart from both English [], with a very short vowel, and Spanish [] with a half-long 3

Ladefoged and Maddieson, 1996.

124

PRONUNCIATION IS IN THE BRAIN

vowel. Third, in teaching comparative vowel systems such as English and Spanish, for instance, it is very difficult to enable the learners on both sides to master each other’s vowels without highlighting the tiny phonetic differences of quality and quantity across the vowel systems. Typically, Russian, Italian, French, Japanese and Spanish learners of English seriously confuse the English vowels such as in vs. and vs. . They reduce each pair of words to one in the form of half-long vowels as [dim] and [pul]. As surveyed above, the differences in the vocalic systems of the two main varieties of English are many, but fortunately, they often result in phonetic accent between its native speakers; however, at times, they may result in phonological accent as in the examples in table 8.3, below: GAE

Pronunciation

RP

Pronunciation

[kd]

[kd]

[ht]

[ht]

[pt]

[pt]

[p]

[p]

Table 8.3. Phonetic differences in vowels between GAE and RP can be phonological.

Thus, a combination of phonetic and, in rare cases, phonological differences makes the study of vowels in RP and GAE an area worthy of attention in teaching pronunciation, especially when making a choice between the teaching of one or the other variety. 8.3. SELECTIONS OF CROSS-LANGUAGE ACCENT-CAUSING VOWELS

In this section, attempts will be made to identify some of the most salient and readily perceptible features of both phonological and phonetic accent of learners of English as L2 with different linguistic backgrounds. Evidently, there will be limitations on what this section will include partly because of my limited linguistic knowledge and experience with languages and partly to avoid repetition of similar information. Additionally, at the end of each sub-section, the experience of learners will be reversed, i.e. native English speakers learning other languages.

CHAPTER 8 8.3.1. Hispanic Learners of English Vowels

125

It has already been demonstrated that English and Spanish vowel systems are almost maximally contrastive, the former being a centripetal system and the latter a centrifugal one. Due to this radical difference, mastering the vowel system of English becomes the foremost phonological and phonetic difficulty for Hispanic learners of English. The difficulty is not simply attributed to differences in quality and quantity of vowels; rather, the dynamics that govern the qualitative and quantitative features of the vowels involved further complicate the problem. It is the combinations of both factors that jointly lead to serious phonological and phonetic accent for Hispanic learners of English. A simple analogy in this regard may be beneficial in helping the learner envisage the difference between the two vowel systems. If each vowel in English and Spanish is likened to a car and the slot of the vowel to a garage, then Spanish will have five garages with five cars in them, while English will have twelve garages with 12 cars in them. Thus, when a Hispanic intends to learn English, he, subconsciously, has no choice but to allow the parking of more than one car in each of his five garages as it is schematically represented in figure 8.5 below, where the arrows show which vowels in English may mistakenly be identified as the ‘same’ vowel in Spanish. In linguistic terms, Hispanic learners will take two or three vowels in English as one vowel. And this is the most prominent pronunciation problem for Hispanics embarked on learning English. To demonstrate, each pair of the English vowel contrasts such as in the pairs = [] vs. = [] or = [] vs. = [] will go into one slot of [] for the former pair and [] for the latter one. Most important of all in terms of both quality and quantity, neither [] and [] are [] nor [] and [] are []. Phonetically, as well as phonologically in this case, these are certain vowels that should be precisely and carefully taught in an English/Spanish cross-language teaching of pronunciation. These examples serve to highlight one of the most challenging— if not the most challenging—problems for L2 learners of English whose vowel systems are very limited in quality and quantity. At times, the confusion can be extremely embarrassing when some obscene or taboo words are involved such as vs.

126

PRONUNCIATION IS IN THE BRAIN

, vs. , vs. and 4 vs. .

Figure 8.5. Misidentification of English vowels by Hispanic learners.

The above schematic diagram indicates that the English vowel system is a complex one, whereas the Spanish system is a relatively simple one as demonstrated with the words in table 8.4, below.

4

In its vulgar sense. These are some of the embarrassing pairs that one

may hear in classroom situation.

CHAPTER 8

127

Vowel Grapheme

Vowel Phoneme

Example

/a/



/e/

or

/i/



/o/



/u/

Table 8.4. Simple Spanish five-vowel system.

In fact, some describe it as “the essence of simplicity and elegance” (Stockwell and Bowen, 1965). What drives the two systems further apart is the substantial difference in the dynamics of vowel reduction in English, especially with regard to word syllable structure and the location of primary stress within a word or stretch of words. Spanish has hardly any noticeable variation of vowel quality and quantity in different contexts and across its different dialects (Stockwell and Bowen, 1965). As for its diphthongs, it is quite natural for a simple vowel system to have a simple and basic combination of diphthongs. The most common diphthongs in Spanish are the following: Diphthong

Example

/ei/

ley, reina

/ai/

hay, taita

/oi/

soy, coy

/au/

auto, chao

Table 8.5. Simple Spanish diphthong system.

The limited variety of diphthongs in GAE compared to RP, especially with diphthongs in words ending with where the is deleted and a schwa [] is inserted as in = [hi] and = [], makes the transition of Hispanics into GAE somewhat easier. However, the above observation about the easier transition into GAE compared to RP should be considered with caution because one can unequivocally state that phonetically every single simple vowel or diphthong in the two languages is virtually different from what phonologically

128

PRONUNCIATION IS IN THE BRAIN

may be considered a counterpart. For instance, phonologically one tends to say that both languages have the /au/ diphthong; nevertheless, phonetically, this is not exactly accurate because the two constituents that make up the diphthong in each language are phonetically different in the first place. At least, one can state that the English diphthong /au/ is originally composed of an [] vowel gliding into an [] vowel, whereas the Spanish diphthong is the coalescence of the [a] and [u] vowels. This is a strictly phonetic assessment that many teachers of pronunciation may not be professionally qualified to be aware of. From this perspective, the intention should be to secure as near a native-like pronunciation as possible that will dispel any confusion in meaning on the part of both the listeners and speakers. 8.3.1.1. English Learners of Spanish Vowels

Generally speaking, English learners of Spanish are expected to reverse what Hispanic learners of English do. They have to learn to shrink the domain of vowel quality diversity and eliminate the influence of schwa completely. Fortunately for such learners, due to the much richer vowel inventory of English than its Spanish counterpart, the problems will be more of phonetic nature than phonological. Nevertheless, it is worth mentioning that the dominating tendency of vowel reduction in English is expected to be the culprit for many unwanted qualitative and quantitative changes in the proper production of Spanish vowels. This tendency, if left unchecked, can become the cause of a serious distortion of the overall rendition of the pronunciation of Spanish primarily in the form of imposing a stress-timed rhythm on a syllable-timed one for which Spanish is well-known. A native speaker of English should never allow himself to be misled in pronouncing the Spanish word = [kolor], which is orthographically identical with the English one, as [klɚ or [kl]. Unlike English, which has remarkable inconsistency of sound and orthography, Spanish has one of the highest consistencies among known languages. 8.3.2. Arab Learners of English Vowels

Arabic and English are two languages that are drastically different in language family, sound systems and orthographic systems.

CHAPTER 8

129

English belongs to the Indo-European family, while Arabic is a typical Semitic language. The sound systems of the two languages differ extensively in consonants, vowels, stress placement and the dynamics. Arabic, like English, is the native language of a large population inhabiting a very large area. Consequently, it has a wide range of different regional, social and ethnic dialects. Some familiarity with a few most salient linguistic characteristics of the dialects is important for any learner of Arabic because even the so-called Modern Standard Arabic (MSA) is regionally influenced by those dialects. In fact, one can easily distinguish among different standard varieties of Arabic such as Iraqi Standard Arabic, Egyptian Standard Arabic and Lebanese Standard Arabic etc. These standard variants are not only different in segmental (consonants and vowels) pronunciation, but also in lexicon and, at times, even in the overall rhythm and melody, especially if North African Arabic varieties are considered. One typical deviation of the dialects away from MSA is the enhancement of the basic three vowel-quality system into a five vowel-quality one by adding the mid vowels of [] and []. The three simple vowels of Arabic usually combine in producing the diphthongs [ai] and [au] as in table 8.6, below. Thus, the above enhancement of vowel quality, through the influence of the Arabic dialects, does, somewhat, help Arab learners of English in handling more English vowels. Nevertheless, the system remains restricted in quality compared to English. Essentially, it is a simple triangular system of maximally differentiated vowels of /, , /. As a corollary to the restricted vowel quality in Arabic, its diphthongs are also limited in number. This is why some linguists are reluctant to accept the existence of diphthongs in Arabic. They opt to identify them as vowel plus a semi-vowel of ي‬j] or و‬w].

130

PRONUNCIATION IS IN THE BRAIN Word

Meaning

SA in IPA

DA in IPA

‫َبيت‬ ‫بَين‬ ‫لَيل‬ ‫َكيف‬ ‫َسيف‬ ‫َكون‬ ‫يَوم‬ ‫بَول‬ ‫لَون‬

house

[bajt]

[bet]

between

[bajn]

[ben]

night

[lajl]

[lel]

how

[kajf]

[kef]

sword

[sajf]

[sef]

universe

[kawn]

[kon]

day

[jawm]

[jom]

urine

[bawl]

[bol]

color

[lawn]

[lon]

Table 8.6. Vowel contraction in Arabic and creation of the mid vowels [] and [].

The nature of the problems facing Arab learners of the English vowel system is, overall, typical of a transition of speakers of non-centripetal vowel system to a centripetal one. Such learners of English are usually pressed for enhancement and diversification of their vowel quality range alongside the mastery of vowel reduction and schwa production. The most typical feature of the centripetal system is the existence of a schwa vowel []. The predominance of a schwa [] in English and its absence in Arabic is the main culprit behind the tendency of the so-called ‘word-deflation’ in English vs. ‘word-inflation’ in Arabic (Odisho, 2009) which is one of the most primary causes of accent by Arab speakers of English (Odisho, 2013). To demonstrate, the reader is referred to the manner in which the full name of the ex-President of U.S.A. William Jefferson Clinton is transliterated in Arabic. Its expected traditional Arabic transliteration would be وليام ُجي ِفرسون ُكلينتون‬5 instead of a more accurate rendition of ِول َيم‬. It is clear that in the English pronunciation of the President’s name there are no long vowels, whereas in its Arabic rendition there emerge five long vowels which, in turn, bring about a major shift in the 5

http://ar.wikipedia. org/wiki.

CHAPTER 8

131

rhythmic structure of the name and its pronunciation as visually demonstrated by the following stress pattern in English rendition    vs. the stress pattern in its Arabic rendition    with larger dots standing for the stressed syllable in each case. Such a shift in stress pattern is the most powerful source of accent generation by Arab speakers of English including those who are highly educated (Odisho, 2013). 8.3.2.1. English Learners of Arabic

English learners of Arabic bring with them a centripetal vowel system with a broad range of vowel qualities dominated by a schwa. Consequently, the first thing they have to do is to learn how to restrain their strong inclination toward vowel quality diversity. For instance, they have to eliminate the use of a schwa as well as any tendency in the direction of schwaization and vowel reduction. Equally importantly, they have to try carefully to maintain the vowel quality of Arabic vowels. They should not attempt to render the Arabic past tense forms, which are predominantly formed with triliteral consonantal roots and vocalized with three فَتحة‬vowels such as َكت‬//, َد َر‬ // and = //, as  // and //, respectively. Additionally, even the stressed vowel [a] may be replaced with English []. Such a substitution of vowels does not only result in a shift in vowel quality and relative quantity, but most importantly in the overall rhythm. In the case of Arabic learners with an RP English background, there is yet one more shift in vowel quality in the form of replacing some long vowels of Arabic by English diphthongs as in words with a long vowel in pre-‘r’ position of a syllable such as in = /safir /, َ

= / samir / and < ‫بير‬ ‫ك‬ >  kabir  which are pronounced as َ َ

/ sfi /, < ‫مير‬ ‫س‬ > = / smi/ and < ‫بير‬ ‫ك‬ >  kbi respectiveَ َ ly. 8.4. CONCLUDING REMARKS

Vowels play a very significant role in generating accent in crosslanguage situations. There are four main reasons for this role. First, vowel systems across languages can be drastically different. Second, vowels do not have a well-defined contact area during their articulation; they are the result of a configuration ra-

132

PRONUNCIATION IS IN THE BRAIN

ther than actual contact. Usually, it is more difficult to form a configuration than to execute definite contact. Third, except for the configuration of the lips, the other two constituents of vowel production, namely location and size of the narrowing, do not yield themselves easily to precise visual, kinesthetic and proprioceptive assessment. Fourth, vowels carry the weight of stress within words and determine the nature of the overall rhythm. Unfortunately, in the traditional teaching of pronunciation, in general, and accent, in particular, there has been more emphasis on consonants than on vowels, and this is partially the reason behind the ineffective approach of many instructors, especially those who depend on the so-called ‘phonics approach’ to teaching pronunciation.

CHAPTER 9: EXAMPLES OF CROSS-LANGUAGE ACCENT-CAUSING SUPRASEGMENTALS 9.1. A DESCRIPTION OF THE MOST SALIENT FEATURES OF SUPRASEGMENTALS

For many people, especially those without any linguistic orientation, the most natural categorization of sounds is into consonants and vowels. However, the design of human speech is too structurally and systematically complex, intricate and diversified to be straitjacketed in the dichotomy of consonants and vowels, which usually represent short segments of speech. Obviously, the use of the attribute ‘short’ for some segments (individual segments) implies the presence of units, which represent ‘long’ segments (relevant to more than one segment or a stretch of segments), which are often known as suprasegmentals. Without suprasegmentals, segmental features alone would not suffice to carry the complex and open-ended communicative message of human speech. A combination of segmental features and multilength suprasegmentals generates tremendous structural and systemic diversity of sound units, which, in turn, account for the multi-layered and multidimensional construct of human speech. A relevant aspect of the study of suprasegmentals is to decide what a ‘long segment’ is. The response to such a question depends on the linguistic school and perspective one follows and the targeted linguistic refinement. In the early works of structural linguistics, the primary emphasis was on stress and intonation; duration (length) and juncture were also occasionally treated. In the tradition of prosodic analysis, of the London school of linguistics, the longer segments called prosodies are not necessarily confined to the traditional stress, intonation, duration and juncture. A prosody, according to prosodic analysis, may represent any feature that extends over more than one segment or pervades throughout a stretch of segments. For instance, the 133

134

PRONUNCIATION IS IN THE BRAIN

‘ized’-suffixed sound phenomena such as labialized, (lip rounding) nasalized, velarized, palatalized, pharyngealized may all be handled as prosodies. For a simple and straightforward illustration of the nature of prosody, let us examine the differences between the sounds of the following minimal pair: = [] vs. = []. A standard phonemic approach will identify three phonemes in each word with the difference being confined to the vowel element. Unlike the phonemic approach, prosodic analysis will identify three segmental units in each called phonematic units and one prosody represented by the lipspreading in the former and lip-rounding in the latter each of which pervades throughout the whole word; hence, both lip spreading and lip-rounding are prosodies or suprasegmentals. With this broadening of the domain of suprasegmentals, the phonetic and/or phonological relevance of suprasegmentals will not only be associated with syllables, as the shortest units in speech, and the sentence as the longest unit, but also will certainly include any stretch of speech or discourse. This trend is highly consistent with the recent emphasis on the study and teaching of language at discourse level. Consequently, the teaching of pronunciation with the inclusion of suprasegmentals, in general, and the inclusion of articulatory settings (Honikman, 1964) or phonetic settings (Laver, 1980; 1994), in particular, creates a primary shift in emphasis and direction. The significance becomes even greater at higher levels of proficiency acquisition. In special cases when accent acquisition and accent reduction are targeted, orientation in suprasegmentals and articulatory settings are extremely vital; in fact, they become inevitable for refined pronunciation. Generally speaking, the formal study of suprasegmentals receives attention only after having exhausted the study of the segmental features. In the beginning, the segmental elements (consonants and vowels) receive more attention because they are more tangible by virtue of their easily identifiable nature. They have become even more tangible and identifiable in languages that have been reduced to writing. In writing, especially an alphabetic system, the major target has been the assignment of symbols to segmental sounds only. Few languages have cared to incorporate prosodic features into their orthographies. Today no linguistic description and study of a given language is com-

CHAPTER 9

135

plete and coherent if its prosodic aspects are left untouched. This advancement in the description of languages has led to the application of the findings in different L1 and L2 language instruction situations. It is interesting to note about language education in the United States that second and foreign language education and instruction are far more linguistically geared than native language instruction (i.e., English language arts). Perhaps two reasons may account for all or part of the discrepancy. Firstly, native language acquisition is completed in a subconscious, effortless and automatic manner due to ample exposure to authentic context-embedded and situation-embedded language materials. Secondly, under normal native language acquisition by normal individuals, there does not seem to be much need for formal linguistic intervention and support. Unlike native language acquisition situation, second and foreign language learning situations, especially with adult learners, may require more formal and linguistics-based intervention strategies to better systematize and enhance the learning process. This latter fact may account for the use of more linguistics-oriented teaching materials and textbooks in ESL and bilingual education instruction than in English language arts instruction (at least in the U.S.). For instance, English vowels are taught as if they are five in number and occasionally six when is added. Consonant clusters (the so-called ‘blends’) may still be determined on the bases of letters rather than sounds. Letter (grapheme), sound (phoneme) and letter-name (nomeneme)1 identities are easily mistaken for each other. Phonics is a mere letter-based technique that fails to handle proper teaching of pronunciation. Many of those misconceptions are less frequently encountered in ESL and bilingual language materials perhaps because those two disciplines have developed in close connection with applied linguistics.

1

This term was coined after the patterns of phoneme and grapheme based

on the Latin root ‘nomen’ (name) to designate ‘letter-name’ (Odisho, 2004)

136

PRONUNCIATION IS IN THE BRAIN

9.2. STRESS AND RHYTHM

Stress may have different interpretations from the speaker’s or the listener’s standpoints. When the speaker’s activity in producing stressed syllables is in focus, stress may be defined in terms of greater effort that is exerted in the production of a stressed syllable as opposed to an unstressed one (Lehiste, 1970). When stress is defined from the listener’s standpoint, the claim is often made that stressed syllables are louder than the unstressed syllables (ibid). This is why Ladefoged tends to think that stress can always be defined in terms of something a speaker does while it is difficult to define it from the listener’s point of view (1982). To avoid those complications, it suffices to deal with stress in terms of greater or lesser physiological effort by the speaker and greater or lesser prominence by the listener who assesses prominence as the overall index of greater loudness, length and higher pitch. Stress is, hence, primarily the result of greater physiological effort exerted by the speaker at a certain point within a polysyllabic word and at repeated points within the flow of speech. A greater respiratory effort makes a given syllable more prominent and with the decrease in this effort, syllable prominence diminishes. A realistic division of the prominence continuum is to identify three degrees of prominence to be associated with the trichotomy of weakly stressed, medium stressed and strongly stressed syllables. However, the dichotomy of unstressed and stressed syllables has customarily been more dominant. The term ‘unstressed’ is only figuratively employed to subsume the first two degrees of stress because the term is literally meaningless—no portion of speech is produced without physiological effort; consequently, every portion should have some prominence. In other words, the unstressed syllables stand for the portions with minimum stress. It is possible to distinguish between stress assignment within a word and within a sentence because within the latter it is likely for words to undergo a shift in the location of stress or to emerge with partial stress only. Within a word a certain syllable sounds more prominent in relation to others, while in a sentence certain word or words sound more prominent in relation to the rest. The former is called word stress and the latter sentence stress.

CHAPTER 9

137

Languages differ in the manner they use word stress and sentence stress to signal linguistic/nonlinguistic variations. Some languages show a strong tendency to retain the stress on a certain syllable within the word regardless of the syllabic structure and the number of syllables. Obviously, in such cases stress becomes highly predictable. Czech words tend to have stress predominantly on the first syllable irrespective of the number of syllables (Ladefoged, 1982), whereas Turkush tends to place stress on the last syllable. In other languages, stress changes its place according to several factors: foremost of all are the number of syllables, their internal structure and arrangement within a word, the grammatical category of words and their status as native or loan words. To facilitate the rules of stress assignment, linguists use the classificatory terms of ultimate, penultimate, and antepenultimate to identify the structural location of stress. If no rules can be formulated or if the rules can only capture certain instances leaving the rest of the instances unaccounted for without some ad hoc rules, then the predictability of stress becomes less likely and its role as a distinctive feature between the lexical items more striking. If stress is highly predictable, its function is primarily that of determining the rhythm and the overall pronunciation though it still can have a demarcative function, i.e. it helps to signal the word boundary (Hyman, 1975). In languages whose stress placement resists straightforward predictability, the function of stress is no longer confined to pronunciation and demarcation; it can assume a wide and diversified range of lexical and grammatical functions. The distribution of stressed and unstressed syllables within a language determines its rhythm. The traditional view is that rhythm in languages follows the dichotomy of stress-timed or syllable-timed (Adams 1979; Dauer, 1983). However, there are some linguists who tend to think that the concept of a dichotomy is too rigid a characterization to realistically portray the nature of rhythm in human language. Ladefoged (1982) states: “Perhaps a better typology of rhythmic differences among languages would be to divide languages into those that have variable word-stress (such as English and German), those that have fixed word-stress (such as Czech, Polish and Swedish) and those that have fixed phrase-stress (such as French)”. But since there is

138

PRONUNCIATION IS IN THE BRAIN

more than one factor that determines the nature of rhythm in a given language there is no compulsion to have one or the other of the stress-timed or the syllable-timed rhythmical bases (O'Connor, 1973). However, one needs to understand what is meant by a stress-timed or syllable-timed rhythm. A stress-timed rhythm is the one in which stressed syllables tend to recur at regular intervals of time and the syllables vary considerably in length depending on whether they are stressed or unstressed. On the other hand, a syllable-timed rhythm is one in which each syllable tends to retain more or less the same duration regardless of stress (Adams, 1979; Ladefoged, 1982; Roach, 1983). In the stress-timed rhythm, only syllables receiving the primary stress stand out prominently, while the unstressed syllables are reduced and compressed in time to become far less prominent. Unlike such uneven distribution of prominence, in the syllable-timed rhythm, all syllables, stressed or unstressed, receive a relatively even prominence; syllables take approximately the same time, and the overall length of an utterance depends on the number of syllables involved. In other words, in this latter type of rhythm there is hardly any noticeable reduction in the prominence of the unstressed syllables. It is in light of the above-mentioned characteristics that English is said to have a typically stress-timed rhythm, whereas Spanish is said to have a typically syllable-timed rhythm. At this stage, it is interesting to consider the possibility of an underlying connection between stress and rhythm type and the type of the vowel system in a given language. In English, vowel quality and quantity fall heavily under the influence of stress and this interaction is part of the dynamics of the vowel system. The location of stress and its strength within the word or sentence greatly influence the vowels both qualitatively and quantitatively. In syllables with a primary stress, vowel quantity (length) reaches its maximum and its quality is very distinct. In syllables with a secondary stress or a weak stress, both quality and quantity of vowels are reduced drastically. In unstressed syllables, almost all the English vowels can be reduced in both quality and quantity to the shortest vowels namely, [] or [] (Dalbor, 1969; Ladefoged, 1982; Dale and Poms, 1985). Such a qualitative and quantitative process of vowel reduction is a typically characteristic feature of English, but very un-

CHAPTER 9

139

characteristic of a language such as Spanish. The above exposition seems to point in the direction of the plausibility of a connection between the centripetal vowel system and the stress-timed rhythm type, on the one hand, and a centrifugal vowel system and a syllable-timed rhythm type, on the other hand. In case of Spanish, vowels tend to retain their relative quality and quantity, regardless of stress, and if they never undergo any schwaization or even vowel reduction, how does one expect the syllables to be manifestly different in length and prominence? A univalent system of vowels should undoubtedly yield a temporally uniform and univalent type of syllables, which is typical of a syllable-timed rhythm as in Spanish. By contrast, with a multivalent system of vowels combined with a very pervasive tendency toward schwaization, one should expect multivalent types of syllables as is the case typically of English stress-timed rhythm. To paraphrase it differently, if the vowel system of a given language does not maintain long/short or tense/lax phonological contrasts, if it does not have a schwa as part of its phonological system, and if it does not tolerate schwaization or a tangible degree of vowel reduction, it implies the presence of a synchronic constraint on the extent to which stress can alter the quality and/or quantity of its vowels. It is true that stress in Spanish can somewhat change vowel quality and quantity, but the change will still be confined to the phonetic domain. Thus, in Spanish, vowels may phonetically be slightly longer/shorter or tenser/laxer, but the absence of phonological contrasts based on those features will deny the language the potential for creating syllables that are significantly different in length and prominence. This argument in favor of binding the rhythm type to the vowel system does not mean that the vowel system is the only factor that determines the rhythm type in languages; undoubtedly, other factors such as syllable structure (Dauer, 1983), fixed/variable word stress and word/phrase stress (Ladefoged, 1982) are relevant in this regard. The most significant conclusion drawn from the preceding discussion is that the nature of the vowel system should be a factor to be seriously reckoned with in the typological classification of speech rhythm, its study and teaching. Those modifications to the traditional syllabletimed and stress-timed rhythm types amount to a major change

140

PRONUNCIATION IS IN THE BRAIN

that should be seriously considered in the development of the approach and the techniques of teaching pronunciation and the remediation of accent. 9.3. TONE AND INTONATION

In speech, there is always a continuous change in the fundamental frequency, which is auditorily realized as pitch also known as the melody of speech. Languages use pitch in two essentially different ways. If it signals semantic differences between words, the languages are called tone languages. In Chinese, for instance, the basic unit < ma > may have more than one meaning depending on the rising, falling, falling-rising or level pitch it carries. It is the pitch difference (or toneme to be consistent with other ‘eme’-suffixed linguistic labels such as phoneme and grapheme) that triggers the semantic difference. Many of the African and Asian languages fall into this category. Thus, a given language, whose pitch pattern has no specific role in the semantic shaping of words, but is rather used to signal a combination of syntactic, semantic and attitudinal features of the utterance, is called an intonation language. 9.4. BASIC PITCH PATTERNS

Pitch patterns are very vividly explained and schematically represented in terms of pitch height and pitch direction (Laver, 1994). The labels used to refer to different pitch levels are high, mid, low, mid-high, mid-low etc. Pitch contour refers to the shape and direction of pitch yielding different shapes such as rise, fall, level, rise-fall, fall-rise etc. A combination of attributes from both pitch-height and pitch direction produce the basic pitch patterns whose recognition and production should be an essential goal in any program for training in phonetics, in general, and pronunciation, in particular. No teaching of tone and intonation will be effective without the mastery of the basic pitch patterns. An interesting aspect of the basic pitch patterns is the cross-linguistic commonality in their nuances. The falling pitch patterns, both low and high, have the general purpose of expressing an utterance with a sense of completeness so that the attention of the listener is no longer required inasmuch as that particular utterance is con-

CHAPTER 9

141

cerned. A high fall usually indicates a more vigorous and determinate notion of completeness and finality than the low fall does. The rising pitch patterns, unlike the falling patterns, imply a sense of the incompleteness of the utterance as if further information is expected from the speaker or a response is necessary on the part of the listener. The basic pitch patterns are:2

Figure 9.1. Basic pitch patterns in human speech in general.

9.5. CONSONANT CLUSTERS

Consonant clusters or so-called consonant blends should be distinguished from consonants that occur juxtaposed to each other. The former is a combination that should structurally belong to one syllable and is pronounced as one intact piece. The latter is a combination that is spread over two syllables. Take the word which has a combination of two consonants, 2

For a demonstration of tone patterns in Chinese go to section 14.2.1/f.

142

PRONUNCIATION IS IN THE BRAIN

but it is not a cluster because the belongs to the first syllable and belongs to the second. Compare the of with the of in which the is one intact combination and belongs to one syllable. The of is linguistically termed ‘abutting consonants’ as opposed to of which is a consonant cluster proper. This phonetic differentiation is quite important in training students in areas pertinent to pronunciation because the difference will stress the point that clusters, not abutting consonants, are the real source of trouble (Odisho 1979a; 2003). Based on two linguistic facts, there is a strong rationale to include consonant clusters among the suprasegmentals. First, they are at a minimum longer than one segmental sound. Second, they can be a major source of mispronunciation and distinctive accent especially for those learners whose native languages do not contain clusters and they are planning to embark on a language loaded with them. Japanese, for example, is a typical language that is almost consonant cluster-free. Arabic is also a language that has relatively few clusters compared to English. Consequently, Japanese and Arab learners of English do exhibit serious problems with consonant clusters and impose phonetic changes that represent their L1 phonotactic rules. To avoid complex phonetic transcription, the changes in pronunciation will be kept as simple as possible. Also, interesting is the fact that speakers of different languages handle consonant clusters or abutting consonants differently. The prevailing rule is the breaking up of the cluster usually by inserting a vowel to rearrange the syllabic structure of the word containing the cluster or abutting consonants. The following are some of the most common ways that learners employ to avoid a cluster production. First, if the cluster is initial, some languages add a vocalic element to the beginning of the cluster called a prosthetic or anaptyctic vowel. This is attested in various languages including Arabic, Hindi, Sinhalese (Odisho, 1978; Fleischhacker, 2000; Jabbari, et al, 2012) among others, as in table 9.1, below.

CHAPTER 9

143

Language

English

Rendition

Arabic



Sinhalese



Hindi



Word

Table 9.1. Examples of initial consonant cluster breaking with a prosthetic vowel.

Second, some languages break up the cluster or abutting formation by inserting a vowel element called an epenthetic vowel as in table 9.2 below for Korean and Farsi, among others. Language

English

Rendition

Farsi



Korean



Word

Table 9.2. Examples of initial consonant cluster breaking with an epenthetic vowel.

The epenthetic vowel can also occur in syllable-final positions as well as across word boundary. It is quite common with Arabic, Assyrian and Spanish speakers to avoid final clusters in two ways: splitting the cluster with a vowel and creating an additional syllable as in pronouncing as ‘barkid’ instead of the normal [bɑrkt] as speakers of Arabic, Spanish and Assyrian usually do. Some Hispanics completely drop the suffix. In fact, in the latter case, the tendency is so strong that it is reflected in their orthographic spelling of the past and part participle of suffixed verbs (Odisho, 2007). Speakers of cluster-deficient languages do usually break up abutting consonants forming across the word boundary. For example, speakers of Egyptian Arabic pronounce a phrase such as ‘whiti house’.3 Korean ESL students tend to 3

The generic vowel is used for simplicity. The exact phonetic “quality of an

epenthetic vowel in a particular language may vary depending on segmental and

144

PRONUNCIATION IS IN THE BRAIN

pronounce as ‘Englishi languagi’ because Korean does not even allow ending words with most consonants. A similar tendency applies to Japanese. In an ESL instructional film, a Japanese student pronounced and as ‘sausi geti’ and ‘norsi geti’, consecutively.4 Three questions are still relevant in the context of this discussion of the role of consonant clusters in accent generation. First, how serious a source of accent can this linguistic aspect be? Second, what vowel quality is inserted to break up the cluster? Third, where to insert the vowel to eliminate the cluster? There is a straightforward answer to the first question. It all depends on the linguistic gap between two languages in terms of cluster-deficiency and cluster-richness—the wider the gap the greater the difficulty. It is in terms of this gap that some languages do not tolerate even combinations of abutting consonants. As for the second question, the answer lies in the phonology of each language and the phonotactical rules that govern. For example, if a language, such as Spanish, has a centrifugal vowels system without a schwa vowel [] or any short lax vowel such as [], [] or [] the inserted vowel tends to be [i] a tense one. Finally, in response to the third question, it also depends on the phonotactical rules of the two languages involved, especially those rules that govern the syllable structure formation of words. Any combination of consonants, whether in the form of clusters or abutting consonants, can be a source of serious accent for learners whose native languages are consonant-cluster deficient. A humorous but authentic anecdote pertinent to the breaking up of word-initial clusters and the subsequently semantic confusion is associated with the deposed dictator Saddam Hussein after he invaded Kuwait in 1990. Kuwait in Iraqi Arabic is either pronounced [] or []. After the invasion, a prosodic factors, such as the quality of the surrounding consonants, the quality

of other vowels in the word, and the position of the epenthetic vowel within the word.” (Repetti, 2012). 4

Obviously, Japanese natives replace the = [] and [] sound with

[s] and [z], consecutively.

CHAPTER 9

145

foreign journalist interviewed Saddam Hussein with the presence of his interpreter because his English was well known to be of very low proficiency due to his poor education. During the interview, the journalist made a statement which I do not recall exactly, but it was lexically and grammatically as follows: “We should not equate this situation with that of the West Bank.” Saddam jumped ahead of his interpreter and said: “Tell him (the journalist), I did not mention Kuwait.” Obviously, there was no mention of Kuwait in the statement of the journalist, but Saddam mistook the word = [] for his own pronunciation of the name of Kuwait = []. The interpreter had no choice but to translate his master’s extraneous interjection because he did not want to lose his life after the interview. The journalist was bewildered at the translation and Saddam did not understand what had happened. 9.6. CONCLUDING REMARKS

No study of pronunciation or the teaching of it is complete and comprehensive without a thorough covering of the suprasegmental features in the speech of any language. Unfortunately, the traditional and the non-linguistic approaches to teach pronunciation focus almost exclusively on the segmental features (consonants and vowels) with hardly any attention paid to the long features that run through those segmental constituents and bind them together into more semantically expressive stretches of speech. Without the study of stress placement one fails to recognize which syllable is prominent; without recognizing the rhythm, the organization of beats within the stretch of speech is lost; and without being aware of intonation one fails to appreciate the melodic difference across languages. In sum, if the suprasegmentals are not a primary part of a study of pronunciation, the resulting accent in L2 is expected to be distinct and telling.

CHAPTER 10: THE ROLE OF ARTICULATORY SETTINGS IN PRONUNCIATION AND ACCENT 10.1. INTRODUCTORY REMARKS

After the pioneering work of Honikman on articulatory settings, (Honikman, 19641), many researchers elaborated on this concept, its characteristics and linguistic and non-linguistic relevance (Laver, 1980, 1994; Esling & Wong, 1983; Lowie and Bultena, 2007). Unfortunately, different names such as phonetic settings, voice quality, voice quality settings, paralinguistic features (Pennington and Richards, 1986) were assigned to it. These features, which Catford calls the initiatory, articulatory and phonatory prosodies (1994) pervade throughout speech in a continuous and/or recurrent manner and characterize the speech of a group of people with a distinctive overall impression. In light of this, Laver’s coinage of phonetic settings seems to be more comprehensive in denoting the phenomenon to include initiatory, articulatory and phonatory features. Laver defines a setting “as a featural property of a stretch of speech which can be as long as a whole utterance; but it can also be shorter, characterizing only part of an utterance, down to a minimum stretch of anything greater than a single segment” (1994: 115). This definition of phonetic settings with the use of the words ‘stretch’ and ‘utterance’ brings it under the rubric of suprasegmentals the only difference emerging being the fact that some features of phonetic settings may be as long as the speech act (or discourse) is maintained. In other words, some features of phonetic settings may readily qualify as phonetic features of discourse; typically 1

‘Articulatory Settings’ is treated as singular. Honikman did not publish

much, but this paper is one of the most brilliant pieces of phonetic literature.

147

148

PRONUNCIATION IS IN THE BRAIN

in this regard is the retroflexion in some sub-continental Indian languages. As discussed above, ‘articulatory settings’ represents a coherent combination of some of the most salient features in a language that may persist throughout stretches of speech of different lengths including discourse. In the study and application of the concept of articulatory settings, the focus will not be on the idiosyncratic manifestations of the characteristics of the settings. Conversely, the focus will be on the collective manifestation of a habitual phonetic orientation by all or most of the speakers of a given language within the native language environment or its extension into the target language environment. To illustrate, if an individual nasalizes his speech because of a physical deficiency or improper articulatory habit, the articulatory settings is regarded an idiosyncratic one. Unlike the isolated nasalization instance, if the speakers of a given language manifest nasalization as a very consistent feature in their vowel system, such as in French, and transfer the feature with them and in their learning of other languages then nasalization becomes a primary feature of French articulatory settings and it, thus, should be the focus of attention in the teaching of pronunciation to French learners of L2 to avoid imposing unneeded nasalization. Conversely, learners of French language should be instructed to acquire its nasal vowels. By the same token, vowel harmony in Turkish and Hungarian will certainly become a component of the articulatory settings of those two languages. The articulatory settings jointly represents the most characteristic consonantal, vocalic and prosodic features that are ingrained in the overall speech production in a given language. They help generate the speech at its most authentic form, color it throughout with those features and give it its most genuine native impression. In the overall approach to teaching pronunciation, there are different levels of proficiency—ranging from the most elementary to the near-native or native level—that are targeted depending on the overall objectives of a program or an individual learner. In targeting the highest level of proficiency, which is the native proficiency, the learner should not only attempt to perfect the phonological distinctions, but also to master any type of phonetic distinctions and characteristics of the targeted language. Any learner whose objective is the native lan-

CHAPTER 10

149

guage proficiency in L2 should gradually progress from a conscious and belabored impersonation of L2 pronunciation to a subconscious and automatic production of it first at the phonological level then at the phonetic level with both segmental and prosodic features. Native language proficiency is not confined to the mastery of phonological contrasts alone. All or most of the refined phonetic features should be perfected to the extent that the learner is indistinguishable or at least hardly distinguishable from a native speaker. To reach this level of proficiency both the instructor and the learner must be qualified to play their roles each in his own way. The instructor must be highly knowledgeable both theoretically and practically; must be aware of the most distinctive and exclusive features of L1 and L2; and must possess a set of strategies to implement his approach. As for the learner, he must be highly motivated; must work hard; and must have ample exposure hours through classroom practice and reallife practice in the authentic environments of L2. It is noteworthy that not every learner may succeed in attaining native language pronunciation proficiency in L2. Certainly, every learner has the potential to improve proficiency in L2 pronunciation, but most of those who excel tend to have some sort of linguistic aptitude and high motivation. However, regardless of the level of achievement in L2 pronunciation, the learner must have enough opportunity to practice L2 in both perception and production. It is the ample exposure time to authentic L1 speech that makes a native speaker be a native speaker. Consequently, if a learner, especially an adult, aims at nearing or matching the proficiency of a native L2 speaker, he should go through the same or similar linguistic experience the native went through. In the following subsections a survey is made of the most salient features of the articulatory settings of some languages which are drastically different from each other. If a learner aims at acquiring high level proficiency in the pronunciation of anyone of those languages he should seriously consider the mastery of the following most salient features of the articulatory settings of each one of them.

150

PRONUNCIATION IS IN THE BRAIN

10.2. SALIENT FEATURES OF ARTICULATORY SETTINGS OF SELECTED LANGUAGES

The selected languages have been strictly limited for obvious reasons, foremost of which are the limited languages with which I am familiar as well as the limited space allocated for each chapter. For each language, there will be a selected combination of the most salient segmental and suprasegmental features that actively mold the articulatory settings of the given language. Stated differently, the articulatory settings functions as the distinctive mark that all speakers of a given language share and manifest when they speak it. With regard to the latter statement, it is imperative that any teaching of L2 to those adults should take into consideration not only the articulatory settings of their L1, but, equally importantly, the articulatory settings of L2. In the first instance, the instructor should try to block the features of the L1 settings from seeping through into the L2, while in the second instance, the instructor should enable learners to absorb and assimilate the features of L2 settings. 10.2.1. English Articulatory Settings

In identifying the most salient features of the English articulatory settings, the focus will be on three general components: consonants, vowels and the impact of vowel dynamics on rhythm. All three domains mold the articulatory settings, but to highlight the characteristic of each domain helps clarify the overall configuration of the settings. 10.2.1.1. Salient Consonantal Features

In highlighting the salient consonantal features of English, it is advisable to consider them in terms of natural classes as much as possible. For example, English /p t k/ are voiceless aspirated, therefore speakers of languages in which these plosives tend to be predominantly unaspirated (Spanish, French, Italian, Greek) should consider the difference and learn how to aspirate the voiceless plosives. /b d g/ plosives tend to be predominantly voiced, fully or partially, in all positions. Consequently, learners of English, such as Germans. should be instructed not to devoice them in final position as they do in their native language. English /k g/ are velar plosives; it is, therefore incumbent on speak-

CHAPTER 10

151

ers of languages such as Persian, Hungarian, Greek, Turkish, Modern Assyrian (Aramaic), etc., whose plosives tend to be palatal /c /, not to replace the English velars with their own palatals. Unfortunately, they often do so and, hence, they enhance their phonetic accent. English affricates /ʧ ʤ/ may be the cause of phonological accent since many languages have only one of the two affricates. For instance, Germans have /ʧ/ only, while speakers of Western Arabic dialects (Egyptian, Syrian, Lebanese) have neither of them; they tend to replace them with the fricatives / /; in the case of Egyptian Arabic, the /ʤ/ is replaced by /g/. As for fricatives, English has many of them most of which are not specifically problematic for a wide variety of foreign learners of English except for the pair /, /. However, natives of some languages may have problems with specific English fricatives in which case the problems should receive selective attention. Typically, Hispanic learners of English should be seriously instructed not to replace a voiced labiodental fricative /v/ with a /b/ or even a //. The substitution results in a serious phonological problem since the /v/ is a relatively common sound in English. An equally serious phonological problem for Hispanics is the failure to recognize and produce the English /z/ as opposed to /s/. Filipino learners of English should be cautioned against replacing an English /f/ with an unaspirated /p/. Equally, Hindi, Urdu, Farsi and Assyrian speakers, among others, should guard against replacing English /w/ with a voiced labialdental fricative /v/, a labialdental approximant // or a labialpalatal approximant //. Typically, most Greek learners of English may have a conspicuous phonetic or even phonological accent when they replace the English pair of alveolar fricatives /s/ and /z/ with their alveolo-palatal fricatives [ and [ʑ , respectively, since Greek does not have the more common pairs of fricatives [s, z] and [, ]. This type of replacement amounts to one of the most characteristic features of Greek pronunciation of other languages with the sibilant sounds /s, z,  /. For Russians, the voiced glottal fricative /h/ can be a source of phonetic accent as they tend to replace it with a voiceless velar fricative // or voiceless uvular fricative //. In learning some languages, such as Arabic, the failure of Russians to pronounce /h/ may amount to a phonological accent as Arabic has a distinct and quite popular contrast between /h/ and //.

152

PRONUNCIATION IS IN THE BRAIN

As for the approximants, /l/ and /r/ they can be seriously problematic, especially for Far Eastern Asians such as Japanese, Chinese and Koreans. The confusion between these two approximants amounts to a major phonological and phonetic problem. The confusion is too ‘ear-catching’ to be ignored; it is, indeed, one of the most classic sources of their interlanguage accent. Speaking of the English /r/ in particular, in both RP British and GAE, its approximant nature makes it a very rare type of ‘r’ sound compared to taps, flaps and trills. Although the failure to master the typical approximant English /r/ and replace it with a tap, flap or trill rarely results in a phonological problem, it does, indeed, amount to a significant source of noticeable phonetic accent. Another important aspect of the consonantal system of English is its complex array of consonant clusters in all three structural positions in a word: initial, medial and final. If, for instance, the Japanese learners of English fail to master the production of English clusters, they will be speaking it with a heavy phonological and phonetic accent. They will not only be distorting the pronunciation of individual words, but will also be heavily impacting its stress-timed rhythm. 10.2.1.2. Salient Vowel Quality/Quantity Features

In this study, it is believed that the vocalic system constitutes the most salient feature of English articulatory settings. This saliency is attributed to two major factors: a) the wide range of vowel qualtity (combined quality and quantity) contrasts in English; b) the prominent role of the rules of vowel dynamics, especially in vowel qualtity reduction and heavy schwaization. Any learner of English coming from a language background in which vowel qualtity range is narrow (such as the three through five-vowel systems), will face a major phonetic/phonological problem because of the resulting vowel qualtity discrepancy. With a five-vowel qualtity system as in Spanish, some of the most frequently used words of English are semantically confused due to mispronunciation. Examples of such confusions have been repeatedly cited throughout this book. The English vowel //, is so characteristic of English that any mispronunciation of it may not only result in a phonological accent, but also in an easily distinguishable phonetic accent; for in-

CHAPTER 10

153

stance, Israeli or Polish speakers usually tend to replace the // vowel with an // vowel-like making the pronunciation of English so unEnglish. What makes the English vowel system even further difficult and challenging is the dynamics of vowel qualtity change through vowel reduction, especially in the form of a schwa []. Consequently, an L2 learner of English does not only face the problem of mastering the wide range of vowel qualtity, but also the strong trend towards vowel reduction. Perhaps the group of English words that most characteristically demonstrates the dynamics of vowel qualtity change in English is a set of approximately fifty words (table 10.1 below for examples) that appear in two or more forms known as strong form and weak form. For instance, the strong form of is //, but it has at least three other weak forms such //, // and //.2 The strong form is habitually of minimum circulation since it has to occur in an emphatic form. It is the weak forms of that are of more frequent recurrence. The often schwa-based weak forms of this group of words and the weak syllables of other words are the linguistic units that collectively govern the overall rendition of English vowel system and its general rhythm type. The weak/strong forms of this group of words amounts to a major area which should receive serious attention in the teaching of English pronunciation. No adult learner of English can bring his pronunciation near that of a native speaker without the proper mastery of the weak forms of those words. The mastery of the weak forms of such words seriously helps with the correct rendition of the overall rhythm of English.

2

The so-called syllabic ‘n’.

154

PRONUNCIATION IS IN THE BRAIN

Word

Stressed (Strong) Form

Unstressed (Weak) Form

A





An



n; n

Been

bin ; bn

bn

Can

kn

kn ; kn

For

f ; f

f ; fɚ

Had

hd

hd ; d ; d

Shall

l

l; l

Some

sm

sm ; sm

The



 (V.) ;  (C.)

Would

wd

wd ; d ; d

Table 10.1. Weak and strong forms of some of the common words in English.

10.2.1.3. Salient Vowel Dynamics and Rhythm

English is a language that is governed radically by excessive vowel reduction in the absence of primary stress. Most of the unstressed syllables are dominated by the short vowels // of which the schwa [] is the most frequent. It is these short vowels that reduce syllables to minimum length and distinction, thus allowing them to be glossed over rapidly and with minimum prominence. The speed with which the unstressed syllables are uttered coupled with their minimal prominence help the stressed syllables stand out. It is the enhanced prominence of the stressed syllables that justifies labeling English rhythm as a typical stress-timed rhythm. The enhanced prominence of stressed syllables is not just attributed to receiving the primary stress, but also because the other syllable or syllables within a word tend to become short and reduced in their vowels. To demonstrate, the first syllable in the word = [fɚtbl] is prominent not exclusively because it is stressed, but equally because the other syllables are significantly reduced. There is no way for any learner of English as L2 to master its rhythm without possessing a skill for vowel reduction and schwaization. All instructors of English pronunciation to natives of other languages should make the teaching of vowel qualtity, vowel reduction at syllable and sentence levels a foremost con-

CHAPTER 10

155

sideration. Without an intensive joint effort in this regard, it is almost impossible to arrive at a satisfactory native-like impersonation of the articulatory settings of English. 10.2.2. Spanish Articulatory Settings

In numerous Spanish-English bilingual situations, especially in the United States, features of Spanish such as the strict qualtity limitations on its vowel system, the absence of tense-lax vowel contrasts, the non-schwa nature of its centrifugal vowel system; the absence of certain consonantal sounds such as [v, z, ] coupled with the phonological contrast between its tap = [] and its trilled = []; and its syllable-based rhythm jointly constitute the most salient and commanding characteristics of the articulatory settings of Spanish. They will not only color the whole pronunciation of Spanish, but will also influence their pronunciation of other languages. To begin with, the Hispanic learner of English, or any learner of English whose language uses the Latin alphabet, should not be fooled by the familiarity of the alphabet because in actual application the same symbols may be used to signal drastically different sounds. Let us consider the comparative example below of the word in English and Spanish as a means of displaying a synopsis of the striking differences between the two languages that are expected to be the source of serious phonological and phonetic accent at both segmental and suprasegmental levels. a) The grapheme in English is pronounced as a voiceless aspirated velar plosive [k], whereas in Spanish it is a voiceless unaspirated velar plosive [k]; b) The two vowels in Spanish retain the ‘traditional vowel quality’ of [o , whereas in English the ‘traditional vowel quality’ completely drifts away in both instances into [] and [] vowels, respectively; c) The in English under the influence of the [] tends to be somewhat velarized or what is traditionally identified as ‘dark L’, whereas in Spanish it has no velarization (i.e., it remains a ‘clear-L’); and,

156

PRONUNCIATION IS IN THE BRAIN d) The in English is a retroflex approximant that coalesces with the preceding schwa vowel [ to produce an ‘rcolored vowel’ [ɚ].

At the suprasegmental level, the retention of the quality of the [o] vowels in Spanish grants the whole word a lip-rounding feature or a ‘prosody of lip-rounding’, according to Firthian linguistics, which phonetically contrasts strikingly with the lipneutralization prosody of the English pronunciation. Besides, the stress in English falls on the initial syllable, whereas in Spanish it falls on the final syllable. The cumulative phonetic and phonological differences (at both segmental and suprasegmental levels) afford a very good example of the considerable differences between the two languages which are even further complicated by the orthographic traditions.

Figure 10.1. General tendency of stress position in English and Spanish words.

Thus, if in a simple comparison of two words, which are identical in orthography, yield considerable phonetic and phonological differences both segmentally and suprasegmentally, then any learner of each other’s language should expect some serious difficulties in the domain of pronunciation. Let us consider some such difficulties. 10.2.2.1. Vowel System

In a previous chapter the vowel system in English was identified as centripetal which turned out to be in sharp contrast with the centrifugal system of Spanish in both quality and quantity. The difference in quality is a straightforward significant problem which is summarized in the presence of twelve (12) vowel qualities in English vs. five (5) in Spanish. It is a serious source of

CHAPTER 10

157

both phonological and phonetic accent. Any instructor can readily identify the general qualitative differences; however, what is more challenging to identify and teach when vowel quality is intertwined with vowel quantity (length). ‘Vowel length’ is a term that is somewhat controversial in that some phoneticians prefer to portray those differences in terms of laxness and tenseness. To maintain a level of simplicity in handling the feature ‘quantity’, the dichotomy of short vs. long is preferred; nevertheless, this preference should not exclude the use of lax vs. tense whenever the need arises. In light of these simplifying propositions, English has certain pairs of vowels which are distinctive (i.e., are contrastive) on the basis of length and they cause a major problem for Hispanics since length is phonologically irrelevant (not distinctive) in their language. Another dimension that further aggravates this problem is the high frequency of occurrence of those pairs of vowels in English. Accordingly, there are virtually thousands of pairs of words which in English are distinguished by vowel length, but when pronounced by Hispanics, the length differential is eliminated thus reducing the pair to a single word pronounced according to the specifications of the single Spanish vowel. Let us consider some of these pairs in table 10.2, below and demonstrate the Hispanic rendition of those pairs. Contrastive Pair

Phonetic Transcription

Spanish Rendition

vs.

[] vs. []

[]

vs.

[] vs. []

[]

vs.

[] vs. []

[]

vs.

[] vs. []

[]

Table 10.2. Comparison of English and Spanish vowels in qualtity.

These examples serve to highlight one of the most challenging— if not the most challenging—problems for Hispanic learners of English. At times, the confusion can be extremely embarrassing when some obscene or taboo words are involved such as those mentioned in section 8.4.1. The confusion is not only confined to phonology; it rather extends to serious phonetic mispronunciation. One can easily identify Hispanics by their accent in the

158

PRONUNCIATION IS IN THE BRAIN

rendition of words such as and as [kding] and [bilding] instead of [kd] and [bld], respectively. For a thorough phonetic study of sounds and sound systems, a person has to go beyond phonology because the latter functions at the level of abstractions which conceal many phonetic details. From the pedagogical and educational perspective, to use a phonetic yardstick in assessing the acceptable targeted proficiency in pronunciation may be too strict, and perhaps too ideal, to be used. Instructionally, however, the intention in teaching any L2 pronunciation should be to minimize semantic confusion as much as possible. Practically, in L2 teaching the intention should be to attain as near-native pronunciation as possible that dispels any confusion in meaning and reduces to minimum the demand for semantic clarification on the part of the listener. 10.2.2.2. Rhythm Types

When one specifically compares the rhythm types of Spanish and English, they typically represent a syllable-timed type vs. a stress-timed one, each of which requires some explanation. A stress-timed rhythm is the one in which stressed syllables tend to recur at regular intervals of time relative to each other. The syllables vary considerably in prominence relative to each other depending on whether they are stressed or unstressed or whether their vowels are reduced or enhanced. In the stresstimed rhythm, only the syllables receiving the primary stress stand out prominently, while the unstressed syllables are reduced and compressed in time to become far less prominent. Therefore, the length of an utterance in terms of time, in stresstimed rhythm depends on the number of stressed syllables not the overall number of syllables within the utterance. The manner in which time is distributed over the syllables is uneven in that the speaker dwells longer on the stressed syllables, whereas he glosses over the unstressed syllables with minimum time. The unstressed syllables serve as time savers for the speaker through the facility of vowel reduction and even, at times, consonant reduction or dropping. Vowel reduction and/or consonant dropping are most vividly displayed in the so-called function words in English typically represented by articles, prepositions, con-

CHAPTER 10

159

junctions, etc. For instance, the strong form of is [], which may be reduced to only a syllabic [] as in usually pronounced “bake ‘n’ take”. On the other hand, a syllable-timed rhythm is one in which each syllable tends to retain more or less the same duration regardless of stress (Ladefoged, 1982; Roach, 1983). Unlike the uneven distribution of prominence in stress-timed rhythm, in the syllable-timed rhythm, all syllables, stressed or unstressed, receive a relatively even prominence; syllables take approximately the same time, and the overall length of an utterance depends on the number of syllables involved. In other words, in Spanish rhythm type, there is hardly any noticeable reduction in the prominence of the unstressed syllables nor is there significantly noticeable enhancement of the stressed syllables. It is in light of the above-mentioned characteristics that English is said to have a typically stress-timed rhythm, whereas Spanish is said to have a typically syllable-timed rhythm. It is, indeed, the restriction on vowel reduction in unstressed syllables coupled with a restriction on vowel enhancement in stressed syllables mentioned earlier on for Spanish that were identified as the main rationale for determining the nature of rhythm in Spanish. All those basic differences at the level of words as well as sentences between the two rhythm types may be demonstrated schematically in figure 10.2, below.

Figure 10.2. Comparative differences in syllable prominence in syllable-timed (Spanish) vs. stress-timed (English) rhythm types.

In comparing the pronunciation of the word in English and in Spanish, each oval shape in the above diagram indicates a syllable and the size of the oval shape indicates the prominence that the syllable receives in pronunciation. Based on these clues, three syllables of the Spanish pronunciation are of the same prominence with only slight increase

160

PRONUNCIATION IS IN THE BRAIN

in prominence in the stressed syllable. With the English pronunciation, two of the syllables are of minimum prominence because they contain the reduced vowel schwa []. The last syllable has slightly greater prominence because its vowel is not reduced3 and the syllable has a CVC structure. Obviously, the greatest prominence is associated with the syllable that receives the primary stress. This type of pictorial representation of rhythm types helps many vision-oriented learners to better grasp the differences in rhythm types. The pictorial representation may also assist learners in noticing that in pronouncing the Spanish , the person takes four equidistant steps, whereas in the pronunciation of its English cognate, he takes one small step followed by a large step followed by another small one and end with a modest (medium) step. For a demonstration of the syllabic and rhythmic differences in sentences, notice the schematic representations in figure 10.3 attached to the following two sentences from Hadlich et al (1968).

Figure 10.3. Patterns of syllable arrangement in syllabletimed rhythm (Spanish) vs. stress-timed rhythm (English).

It is quite evident from the schematic representations in figure 10.3, above that there is only minimal difference in the prominence of all the seven syllables in the Spanish sentence except for the second and the sixth syllables which are slightly more prominent because of the placement of stress. Conversely, in the English sentence, there are four reduced syllables with minimum prominence conjoined with two syllables with distinctly en3

At least in GAE as opposed to [] in RP.

CHAPTER 10

161

hanced prominence. It is the even arrangement of syllables in Spanish that determines its syllable-timed rhythm type which is popularly known as the ‘machine gun’ rhythm or the so-called ‘staccato’ rhythm. This impressionistic feeling of a ‘machine gun’ rhythm of Spanish is also, at times, interpreted as a rhythm with faster tempo or speed. In more popular terms, people think that Spanish is spoken faster than English or other languages. Once again this is not exactly true; it is yet another impressionistic feeling resulting from the nature of the evenly-structured syllables with each taking the same or similar time thus making the transition time across syllables equally even. It is this unvaried mode of cross-syllable and cross-word transitions that gives the impression of faster tempo as opposed to the varied mode of cross-syllable and cross-word transitions in English. In sum, if English speakers speak in words, Hispanics speak in syllables and when Hispanics speak in words, Englishmen speak in phrases and clauses. There is yet another additional false impression which is common among all beginner learners of L2 in that they think the targeted speech is much faster than their L1 speech. This is a false impression; L2 speech sounds faster to beginners because the decoding skill is slower. The above discussion leads to a major conclusion in that rhythm types, especially inasmuch as the dichotomy of syllabletimed vs. stressed timed is concerned, depend largely on the nature of the vowel system in each language. A language, whose vowel system imposes restriction on the qualitative and quantitative diversity of its vowels, is not expected to breed qualitative and quantitative diversity in its syllable structures. These conditions are valid for Spanish because each of its five vowels has defined quality and quantity that only slightly change under most linguistic contexts. It is exactly the opposite of what vowel quality and quantity undergo in different linguistic contexts in English. One then asks: how could a vowel system that does not allow quality and quantity reduction or enhancement in its vowels to have syllables that are reduced or enhanced in quality and quantity? The natural and axiomatic conclusion is that a vowel system whose units are uniform in quality and quantity should only yield syllable structures that are, more or less, uniform. This explains the syllable-timed rhythm type of Spanish vs. the stress-timed of English.

162

PRONUNCIATION IS IN THE BRAIN

10.2.3. Arabic Articulatory Settings

Naturally, the differences between the articulatory settings of English and Arabic are expected to be quite striking since the two languages belong to two language families that are drastically apart. Except for the general claim that both languages have their rhythmic systems generally categorized as stresstimed, their vowel and consonantal systems are hugely different. 10.2.3.1. Consonantal System

In the area of consonants, it is the English learners of Arabic that will have far more pronunciation problems than the Arab learners of English. The consonantal system of Arabic is particularly complicated especially because of nine back consonants: the uvulars [  ], the pharyngeals [ ] and the emphatic [   ] which are some of the toughest sounds to handle in any type of phonetic orientation for L2 learners of Arabic. The uvulars [, ] are attested in other languages, but the rest are almost exclusively typical of the Semitic languages. Besides, the four emphatics [   ] have direct phonological contrasts with their plain counterparts [   ]; therefore, any failure to produce the emphatics will leave the foreign learner with no option other than replacing them with their plain counterparts and subsequently cause serious phonological accent coupled with semantic confusion. The situation may be worse when any of the plain counterparts does not exist in learner’s L1 inventory, as is the case in Farsi and Urdu where the Arabic [ ‫ ]ذ‬and [ ‫] ُظ‬ are rendered [s]. Unlike the emphatics, the uvulars and pharyngeals do not have plain counterparts, but the failure to articulate them properly will fill their slots with other sounds with which those uvulars and pharyngeals will be in contrast. For instance, the word for in Arabic is = [] and if the voiced pharyngeal fricative [] is mispronounced it is usually replaced with a glottal stop [] which changes the word into أيْن‬] meaning . Similarly, the voiceless unaspirated uvular plosive [q] is predominantly rendered a /k/ which will also cause phonological accent and transform words such as قلب‬ [] meaning ‘heart’ into كلب‬] meaning ‘dog’.

CHAPTER 10 10.2.3.2. Vowel System

163

It is with the vowel system of English that Arab learners face some serious difficulties not so much because of quantity (length) difference, but rather because of a wide range of quality difference. MSA, has the simplest vowel system of three vowel qualities /i a u/ that are doubled by lengthening. Obviously, there is some quality difference between the short and the long version, but this is not a major issue. However, unlike Spanish, the existence of short vs. long contrasts in Arabic is helpful in distinguishing English minimal pairs of vs. and vs. . For example, in Arabic, the short vowels of /a, i, u/ as in , ِسن‬and فُل‬have their long counterparts as in سام‬, سين‬and فول‬, respectively. Conversely, the restricted vowel quality in Arabic is detrimental in distinguishing an [] from an [ei] as in [bt] and [beit], for instance. What potentially enhances the vowel system of MSA is its enrichment under the influence of the local dialects most of which have developed the mid vowels of [e or ] and [o]. This addition somewhat minimizes the drastic qualitative difference. Vowels in Arabic do not have much qualitative diversity; therefore, the system does not pose itself as a difficult one. However, foreign learners of Arabic coming from a language background with a qualitatively rich vowel system, such as English with a strong tendency toward vowel reduction, should avoid imposing their rich vowel system on the pronunciation of Arabic. They should shrink the broad domain of their vowel diversity to essentially three basic qualities /  / with long and short versions and refrain from forcing unwanted vowel reductions. Even though the pronunciation of the short versions of Arabic vowels may be, somewhat, qualitatively different from their long counterparts [  a], the difference, [  ], is too marginal to be detected by phonetically unsophisticated people. What the learners should do is to maintain the quality of Arabic vowels whether short or long and avoid any insertions of schwas. One of the most characteristic sources of accent for English learners of Arabic is the heavy imposition of vowel reduction, especially the schwaization of short Arabic [] or the socalled ‘’. This vowel has a high frequency of occurrence in Arabic; in fact, it dominates the overwhelming majority of past

164

PRONUNCIATION IS IN THE BRAIN

tense forms of verbs. It is highly characteristic of the English learners of Arabic pronouncing verbs such as َد َر‬darasa] (studied), َك َس‬kasaba] (gained), َكت‬kataba] (wrote), etc. as [rs] [kasb], [katb] which causes a major distortion of the overall pronunciation of Arabic and the rendition of its rhythm. In a nutshell, it almost imposes the English rhythm on Arabic. 10.2.3.3. Suprasegmentals

Because Arabic does not have a vowel system with extensive vowel qualities or allow considerable vowel reduction (as is the case in English), it is quite natural to expect noticeable differences in rhythm even though both English and Arabic are associated with stress-timed rhythm type. In Arabic the rule of assigning stress to the long syllable is a very powerful prosodic feature that causes pervasive interference with the correct placement of stress in English, hence, generating perceptible accent. For instance, word patterns with ending words such as that have more than two syllables never receive stress on the final syllable. Regardless of where the position of stress in words with those suffixes is, Arab learners have a very strong inclination to shift the stress to the last syllable. Notice the following examples in table 10.3: English

Stress

Arabic

Stress

























Rendition

Pattern

Rendition

Pattern

Table 10.3. Systematic misplacement of stress by Arab speakers of English.

There is yet another cause for misplacement of stress in English by Arabs that seems to be a self-inflicted phonetic ‘sin’. The

CHAPTER 10

165

short vowels of Arabic are marked with diacritics over or under the consonants in the following manner: ف‬, ض‬and known as فتحة‬, ضمة‬and كسرة‬which roughly designate the quality of the short vowels [a], [u] and [i], respectively. In most Arabic orthographic texts, these vowels are not marked. When foreign languages are transliterated in Arabic by native Arabs, there is a very robust tendency to replace the فتحة‬, ضمة‬and كسرة‬with the long vowel counterparts ا‬, و‬and ي‬. Once this vocalic transformation takes place in its visual form through publications and/or its perceptible form on radios and televisions, the Arab reader of those transliterated words pronounces them with long vowels the result of which is not just a significant change in the rendition of vowels, but also a major shift in the assignment of the primary stress resulting in distinct accent generation identified elsewhere as word inflation vs. word deflation as demonstrated in section 8.3.2, above. This type of asymmetrical vowel length change between the two languages in their rendition of the same unit, and the subsequent rhythm change, is one of the most primary sources of mispronunciation and accent by Arab speakers of English (Odisho, 2013). 10.3. CONCLUDING REMARKS

The concept of the articulatory settings is extremely helpful in teaching pronunciation. Any learner of an L2 who is able to identify and master the most salient phonetic and phonological characteristics of the targeted language is undoubtedly on his way to secure quality pronunciation. The success of such a linguistic mission may often require expertise and guidance on the part of the instructor. Instructors who do not have the professional knowledge and expertise in the ‘science’ of pronunciation may not be aware of the concept of articulatory settings, and hence avoid highlighting its pedagogical significance.

CHAPTER 11: PRINCIPLES OF A MULTICOGNITIVE APPROACH TO TEACHING PRONUNCIATION 11.1. INTRODUCTORY REMARKS

Right at the outset, the reader has to bear in mind that the cognitive and sensory principles imply multicognitive and multisensory guidelines and considerations. The cognitive perspective essentially implies that speech, at large, and pronunciation, in particular, are the function of the brain whether in childhood or in adulthood prior to being the responsibility of the speech organs. At birth, the human brain is already biologically endowed with incredibly efficient systems that have immense power to store information in the utmost economic manner. These systems, of which language is only one, are responsible for the smooth biological and social survival of human beings. In light of the above explanation, any normal child has a golden opportunity to naturally, automatically and effortlessly internalize the system of the language of the community in which he grows up, but not necessarily in which he is born. The latter statement implies that a child might be born in one language community, but was then raised in a different language community; it is the latter language that counts as his L1 if the exposure to the former community ceases. This natural process is linguistically known as ‘acquisition’. Under normal conditions once the process of acquisition of a language is completed, one cannot undo it and can hardly be repeated as an adult with a second language (L2). Only children can naturally achieve double-acquisition (grow up as balanced bilinguals with perfect pronunciation) if the following conditions are valid: a) ample and balanced exposure in childhood and early adulthood to language materials in authentic socio-cultural and discourse contexts; (i.e., materials that are context-embedded and situation embedded); b) ample multisensory and multicognitive immer167

168

PRONUNCIATION IS IN THE BRAIN

sion and rehearsal of language materials in perception, recognition and production; and c) an integrative approach in which all language skills and subskills are practiced and internalized jointly. The more one grows older, the more remote the opportunity of double-acquisition becomes, specifically in pronunciation. For a better mastery of L2 pronunciation, more intentional and conscious effort is required. This process is here known as ‘learning’ which is broadly in contrast with acquisition. Consequently, adults are expected to manifest different degrees of accent in L2 pronunciation including the three main domains of consonants, vowels and other suprasegmental features (stress, tone, rhythm, intonation, etc.). Since the focus of this study is on pronunciation, any linguistic statements made here relate to pronunciation unless stated otherwise. Pronunciation of L1 is here identified as sets of physical phenomena that have to be transformed into cognitive codes stored in the long term memory or the so-called subconscious brain and retransformed into acoustic signals. If the physical phenomena are precisely internalized and encoded in the brain, their reproduction in the form of speech later will manifest no traces of accent. In other words, if the person stores the right sound input in early years, he should produce the right sound output entailing ‘no accent’. This is usually what children do in their L1 or their L1s.1 With age, adults begin to slowly lose their adeptness in the perception of sounds that are not part of their L1 inventory (phonology). What usually happens with adults learning L2 is that they become vulnerable to internalizing a sound from L2 that is not precisely the sound they hear. Even if they attempt to transition from hearing to listening, their listening might be biased by their own L1 inventory of sounds. This means, adults will produce what they thought they heard or listened to not what the actual sound input was. Consequently, their production of an L2 sound will be inaccurate because the perception and recognition of that sound were inaccurate due to the bias of their L1. Stated differently, they will have an accent. 1

If they are amply exposed in childhood to more than one language.

CHAPTER 11

169

It should be made clear that the distinction between children and adults in their adeptness to language acquisition is in this book confined to the skill of pronunciation and not necessarily to other skills, such as lexicon, morphology and syntax. It is the belief here that the evidence in favor of children’s skill in acquiring and mastering L2 pronunciation as opposed to adults is overwhelming in real-life situations as well as in published literature. It is, therefore, reasonable to consider children as gifted in the acquisition of language, in general, and pronunciation, in particular. The rapidity and apparent ease with which children learn language is a phenomenon of childhood and can hardly be repeated with such ease by most adults (Borden & Harris, 1980); however, such a statement should not imply that they lose their ability to learn in the absolute sense of the word. The claim that adults enter a stage of linguistic fossilization (Selinker, 1972) is rejected in this study. The rejection is premised on cognitive principles in conjunction with classroom experience. Systematic multisensory and multicognitive orientation, regardless of age and aptitude for pronunciation, helps all learners to improve their skills to different degrees in the proper learning of L2 pronunciation. This last statement constitutes the basic conceptual premise on which the approach stands. 11.2. MULTICOGNITIVE PRINCIPLES FOR TEACHING PRONUNCIATION

Since human language has been defined previously as a code of communication that is a genetically determined cognitive potential before being a set of physical maneuvers, it is imperative for any instructor or learner to try to comprehend such a definition in its applied sense. The instructor is likely to see that adults may experience serious difficulty in producing a new sound2 to which they have never been exposed. This is a good example of the cognitive requirement for sound production meaning that the brain may need enough exposure time to the new sound to 2

For the sake of stylistic brevity the term ‘sound’ will, henceforth, stand

for sound segment, sound feature and sound dynamics collectively, unless specified otherwise.

170

PRONUNCIATION IS IN THE BRAIN

perceive and recognize it before being able to produce it appropriately. Therefore, any instruction in pronunciation should target both the cognitive potential for perception and recognition prior to the necessary physical maneuvers of production. If, for instance, an adult native speaker of Spanish is asked to pronounce the English pair = [phl] vs. = [phiːl] and fails after repeated modeling to distinguish the two vowels and instead replaces them with [iˑ] vowel and substitutes the aspirated [] with the unaspirated [] one then the whole situation indicates that the learner is psycholinguistically (cognitively) unable to perceive and recognize the difference between the two English words. This is a typical condition that is identified in this study as psycholinguistic insensitivity; a condition that is characteristic of adults learning L2. The sub-sections below are meant to shed more light on the multicognitive approach to teaching pronunciation. 11.2.1. Think about L2 Speech Sounds

Upon planning to learn the pronunciation of an L2, an adult should make sure not to fall victim to his own L1 phonology bias, i.e. hear L2 sounds through his L1 filter which is one of the most pervasive threats to ideal L2 pronunciation internalization. If the learner’s L1 phonology filter determines the hearing, or even the listening process without proper focus and guidance, all unfamiliar sounds coming from an L2 phonology source are likely to be filtered out and replaced with the nearest sounds to them in L1. If, however, the learner discovers an unfamiliar sound or is alerted to it by the instructor, the first step is to try to perceive the sound as accurately as possible. The learner should try to think about it in the following terms. Is it similar to one of his native sounds? If it is similar, but not identical, what is the sound feature that renders the L2 sound different from the L1 sound? Personally, I remember when I began my socalled ‘ear-training’ course in my graduate studies, I encountered serious difficulty in the perception and production of the less familiar alveolo-palatal pair [ɕ, ʑ] as opposed to the more familiar postalveolar pair [ʃ, ʒ]. It was persistent thinking about and focusing on the perceptual and the articulatory/kinesthetic differences that helped me set them apart.

CHAPTER 11

171

11.2.2. Transition from Hearing to Listening

As was pointed out earlier on in Chapter 2, hearing is a sense while listening is a skill. Simply, learners have to be encouraged to try to attentively listen to sounds, and remember their acoustic images to facilitate their reproduction correctly. The learner can conduct internal comparing and contrasting of the target sound or sounds with what is/are already part of his psycholinguistic inventory. With time and practice, the learner can hone his listening skill further to make it more discriminative in the accurate perception of the targeted L2 sounds. The best example to cite for this type of advice is my long struggle with the identification of sound [], a voiced labialpalatal approximant in the phonology of Neo-Aramaic (Modern Assyrian). When I was preparing my doctoral thesis, I felt that the phoneme /w/ had a variant, but I did not realize that it was a [] because I had no conscious exposure to it. Simply hearing the sound and listening to it were not enough; it was only because of further discriminative listening and investigation that enabled me to pin down that variant as []. 11.2.3. Learn Something about Speech Production

An instructor or learner need not become a speech specialist or phonetician; however, it is tremendously helpful to acquaint oneself with the basics of sound production. For example, it is helpful to know that the overwhelming majority of speech sounds are produced either with vocal folds vibration or without it. Also speech sounds are identified with their places of articulation such as: bilabial sounds (with two lips); labio-dentals (upper lip and incisors); inter-dentals (tongue tip at biting edge of incisors), etc. The example of [] above is also relevant in this regard. The sound was related to [w], a labialvelar approximant, but [] ended up being a labialpalatal approximant, i.e. a labial sound but somewhat in front of the velar sounds. Additionally, some acquaintance with the way in which sounds are generated along the vocal tract is of significant importance. Once again I cite my experience with the phonetic phenomenon of retroflexion. As a beginner in phonetic studies not only I had not heard the term, but I also did not know how to consciously do a retroflex . Ironically enough, I later discovered that a retroflex

172

PRONUNCIATION IS IN THE BRAIN

exists in a few Neo-Aramaic (Assyrian) dialects such Tiari and Alqosh. Once I mastered the production of a retroflex , it was an easy transition to the mastery of other retroflex sounds. This last statement is of extreme significance in phonetic training and education because excellence in phonetic education is not built up sound by sound, but rather by bundles of sounds or more accurately termed by ‘natural phonetic categories’. Once I learned how to tilt the tip of my tongue for the retroflex /r/, I gradually moved into the production of the retroflex sounds /l, n, t, d/, etc. 11.2.4. Mechanical Repetition Hardly Works with Adults L2 Learning

The classical and old-fashioned style of teaching pronunciation in the tradition of the Audiolingual Method3 known as ‘repeatafter-me’ may be beneficial with children and adolescents, but it is often practically ineffective, in fact pedagogically damaging, with adults because adult learners of L2 usually and subconsciously repeat after themselves. Stated differently, due to L1 phonology bias, they repeat the sound that they think exists in the L1 inventory not what they actually hear demonstrated by the instructor. Such a practice is far more damaging when the class as a whole is asked to repeat after the instructor (i.e., the so-called chorus practice). In a chorus practice, usually the bad practitioners are hidden among the good practitioners because there is usually more noise in the background to mask the targeted sound. According to the repeat-after-me procedure, the immediate demand by the instructor to ‘produce’ after him, the learner is bypassing two important stages, namely, perception and recognition of the unfamiliar L2 sound. Thus, it is not the fault of the learners to mispronounce because the avoidance of those two stages implies that the learner will not be well primed for the targeted performance.

3

Popular in the 1950s and 1960s.

CHAPTER 11 11.2.5. Follow the ‘Perceive, Recognize and Produce’ Procedure

173

In the previous sub-section, the instructor was cautioned against the ‘repeat-after-me’ procedure; instead, it is suggested that he should abide by the three-stage procedure of ‘perceive, recognize and produce’ to prepare the learner for better chances of securing the targeted performance. In reality, any teaching of pronunciation should thoroughly follow the natural three-stage procedure of sound acquisition: perception, recognition and production in the sequence indicated. The above triangular procedure is highly consistent with the three-stage procedure of registration, retention and retrieval in learning and with the three types of memories of sensory, short-term and long-term in information storing. In each case, the earlier stage serves as the gateway to the next and final stage. The transition to the final stage cannot be completed without continued rehearsal. Because the above triplets will be repeatedly used, a brief clarification of the terminology is invaluable. Perception is used to denote the condition of feeling and sensing the presence of a given sound; recognition includes the condition of perception as well as the condition of being able to distinguish the given sound from others and, perhaps, identify the difference(s) in comparative/contrastive situations. According to Parasuraman and Beatty and in terms of cognitive processing, “the distinction between perception and recognition appears to be the matching of the external sensory pattern with some internal sensory engram4 and the bringing of this to awareness” (cited in Kissin, 1986). As a further enhancement of the above quotation, Kissin states, “The definition of recognition as the process of matching external perceptions against existing internal correlates, implies a second level of activity” (Ibid). As for production, it satisfies the above two conditions of perception and recognition in addition to the ability to retrieve the sound and reproduce it at will with an acceptable degree of proficiency and accuracy.

4

An assumed unit of sound imprinting in the memory.

174

PRONUNCIATION IS IN THE BRAIN

The sequential triplet of learning is: registration, retention and retrieval. In standard literature on learning, registration refers to the perception, encoding and neural representation of stimuli at the time of an original experience; retention is the neurological representation of an experience to be stored for later use; and retrieval is the permit to access previously registered and retained information (Arnold, 1984; Levitt, 1981). As for information storing in the brain, there are three different kinds of storing systems. The sensory memory is the initial level of information storing; information stored here is extremely limited in volume and is retained for only a few seconds. Sensory memory is a sort of photographic memory (Loftus: 1980). Short-term memory is not as limited as sensory memory; it can store about seven items plus or minus two items and for no more than half a minute or so. Although short-term memory may be transient and limited in capacity, it may be very useful in ear-training (auditory orientation) sessions where the temporary retention may allow the learner to better perceive the sound; it may also play a crucial role in conscious thought. In plain wording, the halfminute or so allows the learner to think about the sound and its production. Long-term memory is the storing system where information is retained for longer time and even permanently. In terms of cognitive knowledge, the process of learning is essentially one of transferring information from the environment into the long-term memory. Long-term memory is, more or less, a permanent repository of general knowledge about the world and past memories (Bourne et al, 1986). The explanation above suffices to portray that in order to perceive a sound, one has to be exposed to it at least in passing through the sensory memory; to have it registered, at least temporarily, it should be stored in the short memory; however, in order to retrieve and produce a sound accurately at will, it has to be retained and consolidated in the long-term memory through rehearsal. Sequencing of stages is significant and bypassing a stage may negatively impact the outcomes. For instance, with insufficient and improper exposure to unfamiliar sounds, it is highly unlikely to succeed in producing them. A serious flaw in the traditional approach to the teaching of pronunciation is attributed to either insufficient dwelling on the perception and recognition stages or their total negligence.

CHAPTER 11

175

Those two conditions lead to an immediate jump to the production stage a condition that is typically embodied in the ‘repeatafter-me’ technique of teaching pronunciation, which may be so incompatible with the learning styles of adults. The suggested triangular procedure is cognitive in nature. It assumes that the learner may not cognitively perceive the alien sound because it is not part of his sound inventory, hence fail to recognize it in preparation for the correct rendition. With the failure to perceive the incoming sound, he is vulnerable to substitute it with one of his L1 sounds or at best produce a sound that is not the intended one; in both cases, he misses the targeted sound. Only children and a few gifted people or highly trained phoneticians can perceive, recognize and produce an alien sound at the first attempt or after a few attempts. I remember in 1979 I had prepared for the presentation of a paper at the 9 th International Congress of Phonetic Sciences, in Copenhagen titled: ‘A Voiceless Unaspirated Emphatic Alveolar Affricate’.5 This is a very rare sound if not unique to one of the dialects of Neo-Aramaic (Modern Assyrian). A few months prior to the presentation, I met the late Peter MacCarthy an accomplished phonetician and Chair of the Department of Phonetics, Leeds University who asked me whether I was presenting at the Copenhagen Congress. I said: “Yes.”. He said: ‘What is the topic?” I said: “I guess a new sound.” He asked for the description of the sound. I gave him a description of it in terms of phonetic labels as: ‘A voiceless unaspirated emphatic alveolar affricate’. With a few attempts, he was able to produce a quite satisfactory sample of the sound. This is what accomplished phoneticians are able to do; however, ordinary adult learners of L2 often encounter difficulties with unfamiliar sounds. According to this three-step teaching strategy, the first two steps help the learner to become cognitively familiar with the targeted alien sound. In other words, the brain recognizes the identity of the sound as new and different from what it has in its 5

The paper was published in Journal of the International Phonetic Associa-

tion, Vol. 9, 1979.

176

PRONUNCIATION IS IN THE BRAIN

L1 sound inventory (phonology). Once the brain assigns the alien sound a different identity it means the brain is ready to store it as a cognitive entity. Stated differently, the brain ‘neuronizes’ the sound or registers it with the neurons. If the learner expresses further interest in dealing with the sound, the next step for the brain is to begin sending the appropriate articulatory instructions to execute the sound. A few trial and error attempts may be needed prior to successfully master the production of the sound. It is quite likely for the learner to forget the proper production of the sound the next hour or the next day simply because the specifications are still in the short-term memory en route to full registration in the long-term memory. Further practicing should, however, create an autonomous slot for the new sound. 11.2.6. Instructor’s Academic and Professional Qualifications

Are you, as instructor, academically and professionally qualified to teach pronunciation especially when it involves a second language and adult learners? Have you had enough linguistic and educational orientation to tackle a technical subject that requires ample professional preparation? Do not be surprised at these questions that I am raising. Once upon a time, prior to my professional orientation in phonetic science at graduate level, I thought I was well prepared for teaching pronunciation when in reality I discovered later I was not. I have also known instructors who taught topics in pronunciation that were described and demonstrated wrongly. The best example I recall was in Baghdad when one of the instructors holding an M.A. in methodology from Britain was targeting the teaching of the standard Arabic voiced emphatic alveolar plosive / ‫ض‬/, while pronouncing the voiced emphatic interdental fricative / ‫ظ‬/ instead. I knew that he was confused because in the Iraqi variety of Standard Arabic, the two sounds are pronounced identical as / ‫ظ‬/. Nevertheless, an experienced phonetician should know the difference between the two sounds. The worst large scale example of the wrong approach to teaching pronunciation, especially of English, is demonstrated by thousands of teachers of English in the United States who still believe in the so-called phonics approach to teaching pronunciation. In many instances, phonics

CHAPTER 11

177

fails all the basic principles of modern linguistics. Foremost of all, phonics fails to distinguish between letters and sounds or between what linguistics identifies as graphemes and phonemes. One of the worst examples of matching graphemes with phonemes in phonics is in the phoneme identity of the letter (grapheme) in the words vs. (Fox and Hull, 2002) where the former is cited as the short counterpart of the latter. The two s have hardly any qualitative base for comparison because the former grapheme is the vowel proper [] whereas in it stands for the approximant [] + []. Besides, phonics cites the vowels in words such as as a short vowel vs. its long counterpart in . Phonetically, the matching lacks a basis in both quality and quantity. The vowel in is [] a simple vowel, while in is a diphthong [ei]. 11.2.7. Plan Instructional Connection with Learners

As an educator, the instructor has to ascertain that he has established connection with the learners and, in turn, that the learners are equally in connection with him. This is one of the most important pedagogical factors to consider. Learning is at its best when learners and their instructors are on the same page. The instructor should be careful not to ‘teach to himself’, so to say, and learners should pay attention when the instructor emphasizes a certain point and highlights a certain explanation or demonstration. Every now and then the instructor should double check that learners are connected with him. Oftentimes, instructors ask their students the classical question: Do you understand? Some of the students confidently reply: “Yes”, when several others remain silent. It is usually among the silent students that those who did not understand hide some of whom are shy to ask for further explanation. 11.2.8. Explain, Demonstrate and Demonstrate Multisensorily

The instructor has to use the simplest possible language that is not loaded with technical jargon. In many instances, the jargon distracts the attention of the learner and doubles the complexity of a certain problem. If, for example, you want to give details

178

PRONUNCIATION IS IN THE BRAIN

about the phonetic nature of the sounds /p, t, k/ in English and state that they are aspirated your statement involves a technical term; instead, it is better to replace ‘aspirated’ with ‘a puff of air’ that follows the articulation of /p, t, k/. To make the explanation even easier to grasp, the instructor has to use as many demonstrations as possible including visual, auditory, tactile-kinesthetic, etc. These will be further elaborated on in the next chapter. Such multiple demonstrations render the technical jargon and the complicated language more comprehensible. 11.2.9. Deal with Pronunciation in a Holistic Fashion

Pronunciation is an integral part of overall human communication (Morley, 1991). There should be more consideration for integration than isolation among the components of speech and the dynamics of pronunciation. In other words, pronunciation is seen not only as part of the system for expressing referential meaning, but also as an important part of interactional dynamics of the communication process (Pennington and Richards, 1986). This implies that the teaching of pronunciation should be considered in the context of a much broader base of human communication. At a lower level of integration, human speech consists of a combination of segmental and suprasegmental (prosodic) features. Both of them should be handled inseparably from the overall articulatory, visual, auditory and tactile/kinesthetic features accompanying speech production. The latter sets of features form the basis of what is differently labeled as ‘articulatory settings’ (Honikman, 1964) or ‘phonetic settings’ (Laver, 1980; 1994), among others. No cross-language teaching of pronunciation will be authentic and dynamic in nature and a reflection of the native-speaker’s proficiency without serious consideration of the articulatory settings of the targeted language. For instance, a language like Arabic with a limited vowel system and a heavy dependence on guttural sounds and emphatics has such specific articulatory settings that without its incorporation in the overall approach to learning Arabic by non-native speakers and learning of foreign languages by Arabs the results will be highly unsatisfactory.

CHAPTER 11

179

11.2.10. Consider both Top-Down and Bottom-Up Perspectives

Pronunciation is a dynamic cognitive and physical process. It should be taught in a dynamic way with a bottom-up approach (i.e. from smaller to larger units: distinctive features to segments to the prosodics of the syllable, word, sentence and discourse) in conjunction with a top-down approach that reverses the order of processing from discourse down to distinctive features. Stated differently, teaching pronunciation is like two-way traffic in which both directions of movement are needed in order to complete the cycle of communication. Traditionally, pronunciation has been taught in a bottom-up approach with emphasis on vowels and consonants often lacking proper contextualization and embedding in longer meaningful stretches of speech. Recently, there has been a twofold emphasis in teaching pronunciation; firstly, an intra-segmental emphasis with attention on distinctive features (i.e. more microscopic perspective); secondly, an inter-segmental emphasis with attention on prosodic features and the overall features of the articulatory setting (i.e. more macroscopic perspective) (Pennington and Richards, 1986). 11.2.11. Do not Confuse Memorization with Retention

In teaching pronunciation, distinction should be drawn between memorization and retention. Although meaningful memorization is more effective than rote memorization, memorization per se is only one of the many cognitive processes that help with knowledge acquisition, retention and retrieval. Retention of knowledge and perfection of skills can be achieved by means other than sheer memorization. Association, categorization, analysis, synthesis, etc., can be highly effective means of knowledge retention. To cite an example of retention through association, let us take the subskill of spelling. On several occasions, I have taken my college-bound students by surprise and asked them to spell (graphic/written spelling not oral) a word, such as ‘accommodation’. Frequently, the misspelling has been over 70% and occasionally as high as 90% not because is a very difficult word to spell, but because our approach to teaching spelling has often been and still is, based on rote memorization with minimum cognitive processing.

180

PRONUNCIATION IS IN THE BRAIN

There is certainly ample room for a cognitive approach to the teaching of all language skills and subskills. In this instance, the teaching of the spelling of should be approached from a different perspective. In the first place, the word is not difficult to spell in its entirety. The misspelling is almost always associated with one or two segments only not the word as a whole. With , the focus should be on the ‘doubled letters’, i.e. ‘cc’ and ‘mm’. Thus, if the nature of the association is clarified and cognitively highlighted there will be no misspelling. My students have always had instant perfect success with the spelling of this word when they were instructed to think of the need for ‘two doubles’ or to associate the spelling with the phrase ‘accommodation for two’. Teaching pronunciation to adults through memorization in the form of mechanical repetition becomes a highly challenging task because sound features and segments are meaningless in themselves. They are simply acoustic signals that impact the ear. Consequently, their retention is difficult if memorization is the sole channel of input. Definitely, more channels of input are needed for better and more permanent retention. This is why this proposed approach calls for the joint involvement of as many sensory and cognitive channels of input as possible. Let us cite the following teaching scenario. Suppose the teacher intends to teach Hispanic learners of English the difference between English and Spanish plosive of /p, t, k/. They are aspirated in English [p t k], but unaspirated in Spanish [p t k]. If you are the instructor, after a very brief explanation, take a flimsy piece of paper and place it just under your nose pressing on it with your left index finger. Pronounce a typical aspirated sound e.g., [] and do the same with an unaspirated sound []. Direct the attention of the learners to the fluttering movement of the paper with the aspirated articulation and the absence of the movement with the unaspirated articulation. This fluttering movement is the direct result of a tangible puff-of-air that accompanies the production of aspiration and its absence with nonaspiration. The puff-of-air is the ideal demonstration to reinforce of the cognitive retention of the difference between the two types of plosives. Obviously, the auditory difference between the two sounds is additional reinforcement of the phonetic difference.

CHAPTER 11

181

11.2.12. Deal with Pronunciation as a Generative Skill

The generative nature of this approach implies that mastering the perception, recognition and production of one sound should facilitate the mastery of more than that one sound. In other words, developing a skill in one aspect/domain of pronunciation should serve as a key to enhance or generate the skill to master other aspects/domains of pronunciation. For instance, in English, mastering the production of a schwa [] vowel does not only help with the mastery of the complicated vowel system, but it will also considerably facilitate the process of vowel reduction and the overall rhythmic performance. Also, learning how to kinesthetically and proprioceptively sense the upper incisors contacting the lower lip for a [v] or [f] or a tongue tip contact at the alveolar ridge for the articulation of a [d] or [t] should develop the skill of sensing any other contact of the tongue in the oral cavity. Even in the dynamics of sound production, mastering accentuation in a given word should pervade to other words and to the overall rhythm mastery in the targeted language or any other language for that matter. 11.3. CONCLUDING REMARKS

The above principles will be elaborated on through further explanation or application in the course of developing the implementational techniques; however, it is worth noting that the most important theoretical aspect of this approach is that the brain is the seat of all language skills prior to becoming physical manifestations in the form of speech. The reason why this approach is described as multicognitive and multisensory is because its implementational techniques are directly designed on the basis of a wide variety of cognitive and sensory processes. Based on the approach presented here, teaching pronunciation becomes more of a multi-faceted educational process than a mere repeat-after-me mechanical reproduction of speech sounds. Such an approach requires more effort on the part of the instructor and learner and a stronger collaboration between them through the diversification of teaching and learning styles, respectively. It is certainly a time-consuming effort, but the time spent is worth the effort. This approach is no longer a single technique or drill that tackles one sound at a time; instead, it is a joint se-

182

PRONUNCIATION IS IN THE BRAIN

lection of cognitive and sensory techniques that are applied concurrently to facilitate the L2 mastery in a creative and generative manner similar to the process of child language acquisition.

CHAPTER 12: PRINCIPLES OF MULTISENSORY APPROACH TO TEACHING PRONUNCIATION 12.1. INTRODUCTORY REMARKS

The previous chapter emphasized the significance of the cognitive foundation of pronunciation and the multicognitive principles to teaching and learning it. In this chapter, the emphasis shifts to the sensory support of the cognitive foundation of pronunciation via the multisensory channels. In other words, it is the sensory modalities that jointly reinforce the foundation for the cognitive modalities to kick in. Once internalized, speech is used as a tool for social and physical existence and survival. The relationship of the brain and the five senses is one of mutual dependency and co-existence; one cannot function and survive without the other. The power and creativity of the brain is nullified without the feeding of information for processing through the five senses: visual, auditory, olfactory, tactile, and gustatory. To state it more dramatically, the senses are the five windows of the brain to the outside world if they are closed the brain will wither in darkness. In turn, the senses will be redundant organs if the brain does not process the information they feed and transmit it back for execution. Inasmuch as the acquisition of L1 by a child is concerned, it is a natural process for which the brain of the child is tuned. The acquisition progresses very smoothly, effortlessly and subconsciously as long as the child is immersed in the language. The brilliant success of the process of language acquisition is premised on two main pillars: first, total immersion in the language and exposure to a variety of multisensory stimuli; second, transmission of such stimuli to the brain for processing and decision making and then dispatching the decisions back for implementation. 183

184

PRONUNCIATION IS IN THE BRAIN

12.2. MULTISENSORY PRINCIPLES FOR TEACHING PRONUNCIATION

When the human brain is in the process of decision-making, it does not just depend on one sensory source; rather, it manipulates all the senses to gather as much information as possible. The analogy of ‘all the roads lead to Rome’ applies here because all the senses meet in the brain; besides, receiving data from different sensory channels provides the brain with a far more accurate, diversified and comprehensive assessment of the problem to be solved. In solving a linguistic problem, the brain takes in the needed information, sifts through it and then processes it to arrive at a gestalt solution. According to the approach suggested here, of all the five senses only three are indispensable, namely, the auditory, visual and tactile senses. They are the focus of the next sections. 12.2.1. Auditory Modality

Traditionally, the auditory modality has received the priority in teaching pronunciation. No doubt, this is comprehensible because it is the primary sense of acoustic intake. Nevertheless, in real-life situations, especially in child language acquisition, speech is not the exclusive function of the auditory sense; rather, it is the collective function of the input from the auditory, visual and tactile senses. It should, however, be pointed out briefly that the tactile sense covers all the kinesthetic and proprioceptive sensations that are transmitted to the brain via muscular innervations. In adult L2 teaching classes, the instructor should not take for granted that the auditory modality will do the job of teaching sounds that are alien to L1. This was discussed earlier on in that the phonology of L1 will mask many of the L2 sounds hence filter them out. In order to ascertain that the L2 targeted sound is really perceived and recognized by the learners, the instructor should lead the learners step by step towards the targeted sounds using different exercises. At times, the auditory modality alone fails to do the job; therefore, it should be supplemented by other modalities. Let us take the case of /b/ vs. /v/ for Hispanic learners of English. The substitution of a /b/ for /v/ is one of the most salient accent indicators among Hispanics. It has been the experience that when handling this phonological confusion

CHAPTER 12

185

with dependence on only auditory modality, the results have been dismal. For much better results, the auditory modality has to be supplemented with other sensory modalities, especially the visual one. The following step-by-step procedure, which abides by ‘the perceive, recognize, produce’ strategy, is recommended to overcome the difficulty. 1) Model the two sounds as many times as you deem necessary asking learners to pay attention. Without attention to the targeted sound, as they say in some cultures, ‘it enters this ear and goes out of the other one’. 2) Ask if any learners are willing to demonstrate them for the class. This will serve two purposes. First, discover the learners who are more gifted for sound perception, recognition and production and use them as models. Second, peer demonstrations may encourage other learners to pitch in. 3) Prepare a couple of exercises to encourage learners to distinguish and recognize the two sounds in random demonstrations. Make sure that the number of learners who managed to perceive the two sounds and distinguish them is increasing gradually. If the instructor feels that the number of those who failed the initial test of perception and recognition is sizeable, he should repeat the experience of perception and recognition with the help of the visual and tactile modalities. 4) Model the pair a few times asking learners to visually watch your facial gestures during the demonstration of /b/ and then /v/. Typically, emphasize the two lips closing together tightly for /b/, while for /v/ the upper incisors touching the lower lip. Ask the class to perform the articulatory gesture for both /b/ and /v/ separately and repeat the gesture several times for each one. 5) Ask for volunteers to come before the class and demonstrate the articulation of the bilabial (two lips) for /b/ and the labial-dental (lower lip and upper incisors) for /v/. Then ask all learners to perform the articulatory posture while watching them. 6) Place the learners in pairs facing each other and taking turns in performing a /b/ articulation (bring the two lips to-

186

PRONUNCIATION IS IN THE BRAIN gether) followed by a /v/ articulation (bring the upper incisors in touch with lower lip). Move around the class to observe the performance.

All the above demonstrations and exercises will collectively send auditory, visual, tactile-kinesthetic messages to the brain for consideration and registration in the memory. At the end of the above multisensory input the brain will be more prepared to cognitively recognize the two sounds and produce them successfully. Obviously, the first stage of cognitive retention will be in the short-term memory; hence, it is not uncommon for some learners to lose the cognitive impression of the two sounds. This means that some of the exercises have to be repeated in the next sessions until the brain transforms the /b/ and /v/ articulatory impressions from the short memory to the long term memory en route to the subconscious. Once the /v/ sound is perceived, recognized and produced, the brain begins to make all the preparations to register the sound in a slot that is cognitively separate from /b/. This is how the /v/ phoneme becomes part of their enriched phonology. 12.2.2. Visual Modality

The common impression among people is that speech sounds are heard. They rarely envisage ‘seeing’ the sounds; indeed, there are several speech sounds, including consonants and vowels as well as other sound phenomena, such as stress and pitch, whose articulatory organs or maneuvers that generate them yield themselves to vision. As highlighted earlier on, pronunciation is not exclusively an audio-lingual activity because it depends on a much broader base of sensory and physical activities. Hence, serious consideration should be given to non-verbal gestures including facial and body gestures that are intertwined with the overall dynamics of speech production. In other words, in teaching pronunciation it is not enough to hear the sounds, but also, and equally importantly, to see and feel them. In light of such a broad definition of the sensory features involved in human speech production, a certain category of sounds has been identified as visible including consonantal sounds such as the bilabial, labialdental, interdentals. Sounds produced inside the oral cavity beginning with the alveolar

CHAPTER 12

187

ridge will decrease in visibility with further movement backwards in the direction of the larynx. Features of sound production such as lip configurations (lip spreading, protrusion and rounding), jaw depression and elevation are also fairly visible features. It is also possible to visually detect some facial and bodily gestures indicating some features of tense and lax sounds especially with vowels. It is worthwhile reiterating that the visual modality does not work separately; in many instances, the visual sensations are accompanied by tactile/kinesthetic/proprioceptive sensations. Let us cite some examples to demonstrate the assistance that the visual modality affords in teaching certain vowel and consonant sounds. One way to distinguish the German vowel [u] as in (to hurry) as opposed to [y] as in (to guard) is by the degree of lip-rounding and lip-protrusion; they are more visible with the latter. Another good example for demonstrating the effectiveness of the visual modality to distinguish between sounds and master their pronunciation is through the teaching of aspirated vs. unaspirated sounds. Take a flimsy paper and place it just under your nose pressing on it with your index finger so that it covers your lips. Pronounce a typical aspirated sound e.g., [] and do the same with an unaspirated sound []. Direct the attention of the learners to the fluttering movement of the paper with the aspirated articulation and the absence of the movement with the unaspirated articulation. This fluttering movement is the direct result of a tangible puff-of-air that accompanies the production of aspiration and its absence with nonaspiration. One can also see the difference using some talcum powder and placing it on the palm of the hand in front of the mouth while pronouncing the aspirated and the unaspirated sounds consecutively. There will be a spread of powder with the aspirated consonant and its absence with the unaspirated. One can equally distinctly see the flame of a burning candle flutter with aspiration and remain steady without it. Even kinesthetically, one can feel the aspiration if the hand or fingers are placed in front of the lips while producing aspirated sounds such [p , t , k].

188

PRONUNCIATION IS IN THE BRAIN

12.2.3. Tactile, Kinesthetic, Proprioceptive Modalities

Although the word ‘touch’ is a common word and the ‘sense of touch’ is an equally common label, technically, however, ‘tactile sense’ is a somewhat more technical term. For the purpose of teaching a refined level of pronunciation especially for adults embarking on an L2, the more technical terms that arise from different types of touching are ‘kinesthetic’ and ‘proprioceptive’. Obviously, the purpose is not to teach those adults such technical jargon; rather, to enable them to feel the sensations that the two terms invoke. Kinesthesia is the sense that detects bodily position, weight, or movement of the muscles, tendons, and joints whereas proprioception is the ability to sense the position, location, orientation and movement of the body and its parts. Let us consider the detection of the difference in kinesthetic sensation between the articulations of a voiceless aspirated velar plosive [k] vs. a voiceless unaspirated uvular plosive [q]. Place your index and middle finger jointly on the thyroid cartilage— the projection in front of your neck known as Adam’s apple— and do the [k] sound a couple times followed by [q]. Certainly, you will feel (as well as see) greater upward movement of the thyroid cartilage with [q] than with [k] simply because [q] is a uvular sound that requires the whole laryngeal structure to rise and execute a complete contact with the uvula. One can also visually detect the relatively greater movement with [q]. The movements of different parts of the speech production mechanism and vibrations that may accompany them can all be picked up by proprioception. The vibrations created by the vocal folds travel along the bones, cartilages, and muscles of the neck, head, and upper chest, causing them to vibrate. (McKinney, 2005; Daniloff, 1973). Thirteen centuries ago, a brilliant historical example of telling the voiceless consonants from the voiced ones, even prior to being aware of the existence and role of the vocal folds in this regard, is Sībawayhi’s1 choice of the attributes of voiceless (mahmūsa) and voiced (majhūra) simply based on

1

Renowned Grammarian of Arabic Language during 8th century A.D.

CHAPTER 12

189

his impressionistic sensations detected by proprioceptive feedback channels (Odisho, 1988; 2010). 12.3. DEVELOPING TEACHING AND LEARNING STRATEGIES

After lengthy discussion of the modalities of teaching and learning, the natural progression is into the domain of teaching and learning strategies to implement the modalities. Two points are important to highlight in this regard. First, as long as the general approach to teaching pronunciation is premised on multiple cognitive and sensory modalities it is quite natural for the teaching and learning strategies to follow the same pattern: multiple cognitive teaching strategies and multiple learning strategies. Second, there should be some degree of matching or reciprocity between instructors’ teaching strategies and students’ learning strategies. Stated differently, diverse teaching strategies should simultaneously promote diverse learning strategies that serve the same goals. The ‘multiple intelligence theory’ with its, thus far, nine (9) intelligences affords both instructors and learners a different perspective for teaching and learning. For the instructors to know their students by their names is a very commendable skill for class management, but knowing their strengths and weaknesses and their learning styles is even more commendable for their academic success. The knowledge of this personal aspect of learners’ strengths and weaknesses allows the instructor to individualize his instruction when necessary which will be beneficial to both sides. At least the instructor will know what learning style appeals to a particular intelligence that a given learner demonstrates. 12.3.1. Developing Teaching Strategies

A successful instructor, regardless of the subject he is teaching, is the one who prepares the entire class to listen and think. Listen when there is a need for listening and think critically when attempting to solve a problem. Some learners by nature and home and/or family culture are good listeners and thinkers, whereas some others require some guidance and orientation. Part of the educational responsibility of the instructor is to improve the thinking and listening habits of all. The thinking hab-

190

PRONUNCIATION IS IN THE BRAIN

its should focus on the assessment of the complexity of the problem, identification of the most relevant sensory/cognitive modalities and the assignment of roles in implementation between himself and the learners. As for teaching discriminative listening, the instructor should thoroughly demonstrate the effectiveness of moving the learners from hearing to listening and finally to discriminative listening. 12.3.1.1. Assess the Degree of the Complexity of the Posed Problem

It is true that any non-native sound or sound phenomenon may pose a certain degree of difficulty for the learners; it is, however, equally true that some sounds or sound phenomena are relatively more difficult to master than others. In general, teaching consonants may be more readily manageable than vowels simply because most consonants tend to have a more well-defined articulatory posture than that of vowels. However, even among the consonants the teaching of some of them seems to be more straightforward than others. In fact, the so-called visible consonants (e.g., bilabials, labialdental and interdentals, etc.) are the easiest to teach of all sounds. 12.3.1.2. Identify the Most Convenient Sensory/Cognitive Modality

After deciding on the pronunciation problem to be tackled, one has to identify the most convenient sensory/cognitive modalities to be used. To demonstrate this phonetic fact, let us remind ourselves of the teaching of [v] for Hispanics which constitutes one of the most characteristic examples of their pronunciation difficulties. In spite of this fact, teaching [v] vs. [b], in my case as an instructor, has been the easiest task because of their visible articulatory postures and their readily detectable kinesthetic and proprioceptive sensations. Cognitively, in teaching the visible sounds and this pair, in particular, the instructions are more viable to be comprehended and implemented because of the multiple sensory feedbacks that the brain receives. 12.3.1.3. Assign Roles for Instructor and Learner

Once the problem is posed and the sensory and cognitive modalities and strategies are selected, the instructor has to strategize his role and that of the learners. Foremost of the things to be

CHAPTER 12

191

decided is that the instructions given should be as easy to comprehend and implement as possible. For example, if the instructor aims at teaching Hispanic learners of English that a [z] has vocal folds vibration whereas [s] has not, he should give learners the instructions to detect the difference as demonstrated in section 6.1.7, above. 12.3.1.4. Highlight Discriminative Listening

The progression in the direction of teaching discriminative listening should be premised on teaching listening which is a skill as opposed to simply hearing, which is a sense. Let us consider the following example for teaching discriminative listening. In English, the plosives, /p, t, k/, are usually aspirated particularly in initial position. Thus, the aspirated production of such plosives for the native speaker of English is cognitively the model. The last statement implies that the unaspirated production of such plosives is the cognitively unrecognized, hence difficult to perceive, recognize and produce. In teaching unaspirated plosives to native speakers of English learning Spanish, for example, it is recommended not to directly give them Spanish words such as (but) or (dog); rather, stay with native English words in which /p, t, k/ follow an /s/ sound such as vs. . The phonetic rule of pronunciation in English is that when /p, t, k/ follow the /s/ sound in the form of a consonant cluster, the three plosives are deaspirated (they lose their aspirated feature and become typically unaspirated sounds). Therefore, the instructor can use the pair and to familiarize learners with the two

sounds in perception and recognition following the steps below: a) Demonstrate the pair vs. several times and highlight the difference to be picked up auditorily. It is quite acceptable to exaggerate the difference somewhat for the sake of clarity. b) Ask for volunteers from among the learners to do what you did. c) If you notice that learners are having difficulty or even doubt in telling the difference, move to the next step.

192

PRONUNCIATION IS IN THE BRAIN d) Move to the visual sensory modality. Place a flimsy piece of paper in front of your mouth and repeat the demonstration. The paper should flutter visibly with , but will hardly flutter in the case of . e) Ask for volunteers to do what you did. f) Arrange the class in pairs the members of which face each other while conducting the flimsy paper experiment

After such joint demonstrations, some of the learners should be able to perceive and recognize the difference because of the combined auditory and visual modalities of instruction. It is after this phase of orientation, that the instructor can begin the production phase of such phonetic materials from the targeted language such as [p:r] (odd numbers) vs. [p:r] (lambs) in Assyrian language. One can take the exercises one step further by involving the other two plosives of [t] vs. [t] and [k] vs. [k]. In summary, the above orientation began with listening to the targeted phonetic phenomenon. The auditory modality was supported by visual modality to help in two ways. First, elevate natural listening to discriminative listening. Second, use both listening and discriminative listening to activate the cognitive involvement of the brain in the process of internalizing the phonetic difference. 12.3.2. Developing Learning Strategies

Learning strategies are equally important to teaching strategies. When the instructor dominates the classroom situation and reduces the learners to mere listeners with minimum interaction, this is an old fashioned teaching strategy. There is always a wide variety of learning styles that should be considered and encouraged some of which will be considered below. 12.3.2.1. Discover Learning Strategies

When it comes to promoting learning strategies, the instructor has to discover the learning styles of as many learners as possible. To achieve this, he has to be very alert and observant. The discovery procedure usually takes place when the instructor presents one targeted problem with diverse auditory, visual and tactile-kinesthetic modalities and styles and monitors the reac-

CHAPTER 12

193

tion of students to each modality of the presentation. Once the instructor senses that a given learner responds to a given sensory modality, he should reinforce that in future attempts. 12.3.2.2. Consider Cultural Diversity of Learners

Foremost of the requirements for teaching pronunciation is the audible oral exercise that usually requires the presence of one or more persons. This face-to-face scenario can become intimidating, especially with some learners whose background culture is conservative and sensitive to errors and failures in public and in the presence of other learners, especially those who are strangers to them. In fact, this is even more relevant to female learners of those societies. The instructor should carefully consider the psychological and cultural differences between learners and encourage all of them to participate in class activities. This participation cannot be achieved without the promotion of mutual trust and respect for each other’s culture traditions as well as the tolerance of errors in early performance trials. 12.3.2.3. Encourage a Relaxed Attitude among Learners

Some aspects of teaching pronunciation can be intimidating as well as humorous. This is especially true with some very unfamiliar sounds. To have fun while experimenting with sounds should be part of the classroom culture of teaching pronunciation provided it is controlled fun lest it should overflow and interfere with class management. Learners should be instructed that failures and mistakes are part of the learning process. All learners should be encouraged to build up mutual trust between themselves. The more the interaction grows between the learners, the greater the discovery of each other’s learning styles. 12.3.2.4. Move between Individual and Group Learning Styles

It was pointed out previously that all learners have their own personal strategies of learning. In a classroom environment, however, personal strategies should be complemented with group strategies. It is natural to have group strategies because teaching and learning take place collectively in a classroom setting. One way of promoting alternative strategies to one’s individual style is through group and cooperative learning. This re-

194

PRONUNCIATION IS IN THE BRAIN

quires forming groups or teams led by facilitators from amongst the learners. This all should be done under the guidance and supervision of the instructor. 12.4. CONCLUDING REMARKS

In any successful classroom environment, there should be a balance between teaching and learning strategies because they are complementary in the mission of educating learners. Furthermore, a successful classroom environment should afford a threeway learning process: a) Learner from instructor; b) Learner from learner; and c) Instructor from learner. The first two learning styles are self-explanatory, whereas the third needs some further elaboration. With regard to the latter style, I am reflecting on my personal classroom experience which has been the richest source of gaining hands-on experience in instruction. There have been repeated instances when I corrected myself or improved my instruction because of direct or indirect feedback from learners. Let me cite the following example—not related to pronunciation but with a genuine linguistic connection. In one English language class, I was teaching spelling through a multisensory and multicognitive approach. I selected the word ‘grammar’ as an example of a commonly misspelled word often as ‘grammer’ or ‘gramar’, whereas its correct spelling is . I brought the following facts to the attention of the learners. I suggested the following mnemonics: 

Remember 7 letters = ;



Remember 1 single letter plus 3 pairs = + 2r + 2a + 2m = grammar;



Remember mirror image: ram + mar;



Remember ‘ram’

At this stage I exhausted my sensory and cognitive mnemonic hints that help with the retention of the correct spelling of the word . All of a sudden, one of the learners, an artist, asked to go to the board to demonstrate an addition. He quickly drew two sketches of two rams facing each other as if preparing for a headbutt. His quick sketch was to reflect the mirror image of ‘ram’ and ‘mar’ in . It was an ex-

CHAPTER 12

195

cellent and creative idea coming from an artist. I learned that and added the head-butting rams to my sketches.

CHAPTER 13: EXEMPLARY APPLICATIONS OF ACCENT REMEDIATION TECHNIQUES 13.1. INTRODUCTORY REMARKS

In this chapter, we will try to apply the approach detailed in the previous chapters with as many cognitive, sensory and educational teaching and learning strategies as relevant to the sound or sound phenomenon selected for elaboration. To be sure, every sound or sound phenomenon can be a source of difficulty for some learners of a given L2. Consequently, there has to be a limited selection of cross-language problems that encounter large numbers of L2 learners. Due to the international popularity of English as an L2, the application of the strategies tends to lean strongly in that direction with a reversal of roles, i.e., with native speakers of English tackling other languages. In many instances, teaching a given sound whether a vowel or consonant implies teaching a natural class of sounds. For example, teaching an aspirated/unaspirated consonant pair implies teaching an aspirated vs. unaspirated class of consonants usually /p t k/ vs. /p t k/; nevertheless, the two classes can involve other plosives such as palatal as well as affricates, etc., as is the case in Modern Assyrian (Neo-Aramaic). It is relevant to point out that some selected sounds, such / / may be difficult for many L2 learners with a wide variety of linguistic backgrounds, whereas other sounds may be difficult for learners with more specific linguistic backgrounds such as the distinction of /b/ vs. /v/ for Hispanic and Filipino learners of English. 13.2. TECHNIQUES FOR TEACHING SELECTED CONSONANTS

The consonants or consonantal phenomena selected for demonstration include the labiodentals, especially /v, /; the interdentals pair / /; retroflex sounds for learners of sub-continental 197

198

PRONUNCIATION IS IN THE BRAIN

Indians languages and the reversal of retroflexion for most subcontinental Indians learning L2s. It is important to bring to the attention of the instructor and learner that this section will contain sets of strategies that are procedurally, though not specifically, typical of teaching most sounds according to a multicognitive and multisensory approach. This entails that not all sounds yield themselves to the same sensory modalities and teaching techniques; however, all difficult sounds require some sort of cognitive orientation. Consequently, most of the details expounded in this section will not be reproduced for all sounds taught. 13.3. TECHNIQUES FOR TEACHING LABIAL-DENTAL SOUNDS

The voiced labialdental fricative [v] and the voiced labialdental approximant [] are more marked (less common) sounds than the voiced bilabial plosive [b] and the labialvelar approximant [w]. However, because of the generic labial nature of all those sounds as well as the absence or presence of some of them in given languages, any instructor of English as L2 will come across scores of learners who will demonstrate serious difficulties in pronunciation amounting, at times, to phonological accent while others will indicate at least a certain degree of phonetic accent. For instance, Hispanics typically replace [v] with [b], whereas Persians, Turks and Assyrians, among others, replace [v] with either a [w] or a []. I have an Assyrian friend from Iran who pronounces and as ‘Wote’ and ‘Harward’. In some other Assyrian dialects the [w] may be replaced with a labialpalatal approximant []. If one works with Hispanic students, he will readily notice that the mispronunciation of [v] is pervasive even among some individuals whose proficiency in English is otherwise excellent. Obviously, the primary reason is the absence of a /v/ sound in the phonology of Spanish. Pedagogically, the persistence of the problem even with welleducated Hispanics with high competency in English may be attributed to two factors. a) The mispronunciation has not received much attention from instructors at an early stage of learning; b) The instructors did not follow some effective techniques in teaching it. To put it more bluntly, the instructors did not have the know-how of effective remediation of learners’ mispronunciations. Most probably, if ever, they might have fol-

CHAPTER 13

199

lowed the mechanical procedure of ‘repeat-after-me’ which is often less effective with adults due to psycholinguistic insensitivity or deafness explained in earlier chapters. In what follows, some strategies are put forth to develop an effective set of procedures to overcome the problem of /b/ vs. /v/. The strategies typically reflect different cognitive and sensory modalities. a) Cognitive Orientation: Prepare the learners mentally (cognitively) to recognize the existence of the problem and its seriousness because it leads to serious phonetic and phonological accent. The preparation requires the following steps: 1) Instruct them to be ready to accept the problem and be willing to pay utmost attention. 2) Tell them they will certainly manage the pronunciation. 3) Tell them to watch your facial gestures, especially those of the mouth and recognize the difference in the pronunciation of [b] vs. [v]. In fact, to dramatize the postural difference in the articulation of the two sounds, you may call the [v posture a ‘dogface’ because when one assumes the posture, one looks like an angry dog ready to bite. In contrast, you may call the [b posture a ‘tight-lip face’ one since the lips have to come together tightly for the sound. The dramatization of the articulatory facial postures for the sounds oftentimes functions as a humorous, albeit robust and concrete mnemonic to remind the learners of the required articulatory differences. 4) Demonstrate the pronunciation of the sounds in selected minimal pairs of words for which the difference in meaning is easily noticeable and, perhaps, even funny or embarrassing, such as vs. ; vs. or vs. (Figure 13.1).

200

PRONUNCIATION IS IN THE BRAIN

Figure 13.1. Notice the difference when /v/ is replaced with /b/. 5) Use colors and pictures or any other audio-visuals to highlight the difference that results from substituting one sound for the other. 6) Ask learners to watch carefully your facial gestures, especially your mouth and lips, while you slowly and distinctly demonstrate the production of the two sounds. To put it differently, ask them to watch the dogface posture for [v] and the tight-lip-face posture for [b]. 7) While you do all the above, carefully watch the facial gestures of the learners. If you notice that learners’ faces seem attentive and serious then you have to be sure that the learners are in a mode of thinking. In other words, they are trying to cognitively grasp the difference between the two sounds.

b) Auditory Orientation: Go back to the minimal pairs, number each member of the pair as #1 and #2 then produce each member of the pair and ask learners to identify the word as #1 or #2. Do this demonstration with your mouth covered with a piece of carton to prevent lip reading and easy guessing. Another major difference between the two sounds is that [v] being a fricative sound is sustainable (can be prolonged), while [b] being a stop is unsustainable (cannot be prolonged). If some learners still experience some difficulty in perceiving and recognizing the difference between the sounds, then go to the next step. c) Visual Orientation: Remove the carton and pronounce the two sounds quite consciously while exaggerating the bilabial (upper & lower lips or the so-called tight-lip-face) posture for [b] and the labial-dental (lower lip and the upper teeth or the so-

CHAPTER 13

201

called dogface) posture for [v]. Put the learners in pairs facing each other and ask each member of the pair to perform the articulatory postures for the two sounds, while the other learner is observing. Allow them to turn-take on this performance. d) Kinesthetic/Proprioceptive Orientation: Ask the learners to carefully watch your demonstration of the two sounds with distinct performance of their articulatory postures. Stick with one of the sounds and repeat its articulatory posture while saying the name of the sound (i.e., its letter-name) then do the same with the other one. In other words, pronounce the sound [v] or [b] several times followed by or . Ask them to impersonate what you have been doing with emphasis on the need to develop a kinesthetic (tactile) and proprioceptive (inner) sensing of the articulatory contacts made for [v] and [b]. To rephrase the latter statement, learners must be asked to sense the contact of the two lips for [b] and the contact of the upper teeth and the lower lip for [v]. e) Cognitive Reinforcement and Internalization: The initial cognitive orientation is considerably reinforced by the following three sensory modalities of auditory, visual and kinesthetic/proprioceptive. The activity and performance conducted via each sensory modality plays a certain role in the joint reinforcement of the articulatory difference between the two sounds. Once the brain receives the input through each sensory modality it begins to process it and develop the impressions required for the neuronization (i.e., imprinting in neurons) of the two sounds as two different entities. f) Follow-up Procedures: 1. Obviously, the manner in which human memory functions should be taken into consideration. To put it differently, human internalization of an impression may just be for a short time or for a long time. Those different times were technically known earlier on as the sensory memory, shortterm memory and long-term memory. 2) It is quite natural and normal for the learner to be able to correctly articulate the targeted sound, but then forget it in a split second. This situation may be highly indicative of the

202

PRONUNCIATION IS IN THE BRAIN behavior of a sensory memory mode (i.e., perceive, produce and forget). 3) The above situation certainly tells you that more rehearsal is definitely needed which is often the case. 4) Return to more auditory, visual and kinesthetic/proprioceptive rehearsal to at least transform the impression of the sound into the short-term memory during which it undergoes a ‘remember-forget-remember’ process. This is an important stage in the learning/acquisition of the targeted sound because it, most likely, indicates the initial stages in the cognitive internalization of the sound. In other words, learner is actively engaged in a conscious and cognitive processing of the sound. 5) If after an active day of remember-forget-remember, the learner comes back the next day and has forgotten the targeted pronunciation do not panic. All that the learner needs is a refreshment of memory. 6) Once you notice that the learner produces the sound incorrectly, but then he instantaneously realizes the mispronunciation and rectifies it immediately, you should relax because the learner is most likely at the final stage of the correct internalization of the sound in the long-term memory. 7) After this stage, what the learner needs is more occasional rehearsal and practice to finally subconsciously internalize the sound for immediate and automatic retrieval.

13.4. TECHNIQUES FOR TEACHING INTERDENTAL FRICATIVES / /

These two sounds are extremely rare in many languages throughout the world; consequently they are typically classified as marked (unfamiliar; uncommon) sounds. Their rarity may be attributed to their association with the clinical speech problem known as alveolar lisp according to which individuals, especially young children, manifest the symptoms of such pronunciation problem by replacing the [s, z] with [, ], respectively. What matters here is the fact that in spite of the rarity of this pair, it is, nonetheless, of high frequency of occurrence in English and this renders the pair a major pronunciation problem for a large

CHAPTER 13

203

number of L2 learners of English from a wide variety of linguistic backgrounds. It has already been pointed out, in 7.2.1 above, that one of the most interesting aspects of the mispronunciation of this pair is that it is realized differently depending on the native language of the learner and its phonological system. Its replacement is usually with far more unmarked (common) pairs of sounds such as /t, d/ or /s, z/. Surprisingly, when teaching this pair of sounds, one can come across individual learners who are capable of pronouncing the pair, but are psychologically reluctant to do that in conversation because they feel they are lisping. This is a psychological observation that should be taken into consideration. Since the orientation for those who replace / / with the plosive /t, d/ will be different from the orientation of those who replace them with the fricatives /s, z/ there will be two sets of procedures of orientation. Below are, first, the orientation procedures and techniques for /t, d/ substitution. a) Cognitive Orientation: The instructor has to ask learners to carefully watch the articulatory postures and facial gestures concomitant with the production of [] and []. Carefully and dramatically emphasize how the tip of the tongue slightly sticks out at the biting edge of the upper incisors in the production of both of them. It is helpful to remind learners that in many cultures, children tend to stick their tongue out as a gesture of mocking. The articulatory postures for [] and [] are similar to the mocking gesture except in a very moderate and gentle manner. Just to exaggerate the visual, auditory and kinesthetic differences between [, ] and [t, d], carefully and slowly demonstrate the production of the latter pair to highlight the several sensory differences. Cognitively, the purpose of this comparison is to encourage learners to consciously think about the physical and articulatory production of the two targeted sounds and their unwanted substitutions. b) Auditory Orientation: You need to prepare suitable lists of minimal pairs for both[] vs. [t] and [] vs. [d] to orient the learners on the perception, recognition and production of the two pairs of sounds. The initial group of minimal pairs should be carefully selected not only to highlight the pronunciation differences, but also to highlight the semantic differences that result from the mispronunciation. One way to better high-

204

PRONUNCIATION IS IN THE BRAIN

light the pronunciation differences is to select monosyllabic words. Table 13.1, below represents some such minimal pairs: Word []

Mispronunciation

[t]













[]

[d]











Table 13.1. The [t, d] rendition of the English interdentals [,  ] by different learners of English.

To further highlight the semantic difference ensuing from the substitution, select a couple of minimal pairs for pictorial (visual, figure 13.2) difference such as vs. . Bring to the attention of learners, that unfortunately the production of [] and [] may sound like a speech defect (alveolar lisp), but that is how the two sounds are in English.

Figure 13.2.Visual and semantic difference when substituting /t /for //.

c) Visual Orientation: In order to warm up the learners for the visual orientation, go back to the actual production of the

CHAPTER 13

205

pair /, / vs. /t, d/. This time use a paper to block the visual channel of guessing of the sounds through lip-reading. Next, remove the paper and pronounce [] and [] with a clear facial posture showing the tip of the tongue at the biting edge of the incisors. Then do the articulatory postures for [t] and [d] while drawing the attention of the learners to the disappearance of the tip of the tongue. d) Tactile-Kinesthetic Orientation: To help learners with this type of sensory orientation all that the instructor has to do is to direct learners to put the tip of the tongue at the biting edge of the incisors and repeat the contact for [, ] several times until the brain makes an impression of the contact. The brain may forget the sensation minutes, hours or days later, but it is easy to recall the impression with a couple of maneuvers. Word []

Mispronunciation [s]

















[]

[z]

(letter)







(v.)

Table 13.2. The [s, z] rendition of the English interdentals [, ] by different learners of English.

After completing all the cognitive and sensory orientations for /, / vs. /t, d/, move to contrasting /, / with /s, z/. This contrast may be more challenging than that of /t, d/ simply because in this case there is only one distinctive feature of place of articulation that separates them (i.e., interdental vs. alveolar) instead of two in the case of /t, d/ (i.e., place of articulation: interdental

206

PRONUNCIATION IS IN THE BRAIN

vs. alveolar coupled with manner of articulation: fricative vs. plosive). The instructor should use the general procedures in a, b, c and d above with emphasis on the place of articulation using the minimal pairs in table 13.2, above. In addition to the above guidelines and examples, instructor should attempt to demonstrate the considerable difference in meaning when the /, / are replaced with /s, z/ similar to the example of vs. . A good example would be the contrast between 1 and .

Figure 13.3.Visual and semantic difference when substituting /s/ for //.

After the different cognitive and sensory orientations, the instructor has to put his work to the test through designing and implementing some perception and cognition exercises leading finally to production in the following manner: a) Perception: Demonstrate the /, / pair several times for perception by learners with /t, d/ substitution; b) Recognition: Move to the recognition assessment by numbering the sounds /t, d, , / as 1, 2, 3 and 4; produce a list of ten sets of the four sounds arranged in a random order; pronounce the ten sets slowly and methodically while asking the learners to identify the sounds according to their assigned numbers on the worksheet; for example, if the sounds were pronounced with this order: /t, , d, / then the set of numbers should be: 1, 4, 2, 3; collect the worksheets to assess the general accuracy. If the matching is considerably satisfactory, proceed to the production phase, if not repeat the exercise asking learners to print their names on the worksheets. The latter addition of

1

Point to the head while pronouncing it.

CHAPTER 13

207

names will make the learners more alert and attentive; besides, it will also identify the learners who need more orientation. c) Production: Ask for several volunteers to demonstrate the pair /, / while in their seats; if all successful, ask the same volunteers to appear before the class and demonstrate the sounds while the rest of the learners listen to the demonstrations and watch the facial gestures of the performers; ask the volunteers to produce minimal pairs that you provide such as vs. and vs. and so on with contrasts of /, / vs. /s, z/. 13.5. TECHNIQUES FOR TEACHING TENSE (LONG) VS. LAX (SHORT) VOWELS

Obviously, the features of quantity and quality are in many instances too intertwined to be separated and autonomously evaluated and described. The case of the English vowels in < sit > vs. < seat > is typical in this regard. Even though many authors and in many instances, handle the relationship of those two vowels as short vs. long, the relationship is too complex to be glossed over as short vs. long; it involves a feature of lax vs. tense accompanied by a difference in quality. This complex feature combination becomes an instructional reality when adult L2 learners of English in whose languages there are no short vs. long or lax vs. tense vowel distinctions. Typically in this regard are speakers of Spanish, Italian, Russian, Greek, Filipino, among many other languages. The first time I came across such a major pronunciation difficulty for Hispanic learners of English was some three decades ago when I was teaching an ESL class at Loyola University Chicago. In order to demonstrate to my students the so-called short vs. long vowel contrast in English, I wrote the words and on the board and asked the best student in class to pronounce them. To my utter surprise, I heard him produce the same pronunciation for both in the form of [ʧp]. I asked him to repeat the pronunciation and the result was no more than [ʧp] for both. As a linguist, I realized for the first time that apparently there was only one version of an vowel in Spanish and that the language does not have a [] consonant sound. As an instructor, I paid utmost attention to such problems. In this case, the incident became a focus of part of my future research.

208

PRONUNCIATION IS IN THE BRAIN

Let us consider more examples of teaching Hispanics the vowel system of English. a) Cognitive Orientation: In an experiment conducted with adult Hispanic learners of English at a beginning proficiency level, the learners were asked to pronounce the following minimal pairs:













The overwhelming characteristic of most of the tokens obtained was the failure to distinguish the vowel difference within each minimal pair. The vowel that dominated those words sounded shorter than the English long one and longer than the short one although it shared a degree of tenseness with the English long one. The failure to distinguish minimal pairs based on vowel quantity is certainly the most major pronunciation problem for Hispanic learners of English for two reasons. First, they have no cognitive perception and recognition of the quality and quantity differences between the two sets of English vowels because such differences do not exist in Spanish. Second, without the realization of such a vocalic feature, thousands of words in English may be semantically confused some of which may be socially very embarrassing because of obscene or vulgar connotations. This is why some Spanish, French or Russian learners of English avoid words such as < beach, sheet, winner, keys> etc. Right at the outset, the instructor should bring this major vocalic difference to the attention of the learners highlighting the considerable phonetic and semantic differences. He should also prime them for the cognitive realization of the difference through the following steps: 1) Cite some monosyllabic minimal pairs and highlight the difference in both pronunciation and meaning such as: vs. vs.

CHAPTER 13

209

vs. 2) Demonstrate the above minimal pairs again and highlight the difference especially in the shape of the lips which tend to be more stretched sidewise in the latter group of words than the former. 3) Try to insert the Spanish word (without) in between the English minimal pair and to further underscore the phonetic difference. 4) Transcribe the difference between the three words phonetically in the following format: English: = [sn]; Spanish: = [sin]; English = [sn].

The Spanish vowel should be transcribed as [] with [i] indicating the quality of tenseness and the single dot indicating the medium or half-length somewhere between the two English vowels [] and [i]. The transcription is meant to signal both quality and quantity (length) differences. Schematically, the instructor can express the quantity (length) and quality (tense vs. lax) by the thickness of the dark band as demonstrated below: English Vowels Schematic Representation = [p] = [p]

The expected Spanish rendition of (without) will look like the following which is shorter than the long English vowel, but slightly less tense. On the flip side, it is longer than the short English one though somewhat more tense. The following strategies are suggested to handle such vocalic multiple-feature differences. b) Perception: 1) Produce a simulated Spanish vowel in the context of, for example, and transcribe it as [n]. Model this simulated pronunciation several times while learners are listening.

210

PRONUNCIATION IS IN THE BRAIN 2) Embed the third item (i.e., the simulated Spanish examples) in between the English minimal pair of = /sn/ and = /sn/. It is very convenient to select monosyllabic words to avoid any perceptual interference with bisyllabic or multisyllabic words. Number the three items as 1, 2 and 3 as in table 13.3.

English

Transcription

Spanish

English

Transcription

[t]

[t]

[t]

[bt]

[bt]

[bt]

[st]

[st]

[st]

[pl]

[pl]

[pl]

[rʧ

[rʧ

[rʧ

Word

#1

Rendition #2

Word

#3

Table 13.3. Simulated Spanish rendition of English minimal pairs with [] and [] vowels. Model each of the above triplets very carefully and ask students to listen, sense and reflect on the process. 3) Cite more English minimal pairs such as: [i]

[i]









and simulate the expected mispronunciations by Hispanic learners i.e., [bd], [tn] and [lp] 4) Encourage learners to practice what Catford calls ‘silent introspection’ with emphasis on sensing the vowels rather than hearing them. According to Catford, whenever one makes sounds aloud, the auditory impression tends to mask or override the sensations of muscle movements and other proprioceptive sensations. Certainly, student awareness of these proprioceptive sensations can be a useful adjunct to making the articulatory adjustments necessary for learning the articulation of new sounds (Catford, 1994).

CHAPTER 13

c) Recognition: 1) Select the triplet [bt], [bt], [bt] from table 13.3 in Perception and number the items 1, 2, and 3. Record them randomly each of which repeated twice in, at least, ten to fifteen attempts. Play the recordings back one attempt at a time with a few seconds of pause between each attempt and ask the learners to mark the items as 1, 2 or 3 on a specially prepared worksheet. 2) Give the learners the key to the correct answers, ask them to identify the errors and notice the tokens which were with the highest percentage of inaccuracy. The results may be very significant for further design of exercises and drills. If learners misidentify the first token, such a result is expected because the first token may be uttered more emphatically by the reader prior to settling to a normal mode of pronunciation. Besides, with the first token, learners are, so to say, taken by surprise when they have not yet developed a mental or psycholinguistic yardstick for the estimated evaluation of the samples. 3) Ask learners to return all the worksheets of the first trial then ask them to prepare for a repetition of the exercise. Usually, a second and a third trial are much better than the first one; more exposure creates more familiarity and both lead to more confidence and better focus. 4) Select as many minimal pairs as in perception and mark the columns as #1 and # 2 then continue to pronounce the items arbitrarily and ask learners to identify them as #1 or #2. 5) Create carrier sentences and ask learners to place the appropriate member of a minimal pair as in perception/3 in the blank then read the sentences once or twice: Example: select either mill or meal and place it in the sentences below: A

is the place we turn wheat into flour.

A breakfast is a

in the morning.

211

212

PRONUNCIATION IS IN THE BRAIN d) Production: 1) Ask volunteers to repeat the modeling of items in perception/2 above after as many times as necessary. The repetition should be instantaneous. 2) When some learners excel in the impersonation or production of the targeted sounds, allow them to replace you, as the instructor, in modeling. All learners should to be given the opportunity to participate. Preferably, learners should model while seated in their places among the students; this setting creates a more learner-friendly situation.

13.6. TECHNIQUES FOR TEACHING VOWEL REDUCTION

What is vowel reduction, anyway? A reduced vowel becomes shorter in length (quantity) and/or moves in the direction of a neutral vowel, typically a schwa []. Vowel reduction is used here to refer to the overall diminishing in the length (longer to shorter) and quality (from markedly distinct features to less distinct ones) of the vowel. For instance, the change of [] in strong form of < than > = [] into [], its weak form, is a change from a longish and more distinct vowel into a very short and qualitatively less distinct one. Earlier on, the English vowel system was identified as a typical centripetal system as opposed to the Spanish centrifugal one. Since vowel reduction in English leads predominantly to schwa-type vowels, English is best described as a schwa vowel system, whereas Spanish, without any vowel reduction, is best described as a schwaless vowel system. At this juncture, the question of what the characteristics of a schwa are becomes unavoidable. Auditorily, the schwa is the most obscure vowel. Visually, the lips are in the most neutral position: not spread, not rounded with medium opening. Kinesthetically, the whole vocal musculature is relaxed with no sensation of an extreme forward, backward or upward movement of the body of the tongue. Features such as those make the schwa a very elusive sound and a difficult one to teach. With the elusive articulatory maneuvers involved in the production of a schwa, the primary dependence will be on the auditory channel in the form of: (perceive) listen, recognize and produce. Nevertheless, in order to maximize the effectiveness of the instruction, all the other sensory channels have to be brought into play. This should be

CHAPTER 13

213

reinforced by considerable non-linguistic and non-verbal activities and by contextualizing the schwa in words and longer pieces of speech. The typical action word to describe the instructor’s productive action will be ‘demonstrate’; the learner’s action word will be ‘impersonate’. The word imitate will be avoided because it usually connotes mechanical action whereas impersonate implies watching, contemplating and purposeful production. Indeed, vowel systems that tend to be tense (centrifugal) are usually without a schwa unit; therefore, its speakers face great difficulty in neutralizing the vowels. This simply implies that teaching the production of a schwa requires concentrated sensory efforts until the sound is internalized cognitively. Below are some perceptual orientations learners have to go through. a) Perception: 1) Demonstrate to learners a posture of relaxing the muscles as opposed to tensioning them. Muscle tension is most visible in the form of stretched neck muscles, which if detected it means the whole musculature is inappropriate for lax sound production of which a schwa is most typical. 2) Once the relaxed position is maintained, proceed to demonstrate short intermittent moaning sounds as if feeling pain which tend to be typical of schwa vowel postures. 3) If you, as an instructor, are a native or near-native speaker of English who has mastered the articulatory process of vowel reduction or schwaization in English, try to demonstrate the so-called hesitation vocal gesture (Delattre, 1965) which happens to be very much a schwa-like sound. 4) Conduct a demonstration of CV type of nonsensical syllables with the following vowel qualities: [] as in the English : [l l l] [a] as in Arabic ال‬No): [la la la] Then pronounce the three syllables with each vowel in two formats: firstly, change the stress from the first syllable through the third maintaining the vowel quality and with no temptation, whatsoever, of reducing it to a schwa; secondly,

214

PRONUNCIATION IS IN THE BRAIN do the same but this time reducing the unstressed vowels to schwas. Repeat as demonstrated: 





lalala

lalala

lalala







lall

llal

llla

5) Shift from nonsensical syllables to real words that are typical of schwa articulation such as banana and America. 6) Model the pronunciation banana emphasizing the reduction of the vowels of the first and third syllables and accentuating, and somewhat exaggerating, the length and quality of the vowel in the second syllable. 7) Model the pronunciation of banana in English, e.g., [bnn]2 and compare it to a simulated pronunciation of it in Spanish, e.g., [banana]. It is perfectly acceptable to somewhat over-accentuate (exaggerate) the differences between the two pronunciations to attract the attention to them. This over-accentuation is similar to caricaturish drawing in which the distinctive features are exaggerated to attract attention. However, once the objective has been attained, instructor should stop the exaggerated versions and end his demonstration with normal and natural modeling of pronunciation 8) To reinforce the auditory channel with the visual one, transcribe the English pronunciation of banana phonetically and portray the syllabic structure of the word with various schematic diagrams that visualize the difference such as below,

2

Or [bnn] as in RP.

CHAPTER 13

The schematic representation can be in any form or shape as long as it visually demonstrates the reduced shapes versus the augmented one, the latter of which often standing for the accentuated syllables. Model the syllabic pronunciation repeatedly. 9) For further visualization, demonstrate vowel reduction and accentuation through nonverbal gestures. To demonstrate the English pronunciation of [b n n] nonverbally, take one short step, followed by a long step and then another short one; the size of the steps indicates the strength of the stress and degree of prominence of the vowel. For the demonstration of the Spanish pronunciation, take three steps, which are, more or less, of the same medium size although the stressed syllable will slightly affect the quantity (length) and prominence of the vowel. Visually, the Spanish pronunciation of will schematically appear as follows.

215

216

PRONUNCIATION IS IN THE BRAIN b) Recognition: 1) Number [b n n] and [ba na na] as #1 and #2, model them several times and ask learners to identify them as 1 or 2. 2) Repeat Recognition#1 above then ask learners to identify the word in its reduced (English) form or unreduced (Spanish) version.

c) Production: Return to activities in the perception stage and redo them. Then ask learners to reproduce everything seen, heard or perceived in the following manner: 1) Initially, any attempt at production by learners should be instantaneously after the instructor’s modeling with no background noise or speech separating the modeling from production. This procedure secures the recency effect of the input in the learner’s short-term memory circuitry for instant retrieval and reproduction. 2) If the instructor feels the learners still experience difficulty in recognizing and producing a schwa or a schwa-like vowel viz., [ , which is “acceptable for some native speakers of English” (Whitley, 1986:58), instructor should sustain the modeling. If the failure continues, the instructor should not persist. A break should be taken to let the learners relax. The physical and psycholinguistic distancing of self from the practice and drilling helps both the instructor and the learner to start afresh—the instructor with a more energetic and enthusiastic attitude and the learner with a more determined, thoughtful and hopeful attitude. 3) The instructor should encourage learners to think of the targeted sounds and the overall practice for their mastery. This reflective thinking can go on anytime during the practice and drilling or the distancing-of-self breaks. Techniques 1 (immediate repetition) and 2 (a break), above, should apply in the orientation of any aspect of pronunciation. 4) If the instructor feels learners need a refreshment of memory, he should afford them with more opportunity to internalize the production of a schwa through comparison

CHAPTER 13

217

with other non-reduced vowels in syllables similar to those in Perception/4: 





lalala

lalala

lalala







lall

llal

llla

5) Place the word “banana” in short appropriate sentences, portray the English rendition in broad phonetic transcription, model the pronunciation and ask learners to produce after. Example: Barbara eats a banana a day. [brbr

ts



bnn



de]

6) Select other appropriate English sentences, such as the one below, model them and ask learners to reproduce them. Example: I can see a banana in the tree. [ai

kn si



bnn

n



tri]

13.7. TECHNIQUES FOR TEACHING ACCENTUATION (STRESS)

Before understanding the nature of rhythm and its teaching, one has to be aware of the premise on which rhythm is based— stress. The traditional approach to teaching stress has been through the auditory modality and its predominant technique has long been, and unfortunately still is, repeat-after-me. This exclusive auditory sensory modality in teaching stress may not be effective with all learners. Besides, in some very traditional methods of teaching, stress and rhythm are hardly ever tackled. To help more learners master the process of stress perception, recognition and production effectively and efficiently, the instructor has to resort to a cognitive approach that is implemented through a wide variety of activities and exercises based on diverse sensory modalities—auditory, visual and kinesthetic/ proprioceptive. Occasionally, multisensory modalities are jointly invoked. Once again, one has to be reminded of the triangular base of teaching pronunciation in the form of perception, recognition

218

PRONUNCIATION IS IN THE BRAIN

and production. These three sub-processes are interrelated and one leads to the other. In the following sections, the techniques of teaching stress perception and recognition will be clustered together, while its production will be handled somewhat separately since the techniques applicable to teaching the first two processes may not be applicable to the teaching of the third and vice versa. a) Perception: 1) Take a two-syllable word such as and demonstrate it with accent on the first syllable and then with accent on the second syllable as exhibited below. Also to highlight the grammatical and/or semantic role of stress inform the learners that the first rendition of indicates a ‘noun’, whereas the latter indicates a ‘verb’.

  pro duce

  pro duce

2) Ask whether any of the learners wants to demonstrate the difference. If so, let them demonstrate the difference. Such exercises are the initial exposure to the phenomenon of stress. You should expect to have learners in your class who do not know what is going on and what the difference is between the above two demonstrations of stress. Personally, I am confident that such learners will always be there. Once upon a time, I had been one such learner and I always encountered such learners in my classes as an instructor. 3) Go one step further to impress the difference on the learners and try again by tapping on a desk or table—one strong tap followed by a weak tap for and reverse the order for . 4) To dramatize the difference and capture the attention of the learners, grab an empty carton box or can and beat the rhythm of the two words on it. The visualization that goes on with the latter demonstration is of great help to reinforce the retention of the beats and the overall stress location.

CHAPTER 13

219

b) Recognition: 1) Assign colors to the stressed and unstressed syllables. Produce a prepared poster with the targeted two words. Tell learners that the red-colored syllable is the stressed one in each case and demonstrate the difference. 2) To do it differently but still visually, you may write/print the stressed syllable in larger size as in,

pro duce

produce

then take a big step followed by a small step forpro duce, followed by a small step and a big one for produce. The arrows mark the size of steps 3) One important set of features for the learners to observe and impersonate are the facial and body gestures that accompany the stressing of a syllable. Stressing a syllable immediately implies the exertion of greater physiological effort. This additional effort reveals itself physically in the overall body gestures of the speaker. Such features may be in the form of a downward head movement, rising of an eyebrow or both, or simply a sudden movement with the hand or arm. 4) For an auditory checking on stress recognition, take another noun-verb contrast of English Such as: vs. , number them #1 and #2, consecutively and read them randomly asking the learners to identify them as #1 or #2. 5) The learners will be taken one step further in the recognition challenge if they are given three nonsense syllables to be demonstrated three times each with stress on a different syllable as below, 

















220

PRONUNCIATION IS IN THE BRAIN Mark the triplets as #1, #2 and #3 and then read them randomly asking the learners to identify them as #1, #2 or #3. One can also tap the three stress triplets on a desk, empty can or on anything that resounds and ask the learners to identify the location of the strongest stress within each triplet. If you have a small wooden hammer or a gavel, tap or beat the same stress triplets then ask learners to recognize the strongest beat in each case. In both of the above demonstrations (reading or tapping) there are certainly visual signs to indicate the stressed syllable. Bring the latter fact to the attention of the learners. 6) Repeat the beating (tapping) of the stress triplets mentioned above putting more emphasis on the movement of the hand or the gavel so that the movement will visually attract the attention of learners then ask them to identify the stressed syllable. One can also use the overhead projector or power point slides of the stress triplets telling the learners that red-colored syllables are the stressed ones. 7) Finally, you can ask for volunteers to demonstrate the stress triplets.

c) Production: Obviously, production is the last phase in the training of learners to master the dynamics of stress. It is in the production phase that learners will be far more able to sense stress dynamically and proprioceptively because they will actually be physically producing the stress either through nonlinguistic activities or through authentic linguistic ones as follows: 1) Foremost in this regard is for the instructor to demonstrate any stress performance that he intends to ask learners to execute through tapping or beating of syllables in doublets, triplets or even quadruplets. It is preferable for both instructor and learner to use the hand or a small hammer or gavel to perform the demonstration. It is easier for the learner to begin with two-syllable structures and proceed further with multi-syllable structures. Ask for a volunteer to perform the beating of two syllables once with the first syllable receiving the strong stress and then the second syllable receiving it as in:

CHAPTER 13 







Gradually, the performance should include more learners with more syllables and beats as in: 

















































or

While all this tapping or beating of syllables goes on, you may ask the learners to notice the difference in the physical effort exerted by the hand of the performer. The physical difference is visually quite noticeable. In fact, the performer may be asked to intentionally exaggerate or intensify the physical effort or bodily gesture that accompanies the production of a stressed syllable. 2) After one is done with the non-linguistic exercises, it is time to move to the linguistic ones especially with counting numbers beginning with ‘one two’ and moving to ‘one two three’ etc… as follows:

/one two/ /  / Reverse the stress.

/one two/ / /

221

222

PRONUNCIATION IS IN THE BRAIN Move to three units advancing the stress every time:

Move to four units advancing the stress every time:

3) In English, the most appropriate materials to teach stress placement and its linguistic significance at word level are the noun-verb doublets such as: in table 13.4, below, Noun

Verb

subject

subject

record

record

contract

contract

perfect

perfect

present

present

insult

insult

Table 13.4. Sets of noun-verb contrasts signaled by stress placement Pronounce pairs such as the above slowly and somewhat emphatically (i.e., with somewhat more determined articulation) and ask the whole class to repeat after you as a chorus. Later on, ask for individual volunteers to come to the front and repeat the above list after you. At this stage, the instructor may carry the performance a step further by asking individual learners to produce the pairs without being modeled.

CHAPTER 13

223

If there are still more learners who still experience some difficulty, you have to be patient with them and be ready to repeat the performance and the demonstrations. In such instances, you should proceed in the following manner. First, demonstrate the examples with somewhat more emphatic articulation in the hope of highlighting the difference between them even further. In fact, you may select pairs of words in English which contrast not only grammatically, such as the above noun-verb, but also semantically and orthographically such as vs. and vs. . Second, demonstrate them in conjunction with somewhat more visible facial and body gestures. Third, as pointed out in the early stages of teaching production, learners should produce the targeted modeling instantaneously after you with no background noise or speech separating the modeling from production to trigger the recency effect and its impact on memory. Fourthly, ask the learner to produce a series of repetitions for a given item of demonstration. These repetitions may sound mechanical in nature, but they may have a positive impression on the brain inasmuch as their long-term retention is concerned. After all, learning by human beings is a process of transforming mechanical habits into cognitive ones.

13.8. CONCLUDING REMARKS

There are no solid and comprehensive rules that predict and capture stress placement in English words for two main reasons. First, English is a stress-timed language with mostly unpredictable stress placement (MacKay, 1978, 150). Second, there is a high percentage of foreign loanwords in English that maintain their own stress patterns and create exceptions to the native rules. This makes the problem of mastering stress placement for learners of English as L2 a major difficulty and source of accent. It is, therefore, an area which is worthy of the attention of any instructor teaching English or any other language as L2. The instructor has to be professionally qualified to teach this very essential linguistic aspect in cross-language teaching. As has been discussed earlier on, instruction can begin as a phonetic practice which gradually should lead to authentic linguistic materials. The most appropriate linguistic materials would be the

224

PRONUNCIATION IS IN THE BRAIN

so-called strong vs. weak forms the majority of which fall under the categories of prepositions, conjunctions, auxiliary verbs, etc. Word

Strong Form

Weak Form (most frequent)

[]

[] or even [] or []

[]

[]

[]

[]

The noun-verb category of words which are only distinguished grammatically by the placement of stress such as the ones cited in section 13.7. Additional excellent materials that serve the purpose of teaching stress placement in English words is through grammatical derivations. One of the most typical examples cited by teachers of English to demonstrate the variability of stress and rhythm in English is the word . Notice the following changes in primary stress indicated by the bold syllables in conjunction with the large dots as marked below.

  

   

  

Because of the highly unpredictable nature of stress placement in English, some phoneticians and instructors of pronunciation advise learners of English to listen carefully to the native speakers or materials recorded by them or use a pronunciation dictionary. This is not a bad advice, but it is a privilege that is not attainable by all learners of English. Consequently, some, regardless of how few, of the most common and effective rules of stress assignment in English should be taught systematically. However, foremost in importance is that stress and stress placement should be taught because many learners in many languages are not really aware of the existence of stress as a linguistic dynamic or may not have the auditory sensitivity to pick up the stress on their own.

CHAPTER 14: TIPS FOR ACCENT REDUCTION AND ACCENT DETECTION 14.1. INTRODUCTORY REMARKS

Accent reduction has already been defined and elaborated on (4.5.2) as an attempt to bridge the pronunciation gap between the native speaker of a given language and someone who is attempting to learn that given language as L2. The only further addition to the theme of ‘accent reduction’ in this chapter is to provide some key tips to help learners focus on the points which are considered significant in its reduction. With regard to accent detection in the context of this book, two points are worthy of consideration. First, how does one discover that he has an accent? Second, how can accent lead one to the identification of the linguistic affiliation of the speaker? In response to the first question, one has to bear in mind that perhaps one of the most controversial aspects of pronunciation at large is that many people do not realize they have an accent or admit to having it because they hear themselves through their own native linguistic filter or prism. There are two ways to convince one to admit having an accent. First, the instructor or coach brings it to one’s attention and advises him to work on it. Second, it is self-discovered after having ample exposure to the target language especially in cases of full immersion coupled with experience in speech production and pronunciation. The latter case applies to my personal attempt at accent reduction. Nobody brought my accent in English to my attention; I was fortunate to be able to discover it and work on it diligently. I pointed out in chapter 1, that my so-called ‘Iraqi English’ was distinctly accented with the linguistic substrata of my three native languages. I did not have any problem with consonants except for replacing the English approximant with my tap and trill s. I had some problems with the vowels, especially 225

226

PRONUNCIATION IS IN THE BRAIN

with the neutralization of unstressed ones. My most serious problem involved accentuation and rhythm at large. I also have to admit that I did not discover my accent right away; it took about six months to gradually realize that I had a few fundamental problems. I had to spend long days, weeks and months paying intensive attention to native speakers of English and practicing what I listened to. I certainly reduced my accent considerably in segmental as well as suprasegmental constituents. Specifically, I have to emphasize that I improved my stress placement and accentuation in a very marked way. In spite of all that improvement coupled with decades of full immersion in English as an adult in the United States, I still feel I have an accent; however, it hardly ever interferes with communication. Briefly, I have overcome almost all sources of phonological accent, but there are still some residues of phonetic accent that pop up here and there. 14.2. TIPS FOR ACCENT REDUCTION

There are three general routes that lead to accent reduction. First, try to improve pronunciation as much as possible by first mastering the basic phonological differences between L1 and L2, including those pertaining to segmental elements (consonants, vowels) and other suprasegmentals (stress, rhythm and intonation and/or tones) to avoid semantic confusion. This is one way to prove the functionality of the phonological vs. phonetic accent dichotomy and why educationally the former should receive more attention. Second, work on the most striking phonetic differences that do not result in semantic change, but do generate different degrees of noise ranging from slight to substantial which indirectly interfere with comprehension. Third, raise the competency level in other aspects of language including, morphology, syntax, lexicon and idiomatic expressions to facilitate comprehension. 14.2.1. Tackle the most Salient Phonological Problems

Because the phonological deficiencies in one’s L2 proficiency are more perceptually outstanding and semantically distracting, they should be identified first and given the priority in tackling. Naturally, if the individual is not able to identify such deficien-

CHAPTER 14

227

cies, a competent instructor or a seasoned supervisor has to do that. In any case the following are some of the most significant phonological accent-causing problems that should receive priority. a) There is a significant number of consonantal elements generating phonological accent across languages. Just to cite some examples, Arabic poses a set of several really challenging sounds that learners of Arabic usually encounter. They include, the uvulars [], the pharyngeals [ ] and the emphatic [   ]. For instance, /q/ = [q] is overwhelmingly confused with /k/ = [k] by non-Arabs. For Germans, Arabs, Filipinos and Greeks the sounds of /ʤ/, /p/ /f/ and // are, respectively, quite demanding. The pair / / is absent in many languages; consequently, it is a fundamental source of both phonological and phonetic accent; frequently, the pair is replaced with either /s, z/ or /t, d/. What is of utmost importance is not just the ability to accurately pronounce such sounds in isolation (out of linguistic context), but also in proper conversational context and in lengthy discourse. In my teaching career, I have come across many learners who master the execution of the phonologically alien sounds in isolation, but occasionally fail when a lengthy and indepth conversation is sustained. This happens when the alien sound has not yet become a cognitive entity stored in the subconscious; consequently, the speaker may become vulnerable to a phonological lapse or pull in the direction of L1. b) Care must be taken of the most significant vocalic elements generating phonological accent such as the absence of short (lax) vs. long (tense) vowels of English in many languages including Italian, French, Russian, Tagalog and Spanish, among others. For instance, a Mexican must distinguish between an English short/lax vowel [] and a long/tense vowel [i] in order not to confuse a = [bd] (for a contract) with a = [bd] (on a rosary). There is no place, whatsoever, for a typical Mexican [bid] rendition in English. Equally, a native Italian should be able to distinguish between = [pl] and = [pul] as there is no place for his rendition of those two English words as [pul]. As for individual vowels one can readily notice the absence of close front rounded vowel /y/ of French and German in English and many other

228

PRONUNCIATION IS IN THE BRAIN

languages. The very English vowel // has no counterpart in many languages; it is, therefore, replaced with /a/ or //, etc. c) Seriously consider the most significant consonant clusters generating phonological problems for speakers of languages without clusters or with very limited word-initial, middle or word-final clusters. Typically, such problems are observed with learners of English of Japanese, Korean, Chinese, Hispanic, Italian, Arabic linguistic backgrounds, among others. Just for the sake of example, any Japanese should really work extremely hard on moving his pronunciation of = [f] far away from the traditional Japanese rendition of [fut-ta-bo-]. d) Carefully abide by the stress-placement differences between L1 and L2. Differences in stress placement can be a major source of both phonological and phonetic accent. For instance, one should be aware of the fact that the difference in stress locations between some English words determines whether they are verbs or nouns/adjectives such as with , etc. Also, the learner should be aware of the fact that regardless of the difference in spelling between and , it is the stress location that triggers the semantic difference between them. e) Observe the rhythm differences between L1 and the targeted L2. The shift from stress-timed rhythm to syllable-timed one or vice versa is one of the most significant markers of accent. Although the accent tends to be more phonetic than phonological, but the phonetic noise that ensues causes serious pronunciation distortion that obscures meaning. If, for instance, a Spanish speaker imposes his syllable-timed rhythm on the stresstimed rhythm of English the overall rhythmical chunking of the target language and tempo changes. In many instances, a Hispanic speaker, whose overall linguistic competency in English is quite proficient except for the retention of his syllable-timed rhythm, may still sound somewhat unintelligible to a native speaker of English. It is like speaking a ‘morse-code’ rhythm with a ‘machine-gun’ rhythm. f) Tone and intonation are equally important like stress and rhythm. They are extremely significant when the individual involved is moving from a tone language, such as Chinese, to an intonation language such as English or vice versa. For instance, Mandarin Chinese has four tones in the form of: high level, rise,

CHAPTER 14

229

fall-rise and fall. To demonstrate the lexical function of such tones, one has to use the most common monosyllabic word in Mandarin, which conveys the following meanings with the four tones: /ma/ = (mother); /ma/ = (hemp); /ma/ = (horse) and /ma/ = (scold) (http://mandarin.about.com/od/pro nunciation/a/tones.htm). Although both tone and intonation are based on pitch, the difference is in the use of pitch. In the intonation languages, pitch tends to be the feature of a phrase or sentence to signal attitudinal, emotional and, at times, grammatical differences, while tone tends to be the feature of a word or syllable to signal lexical and/or grammatical differences. Thus, the difference between tonal and intonational languages constitutes a major difference in the overall melody of a given language. This is why adult learners of intonational languages, such as English, experience serious difficulty when learning tonal languages. By the same token, people with tonal languages encounter similar level of difficulty when embarking on learning intonational languages. Each group struggles to restrain the strong drive in the direction of pitch orientation in their languages when working on their L2s. Failing to restrain that drive results in serious overall melody change coupled with lexical, grammatical and attitudinal alterations. 14.2.2. Tackle the most Salient Phonetic Problems

Obviously, the instructor has to point out to learners which problems rise to the level of a phonological accent and which ones remain at the phonetic level between the two languages involved. Between any two languages, there are innumerable phonetic differences some more prominent than others. Let us take English and Greek. In Greek, the English postalveolar affricates [ʧ ʤ , as in and , are absent and are usually replaced with their alveolar counterparts [ʦ ʣ . The substitution does not confuse meaning; it simply renders unfamiliar the words in which the sounds occur. Although for English learners of Spanish the confusion between two s as in = [kao] (dear) and = [karo] (cart/car) constitutes a phonological problem, for Spanish learners of English the problem remains phonetic. Most typical for speakers of sub-continental Indian language, retroflexion is the primary source of their phonetic accent; however, it is so pervasive that

230

PRONUNCIATION IS IN THE BRAIN

it can seriously interfere with their overall comprehension by the natives of the language they are targeting. The comprehension is further hampered with the syllable-timed stress type that most Indians languages have. On the flip side, it is worth pointing out that for English speakers learning other languages, it makes a substantial difference in their phonetic rendition of those languages if they can replace their approximant with a tap, flap, or trill according to the specific language they are targeting. 14.2.3. Improve other Linguistic Skills

Developing high competency in all linguistic systems including syntax, morphology, lexicon and idiomatic expressions alleviates the negative impact of both phonological and phonetic accent in two ways. First, it affords the person using L2 greater sentential and discourse fluency. Second, the resulting fluency increases the likelihood of predicting the meaning of the mispronounced words by the native listener from the context of the sentence or discourse produced. This is exactly what most educated adults attempting communication in L2 make recourse to. It was mentioned earlier on that retroflexion permeates throughout the native languages of sub-continental India. Consequently, when speaking or learning other languages, the strong retroflexion colors their rendition of any L2 they attempt to speak. It, thus, generates serious phonetic noise to the ear of the natives of the attempted L2. One strategy to mitigate the pervasive interference of retroflexion noise in their L2s is through enhancing the competency in syntax, morphology, lexicon, idiomatic expressions, etc. They will all contribute toward a better conveyance of meaning to their listeners regardless of the phonetic interference. 14.3. ACCENT DETECTION

Research and investigation in the domain of automatic speech and accent detection and recognition systems using advanced software is in considerable progress (Zheng et al, 2005). However, machine accent detection is not the focus here; rather, the focus is on live accent detection in real-life situations. Stated differently, the attention is focused on man-to-man accent detec-

CHAPTER 14

231

tion not machine-to-man. There are different scenarios for the detection of the presence of accent by different individuals ranging from an ordinary unsophisticated person to a professional one. Let us shed some light on these scenarios. 14.3.1. Accent Detection by Ordinary Individuals

This is a scenario which involves ordinary native speakers of a given language listening to others who are speaking their language as L2. The listeners in this case may range from illiterate to educated and to highly educated individuals, but none of whom has professional orientation in linguistics, in general, and phonetics and pronunciation, in particular. All that they have is a sensitive ear to their language that helps them to innately assess the accuracy of the pronunciation of their L1 by an L2 speaker. Often, they perceptually, as well as cognitively, sense the accent, but cannot describe it or pinpoint the details. The assessment of accent by some of them might be very general and as simple as saying: “He has an accent.” Others might say: “The vowels sound somewhat off the normal.” A few might be fairly detailed in their commentary by identifying certain vowels, consonants or even stress and rhythmic performances that did not sound accurate and/or native-like. 14.3.2. Accent Detection by Professionals

In this context, professionals represent those individuals who possess knowledge and experience in the nature and structure of human language with reasonable exposure to linguistics and phonetics. They understand human language as sets of structures and systems and the manner in which they collaborate accurately to generate meaning. With specific focus on pronunciation, they possess enough expertise to make very refined judgment as to the source of mispronunciation and accent as well as the recommendation and techniques to eliminate the source of the problem. However, to be able to achieve all that, the individual has to have knowledge about the structures and systems of the two or more languages involved. This knowledge does not imply knowing the two languages fluently or even fairly; however, he must have a certain degree of experience and exposure to the two languages in action and real-life situations.

232

PRONUNCIATION IS IN THE BRAIN

Personally, I do not consider Spanish a language that I know, but I have had ample exposure to it in the community and as well as in my classes. For a period of three decades, I accumulated rich knowledge and experience about Hispanic learners of English. I identified almost all the major pronunciation errors they made in English. This authentic interaction with Hispanic learners of English encouraged me to pursue my interest in documenting my experience in a book1 which covered a wide range of pronunciation problems they encounter in learning English and the tips to enable them overcome them. It was through such educational exchange with my students that I enriched my own knowledge and field experience in teaching pronunciation. Their paramount difficulty in distinguishing English from or from helped me identify the vowel system of Spanish as a centrifugal one in which the vowels tend to be almost ‘frozen’ in their quality and quantity. This forces them cognitively and articulatorily to cluster the two vowels of each pair into one vowel of their own that is neither of the two English vowels. It took me a long time to discover the problem, analyze it and design effective strategies and exercises to help them perceive the difference, recognize it and execute it comfortably. 14.3.3. Telling the Linguistic Background of a Speaker through Accent

Some people, especially linguists or those who have a passion for languages, manifest a hobby or skill at trying to identify the linguistic background of a speaker of a second language (L2) by simply listening to the person without even having any prior knowledge of the native language of the speaker. This is not an easy task because it needs talent, linguistic skill and experience; most likely, a combination of all three. Personally, I have tried this hobby and I succeeded at times and failed at others. Such a linguistic skill can work as a two-edged sword for professional

1

Linguistic tips for Latino learners and teachers of English, 2007.

CHAPTER 14

233

orientation of under-cover agents as well as for those who hunt for them. More will be said on this in due course. This is not an easy task because it needs thorough knowledge about the overall pronunciation of the L2 that is spoken and the L1 of the speaker; additionally, it requires knowledge about the two phonologies and at times even more refined phonetics. Let us take English as the attempted L2 by an X language speaker. One of the foremost prerequisites to make a correct prediction is to be aware of some of the most salient phonological features of the X language and their expected reflexes in the rendition of English. Even with this apparent thorough knowledge, one can go wrong for two primary reasons. First, some typical phonological and/or phonetic features may be shared by more than one language. Thus, one tends to think that the language of the speaker is X but it turns out to be Y or Z. Let me cite one example of how I went wrong in my linguistic identification of one of my students. It was the first day of a class and I wanted to get as much background information about my students as possible. Everyone began to give a brief summary of his/her background. In the short presentation of one of the students, I heard a couple retroflex s, so, I said: “Are you from India or Pakistan?” She responded: “I am from Guatemala.”

Of course, I was mute and I apologized. Deep in my heart, as a linguist, I was embarrassed for the glaring misidentification. She identified herself as a native speaker of Spanish, but where did she get that retroflex ? I still do not know to this day. My only explanation at the time was that the retroflex might have been part of a linguistic substratum of a local Native American language or an idiosyncratic feature of her overall speech. To add more specifics to this linguistic encounter, I have to admit that besides the hearing of the ‘retroflex’ , the student was of a dark complexion very much like many sub-continental Indians. The complexion enticed me to make the judgment that turned out to be wrong. So since then I taught myself a lesson: “Don’t be fooled by the complexion.” Obviously, my attempts at accent detection have not always been disappointing. On several occasions I was right on

234

PRONUNCIATION IS IN THE BRAIN

target and then the question came: “How did you know I was X?” Let me cite some such accent detection anecdotes. Years back, in one instance, I was involved in the registration for the new semester courses. One course was canceled, but was still on the list. Some students came complaining and yelling: “Why is it still on the list?” One of the students who desperately needed the course was the most vocal. I heard typical repeated alveolo-palatal [] and [] sounds in place of the English [s] and [z]. I wanted to cool her down and, I said to her: “Are from Greece or Cyprus?” She abruptly said: “How did you know?” I said: “Register for my course and I will tell you later.”

No doubt, my course was relevant to what she was planning to study. In another instance, I was in hospital for a surgery. A day after surgery, I was allowed to take some clear soup. When it was brought to my room the nurse said: “You hab to pinish all of it.” Naturally, her sentence contained several pronunciation hints that pointed in the direction of her Tagalog language background. Three hints were outstanding, namely, [b] for [v], [p] for [f] besides her unaspirated [t] and [p]. When she came back to check on me, I greeted her in Tagalog. She was surprised and said: “Are you ‘Pilipino’?”2 I said: “No, but I thought you were. I read your name tag.”3

Just recently, I greeted one of my new neighbors simply to strike a friendly relationship. There was work done in his garage door. I said: “You must be busy.” He replied: “Yes, I am changing my garage door.”

2 3

Remember, not ‘Filipino’

Of course, I gave a fake answer to avoid commenting on her accent.

CHAPTER 14

235

Once I heard [ʦʣ instead of [ʧʤ] and [gʣ instead of [gʤ , I guessed he was Greek. I said: “Tikanis?” (How are you?). He responded: “Kala” (Good).

And he was Greek. It took one little indicator of accent to identify him linguistically. 14.3.4 Hiding an Agent through Hiding an Accent

In this section, the aim is not to detect an accent, but rather to hide it lest it should be exposed with unknown consequences. It was made clear earlier on that accent reduction is professionally very beneficial for instructors of languages, actors, newscasters, spontaneous interpreters and under-cover agents; however, in the real world, there is one difference between the under-cover agents and the rest of the professionals who can tolerate making mistakes or even blunders, but under-cover agents are not supposed to. Any inappropriate linguistic gaffe can cost them dearly. No doubt, linguistic talent, language competency, in general, and pronunciation, in particular, are indispensible prerequisites for someone who is willing to serve as under-cover agent or the more crudely named profession of ‘spying’. Grosjean (2010-a) wrote a compact, but very linguistically and culturally rich article concerning the linguistic and cultural prerequisites for this very risky and mentally demanding profession of spying, especially as sleeper agent (or deep cover agent). A sleeper agent can be either the native of the same country that is spied on or a native of the spying country who has been embedded in the target country for a long time. In the case of a native citizen, there is no problem of native language and culture competency and native unaccented pronunciation because he belongs to that linguistic/cultural community. The lack of native or even native-like linguistic and cultural qualifications may be the source of problem for the non-native agent. This is so because in spite of the very disciplined linguistic and cultural orientation of the person, he may inadvertently and subconsciously reveal a linguistic or cultural hint, regardless of how minor, that is inconsistent with the language and culture of the targeted country. As the intruder country does its best to groom

236

PRONUNCIATION IS IN THE BRAIN

its spies, so does the targeted country in priming its counterintelligence cadre. Spy trackers should supposedly be coached to watch for the tiniest linguistic, cultural and behavioral indications to be able to detect ‘deeply covered’ agents. The above discussion brings forth the very delicate responsibility of preparing an under-cover agent. Naturally, if the agent is a native speaker then accent becomes a non-issue. For the non-native agent achieving native proficiency in L2 is a rarity except if he is embedded as a very young person so that he will have ample opportunity for full immersion in the target language and culture. In the absence of the latter condition, the candidate will come from amongst adults. Even with adults, the younger the better for mastering an L2 or C2. Inasmuch as pronunciation is concerned the target will be near-native proficiency as the native proficiency is a rarity. Even near-native proficiency is not an easy task to achieve in pronunciation as early exposure to the sound system seems to be a prerequisite unlike the rest of the linguistic systems of lexicon, morphology and syntax in which many educated non-natives can outperform many natives. Once again, international figures such as Joseph Conrad, Krishna Menon and Henry Kissinger achieved the highest levels of native competency in all linguistic systems of English except for pronunciation. In the case of the latter two, whom I have heard speaking, it has been a phonetic accent (no interference with meaning) rather than a phonological one (interference with meaning). It is assumed that individuals recruited for spying must demonstrate prowess in many skills, foremost of which are expected to be competency in L2. This implies extensive orientation in pronunciation with particular focus on areas that trigger the most revealing phonological accent followed by areas of less revealing phonetic accent. To achieve this goal, the individual should go through a very stringent preparation and orientation under the supervision of highly gifted linguists and phoneticians whose focus would be all the steps highlighted in section 14.2 perhaps with greater depth and rigor. The greater in-depth and rigorous orientation of spies is needed because of the risky nature of the assignment. Greater in-depth orientation implies not only preparation in linguistic and phonetic nature of the targeted language, but also in C2

CHAPTER 14

237

orientation. “If a person is claiming to be a native speaker, there may be occasional language oddities or incongruous phrasing that can alert you to this person not being a native speaker but trying to hide this fact” (http://www.wikihow.com/Spot-a-Spy). This implies that any L1 and C1 residues seeping accidentally through the adopted L2 and C2 can become potential traps. At the highest level of orientation, the person may need a certain degree of theoretical familiarization with linguistics and the dynamics of speech production and pronunciation. Considerable difference is expected in the interpretation of speech acts and linguistic facts between an ordinary person, a person with some exposure to linguistics and a specialist. Consider this example that I recently had to comment on. One of my friends who is a geologist, thought that another friend of ours was the speaker of certain Neo-Aramaic (Modern Assyrian) dialect because all his s sounded fully retroflex instead of the regular tap or trill s of the language. I said: ‘No, he was not.” Then, I explained to him that our friend has a speech defect that made his s sound as if they were retroflex. This indicates the huge gap between the impression of a lay person and the reality of a specialist. There is no doubt, whatsoever, that the theoretical knowledge and expertise in any field facilitates the mastery of applied skills and the creativity in them. One last piece of advice from a linguist to whoever is willing to consent to the mission of an under-cover agent: beware your subconscious when handling the target language (L2) and target culture (C2). It has been emphatically stressed throughout this book that both native language (L1) and native culture (C1) are the creation of the brain and are stored in the subconscious. This is why Grosjean highlights the fact that “Sleeper agents have to ‘speak monolingually’ at all times, even when the situation is conducive to code-switching” (Grosjean, 2010-b). Certainly, Grosjean inspires me to stress the fact that sleeper agents should also behave monoculturally lest they should be snared because of falling victim to their cognitively deep-seated C1. Let me cite the following anecdote from years back when two undercover Iraqi policemen tricked two Iranian smugglers into admission and arrested them by manipulating the linguistic/cultural subconscious of the smugglers:

238

PRONUNCIATION IS IN THE BRAIN “Half a century ago when we used to travel from Kirkuk in the north of Iraq to Baghdad in the center, we had the option of taking a very slow train that took twelve (12) hour or the option of a bus that took six hours, but was vulnerable to getting lost due to the lack of modern roads with proper signs. Once I took the bus and it pulled out in the morning and continued for two (2) hours after which it stopped at a small town and picked up two more passengers and continued the journey. The two passengers were dressed in folk attire and carrying midsize bags on the back. After an hour or so, and to the surprise of all passengers, the driver announced that he had lost his way and had been circling around during the last hour. The diver was embarrassed to find himself back at the same town where he first stopped and picked up two passengers. We were all frustrated, but we had no better choice. Just a coincidence, two more passengers were picked up who were dressed in pants and jackets. They proceeded to the last seats on the bus which were empty. The bus drove for about ten (10) minutes and all of a sudden I heard someone from the back seats loudly uttering a few words which I knew they were not Arabic, Turkmeni or Assyrian because I spoke those languages. They did not sound Kurdish as I had some impression of how Kurdish language sounds. When the few words were spoken out, I saw only two people from the middle seats turn their head to look back at the source of the words. It was one of the last two passengers who was the speaker. In a split second, they both dashed forward in the direction of the first two passengers who looked back, yelled out some obscenities and handcuffed them saying that they were Iranian smugglers whom they have been tracking for hours. They then ordered the driver to return to the same town once again where the police station was.”

I discovered later that the last two passengers were undercover policemen and had lost the track of those two Iranian smugglers, but they had a feeling that they had taken the bus. Ironically, the bus lost its destination and by a stroke of chance was back where it was two hours ago. Apparently, the two undercover policemen intentionally went to the back of the bus and sound-

CHAPTER 14

239

ed out loudly a phrase in the Farsi language. Of all the passengers, only the two smugglers turned their head decidedly and looked back because they thought they were welcomed by an acquaintance. Certainly, by turning their heads to the source of the Farsi language greeting, they did not behave consciously; rather, it was the two experienced policemen that played a linguistic trick on them through manipulating the unintentional, subconscious and reflexive response of the two native speakers of Farsi language. It was their linguistic and/or cultural subconscious that betrayed them. In sum those two smugglers were caught because they failed to behave as Arabic-speaking monolinguals, as Grosjean phrased it, and hide their identity; hence, under-cover agents should be more careful than those two smugglers in order not to take the bait and uncover themselves under the influence of the subconscious. 14.4. CONCLUDING REMARKS

No one likes to have an accent; however, accent is usually part and parcel of attempting an L2 when one is no longer a child. Some adults stay with their accent throughout their life, but others try to reduce it as much as possible. The degree of success in accent reduction is contingent on many factors including age—the younger the more successful—linguistic talent, enthusiasm, professional orientation, etc. If all factors are available, it is the professional instruction and orientation that eventually make the difference in the degree of success. Doubtless, not everyone is qualified to conduct the professional orientation in matters of language mastery and accent reduction. Any person delegated with this responsibility must have the qualifications of a linguist and phonetician or be a ‘linguistically’ gifted individual. From the linguistic perspective, all attempts at accent reduction or accent faking have no negative intent except for spying which is a double-edged sword: good for the spying agent, but bad for the spied on. If one accepts this type of professional service and wants to avoid the risky consequences, he has to learn how to restrain the subconscious linguistic and cultural dominance of L1 and C1 in situations where L2 and C2 are needed.

REFERENCES Abercrombie, D. (1967). Elements of general phonetics. Edinburgh: Edinburgh University Press. Adams, C. (1979). English speech rhythm and the foreign learner. The Hague: Mouton. Aitchison, Jean (1996). Seeds of language: language origin and evolution. New York: Blackwell. Anderson, John R. (1980). Cognitive psychology and its implications. San Francisco: W.H. Freeman & Company. Arnold, Magda B. (1984). Memory and the brain. Hillsdale, New Jersey: LEA Publishers. Baddeley, Alan D. (1976). The psychology of memory. New York: Basic books, Inc. Publishers. ——— (1993). Your memory: a user’s guide. London: Multimedia Books Limited. Bates, Elizabeth (1999). On the nature and nurture of language. Frontiere della Biologia. The Brain of Homo Sapiens. Rome: Giovanni Trecanni. Beck, Douglas L., and Flexer, Carol (2011). Listening is where hearing meets brain. http://www. hearingreview. com/issues/ articles/2011-02_02.asp. Best, Catherine T. (1991). The emergence of native-language phonological influences in infants: a perceptual assimilation model, Haskins laboratories status report on speech research, SR·107/108, 1-30. Best, Catherine T, McRoberts, Gerald W. and Goodell, Elizabeth (2001). Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener’s native phonological system. Journal of the Acoustic Society of America, Vol.109, 2, 775–794.

241

242

PRONUNCIATION IS IN THE BRAIN

Borden, Gloria J. and Harris, Katherine S. (1980). Speech science primer: physiology, acoustics and perception of speech. Baltimore: Williams & Wilkins. Bourne, L.E, Dominowski, R.L., Loftus, E.F. and Healy, A.F. (1986). Cognitive processes. Englewood Cliffs, New Jersey: Prentice-Hall. Caplan, David (1995). Language and the Brain, Vol. 4, No. 4, 1–4. Carpenter, Harry (2004). The Genie within: your subconscious mind. San Diego: Anaphase II Publishing. Catford, J.C. (1977). Fundamental problems in phonetics. Edinburgh: Edinburgh University Press. Catford, J.C. (1994). A practical introduction to phonetics. Oxford: Clarendon Press Chechik, Gal, Meilijson, Isaac and Ruppin, Eytan (1999). Neuronal regulation: a mechanism for synaptic pruning during brain maturation. Journal of Neuronal Computation, Vol. 11, 2061–2080. Chomsky, N. and Halle, M. (1968). The sound pattern of English. New York: Harper & Row. Christensen, Ken Ramshøj (2001). The Co-evolution of language and the brain: a review of two contrastive views (Pinker & Deacon). Grazer Linguistische Studien, Vol. 55, 1–20. Croom, Christopher, (2003). Language origins: did language evolve like the vertebrate eye, or was it more like bird feathers? http://www.csa.com/discoveryguides/lang/gloss_f.php Clancy, Barbara and Finlay, Barbara (2001). Neural correlates of early language learning. In Language development: the essential readings (Mike Tomasello & Elizabeth Bates, eds.), 307– 330, Wiley-Blackwell. Cummins, Jim (1979). Cognitive/academic language proficiency, linguistic interdependence, the optimal age question and some other matters. Working papers in bilingualism, 19, 121-129. ——— (1984). Bilingualism and special education: issues in assessment and pedagogy. Clevedon: Multilingual Matters. Dalbor, J. (1969). Spanish pronunciation: theory & practice. New York: Holt, Rinehart & Winston. Dale, P. & Poms, L. (1985). English pronunciation for Spanish speakers. Englewood Cliffs, New Jersey: Prentice-Hall.

REFERENCES

243

Daniloff, Raymond G. (1973). Normal articulation processes. (Fred D. Minifie, Thomas J. Hixon, Frederick Williams, David J. Broad, eds.). Normal aspects of speech, hearing and language, 169–209. Englewood Cliffs, New Jersey: PrenticeHall. Dauer, R. (1983). Stress-timing and syllable-timing reanalyzed. Journal of Phonetics, 11, 51–62. Deacon, Terrence (1997) The Symbolic species: the co-evolution of language and the human brain. London: Penguin Books. de Houwer, Annick, (1990). The Acquisition of two languages from birth: a case study, Cambridge Studies in Linguistics. Delattre, P. (1965). Comparing the phonetic features of English, German, Spanish and French. Heidelberg: Julius Groos Verlag. Eimas, P.D. (1978), Developmental aspects of speech perception. (In R. Held, H. W. Leibowitz, & H. L. Teuber, eds.), Handbook of sensory physiology, vol. 8. Berlin: Springer. Esling, John H. and Wong, R. F. (1983). Voice quality settings and the teaching of pronunciation. TESOL Quarterly, Vol. 17, 89–95. Fleischhacker, Heidi (2000). Cluster-Dependent Epenthesis Asymmetries (http://www. linguistics.ucla.edu/people/grads/ fleischhacker/uclawpl.pdf). Fox, Barbara, J. and Hull, Marion A. (2002). Phonics for the teacher of reading. Upper Saddle River/New Jersey: Merrill. França, Aniela Improta (2006). Introduction to neurolinguistics. In: Textos em Psicolingüística, Ingrid Finger; Carmen Matzenauer(Org.). Pelotas: Editora da Universidade Católica de Pelotas, Vol.. 1, 1–52. Gardner, Howard. (1983). Frames of mind: the theory of multiple intelligences. New York: Basic Books. ——— (1993). Creating minds: an anatomy of creativity seen through the lives of Freud, Einstein, Picasso, Stravinsky, Graham, and Gandhi. New York: Basic books. Gimson, A. C. (1967). An introduction to the pronunciation of English. London: Arnold. Gopnic, A., Meltzoff, A., Kuhl, P. (1999). The scientist in the crib: what early learning tells us about the mind. New York: Harper Collins Publishers

244

PRONUNCIATION IS IN THE BRAIN

Grosjean, Franҫois (2010-a). Bilingual: life and reality. Harvard University Press. ——— (2010-b). http://www.guardian.co.uk/education/2010. Hadlich, Roger, Holton, James and Montes, Matias (1968). A drillbook of Spanish pronunciation. New York: Harper & Row, Publishers. Handbook of the international phonetic association (1999). Cambridge: Cambridge University. Hiiemae, Karen M. and Palmer, Jeffrey B. (2003). Tongue movements in feeding and speech. Critical Reviews in Oral Biology & Medicine. Vol. 14, 413–429. Hyman, Larry (1975). Phonology. New York: Holt, Rinehart & Winston. Jabbari, Ali A., van de Weijer, Jeroen, Safari, Parvin and Falaknaz, Farane (2012). Journal of Teaching Language Skills, 3/4, 59–76. Jakobson, R., Fant, G. & Halle, M. (1969). Preliminaries to speech analysis. Cambridge, Massachusetts: M.I.T. Press. Johnson, J. S. and Newport, E. (1989). Critical period effects in second language learning: the influence of maturational state on the acquisition of English as a second language, Cognitive Psychology, Vol. 21, 60–99. Joseph, Rhawn (2011). Origin of thought: consciousness, language, egocentric speech and multiplicity of mind. Journal of Cosmology, Vol. 14. Kanokpermpoon, Monthon (2007). Thai and English consonantal sounds: a problem or a potential for EFL learning? ABAC Journal, Vol. 27-.1, 57–66. Kelly, Spencer, Özyürek, Asli and Maris, Eric (2009). Two sides of the same coin: speech and gesture mutually interact to enhance comprehension. Psychological Science, Vol. 20,1–8. Kenyon, J. S. and Knott, T. A. (1953). A Pronouncing dictionary of American English. Springfield, Massachusetts: G & C Merriam Company. Kissin, Benjamin (1986). Conscious and unconscious programs in the brain. New York: Plenum Medical Book Company. Ladefoged, P. (1982). A course in phonetics. New York: Harcourt Brace Jovanovich. Ladefoged, P and Maddieson, Ian. (1996). Sounds of the world’s languages. Cambridge, Massachusetts: Blackwell.

REFERENCES

245

Larson, Christian D. (1912) Your forces and how to use them (http://www.sacred-texts.com/nth/yfhu/yfhu02.htm). Laver, John (1980). The phonetic description of voice quality. Cambridge: Cambridge University Press. ——— (1994). Principles of phonetics. Cambridge: Cambridge University Press. Lehiste, I. (1970). Suprasegmentals. Cambridge, Massachusetts: M.I.T. Lenneberg, Eric H. (1967). Biological foundations of language. New York: John Wiley & Sons. Levitt, Robert A. (1981). Physiological psychology. New York: Holt, Rinehart & Winston. Loftus, Elizabeth (1980). Memory: surprising new insights into how we remember and why we forget. Reading, Mass.: AddisonWesley Publishing Co. Lowie, W. and Bultena, Sybrine (2007). Articulatory settings and the dynamics of second language speech production (http://www.phon.ucl.ac.uk/ptlc/proceedings/ptlcpaper_15e.pdf) MacKay, I. R. A. (1978). Introducing practical phonetics. Boston: Little Brown & Company. McKinney, James C. (2005). The diagnosis and correction of vocal faults: a manual for teachers of singing and for choir directors. Long Grove/Illinois: Waveland Press. Martinet, André (1964) Elements of general linguistics. London: Faber & Faber. Miller, Cynthia. Brain for success. (http://brainforsuccess.com/ howyourbrain work.html). Morley, Joan (1991). The pronunciation component in teaching English to speakers of other languages. TESOL Quarterly. Vol. 25, 481–520. O’Connor, J. D. (1973). Phonetics. London: Penguin Books. Odisho, Edward Y. (1977). Arabic /q/: a voiceless unaspirated uvular plosive. Lingua, Vol. 42, 343–347. ——— (1977). The opposition/ t / vs. / th / in Neo-Aramaic. Journal of the International Phonetic Association, Vol. 7, 79– 83. ——— (1979/a). An emphatic alveolar affricate, Journal of the International Phonetic Association, Vol. 9, 67–71. ——— (1979/b). Consonant clusters and abutting consonants. System, Vol. 7, 205–210.

246

PRONUNCIATION IS IN THE BRAIN

——— (1981). Teaching Arabic emphatics to the English learners of Arabic. Papers and Studies in Contrastive Linguistics, Vol. 13, 275–280. ——— (1988a). The sound system of modern Assyrian (NeoAramaic). Wiesbaden: Otto Harrassowitz Velag. ——— (1988b). Sibawaihi’s Dichotomy of ‘majhūra’ and ‘mahmūsa’ Revisited. Al-arabiyya. Vol. 21, 81–90. ——— (1990) Phonetic and phonological description of the labio-palatal and labio-velar approximants in Neo-Aramaic. Wolfhart Heinrichs (ed). Studies in Neo-Aramaic, 29–33. Atlanta, Georgia: Scholars Press. ——— (1992). Transliterating English in Arabic. Zeitschrift für arabische Linguistik, Vol. 24, 21–34. ——— (2002). Bilingualism: a salient and dynamic feature of ancient civilizations. Mediterranean Language Review, Vol. 14, 71–97. ——— (2003). Techniques of teaching pronunciation in ESL, bilingual and foreign language classes. München: Lincom-Europa. ——— (2004). A linguistic apoproach to the application and teaching of the English language. New York: Edwin Mellen Press. ——— (2005). Techniques of teaching comparative pronunciation in Arabic and English. New Jersey: Gorgias Press. ——— (2007/a). A Multisensory, Multicognitive Approach to Teaching Pronunciation. Revista de Estudos Linguisticos da Universidade do Porto, Vol. 2, 3–28. ——— (2007/b). Linguistic tips for Latino learners and teachers of English. New Jersey: Gorgias Press. ——— (2010). An Aerodynamic, Proprioceptive and Perceptual Interpretation of Sibawayhi’s Misplacement of /‫ط‬/ and /‫ق‬/ with Majhūra Consonants. Zeitschrift für Arabische Linguistik, Vol. 52, 39–52. ——— (2013). Some primary sources of accent generation in the pronunciation of English by native Arabs. Nicht nur mit Engelzungen (Beiträge zum semitischen Dialektologie: Festschrift für Werner Arnold zum 60 Geburstag, eds. R. Kuty, U Seeger and Sh. Talay, 265–274). Harrassowitz Verlag, Wiesbaden. Pennington, Martha C. and Richards, Jack C. (1986). Pronunciation revisited. TESOL Quarterly, Vol. 20/2, 207–225.

REFERENCES

247

Petitto, Laura-Ann (2002) http://www.dartmouth.edu/~news/rel eases/ 2002/nov/ 110402a.html. Pinker, S. (1994). The language instinct. New York: Harper Perennial Modern Classics. Port, Robert F. (2007).The graphical basis of phones and phonemes. Murray Munro and Ocke-Schwen Bohn, eds. 349– 365) Second-language speech learning: the role of language experience in speech perception and production. Amsterdam: John Benjamin Publishing Co. Repetti, Lori (2012). Consonant-final loanwords and epenthetic vowels in Italian. Catalan Journal of Linguistics, Vol. 11, 167–188. Roach, Peter (1983). English phonetics and phonology. Cambridge: Cambridge University Press. Selinker, Larry (1972). Interlanguage. International Review of Applied Linguistics, Vol. 10, 209–231. Shiver, Elaine (2001). Brain development and mastery of language in the early childhood years. (http://www.idra. org/IDRA_Newsletter/April_2001_Self_ Renewing_Schools_Early_ Childhood/Brain_Development_and_Mastery_of_Language_in_ the_Early_Childhood_Years/) Stockwell, R. P. and Bowen, J. (1965). The sounds of English and Spanish. Chicago: The University of Chicago Press. Tortora, G. and Grabowski, S. (1996). Principles of anatomy and physiology,(8th ed.), New York: HarperCollins College Publishers. Vicentini, Alessandra (2003). The economy principle in language: notes and observations from early Modern English grammars. (http://www.ledonline.it/mpw/.) Werker, J.F. (1995). Exploring developmental changes in crosslanguage speech perception. (Lila R. Gleitman and Mark Liberman, eds.). An invitation to cognitive science. Cambridge, Massachusetts: MIT Press. Werker, Janet F. and Tees, Richard C. (2002). Cross-language speech perception: evidence for perceptual reorganization during the first year of life. Infant Behavior & Development, Vol. 25, 121–133. Werker, J.F., Yeung, H.H., & Yoshida, K.A. (2012). How do infants become experts at native-speech perception? Current Directions in Psychological Science, Vol. 21/4, 221–226.

248

PRONUNCIATION IS IN THE BRAIN

Wesson, Kenneth A. (Neurosciencehttp://www.sciencemaster.com/ columns/wesson/ wesson_part_03.ph). Whitley, M. S. (1986). Spanish/English contrasts: a course in Spanish linguistics. Washington, D.C.: Georgetown University Press. Zero to Three. How the brain develops (2009). (https://www. childwelfare.gov/pubs/issue_briefs/brain_development/how.cfm). Zheng, Yanli, Sproat, Richard, Guy, Liang, Shafranz, Izhak, Zhouz, Haolang, Suz, Jurafsky, Dan, Starr, Rebecca, Yoon, Su-Youn (2005). Accent detection and speech recognition for Shanghai-accented Mandarin. (http://www.iscaspeech.org/ archive/ interspeech_2005/i05_0217.html).