Shadowing as a Practice in Second Language Acquisition: Connecting Inputs and Outputs [1 ed.] 1138485500, 9781138485501

Shadowing is a theoretically and empirically well-examined method to develop L2 learners’ listening comprehension (input

316 70 57MB

English Pages 214 Year 2019

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
List of figures
List of tables
Preface
1 What is shadowing?
2 Shadowing for L2 listening comprehension
3 Shadowing for promoting L2 learnability
4 Shadowing for L2 speech production
5 Metacognitive monitoring and control
6 Establishing a new concept of practice in L2 acquisition
References
Index
Recommend Papers

Shadowing as a Practice in Second Language Acquisition: Connecting Inputs and Outputs [1 ed.]
 1138485500, 9781138485501

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

 i

Shadowing as a Practice in Second Language Acquisition

Shadowing is a theoretically and empirically well-​examined method to develop L2 learners’ listening comprehension (input effect); enhance their subvocal rehearsal mechanism in the phonological working memory for learning new words, formula, and constructions (practice effect); simulate some stages of speech production (output effect); and develop metacognitive monitoring and control by their executive working memory (monitoring effect). In Japan and some other Asian countries, shadowing is a well-​recognized, popular method of learning English and Japanese as L2, and this book offers the chance for anyone new to this method to benefit. Through the research contained within this book, readers will be armed with detailed and useful accounts of the four effects above (i.e., input, practice, output, and monitoring effects) from a theoretical and empirical viewpoint. Shuhei Kadota is a Professor of Applied Linguistics, Graduate School of Language, Communication and Culture, Department of Law at Kwansei Gakuin University, Japan.

ii

Routledge Research in Language Education

The Routledge Research in Language Education series provides a platform for established and emerging scholars to present their latest research and discuss key issues in Language Education. This series welcomes books on all areas of language teaching and learning, including but not limited to language education policy and politics, multilingualism, literacy, L1, L2 or foreign language acquisition, curriculum, classroom practice, pedagogy, teaching materials, and language teacher education and development. Books in the series are not limited to the discussion of the teaching and learning of English only. Books in the series include: Addressing Difficult Situations in Foreign-​Language Learning Confusion, Impoliteness, and Hostility Gerrard Mugford Translanguaging in EFL Contexts A Call for Change Michael Rabbidge Quantitative Data Analysis for Language Assessment Volume I Fundamental Techniques Edited by Vahid Aryadoust and Michelle Raquel Shadowing as a Practice in Second Language Acquisition Connecting Inputs and Outputs Shuhei Kadota Further Language Learning in Linguistic and Cultural Diverse Contexts A Mixed Methods Research in a European Border Region Barbara Gross For more information about the series, please visit www.routledge.com/​Routledge​Research-​in-​Language-​Education/​book-​series/​RRLE

 iii

Shadowing as a Practice in Second Language Acquisition Connecting Inputs and Outputs Shuhei Kadota

iv

First published 2019 by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN and by Routledge 52 Vanderbilt Avenue, New York, NY 10017 Routledge is an imprint of the Taylor & Francis Group, an informa business © 2019 Shuhei Kadota The right of Shuhei Kadota to be identified as author of this work has been asserted by him in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. British Library Cataloguing-​in-​Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-​in-​Publication Data Names: Kadota, Shuhei, author. Title: Shadowing as a practice in second language acquisition : connecting inputs and outputs / by Shuhei Kadota. Description: New York, NY : Routledge, [2019] | Series: Routledge research in language education | Includes bibliographical references and index. Identifiers: LCCN 2018058026 | ISBN 9781138485501 (hardback) | ISBN 9781351049108 (ebook) Subjects: LCSH: Second language acquisition–Methodology. | Second language acquisition. Classification: LCC P118.2 .K33 2019 | DDC 418.0071–dc23 LC record available at https://lccn.loc.gov/2018058026 ISBN: 978-​1-​138-​48550-​1  (hbk) ISBN: 978-​1-​351-​04910-​8  (ebk) Typeset in Galliard by Newgen Publishing UK

 v

Contents

List of figures  List of tables  Preface  1 What is shadowing? 

vi xi xiii 1

2 Shadowing for L2 listening comprehension 

13

3 Shadowing for promoting L2 learnability 

51

4 Shadowing for L2 speech production 

99

5 Metacognitive monitoring and control 

132

6 Establishing a new concept of practice in L2 acquisition 

155

References  Index 

187 197

vi

Figures

1.1 1.2 .3 1 1.4 1.5

1.6

.7 1 1.8 1.9 1.10 1.11 .1 2 2.2 2.3 2.4 .5 2 2.6 2.7 .8 2 2.9 2.10 2.11

An image of shadowing as a verbatim repetition task requiring learners to repeat what they hear as accurately as possible An image of a shadowing experiment on selective attention in auditory phonetics Quasi-​multi-​processing in daily communication An example of multi-​processing in a daily conversation The sound wave (upper) and the results of a loudness and pitch analyses (lower) of “There, the one on top of the hill” spoken by a native speaker of English in the book-​attached CD The sound wave (upper) and the results of a loudness and pitch analyses (lower) of the shadowing of “There, the one on top of the hill” by a low-​intermediate Japanese learner of English An image of repeating An image of oral reading An image of parallel reading Two major effects of shadowing training for L2 learners Revised model of the effectiveness of shadowing training for L2 learners A simplified outline of listening comprehension Switching perception and comprehension in L2 listening Processing operations involved in L2 listening comprehension Depiction of properties of the lemma and lexeme for each lexical entry An example of the restaurant schema An image of continual “switch-​on” perception during L2 shadowing The McGurk effect: The sound being uttered is /​ba/​but the movement of the mouth reflects /​ga/​ Brain activation as measured by fMRI in three subjects Map of Brodmann areas The sensori-motor dorsal and lexical–semantic ventral streams involved in speech perception Photographs of 12-​to 21-​day-​old infants imitating the facial expressions of an adult

2 2 3 3

5

5 6 7 9 10 11 14 16 16 17 19 21 23 24 25 27 28

 vii

List of figures vii 2.12 A image of the premotor cortex (F5) in the macaque brain 30 2.13 Views of the monkey brain (left) and human brain (right): There is a region called F5 (left) that roughly corresponds to the region of Broca’s area (right) associated with language production 31 2.14 An illustration of how NIRS measures hemoglobin changes (taken from www.shimadzu.com/​an/​lifescience/​imaging/ ​nirs/​nirs2.html) 32 2.15 A participant ready for the experiment in the soundproof room (left), and the NIRS holder placed on the left and right sides of the participant’s head with 48 measurement channels (right) 33 2.16 The NIRS used in the experiment (left) and the data collection being observed and controlled by experimenters in the next room (right) 34 2.17 The NIRS probes and measurement channels on the left side of a participant’s head 34 2.18 Brain activity according to NIRS data (oxy-​Hb concentration) in the left (left) and right (right) hemispheres during shadowing (20 seconds after start) 35 2.19 Brain activity according to NIRS data (oxy-​Hb concentration) in the left (left) and right (right) hemispheres during listening (20 seconds after start) 36 2.20 Listening comprehension scores of the three proficiency groups at the pre-​and postintervention listening tests 39 2.21 Listening comprehension at pre-​, mid-​, and posttests 40 2.22 Rates of correctly reproduced words during shadowing at the pre-​, mid-​, and posttests 40 2.23 Articulation speed (number of syllables during oral reading) at the pre-​, mid-​, and posttests 41 2.24 Steps from shadowing training to the development of listening comprehension 42 2.25 The scores of the pre-​and posttests for the treatment (experimental) and control groups 42 2.26 Improvement in reading comprehension rates (%) between pre-​and posttests in 2009 and 2010 camps 43 2.27 Improvement in listening comprehension rates (%) between pre-​and posttests in 2009 and 2010 camps 43 2.28 Improvement in listening comprehension scores between the pre-​and posttests in both groups 46 2.29 Improvement in dictation exercise scores between the pre-​and posttests in both groups 46 3.1 Three basic components of human information processing 52 3.2 Human memory system proposed by Atkinson and Shiffrin (1968) and Baddeley (1986) 52 3.3 Baddeley’s working memory: Revised 2015 model 53 3.4 Example of a phonetically similar and dissimilar letter sequence 54

viii

viii  List of figures .5 3 3.6 .7 3 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 3.20 3.21 3.22 3.23 3.24 3.25 3.26 3.27 3.28 3.29 3.30 3.31 3.32 3.33

Example of a sequence of monosyllabic and multiple-​syllable words Phonological loop (phonological working memory) system (Gathercole and Baddeley) Phonological loop (phonological working memory) system (Logie) The highest number of syllables participants correctly transcribed in the four conditions The percentage of subjects who correctly transcribed the given sentences Neuropsychological model of the phonological loop Example of the two-​second constraint in the phonological loop for three ESL learners Language acquisition device (LAD), consisting of UG and LAF, proposed by Chomsky and others Assumed causal relationship between vocabulary size and nonword repetition Lesion sites for conduction aphasia, as well as for Wernicke’s and Broca’s aphasias Mean RTs (ms) in the semantic relatedness judgment task for four stimulus pair types in English Number of mean response errors in the semantic relatedness judgment task for four stimulus pair types in English Mean RTs (ms) in the semantic relatedness judgment task for three stimulus pair types in Japanese kanji words Number of mean response errors in the semantic relatedness judgment task for three stimulus pair types in Japanese kanji words Effect of articulatory suppression on L2 reading comprehension The effects of regular and irregular rhythmic beats on L2 reading comprehension The effect of irregular rhythmic beats on phrase-​sized reading in L2 Areas recruited for overt rehearsal by subtracting noise-​level activation from overt rehearsal activation Areas recruited for covert rehearsal obtained by subtracting noise-​level activation from covert rehearsal activation Areas obtained by subtracting covert from overt rehearsal activation Two sessions in the experiment Display of reading silently with subvocalization while tracing every word in the passage Mean subvocal reading speed (wpm) Mean subvocal reading time Mean comprehension accuracy Mean subvocal reading time Mean comprehension accuracy Long-​term memory system An example of classical conditioning

54 55 56 57 58 59 60 61 63 64 69 70 71 71 73 74 75 77 77 78 80 81 82 82 83 84 84 86 88

 ix

List of figures ix 3.34 3.35 3.36 3.37 3.38 3.39 3.40 3.41 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 4.21 4.22 4.23 .1 5 5.2 5.3 5.4

An example of operant conditioning 89 Two-​way channel model from sensory register to long-​term memory 90 From episodic to semantic memories through decontextualization 91 Hippocampus in the limbic system, which connects the perceptual information and the implicit memories in the cerebral cortex 92 From episodic to semantic to implicit memory 94 Correctly reproduced rates (%) in shadowing and phrase-​/​ sentence repeating 95 The articulation duration of the heard segments (ms) in shadowing and phrase repeating 96 The repeated practice of phrase-​level shadowing 97 High-​speed processing in everyday conversation 100 Example of a phoneme arrangement error 103 Simplified L2 speech production model based on Kormos 105 Mean accuracy (%) of word pitch accent 108 Lexical accent accuracy (%) in oral reading of Japanese 110 Articulation speed (total uttered moras /​time in seconds) in oral reading before and after shadowing and repeating 111 Lexical accent of hashi (橋: bridge) in the Tokyo and Osaka dialects 111 L1 Japanese speakers’ perception of how natural the lexical accent of L2 speakers and a speech synthesizer sounded 113 Mean deviation in mora duration (ms) of L2 recordings from the control speech before and after shadowing/​repeating 114 Mean accuracy (%) of pitch accents before and after training 115 Sound intensity per syllable (dB) before and after training: Clause 1 116 Sound intensity per syllable (dB) before and after training: Clause 2 116 Pitch changes of syllables (Hz) before and after training: Clause 1 117 Pitch changes of syllables (Hz) before and after training: Clause 2 118 Multiple route model of aural word repetition based on conduction aphasia data 121 An image of personalized oral reading 125 Overall results of pre-​and posttraining spoken English tests 126 Articulation speed data 127 Content data 127 Selective shadowing 128 Interactive shadowing 129 Correlation between shadowers’ GOP and the two subjective scores 131 Correlation between learners’ speech GOP and the two subjective scores 131 Example of metacognitive monitoring 133 Types of metacognitive knowledge 134 Metacognitive activity: Metacognitive monitoring and control 135 Major pre-​, mid-​and posttask metacognitive activities 136

x

x  List of figures 5.5 .6 5 5.7 .8 5 5.9 5.10 5.11 5.12 6.1 6.2 6.3 6.4 6.5

.6 6 6.7 6.8 .9 6 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 6.20 6.21 6.22 6.23 6.24

Percentage of the brain occupied by the frontal association cortex in humans and other animals Picture shown to patients with damaged frontal association cortex An integrated conceptual framework for working memory in L2 acquisition Example of the Simon task Example of an n-​back task PWM and EWM hypothesis for L2 acquisition Think-​aloud protocol research The images of monolinguals and bilinguals Two fundamental research questions for L2 learning A L2 teacher contemplates the two research questions Input and output theories of L2 acquisition Cognitive, utterance, and perceived fluencies Typical example of event related potential (ERP) N400 appearing 400 ms after processing a critical word. CZ (the vertical axis) is the potential of an electrode placed at the center of the parietal lobe of the brain in the internationally recognized position for brain wave measurement. Intervention to connect L2 input and output Structurally feasible objects illustrated by Schacter, Cooper, and Delaney Structurally impossible objects illustrated by Schacter, Cooper, and Delaney Semantic priming Input-​and output-​driven practice A joint attentional frame for adult–​child communication A child can work out the intention of an adult saying “dog” Children’s ability to create usage patterns Typical early pivot schemas of children learning English Results of four groups in Tests 1–​3 Average RTs (ms) of ER group A in the four tests Average RTs (ms) of control group B in the four tests Sportsmen practicing Simplified model of bilingual processing The four pillars of L2 acquisition: Input processing, practice, output production, and monitoring Four proposed effects of shadowing upon L2 acquisition Emergence order of three effects of shadowing Steps from shadowing training to development of listening comprehension Hypothetical route map to L2 acquisition through shadowing

137 138 139 140 141 142 144 153 156 156 158 160

161 162 164 165 165 168 171 172 173 174 176 178 179 180 181 183 184 185 185 185

 xi

Tables



2.1 2.2 2.3 2.4

2.5 2.6 2.7 2.8 3.1

3.2 3.3 3.4 4.1 4.2

4.3 4.4 4.5 4.6 4.7 4.8 5.1 5.2 5.3 5.4

Relevant Brodmann area numbers and names 25 Major symptoms of various aphasia syndromes 26 Major types of brain imaging techniques 32 Comparison of NIRS data (oxy-​Hb concentration) in the left hemisphere (LH) between shadowing and listening 36 Mean increase from pre-​to posttest for Groups A and B 44 Mean increases from pre-​to posttest for the bottom-​up, pre-shadowing and top-​down, post-shadowing groups 45 Pre-​and posttest scores on all measures by Groups A and B 48 Pre-​and posttest scores on all measures of the two treatment groups 49 Presented sentences, number of syllables, and projection time for each sentence 57 Symptoms of conduction aphasia 64 Articulation durations of the chunks in the first and sixth shadowing 98 Rates (%) of the four correctly recalled chunks 98 Shadowing and oral reading schedule 108 Five recordings of a Japanese sample passage read aloud by L2 speakers (Pre, Post, Dur, Pit, Dur+Pit) 112 Mean mora duration in pretraining (“pre-​oral”) and posttraining (“post-​oral”) recordings 114 Mean duration (ms) of both clauses and the whole passage before and after training 117 Mean range of pitch (in semitones) in Clauses 1 and 2 and the final words in each clause, before and after training 117 Two effects of shadowing on output speech 121 Types of formulaic sequence 123 Average scores of objective measures for comprehensibility and smoothness by NS-​1, -​2, -​3 130 Main results of Yoshida et al. 145 Oral reading speed (syllables per minute), pitch range (semitones), holistic evaluation scores, and written recall scores 147 Attentional resource allocation results 147 Pearson’s correlation coefficients for all measurements 148

xii

xii  List of tables 5.5 Major types of dementia 5.6 Age at onset of different types of dementia for monolinguals and bilinguals 5.7 Mean reaction times and accuracy under different conditions 6.1 Languages listed in order of increasing difficulty by the US Foreign Service Institute (FSI) 6.2 BASE scores of two ER groups (junior high school seventh and eighth graders) and one non-​ER group (senior high school tenth graders) 6.3 Typical early pivot schemas of children learning English 6.4 Number of junior high school students who passed each grade of the Eiken test of Practical English Proficiency 6.5 Collocational continuum in English

150 151 153 166

169 174 177 179

 xiii

Preface

Shadowing is a technique for enhancing second language (L2) acquisition, in which learners repeat speech aloud as they hear it, as precisely as possible, while continuing to listen attentively to the incoming speech. This is the definition of shadowing I would like to employ in this book. Shadowing has sometimes been defined somewhat differently, emphasizing the listening task. “Shadowing is an act or a task of listening in which the learner tracks the heard speech and repeats it as exactly as possible while listening attentively to the incoming information” (Tamai, 2005, p.  34). Shadowing is “an active and highly cognitive technique for EFL listening skill development, in which learners track heard speech and vocalize it simultaneously” (Hamada, 2017). Thus shadowing has usually been considered a task for improving L2 listening comprehension through immediate “online” repetition of input speech. However, my previous books suggest, based on theoretical and empirical research, that shadowing is effective in four different ways in L2 learning and teaching, particularly of English and Japanese (Kadota, 2007, 2012, 2014, 2015 and 2018). 1) Shadowing facilitates automatic perception of input speech, which leads to improvement of L2 listening skill (input effect). 2) Shadowing enhances L2 learners’ subvocal rehearsal rates in a phonological working memory and accelerates intake or internalization of words, formulas, constructions, etc. (practice effect). 3) Shadowing promotes speaking at L2 by simulating parts of the cognitive process involved in speech production (output effect). 4) Shadowing develops L2 metacognitive monitoring and control, that is, the executive functions, by the executive working memory (monitoring effect). The present book is intended to examine the cognitive psycholinguistic processes involved in shadowing input speech, based on empirical data obtained mainly from L2 learners of English and Japanese. Overall, the book is a theoretical and empirical attempt to indicate the effectiveness of shadowing practice in transforming L2 learners’ declarative, explicit knowledge into more proceduralized, automatized knowledge; this is a process many researchers now

xiv

newgenprepdf

xiv Preface seem to be taking note of in investigating what is really important to acquiring L2 fluency. In Japan and some other Asian countries, shadowing is already a recognized, popular method of L2 learning because it is acknowledged to be theoretically and empirically well grounded. The present book aims to fill the debated theoretical gap between the “input model” and the “output model” and to provide a rationale for an intermediate practice connecting the two models. In so doing, I believe that this book will provide useful new insights on the effectiveness of shadowing and will serve as a readable, innovative guidebook for learners, educators, and researchers around the world. Finally, I  would like to express my deepest gratitude to the anonymous reviewers of the proposal for this book as well as to the editor from Routledge, Ms. Samantha Phua. However, I am of course solely responsible for any remaining limitations and shortcomings. Shuhei Kadota

 1

1  What is shadowing?

1.1 What is shadowing? Shadowing has usually been considered a technique for improving listening ability in second language (L2), wherein learners track the heard speech and repeat it back verbally in as exact a manner as possible while continuing to listen attentively to incoming messages (see Tamai, 2005, p. 34; Hamada, 2017). An image of the technique is shown in Figure 1.1. Shadowing was originally used as an experimental method in phonetics, particularly auditory phonetics on “selective attention,” to explore the mechanism underlying perception of human speech sounds. Typically, in such an experiment, the participants wear headphones while listening to two different messages  –​ one in their right ear and one in their left –​and are instructed to shadow (i.e., repeat verbally) the speech from the specified ear. Figure 1.2 depicts the typical shadowing experiment in auditory phonetics. The purpose of such an experiment is to examine how shadowing performance is influenced by changes in the semantic content or sound quality (e.g., speed, pitch, loudness) of the message in the unattended ear. A typical finding from such experiments is that people do not completely ignore the unattended speech; instead, they process it roughly in a “gestalt-​like” or holistic manner, even when focusing on shadowing the target message (Cherry, 1966; Kadota, 2007). More practically, shadowing is used as a basic training method for simultaneous interpreters at schools of interpretation; beginner interpreters must first learn to listen and speak simultaneously before they begin to translate from one language to another. Recently, shadowing has been acknowledged as an effective task for enhancing L2 acquisition or learning, mainly by developing listening comprehension, and is widely employed by numerous language teachers and educators in Japan for that reason (Kadota, 2007). Thus its widespread use is because one of the more urgent problems for many learners of English as a second or foreign language in Japan –​and in any other non-​English-​speaking country –​is finding a good way of improving interactive oral language ability, such as listening

2

2  What is shadowing?

Figure 1.1 An image of shadowing as a verbatim repetition task requiring learners to repeat what they hear as accurately as possible

Figure 1.2 An image of a shadowing experiment on selective attention in auditory phonetics (Kadota et al., 2012, p. 33)

and speaking. More specifically, in daily communication, we must almost simultaneously perform three tasks:  1) understand the message addressed to us; 2) conceptualize the response; and 3) respond to the message. And this multi-​ processing requires much more cognitive load than a single word processing, particularly for L2 learners. This can be well shown in Figure  1.3, a simplified diagram based on Levelt’s (1993) model of speech comprehension and production. The following is an example illustrating this multiple processing in our daily communication (see Figure 1.4): SPEAKER A:  We have a party at my friend’s house next Saturday. SPEAKER B:  Sorry, on that day I have plans to go out for dinner with my wife.

 3

What is shadowing? 3

Conceptualizer Formulator

Parser

Arculator

Acousc –phonec processor

overt speech

speech

Figure 1.3 Quasi-​multi-​processing in daily communication

Figure 1.4 An example of multi-​processing in a daily conversation (Kadota, 2015, p. 14)

In responding to Speaker A, Speaker B must perform all three of the aforementioned tasks almost simultaneously and in a short period of time after Speaker A’s utterance. This shows how quick or automatic responses are highly necessary in real-​ time communication. Shadowing is assumed to be a good method of promoting the development of such automatic responses  in  L2  because it forces learners to practice the dual task of input speech perception and output production. Indeed, this cognitive duality of shadowing appears to create a situation similar to real-​ time human communication.

4

4  What is shadowing? Relatedly, it is not so uncommon to find ourselves repeating in our mind the name of a person we just met or the name of a restaurant we have just entered in order to store them in our long-​term memory. Although shadowing does not refer to this use of the “internal voice,” or subvocalization, but rather the practice of vocalized repetition, it does share some common features with subvocal speech for memorization. Furthermore, it seems very effective in promoting explicit and implicit memory formation of new words, chunks, and grammatical constructions. Although English and Japanese are phonetically distinct languages both in terms of segmental (i.e., vowels and consonants) and nonsegmental sounds (i.e., speech rhythm, intonation), it is expected that shadowing can help Japanese learners overcome the many problems that they would encounter in L2 English acquisition.

1.2  A sample of L2 shadowing Again, shadowing involves repeating speech immediately after hearing it –​in other words, it is the online, immediate process of perceiving speech and repeating it back. The following is an example dialog included in a shadowing textbook for beginners (Tamai, 2008, p. 40):    B is talking to A (Chris) who is driving a car: A:  Wow, Chris, look at that house. B:  Which one? I just can’t look away. A:  There, the one on top of the hill. It has white and blue roof tiles.

That’s cute.    Could you get one like that for me? B:  Of course, honey, if you could wait about 20 years. Or would you like to

marry a rich guy? Figures 1.5 and 1.6 depict the sound wave and the results of a loudness and pitch analysis of the above sentence in bold (i.e., “There, the one on top of the hill.”) using Praat, which is a free software for acoustic phoneticians (Kadota, 2015, pp. 69, 338). As we can see, the duration of the utterance of “There, the one on top of the hill” is almost the same between the original and shadowed speech, at 2.7s and 2.5s, respectively. The patterns of loudness (dB) change do not differ much either. However, the Hz ranges of pitch change differed remarkably: In the original CD speech, there is far greater intonational up-​and-​down variety in the Hz range than in the L2 shadowed speech. Interestingly, the very narrow pitch range of the latter speech is common to all Japanese learners of English, no matter how much conscious effort they put into imitating the original speech in their shadowing.

 5

Figure 1.5 The sound wave (upper) and the results of a loudness and pitch analyses (lower) of “There, the one on top of the hill” spoken by a native speaker of English in the book-​attached CD (Kadota, 2015, pp. 69, 338)

Figure 1.6 The sound wave (upper) and the results of a loudness and pitch analyses (lower) of the shadowing of “There, the one on top of the hill” by a low-​intermediate Japanese learner of English (Kadota, 2015, pp. 69, 338)

6

6  What is shadowing?

1.3  Repeating, oral reading, and parallel reading There are several L2 learning tasks related to shadowing. The following are perhaps the most frequently used in the classroom. 1.3.1  Repeating Repeating, like shadowing, is an oral repetition task often used in L2 learning classrooms; it involves having learners first listen to a message and then repeat it back during a sufficient pause (see Figure 1.7). It is often considered similar or essentially the same as shadowing. However, Kadota (2007) suggested that, from a psycholinguistic viewpoint, they are cognitively different: Shadowing is assumed to be the “online” or immediate process of repeating speech, and due to this time pressure, learners must always focus on the input speech sounds without thinking excessively about their grammatical structures or meaning (at least at the beginning level). In contrast, during repeating, learners are given sufficient time to repeat the input speech during the provided silent pause, which ostensibly makes this process “off-​line.” Thus, it is believed that learners perform a variety of cognitive tasks before they begin repeating, such as grammatical or semantic processing, rather than merely focusing on the input speech. We expect that, as learners become more accustomed to shadowing  –​and thus require increasingly less cognitive effort to perform it, to the point that it becomes automatic –​they will be able to shadow the input speech while simultaneously processing the meaning of the input sentence and other such higher-​level processing. Notably, the difference in articulation latency between shadowing and repeating seems crucial; in shadowing, students are forced to focus more on the

Figure 1.7 An image of repeating (Kadota et al., 2012, p. 24)

 7

What is shadowing? 7 sounds of the input speech, whereas in repeating, the learner has sufficient time to analyze the input message syntactically and semantically (e.g., they engage in parsing and semantic proposition construction). This difference may lead us to the hypothesis that the two methods would differ in their effectiveness in promoting L2 acquisition. Moreover, for repeating, it would be necessary to examine empirically the length of pauses necessary for learners to repeat a given message as well as the optimal length of such pauses (e.g., less than 2 s). I  will discuss the latter question later in Chapter 3, along with the examination of phonological working memory span (see also section 2.4.6 in Chapter 2 for empirical results comparing shadowing and repeating). 1.3.2  Oral reading Whereas shadowing is the task of reproducing phonological representations constructed from perceived auditory input, oral reading, or “reading aloud,” is the task of articulating phonological representations coded from written visual input (see Figure 1.8). As with shadowing, oral reading for beginning L2 learners is the essentially “online” process of decoding written inputs into phonological codes without giving much attention to the sentence structure or meaning. Later, when learners have become proficient in the phonological coding of written words, they will be able to read aloud written text using fewer cognitive resources while simultaneously attending to and understanding the meaning of the input sentence, etc. Again, like shadowing, oral reading is characterized by cognitive duality: the task involves both phonological coding of written words and articulating them into output sounds.

Figure 1.8 An image of oral reading (Kadota et al., 2012, p. 25)

8

8  What is shadowing? However, oral reading differs from shadowing in terms of the following important points: 1) While shadowing, learners are presented with auditory speech inputs, and thus are required to follow the input speech at the same speed as the speaker; in contrast, during oral reading, learners decide the reading speed by themselves. 2) This flexibility in reading speed for oral reading can sometimes cause individuals to mispronounce words both segmentally and nonsegmentally, as they have no speech input. 3) In general, shadowing is considered cognitively more challenging than oral reading. This suggests that, in shadowing, the linguistic material presented to beginning learners must be much easier than in oral reading. Although oral reading is a traditional classroom task in L2 learning for helping learners develop silent reading comprehension, it is somewhat undervalued according to the current, predominant “communicative approach” to L2 learning. Specifically, this approach does not regard oral reading highly because it does not allow for communicative interaction through the negotiation of meaning (Yonezaki and Ito, 2012). However, Kadota (2007) recently reevaluated the practice of oral reading, arguing that it facilitates the efficiency and automaticity of phonological coding of written inputs, which is crucial for later semantic comprehension. In this way, oral reading, like shadowing, has come to be recognized as an effective means of improving L2 learners’ speaking and oral production (see Chapter  4 for more details). 1.3.3  Parallel reading Shadowing is based on auditory speech inputs and oral reading on written visual inputs  –​what about, then, when the two tasks are combined? This is known as “parallel” or “synchronized reading” (i.e., shadowing with written text). Specifically, learners shadow an oral message while also reading the same text. In other words, they are expected to read aloud a given text while listening to the model input speech (see Figure 1.9). As with shadowing and oral reading, it is considered an “online” task. Given the highly complex, dual-​task nature of parallel reading, one might think it too difficult to actually perform. This is true in a strict sense. However, research on cognitive psychology has shown that humans are endowed with the capacity for “selective attention,” which makes it possible to focus on a single input without processing the unattended input. In other words, processing should be “switched off” for the unattended input automatically during a parallel reading task. Usually, this task is employed in the classroom before actually introducing shadowing.

 9

What is shadowing? 9

Figure 1.9 An image of parallel reading (Kadota et al., 2012, p. 25)

1.4  The proposed effects of shadowing So how does shadowing influence L2 learners? Tamai (2005) argued that for Japanese learners, shadowing is perhaps the best method of developing L2 listening comprehension. Later, Kadota (2007, p. 34, 2012, p. 135), in discussing the effectiveness of shadowing and oral reading, hypothesized two possible effects of shadowing training on L2, particularly in ESL learning, as follows. I. Shadowing promotes automatic perception of input speech, which is a lower-​level process preceding input comprehension, and thus accelerates the development of listening skills, which is called “input effect” of shadowing. II. Shadowing enhances the vocal and subvocal rehearsal speed of heard speech and promotes internalization (i.e., explicit and implicit memorization) of rehearsed words, chunks, grammatical constructions, etc., and is named here as the “practice effect” of shadowing. (See also Kadota, 2007, 2012, 2015 for a discussion of the effects of oral reading practice in ESL/​EFL.) These two effects are illustrated in Figure 1.10. Until now, there has been a certain accumulation of relevant data concerning the above two effects of shadowing for Japanese ESL/​EFL learners. Overall, the literature on L2 learners of English or Japanese does offer support for how shadowing leads to the improvement of listening skills, particularly the

10

10  What is shadowing?

SHADOWING

Automatic speech perception

Improvement of listening skill

Acceleration of vocal and subvocal speech rate

Internalization of lexical chunks, constructions, etc.

Figure 1.10 Two major effects of shadowing training for L2 learners (based on Kadota, 2012, p. 135)

automatization of L2 speech perception. More specifically, this research has found the following: 1) Continued practice of shadowing leads to an increase in the percentage of correctly shadowed syllables until about the fifth practice session (Hori, 2008; Shiki, Mori, Kadota, and Yoshida, 2010). 2) It promotes English listening ability among university (Tamai, 2005) and high school EFL students (Mochizuki, 2006) in Japan. 3) It promotes Korean learners’ listening span for Japanese as an L2 (Sakoda and Matsumi, 2004, 2005). These empirical results for L2 learners of English or of Japanese are supposed to support the above improvement of listening skills, particularly the automatization of L2 speech perception. We will consider in detail in Chapter  2 why these above findings are relevant with regard to the development of listening comprehension. Regarding acceleration of subvocal rehearsal speed and efficiency –​the other proposed practice effect of shadowing  –​there is similarly some support. More specifically, the research suggests effective transfer of shadowing training to learnability, which in turn promotes subvocal rehearsal speed and efficiency and therefore L2 learning. 1) Shadowing facilitates implicit memorization of the practiced (i.e., shadowed) phrases for Japanese ESL learners (e.g., Miyake, 2009b). 2) Shadowing practice, together with oral reading drills, develops Japanese-​as-​ L2 skills, such as listening (e.g., SPOT), vocabulary, and dictation (Sakoda and Matsumi, 2004, 2005).

 11

What is shadowing? 11

SHADOWING

Automatic speech perception

Improvement of listening skill

Acceleration of vocal and subvocal speech rate

Internalization of lexical chunks, constructions, etc.

Simulation of sentence production process

Improvement of speaking skill

Enhancement of meta-cognitive monitoring and control

Promotion of executive control

Figure 1.11 Revised model of the effectiveness of shadowing training for L2 learners

3) Shadowing in L2 Japanese enhances students’ learnability by improving their memory spans according to the results of various tests, such as listening or digit span (Sakoda and Matsumi, 2004, 2005; Sakoda, 2006, 2010). The main purpose of Chapter  3 is to explore why shadowing promotes students’ L2 learnability in the above ways  –​namely, why it helps to improve explicit and implicit learning (memory formation or memorization) of new lexical chunks, formulae, grammatical constructions, etc. In addition to the above two effects, Kadota (2015, 2018) proposed new plausible effects of shadowing with reference to 1) the promotion of L2 speaking, called the “output effect,” and 2)  the enhancement of learners’ metacognitive activities such as monitoring and controlling L2 processing, named the “metacognitive monitoring effect.” With the addition of these effects, the model depicted in Figure 1.10 can be expanded as follows (Figure 1.11). III. Shadowing simulates the sentence production process in L2 and thereby facilitates the automaticity or efficiency of L2 speech production. There was relatively little empirical research on the output effect of sentence production until recently. However, to our knowledge, the following have been offering supports for it: 1) Shadowing increases the intensity of learners’ speech in oral reading of new material they have not practiced (e.g., Mori, 2011). 2) It accelerates learners’ articulation speed (e.g., Miyake, 2009a; Mori, 2011). 3) It widens the pitch (F0) range of learners’ speech (e.g., Miyake, 2009c; Hori, 2008; Mori, 2011). 4) Shadowing is effective in helping Chinese JSL learners acquire prosodic features such as Japanese pitch accents (Hayashi, 2014).

12

12  What is shadowing? 5) Shadowing is effective for alleviating stuttering blocks (A, Sakai, and Mori, 2014). These findings indicate that shadowing stimulates the sentence production process, particularly in the “phonetic encoding” or “articulator” stages in L2 speech production (Kormos, 2006, p. 106). In Chapter 4, I will examine why and how shadowing is effective for promoting L2 speaking, particularly regarding the output simulation effect. IV. Shadowing enhances L2 learners’ metacognitive activities in L2 processing. Since the monitoring and controlling effect of shadowing has been newly proposed (Kadota, 2018), there is almost no empirical research that suggests this effect except for the following neurolinguistic research. While shadowing, there may be a greater cerebral activity in the frontal association cortex (Kadota et al., 2015a), which is supposed to support the brain’s executive functions or executive working memory. This leads us to the hypothesis that during shadowing learners are engaged in monitoring and controlling their repetition performance. This hypothetical effect of shadowing will be examined in Chapter 5. Here, I offer the following outline of the chapters in this book. In Chapter 2, I explain in detail the process of listening comprehension and examine the hypothesized effect of shadowing on it in L2 (i.e., input effect). Then, I  will report on past empirical research results in Japan concerning the development of speech perception in EFL and JFL. As you will see, there is a fair amount of empirical and experimental data on this effect of shadowing. Chapter 3 explores the effects of shadowing on implicit and explicit memory –​ in other words, on learnability (i.e., practice effect). In particular, I focus on the implicit learning effect of shadowing on L2 acquisition by way of “repetition” (or “direct priming”) and conceptualize the importance of vocal and subvocal rehearsal in L2 acquisition. Furthermore, the chapter discusses the neurological connection (or interface) of explicit and implicit memory formation or learning. Chapter 4 discusses the output effect of shadowing: namely, its contribution to speaking or speech production, with an emphasis on the simulation effect of sentence production. Chapter  5 focuses on the most recent hypothesis regarding the effect of shadowing on metacognitive monitoring and control, reviewing relevant theoretical and empirical studies (i.e., monitoring effect). Finally, Chapter 6 consolidates the information so far discussed to offer a new theoretical framework of “practice” in L2 acquisition and learning connecting input processing and output production. I also refer to a hypothetical route map of L2 acquisition through shadowing-​based practice.

 13

2  Shadowing for L2 listening comprehension

2.1  A simplified sketch of L2 listening comprehension Understanding messages by listening to speech is such a natural and everyday activity that few of us dare to reflect on what is occurring in the mind during this process. However, it becomes evident how complex speech understanding is when we observe an individual with brain damage or a language comprehension disorder. A good example would be Wernicke’s aphasia, a “receptive type” of aphasia wherein patients typically find themselves unable to understand the meaning of spoken or written language addressed to them, even when they can produce speech with normal grammar and speed. Unlike native speakers, L2 learners are more liable to consider what is involved in the process of listening, as they often wish to identify the reason that they cannot understand a message given in their target L2, as follows: ) They lack the ability to perceive the sounds of the message. 1 2) They lack knowledge of the vocabulary. 3) They have imprecise or impractical knowledge of grammar to express the intended message. 4) They cannot retrieve the necessary background knowledge or schemata to aid in their comprehension. In this way, whether in L2 or L1, only when we experience problems in communication do we pay more attention to the marvelous process of language. The study of listening, or spoken language comprehension, has close ties with the fields of phonetics, linguistics, cognitive psychology, and neuroscience. In these fields, there are numerous detailed and elaborate models proposing to explain how people recognize speech sounds to construct phonetic or phonological representations and then use these representations to process spoken language. Figure  2.1 offers a rough but easy-​to-​understand sketch of the major stages of listening comprehension based on these theories (proposed by Kohno, 1992, referring to Pimsleur, 1971). The “filter device” depicted in the figure allows us to focus our auditory attention on a particular stimulus we want to perceive while attenuating

14

14  Shadowing for L2 listening comprehension

Sensory input Ў

Filter device

Preliminary auditory analysis stage

Ў

Echoic memory Ў

Analysis by synthesis Short-term memory stage

prediction- testing

Rehearsal buffer

Primary information analysis stage

ЎЌ

Long-term memory Figure 2.1 A simplified outline of listening comprehension (Kohno, 1992, p. 41)

perception of (or filtering out) the other stimuli. This is what allows us to focus on what one person is saying when at a party where many others are also talking (this example is often why it is referred to as the “cocktail party effect”). “Echoic memory” is the auditory form of sensory memory (with the visual version being iconic memory; see Chapter  3 for a detailed description of the human memory system, including sensory memory). The filter device and echoic memory constitute the preliminary auditory analyses of input speech and produce an output called a “phonetic or phonological mental representation.” The central stage of listening comprehension is the short-​term memory stage (i.e., working memory stage; see also Chapter  3), wherein primary information processing is conducted  –​namely, the processing that builds the ultimate semantic representation of the input stimuli. In this stage, there are two major processes: “analysis by synthesis” and “prediction-​testing.” “Analysis by synthesis” is, in its broadest sense, the process of collecting and using all relevant knowledge (both linguistic and nonlinguistic) stored in long-​ term memory to process speech input. Such knowledge includes: 1) Phonetic or phonological knowledge to analyze speech sounds, including sound changes such as /​Aνδ /​→/​Eνδ//​νδ/​(‘and’) or /​ηEρ/​→/​Eρ/​ (‘her’). 2) Grammatical or morpho-​ syntactic knowledge to construct a morpho-​ syntactic structure of the input sentence.

 15

Shadowing for L2 listening comprehension 15 ) 3 4) 5) 6) 7)

Vocabulary or lexical knowledge to access the words in the speech stimulus. Contextual knowledge to extract the meaning of the utterance. Background knowledge or schemata to predict the message. Paralinguistic knowledge to infer the emotional information of the speaker. Knowledge of nonverbal communication such as body language to complement the speech stimuli.

Among these, 1 to 4 constitute linguistic knowledge, whereas 5 to 7 constitute nonlinguistic knowledge. “Prediction-​ testing” not only includes collecting the relevant knowledge above but also involves much more active use of them to predict upcoming words and messages in processing speech.

2.2  The two stages of listening: Perception and comprehension To determine the effects of shadowing training on L2 listening, it is necessary to distinguish the two major processes involved in it:  ‘perception’ and ‘comprehension’. Individuals who are just beginning the process of L2 learning must consciously switch between perception and comprehension because these processes occupy an inordinate amount of cognitive resources; in this way, these two processes have a trade-​off relationship. However, as perception becomes less conscious or more automatized through practice or habituation, increasingly greater cognitive resources can be allocated to comprehension. In other words, learners can begin to devote their attention solely to comprehension. Figure 2.2 illustrates this process of attentional shift from beginner to advanced learners. In the initial stage (perception) of L2 listening comprehension, learners must identify which segmental (e.g., consonants and vowels) and prosodic (e.g., rhythm, intonation) speech sounds make up the incoming speech stimuli and then use these to construct either concrete phonetic or abstract phonological representations. However, in the higher-​level comprehension stage, which follows the perception stage, various cognitive mental operations are performed to interpret the meaning of the speech input. These operations include 1) lexical, 2) syntactic, 3) semantic, 4) contextual, and 5) schema-​based processing (see Figure 2.3 below). 1) Lexical processing refers to how we recognize and identify individual words in speech stimuli based on the phonetic or phonological representations built in the perception stage. Successful lexical processing requires consultation of the “mental dictionary,” or lexicon, and is usually accompanied by the retrieval of various properties of the words consulted –​namely, their lemma (i.e., semantic and syntactic properties) and their lexeme (i.e., their morphological and phonological properties). Figure 2.4 depicts these two types of information embedded in the retrieved words (i.e., the lexical entries in the mental lexicon).

16

16  Shadowing for L2 listening comprehension

Beginning learners with perception non-automaticity: Perception

Switching

Attention

Comprehension

More advanced learners with perception automaticity: Perception

(automatic)

Attention Comprehension Figure 2.2 Switching perception and comprehension in L2 listening (adapted from Samuels, 2006)

Lexical processing Syntactic processing

Speech input

Perception

Semantic processing

Comprehension

Contextual processing

Sehema processing

Figure 2.3 Processing operations involved in L2 listening comprehension (Kadota, 2007, p. 46)

Incidentally, when the speech input comprises nonwords or words that learners have never heard (i.e., unknown word), they can neither be retrieved nor used for meaning comprehension since they are not stored in the mental lexicon. The exception to this is when the meaning of these words/​nonwords can be predicted from contextual or schema processing (described below). 2) Syntactic processing (also called “parsing”) involves the construction of a grammatical or syntactic representation of a sentence (i.e., a mental representation of the structure and grammar of a sentence); this is a necessary

 17

Shadowing for L2 listening comprehension 17

Figure 2.4 Depiction of properties of the lemma and lexeme for each lexical entry (from Kadota, 2015, p. 78, who adapted it from Levelt, 1989)

process for both L2 learners and L1 users. Evidence of this stage of syntactic processing can simply be demonstrated with the following example. The two sentences, “The man hit the woman” and “The woman hit the man,” share the same words but are arranged such that the agent–​patient relationships depicted are reversed, leading the sentences to have different meanings. Additionally, “The colorless green idea sleeps furiously”  –​a sentence often cited in linguistic articles to distinguish between syntax and semantics  –​is semantically nonsense but syntactically appropriate. Furthermore, ungrammatical sentences usually require more time to process than grammatical ones. ) Semantic processing refers to judging whether a particular sequence of words 3 is semantically appropriate or not. This involves determining the semantic features or components of each word in the sequence and confirming that they are congruent. For instance, the words “man,” “woman,” “boy,” and “girl” are supposed to consist of the following semantic features:

• • • •

man = [+human], [+male], [+adult] woman = [+human], [-​male], [+adult] boy = [+human], [+male], [-​adult] girl = [+human], [-​male], [-​adult]

In these examples, “man” contrasts with “woman” for the feature [±male], whereas “man” and “boy” contrast for [±adult]. We similarly analyzed the

18

18  Shadowing for L2 listening comprehension semantic features of the earlier sentence “The colorless green idea sleeps furiously” to judge that it was semantically inappropriate. Ultimately, the goal of semantic processing is to construct a semantic or conceptual representation, which is composed of a set of “propositions.” 4) We tend to make good use of the contextual information that precedes the sentence currently being processed; this is termed “contextual processing.” Contextual information is particularly useful in understanding ambiguous sentences. This is often particularly necessary for English, which has a variety of words with two or more meanings (e.g., bank, court, organ). For such words, preceding information is crucial for understanding which of their meanings is being used. Take, for instance, the different uses of “organ” in the following sentences: The main attraction of the concert was the organ. The patient waited in a hospital for the organ transplant. In the former sentence, “organ” refers to the large musical instrument, while in the latter, it refers to a part of the human body. Contextual processing can also be used to interpret structurally ambiguous –​as opposed to lexically ambiguous –​ sentences, such as “The hostess greeted the guest with a smile.” This sentence can be interpreted as follows: The smiling hostess greeted the guest. The hostess greeted the smiling guest. In structurally ambiguous sentences, the prosody of the sentence often gives listeners crucial hints as to the correct interpretation; however, this only works when the sentences are spoken rather than written. 5) “Schema” can be roughly considered as a kind of background information, often representing a generic, prototypical mental concept or frame. It is also called a “script” or a “scenario,” and is used to guide both L1 and L2 listening comprehension. Although schemas are formed via our individual experiences, they often have commonalities shared by learners. The following Figure  2.5 is the often cited “restaurant schema,” which is formulated by generic knowledge commonly shared by people in English-​ speaking countries. The above “restaurant schema” makes it possible for us to provide exact answers to the questions below the following short sample passage: John went to a restaurant. The waiter gave John a menu. The waiter came to the table. John ordered lobster. John was served quickly. John left a large tip.

 19

Name: Restaurant Props : Tables

Roles : Customer

Menu

Waiter/waitress

Food

Cook

Bill

Cashier

Money

Owner

Tip Entry condition: Customer is hungry Customer has money Results: Customer has less money Owner has more money Customer is not hungry

Scene 1: Entering Customer enters restaurant Customer looks for table Customer decided where to sit Customer sits down

Scene 2: Ordering Waitress brings menu Customer reads menu Customer decides on food Customer orders food Waitress gives food order to cook Cook prepares food

Scene 3: Exiting Waitress gives bill to customer Customer gives tip to waitress Customer goes to cashier Customer gives money to cashier Customer leaves restaurant

Figure 2.5 An example of the restaurant schema (adapted from Greene, 1986, p. 38)

20

20  Shadowing for L2 listening comprehension [Questions] 1) 2) 3) 4) 5) 6)

What did John eat? (Lobster) Who gave John the menu? (The waiter) Who gave John the lobster? (Probably the waiter) Who paid the bill? (Probably John) Why did John get a menu? (So he could order food) Why did John give the waiter a large tip? (Because he was served quickly) Similarly, take the following passage: Mary heard the ice cream truck coming down the street. She remembered her birthday money and rushed into the house and locked the door.

In this passage, many readers would need to pause to think of a reason why she locked the door because of the phrase “the ice cream truck” in the first sentence, which activates a related schema in which the “locked the door” phrase does not fit (Greene, 1986, p. 37–​40). In summary, schema-​driven comprehension, whether conscious or unconscious, is a typical cognitive process in language comprehension. Researchers have proposed that these five operations making up comprehension do not proceed in a serial fashion; rather, they occur almost simultaneously. For instance, when we hear the sentence, “I went to the bank yesterday,” the meaning of “bank” cannot be lexically determined, but it can be interpreted via contextual processing: I went to the bank to open a new account yesterday. I went to the bank of a river to enjoy jogging yesterday. This clearly suggests the above five operations do not proceed in a serial fashion but rather interact with each other dynamically to achieve language comprehension.

2.3  Switching between perception and comprehension in listening As noted previously, perception and comprehension do not operate simultaneously –​chronologically, perception precedes comprehension because the latter process is not possible without perceiving the speech input. However, perception of the speech input need not be completely accurate for appropriate comprehension; the latter can occasionally compensate for inaccurate perception of the input. For instance, suppose two students are presented with the sentence, “We need to support her,” and Student A perceives the sentence as “We need supporter” whereas Student B perceives it as “We need support her.” Although neither

 21

Shadowing for L2 listening comprehension 21

Figure 2.6 An image of continual “switch-​on” perception during L2 shadowing (Kadota, 2007, p. 55)

perception is correct, both students are likely to have tried to achieve correct understanding of the sentence with activation of the relevant lexical, semantic, and contextual information. During shadowing practice, learners are not given sufficient time to appropriately utilize the linguistic and nonlinguistic information to predict the message –​in other words, they cannot switch between perception and comprehension. Instead, they must concentrate solely on perceiving the speech input (see Figure 2.6 above); therefore, they only process the input speech to the point where the phonological representation is constructed. This is assumed to underlie why shadowing leads to more successful automatization of speech perception for L2 learners.

2.4  Theoretical background of the effect of shadowing on speech perception Based on the above, how can greater automaticity of speech perception in L2 be achieved? Krashen (1985) insisted that providing learners with a large amount of comprehensible inputs is a necessary and sufficient condition for L2 acquisition. Certainly, no researcher in L2 acquisition denies the importance of input processing or “input theory” (see Chapter 6) in L2 acquisition. However, there is ample evidence that it is difficult to develop perceptual automaticity solely through listening to sufficient L2 input. In Chapter  1, we discussed the “selective attention” mechanism of input perception, which makes it possible for us to focus on particular inputs while excluding unattended ones when there are more than two messages given to us. Although it has been shown that we do not completely ignore unattended speech inputs, we are able to effectively filter it out of processing. This involves the use

22

22  Shadowing for L2 listening comprehension of the filter device mentioned in section 2.1. The fact that such a device exists suggests that learners cannot improve their speech perception through merely hearing L2 speech passively as background noise, such as while reading a book or talking with friends or family members. Instead, learners would need to concentrate on perceiving speech actively and consciously. Thus, to effectively enhance L2 learners’ speech comprehension ability, it would be necessary to precisely measure perceptual processing –​in other words, what is going on in our minds when we are actively processing speech sounds. Understanding this process would help develop evidence-​based methods of promoting L2 speech perception by shadowing for use by learners and teachers (i.e., input effect). To this effect, we will now outline major relevant theories and data from recent studies, including the McGurk effect, the motor theory of speech perception, and the mirror neuron system. 2.4.1  McGurk effect The McGurk effect is a perceptual phenomenon that indicates the close interaction between hearing and vision in speech perception. When we hear a person pronouncing the sound /​ba/​, our ear will recognize that sound as /​ba/​naturally. However, when we hear the same person speaking /​ba/​but watch a video of the face of that person pronouncing /​ga/​, we will not perceive the sound correctly as /​ba/​but rather as /​da/​or /​ga/​. Figure 2.7 depicts this illusion. In sum, when the auditory perception of one sound is paired with the visual recognition of another sound, a different sound is perceived altogether. So what does the McGurk effect tell us about the mechanisms of human speech perception? It illustrates that we do not perceive speech solely by using our ears (i.e., the auditory system); instead, we construct an articulatory image of the sound in our minds and superimpose that image onto the sound we actually hear. In other words, based on visual information regarding mouth movements, we articulate a sound and hear what we expect to hear, so to speak. This explains the well-​known finding that it is easier to understand speech inputs in situations where there is a large degree of background noise when the face of the speaker is visible. 2.4.2  Motor theory of speech perception Our discussion above suggests that perceiving speech is highly related to producing it; this link is what led Liberman and his coworkers (1963, 1967) to propose the “motor theory of speech perception.” They insisted that, in speech perception, the human motor system is recruited in mapping or changing acoustic speech inputs into phonological codes. The theory appears to be consistent with the commonly supported notion that articulation is the prerequisite for perception and that “we cannot perceive sounds which we cannot pronounce.” For instance, we would know how to articulate /​r/​in “right” and “rice” and /​l/​in “light” and “lice” correctly in order to differentiate /​r/​and /​l/​perceptually.

 23

Shadowing for L2 listening comprehension 23

Figure 2.7 The McGurk effect: The sound being uttered is /​ba/​but the movement of the mouth reflects /​ga/​(Kadota, 2014, p. 66)

Although this motor theory was epoch-​making at the time of its development, it was largely disregarded by many researchers. However, in 2004, Wilson, Saygın, Sereno, and Iacoboni provided empirical data supporting the motor theory. They hypothesized that listeners, during sound perception, would show activation in the motor areas of their brain, a sign that they were recruiting the motor system to articulate the sounds. They carried out an fMRI (functional magnetic resonance imaging) experiment to investigate whether motor areas were indeed recruited in speech perception. In their experiment, ten English-​speaking British adults passively listened to 16 blocks of monosyllabic nonwords while their brain activation was recorded using fMRI. Remarkably, regions such as the premotor cortex and the primary motor cortex showed activation in all ten subjects when they were passively listening to nonsense monosyllables as compared to a resting period. Figure 2.8 shows the areas activated by the task. More specifically, speech perception activated the premotor (BA6) and primary motor (BA4) areas of the brain, as well as the auditory areas (BA41 and BA42) in every one of Wilson et al.’s subjects; there were, of course, minor variations in activity among participants (i.e., motor-​related activations were bilateral in four subjects but left-​and right-​lateralized in two and four subjects, respectively). Figure 2.9 provides a map of the Brodmann areas, while Table 2.1 shows some of the relevant region numbers and names.

24

Figure 2.8 Brain activation as measured by fMRI in three subjects (Wilson et al., 2004)

 25

Shadowing for L2 listening comprehension 25

4

6

8

5 7

9

312

46

40

10

39

44 45 11

43 43

19

41 42

47

38

21

22

18

17

37

20

Figure 2.9 Map of Brodmann areas (Kadota, 2015, p.  314) (taken from http://​spot. colorado.edu/​~dubin/​talks/​brodmann/​brodmann.html)

Table 2.1 Relevant Brodmann area numbers and names (Kadota, 2012, p. 78) BA4

primary motor cortex

BA6 BA39 BA22 BA40 BA41, 42 BA44, 45

premotor and supplementary motor cortex angular gyrus Wernicke’s area supramarginal gyrus primary auditory cortex Broca’s area

As expected, speech perception activated the auditory (BA41 and BA42) and Wernicke’s (BA22) areas of the brain, which are normally considered involved in speech perception. The interesting finding was the obvious activation of motor-​ related areas (BA4 and BA6), which suggests that neural correlates underlying motor activity for producing sounds are recruited during speech perception. This lends support to the idea that speech perception is obtained through superimposing articulatory images over the perceived input sounds. In other words, these results provide clear support for the motor theory of speech perception.

26

26  Shadowing for L2 listening comprehension 2.4.3  Roles of the ventral and dorsal streams in speech perception The well-​known group of syndromes called “aphasia” is characterized by various impairments in language processing; it is usually caused by acquired brain damages, such as cerebral infarction, brain tumors, cerebral hemorrhages, or traumatic brain injury. Aphasia usually consists of a loss of the ability to understand or produce speech without impairments to thinking or intelligence. Table 2.2 briefly summarizes the major symptoms of various aphasia syndromes. Among the aphasia syndromes listed above, Wernicke’s aphasia is characterized by impairments in speech perception including meaning comprehension, whereas speech production is intact. This aphasia syndrome is caused by damage to Wernicke’s area (mostly located in BA22), which is in the left posterior brain (see Figure 2.9 above). Wernicke’s aphasia led researchers to believe that Wernicke’s area is a major language perception center of the brain and underlies perception and comprehension of speech when listening to any language. Recent developments in neurocognitive research, however, indicate that Wernicke’s area (BA22) alone is not responsible for speech comprehension. Instead, there appear to be two basic cerebral routes (or streams) of connected brain areas responsible for perceiving speech (Ward, 2010): “ventral” and “dorsal”. The ventral stream is concerned with lexical–​semantic information processing. This processing stream is based on auditory–​ semantic processing rather than auditory–​motor processing and contains Wernicke’s area, etc. The dorsal route, in contrast, is involved in articulatory motor control. This stream, which includes Broca’s area (BA44), the premotor and primary motor areas (BA4 and BA6), etc., is the hub that links the various aspects of speech processing (i.e., auditory, motor, and possibly visual) (Ward, 2010; Griffiths and Warren, 2002). Recent studies using fMRI have suggested that the dorsal stream responds to silent articulation

Table 2.2 Major symptoms of various aphasia syndromes (Obler and Gjerlow, 1999, p. 40) Syndrome

Speech

Compre-​ Repeti-​ Naming Lesion Site hension tion

Broca’s aphasia Wernicke’s aphasia Conduction aphasia Anomic aphasia

poor, nonfluent fluent, empty

good poor

poor poor

poor poor

anterior posterior

fluent

good

poor

poor

fluent, good circumlocutions virtually none poor little good

good

poor

arcuate fasciculus anywhere

poor good

poor not bad

poor

good

poor

Global aphasia Transcortical motor aphasia Transcortical fluent sensory aphasia

large outside in frontal lobe outside in parietal lobe

 27

Shadowing for L2 listening comprehension 27

Sensori-motor speech loop (nonsemantic repetition, motor-based speech perception?)

Angular gyrus (phonological buffer)

Broca’s area (planning of speech production) Semantric knowledge

Speech recognition

Part of wernicke’s area (links auditory, motor, and visual aspects of speech?) Heschl’s gyrus (primary auditory cortex)

Figure 2.10 The sensori-motor dorsal and lexical–semantic ventral streams involved in speech perception (Ward, 2010, p. 228)

(i.e., thinking) and is believed to be a neurocognitive basis for the articulatory (i.e., phonological) loop of working memory (see Chapter 3 for details). Figure 2.10 depicts the dual-​stream model of speech perception. In the above figure, the horizontal and vertical striped patterns between the dorsal and ventral streams are also involved in the auditory analysis of speech input. Altogether, the dual-​stream model provides clear support for the validity of the “motor theory of speech perception” described above. 2.4.4  Mirroring system In 1977, Meltzoff and Moore reported on a phenomenon called “neonatal imitation,” which referred to how 12-​to 21-​day-​old infants appeared innately capable of imitating an adult’s facial expressions. This surprising ability is depicted in Figure 2.11. Meltzoff and Decety (2003, p. 492) insisted that “imitation is innate in human beings, which allows them to share behavioral states with other ‘like me’ agents.” In their original experiment, Meltzoff and Moore presented infants with several facial expressions and gestures sequentially and had their responses videotaped and scored by observers. They found it difficult to explain this imitation in terms of conditioning or learning just after birth and thus concluded that it was hardwired. They proposed that imitation is a way for infants to learn about adults.

28

28  Shadowing for L2 listening comprehension

Figure 2.11 Photographs of 12-​to 21-​day-​old infants imitating the facial expressions of an adult (Meltzoff and Decety, 2003, p. 492)

Although neonatal imitation has since aroused serious controversy among researchers, it is believed that, in spite of some expected individual differences, we all have some sort of innate capacity to imitate others and that this constitutes the basic framework for our learning abilities. As far as L1 acquisition is concerned, “imitation” was thoroughly criticized by the well-​known linguist Noam Chomsky in the late 1950s. He contended that L1  –​specifically the L1 grammar  –​is produced by an innate LAD (language acquisition device). Later on, Steven Pinker (1994), in his discussion of the nature of human language, also insisted that all humans share an innate “language instinct.” Both have persuasively argued that L1 acquisition is possible only by postulating the existence of an innate LAD, which comprises a “UG” (universal grammar) and a “parameter”; neither considered it to be the result of “imitation” or “analogy.” However, according to recent neurocognitive studies, imitation has been the subject of renewed interest as a mechanism underlying acquisition of various behaviors, including language. This renewed interest has primarily been driven by the discovery of “mirror neurons,” which in turn led to the development of a new theory regarding an imitation-​based learning mechanism in our brain. This

 29

Shadowing for L2 listening comprehension 29 new theory relies on the distinction between two types of imitation (see Ward, 2010, p. 164): 1) “Mimicry” refers to the superficial reproduction of behavior; more specifically, it is based on a sensorimotor, rather than a cognitive, level of information processing and does not consider the purpose of the behavior or the performer’s intention. A  good example is a parrot’s repetition of speech: they mimic words that they hear without understanding the meaning of those words. 2) “Imitation,” however, is a more sophisticated, cognitive method of behavioral reproduction, as it is based on more detailed observation of the behavior’s purpose or the performer’s intention. When imitating actions or words, we consciously reproduce them via a deeper level of processing while simultaneously thinking of the underlying motivation for their production. A good example would be a comedian imitating the voice of a famous actor or popular singer. An important aspect of imitation is awareness of the underlying motivation of the action: if we were to imitate a person putting something into a cup and then were asked how we had done it, most of us would mention the motivation of the person we were imitating rather than the physical movement made (i.e., using the right or left hand). Imitation is further believed to comprise two component processes, as Kadota (2012, pp. 151–​152), based upon the relevant research, suggests as follows (see also Rizzolatti and Sinigaglia, 2006): (a) Coding, which consists of analyzing the target activity and then decoding (or transforming) it into manageable elements (“motor acts”) corresponding to our motor repertoires. (b) Rearrangement, which involves arranging the decoded elements in the appropriate order to form a natural sequence of the target activity. When considering this distinction between imitation and mimicry, it is possible that the abovementioned criticism of neonatal imitation by linguists applies only to superficial mimicry. However, it is unclear at present which of these types of imitation actually underlie neonate imitation; this is a matter of future discussion. Now that we are familiar with the distinction between mimicry and imitation, we will move on to the underlying processes of imitation. Current research suggests that imitation is primarily governed by the “mirror neuron system” or “mirror system.” Mirror neuron system automatically reflects or “mirrors” perceived behaviors by generating motor representations of them, which in turn are activated during imitation of the perceived action. Furthermore, the mirror neuron system is believed to be involved in integration of motor representations into relevant knowledge in long-​term memory. It is arguable that this process underlies humans’ “learning by imitation,” which, despite being so harshly criticized by linguists, is believed to play a crucial role in human information

30

30  Shadowing for L2 listening comprehension

Figure 2.12 A image of the premotor cortex (F5) in the macaque brain (An illustration drawn with reference to Arbib, 2012, p. 108)

acquisition. Nowadays, neuroscientists regard mirror neurons as essential for learning by imitation in both primates and humans. Mirror neurons were first discovered in the frontal cortex of macaques by Rizzolatti and colleagues at the University of Parma, Italy in the late 1990s. The mirror neurons were primarily found in F5 of the inferior frontal cortex, or the premotor area. Figure 2.12 shows the location of the premotor cortex (F5) in the macaque brain. Since mirror neurons are found in the frontal (i.e., premotor) cortex of primates, it is hypothesized that they are primarily located in the frontal cortex in humans as well. It has even been convincingly argued that there is a close relationship between the mirror neuron system and linguistic competence, given that the macaque premotor cortex (F5) roughly corresponds to the brain regions in humans associated with language production (i.e., Broca’s area; Yoshida, 2009; see also Table 2.2). It should be noted that the links between the human frontal cortex and mirror system have been indirectly demonstrated –​namely, through accumulated neuroscientific data using fMRI and other neuroimaging techniques recently developed –​rather than directly demonstrated. In summary, the mirror (neuron) system, particularly the one located in the frontal cortex, is believed to be the neural correlate of learning by imitation, including neonate imitation (see Figure  2.11). Additionally, the system is hypothesized to underlie language acquisition in L1, and possibly in L2, by allowing repetition and imitation of speech sounds, words, phrases, and grammatical formulae (constructions).

 31

Shadowing for L2 listening comprehension 31

Figure 2.13 Views of the monkey brain (left) and human brain (right): There is a region called F5 (left) that roughly corresponds to the region of Broca’s area (right) associated with language production (An illustration drawn with reference to Arbib, 2012, p. 88)

2.4.5  Cerebral activation by shadowing: A neuroscience study Thus far, I have provided some theoretical background regarding why I believe shadowing to have an influence on speech perception, such as the McGurk effect, motor theory of speech perception, and the mirror neuron system. Of most relevance to the concept of shadowing is the assumption that the mirror neuron system provides us with the capacity for learning by imitation. In the following section, I will outline a study investigating the brain mechanism underlying the actual practice of shadowing. Kadota et al. (2015a) investigated brain activity when Japanese EFL learners practiced shadowing or listened to an English passage using NIRS (near-​infrared spectroscopy). NIRS (also called “optical topography”) is one of the more recently developed brain imaging techniques that has attracted the attention of neuroscience researchers. Table  2.3 lists several brain imaging techniques currently employed in brain activity experiments. NIRS relies on hemoglobin, a compound in the blood. Hemoglobin scatters light, and “the ratio of infrared light absorbed to that scattered changes depending on the degree of hemoglobin binding with oxygen. NIRS measures this rate of change and the change in oxygenated hemoglobin concentration” (www.shimadzu.com/​an/​lifescience/​imaging/​nirs/​nirs_​top.html). Figure 2.14 depicts how NIRS probes (of which there are two types) function to measure the rate of hemoglobin change. The participants in Kadota et al. (2015a) were 28 paid Japanese EFL learners (11 males and 17 females) who volunteered for the experiment. They were either undergraduate or graduate students of various majors at private universities in Japan. To confirm their proficiency levels, they were administered the Oxford

32

32  Shadowing for L2 listening comprehension Table 2.3 Major types of brain imaging techniques (adapted from Kadota, 2010b, p. 117 and based on Kawashima, 2003) Measurement 1) EEG: electroencephalogram electrical activity 2) PET: positron emission radioactivity tomography 3) fMR: functional magnetic blood oxygenation resonance imaging 4) MEG: magnetic fields magnetoencephalography 5) NIRS: near-​infrared blood oxygenation spectroscopy

Invasiveness

Experimental constraints

no radioactive

minor minor

electromagnetic

major

no

major

no

no

Brain surface data detected by reflective measurement Photo-transmitter probe Laser light

Photo-receiver probe

30 mm

Light path in living organism

Banana shape

Figure 2.14 An illustration of how NIRS measures hemoglobin changes (taken from www. shimadzu.com/​an/​lifescience/​imaging/​nirs/​nirs2.html)

Quick Placement Test. Their scores ranged from 28 to 56 (out of 60), with an average of 45.7. This indicated that they tended to fall into the upper intermediate level (i.e., B2, or independent user) according to the CEFR (Common European Framework of Reference for Languages) criteria. Therefore, participants’ proficiency was found to be sufficient for the given shadowing and listening tasks.

 33

Shadowing for L2 listening comprehension 33 Ten English passages were used as materials for the main tasks of shadowing and listening, while 20 additional passages were selected for the control task of silent reading before or after the main tasks. Each passage was followed by a multiple-​choice comprehension question. The difficulty levels and word count of the passages were confirmed to be statistically homogeneous (average readability according to Flesch readability score:  67.92; average word count 104.5). The following is an example of the passages used for shadowing or listening: We all tell lies for different reasons. We begin to tell simple lies when we become aware of the use and power of language, usually around age four. Since children are very curious, they may tell lies just to see what will happen. If the result is good, they will continue to lie until they are caught. They usually stop once the result is no longer pleasant. There are basically two types of lies, white lies and black lies. While white lies do little or no harm, black lies can cause serious damage to both the liar and the person who is lied to. Participants were tested individually in a soundproof room while oxy-​Hb (oxyhemoglobin) data were collected using the NIRS equipment mentioned above. The NIRS holder was placed on the left and right sides of the participant’s head and a total of 48 measurement channels were used. The two pictures in Figure 2.15 illustrate the NIRS set-​up in the soundproof room. The participants first engaged in the shadowing task for five different passages, after which they completed the listening task for another five passages. The order of the two tasks (shadowing or listening) was counterbalanced; half of the participants began with shadowing and then proceeded to listening, and the other half began with listening and then proceeded to shadowing. After shadowing or listening to each passage, participants were given the comprehension questions, which asked about the contents of the passages. While shadowing, participants’ voices were recorded on digital sound recorders for analysis of their performance.

Figure 2.15 A participant ready for the experiment in the soundproof room (left), and the NIRS holder placed on the left and right sides of the participant’s head with 48 measurement channels (right) (Kadota et al., 2015a)

34

34  Shadowing for L2 listening comprehension

Figure 2.16 The NIRS used in the experiment (left) and the data collection being observed and controlled by experimenters in the next room (right) (Kadota et al., 2015a)

NIRS probes and channels (Left Hemisphere) 9 cm 3 cm

3 cm

T1

Frontal

04

3 cm

R3

9 cm

08

02

T3

15

R5

09

T7

03

R4

16

T6

10

R8

T4

T

Transmitter

R

Receiver

14

17

20 23

R2

07

13

19 22

T2

06

12

18 R7

R1

05

11 T5

3 cm

01

3 cm

R6

Channel for measurement

21 24

T8

Figure 2.17 The NIRS probes and measurement channels on the left side of a participant’s head (Kadota et al., 2015)

Figure 2.16 shows the NIRS used in the experiment and an image of the data collection equipment. Figure 2.17 illustrates the placement of the 24 measurement channels and 16 probes on the left side of a participant’s head. The arrangement of the NIRS probes and measurement channels was determined using the International ERP 10–​ 20 system; specifically, Kadota

 35

Shadowing for L2 listening comprehension 35

Figure 2.18 Brain activity according to NIRS data (oxy-​Hb concentration) in the left (left) and right (right) hemispheres during shadowing (20 seconds after start) (Kadota et al., 2015a)

et al. (2015a) selected the C3, T3, and F7 positions (Jasper, 1958). They also considered the probabilistic relationship between the Brodmann areas and the anatomical locations of the International 10–​20 cortical projection points provided by Okamoto et  al. (2004) to predict what brain functions the NIRS data from each measurement channel represent. Figures 2.18 and 2.19 depict brain activation based on NIRS data (oxy-​Hb data) in the left and right hemispheres during shadowing and listening. The two figures above illustrate that shadowing leads to much greater activation in both the left and right hemispheres (Figure 2.18) than does listening (Figure 2.19). A more detailed comparison of left hemisphere activation between shadowing and listening is shown in Table 2.4. The major results of Kadota et al. (2015a) were as follows: 1) In the motor area, there was no activation difference (in terms of oxy-​Hb concentration) between shadowing and listening for CH (measurement channels) 3, 7, and 10; however, CH 14 showed significantly more brain activity during shadowing than during listening. 2) For all measurement channels attached to Broca’s area (CH 12, 15, and 16), shadowing elicited significantly greater brain activity than did listening. 3) Regarding the auditory area channels, significantly greater activity was found during shadowing than during listening only for CH 21; in contrast, there was no difference between the two tasks at CH 24. 4) CH 18, which was attached in the prefrontal area, showed greater activation during shadowing; however, there was no significant difference between the tasks for CH 22.

36

36  Shadowing for L2 listening comprehension

Figure 2.19 Brain activity according to NIRS data (oxy-​Hb concentration) in the left (left) and right (right) hemispheres during listening (20 seconds after start) (Kadota et al., 2015a)

Table 2.4 Comparison of NIRS data (oxy-​Hb concentration) in the left hemisphere (LH) between shadowing and listening (Kadota et al., 2015a) The Brain Area in LH

NIRS Channel Number

The Results of Statistical Analyses

Motor Area (BA4, BA6)

ch03 ch07 ch10 ch14 ch12 ch15 ch16 ch21 ch24 ch18 ch22

ns. ns. ns. p=.005 p=.004 p j -​ > g -​ > h in Figure  4.15):  Phonological repetition additionally includes some formulation of phonological representation: perceiving the sounds of speech and to some extent grasping what vowel and consonant phonemes are contained within it. This type of repetition involves awareness of differences in pronunciation of the phrase “going to”  . 3) Lexical repetition (via a -​> b -​> c -​> k -​> f -​> g -​> h in Figure 4.15): Repetition at the lexical level involves dividing the heard speech into words. However, the meaning of the words is still not well understood. 4) Semantic repetition (via a -​> b -​> c -​> d -​> e -​> f -​> g -​> h in Figure 4.15): This is the normal route, in which we repeat input speech while processing and understanding the meaning of the aurally presented words. Native speakers without conduction aphasia can repeat L1 speech at this level. These multiple pathways to repeating aurally input words suggest that there are also several routes to L2 speech shadowing. Although Figure  4.15 above focuses on word-​level repetition, we can envisage the following scenarios for shadowing at sentence and passage level. When initially learning shadowing, learners tend to engage in acoustic and phonological repetition (levels 1 and 2 above), that is, repeating the input speech without understanding the words or their meaning. However, as the learner becomes more accustomed to shadowing speech, repetition becomes gradually more automatic and easier to accomplish. The learner

 121

Shadowing for L2 speech production 121

Semantic level d

e

Lexical level

Lexical level k

c

f

Phonological level INPUT

b

Phonological level j

Acoustic level

g

OUTPUT

Acoustic level i

a Speech input

h Speech output

Figure 4.15 Multiple route model of aural word repetition based on conduction aphasia data (Kojima, 2006, pp. 156–​168)

Table 4.6 Two effects of shadowing on output speech Effect 1 Effect 2

Simulating phonetic encoding and articulation in L2 speech Simulating phonological coding and lexico-​grammatical coding in L2 speech

becomes able to understand the meaning of what is heard during shadowing, performing lexical and semantic repetition at levels 3 and 4 above. It can be assumed that shadowing practice helps develop accurate and fluent speech. This is the level of speaking ability –​characterized by the almost simultaneous parallel processing of “listening,” “thinking,” and “speaking,” as described in the introduction of this book –​that is required in real-​world communication. Consider the speech production process discussed in detail in section 4.2 of this chapter (Figure  4.3). This approximates to:  (1) creating a message at the conceptualizer stage, (2)  inputting that message to the formulator for lexico-​ grammatical and morpho-​ phonological encoding, and (3)  pronouncing the output via the articulator. It would seem obvious that shadowing practice is not applicable to step 1 above (message generation). However, it seems possible and even likely that, by simulating the sentence encoding and articulation stages of the L2 speech production process, shadowing should be excellent training for steps (2) and (3) (see Table 4.6). You may wonder whether shadowing really does produce the above two effects. For beginners who have just started shadowing training, it is conceivable that the effect of shadowing is restricted to articulation and, at most, vocalization at the articulator stage. However, as repetition gradually becomes automatic, we would expect the cognitive load related to the task of shadowing

122

122  Shadowing for L2 speech production to be reduced. Once it becomes easy-​to-​repeat input speech, the situation changes:  shadowing training becomes effective in simulating speech production with semantic processing, a multitasking capability that is needed in real-​ life communication.

4.5  Effectiveness of shadowing in L2 lexico-​grammatical encoding Lexico-​ grammatical encoding is the most important process in the acquisition of L2 speech production. How this encoding process is performed greatly influences fluency of speech. The author has previously proposed the following three processes (Kadota, 2012, pp. 256–​257). 1) Rules-​based language production, in which we apply syntactic and morphological rules to words retrieved from the mental lexicon to produce sentences. This is based upon the “open-​choice principle,” which makes it possible to understand and produce “new” sentences we have not heard before. As Chomsky (1965) once suggested, it is a mechanism that characterizes the creative language capacity of human beings. 2) Structurally primed language production, which is based upon the “priming effect” of the syntactic structures already processed by the speaker. Here we produce sentences by reusing the same grammatical structures that we have already processed through reading or listening. The “priming” here is a research method used in experimental cognitive psychology, in which the information in “primes” (words, sentences, pictures, etc.) presented earlier to the participants affect the processing of the subsequent “targets” (words, sentences, pictures, etc.), usually in a positive way. For example, suppose a participant has to decide as a word-​level priming task whether or not a presented target word actually exists (e.g., hospital:  real word; hosbital:  nonword). This lexical judgment may be affected by semantically related prime words (e.g., nurse) or by semantically unrelated prime words (e.g., teacher). It has been shown that the reaction time (RT) required for lexical judgment is significantly shortened when the word “nurse” is presented earlier than the word “teacher” (Koike et al., 2003, p. 547). This is a typical example of lexical priming. Morishita, Satoi, and Yokokawa (2010) presented Japanese students of English with one or other of the following two examples of sentence structures using the verb “give”: ) PO (prepositional object) structure: The driver gave the car to the mechanic. 1 2) DO (direct object) structure: The driver gave the mechanic the car. The participants were then required to complete the sentence “The patient showed …” without referring to the above examples again.

 123

Shadowing for L2 speech production 123 There was a clear tendency to reuse the same structure as they had previously been given: either PO or DO. This suggests that Japanese learners of English form syntactic representations in their minds when they process sentence structures and reuse them to produce further sentences. 3) Formulaic language production, in which we use formulaic sequences such as idioms, word collocations, and sentence stems to create sentences. This is based upon the “idiom principle” in which speakers can activate relevant prefabricated lexical units using less cognitive resources than would otherwise be needed. A formulaic sequence (FS) is a unit of language that English native speakers usually store whole in their mental lexicon. It is argued that they use FSs in everyday conversation without even thinking about the constituent words. Examples are shown in Table 4.7. Wray (2002) gives us the following working definition of an FS: … a sequence, continuous or discontinuous, of words or other elements, which is, or appears to be, prefabricated: that is, stored and retrieved whole from memory at the time of use, rather than being subject to generation or analysis by the language grammar. [Wray, 2002, p. 9] Wray (2002, pp. 14–​18) goes on to propose a dual-​function model of a mental lexicon comprising analytic and holistic systems. 1) The analytic system creates sequences out of small units using grammatical rules; it is flexible enough to interpret novel input and to produce original output. Table 4.7 Types of formulaic sequence (Moon, 1997, p. 44) Formulaic sequences

Examples

Compounds Phrasal verbs

freeze-​dry, Prime Minister, long-​haired, etc. Verbs like go, come, take, put, etc. + adverbial or prepositional particles like up, out, off, in, down, etc. kick the bucket, rain cats and dogs, spill the beans, etc. of course, at least, in fact, by far, good morning, how do you do, etc., which fall outside the above three categories Similes: dry as a bone, etc. Proverbs: It never rains but it pours, enough is enough, etc. Preconstructed, institutionalized phrases like the thing /​fact /​ point is, that reminds me, I’m a great believer in…, etc.

Idioms Fixed phrases

Prefabs

124

124  Shadowing for L2 speech production 2) The holistic system reduces processing requirements by storing and retrieving sequences as whole chunks. She then suggests that the holistic system is indispensable to fluent L2 processing and production. Using formulaic sequences and reusing recently processed structures makes it much easier to construct sentences in a semi-​automatic way with reduced cognitive load. In Tomasello’s (2003, p.  7) usage-​based view of L1 acquisition, adult language speakers are endowed with linguistic competence through “a structured inventory of linguistic constructions” rather than via linguistic rules. In other words, it is not linguistic rules that enable us to interpret and produce a sentence correctly but familiar, frequently (or less frequently) used constructions. Examples of such constructions range from simple morphemes like –​ing and -​ly to syntactic frameworks like Subject-​Verb-​Object-​Object (e.g., Nick made Steffi a sandwich). Acquisition of a structured inventory of linguistic constructions is the final stage that young children go through in achieving L1 capabilities; see Chapter 6 for a more detailed discussion on the relevance of usage-​based theory in L2 acquisition. It seems sensible that the constructions Tomasello draws attention to should be included among the formulaic sequences mentioned above, because the two are rather similar (see also Ellis and Wulff, 2015). Of the three language processes outlined above, the second and third (structurally primed and formulaic language production) can significantly reduce the speaker’s cognitive load at the lexico-​grammatical encoding stage, compared to relying solely on the first process (rules-​based language production). The linguistic knowledge employed in the first is considered to be consciously memorized as a result of formal learning, while the second is a typical case of implicit learning (i.e., memory formation) and the third seems likely to be acquired through repetitive priming or practice.

4.6  Effectiveness of shadowing in L2 speech: Empirical studies in ESL and JSL No research to date has shown unambiguously that shadowing practice is effective in enhancing lexico-​grammatical encoding as well as phonetic encoding and articulation. However, empirical data supporting the efficacy of shadowing are included in studies on oral reading, which is often used alongside shadowing in the classroom. The oral reading technique considered most beneficial to L2 speech production is “personalized oral reading of a temporarily memorized text.” In this technique, according to Yonezaki and Ito (2012), “learners read aloud a text about some famous person, pretending as if they were the famous person themselves” (Yonezaki and Ito, 2012, p.  97). To clarify, learners might read text A  below,

 125

Shadowing for L2 speech production 125 about Princess Diana, converting pronouns to the first person as if the speaker actually was Princess Diana (see text B). (A) In 1974 Diana went on to her mother’s old school, where her sisters were also students. By then, their mother wasn’t living in London, but in Scotland. She was kind to Diana, although they lived separately. She and her new husband, Peter, had a large farm on an island. Diana was looking forward to visiting it and had some lovely holidays there. (B) In 1974 I went on to my mother’s old school, where my sisters were also students. By then, our mother wasn’t living in London, but in Scotland. She was kind to me, although we lived separately. She and her new husband, Peter, had a large farm on an island. I was looking forward to visiting it and had some lovely holidays there. Learners might then read aloud another passage like text B, replacing words with different pronouns. The point is that personalized oral reading learners have to take an original text and construct new sentences from it. Yonezaki and Ito (2012) combined this technique with a “read-​and-​look-​up” task, in which learners read a sentence, temporarily memorize it, look up, and recite the sentence without seeing the text again. They point out that this exercise is similar to what happens in actual speech, so it’s better oriented toward natural speech than is a standard oral reading exercise.

Figure 4.16 An image of personalized oral reading

126

126  Shadowing for L2 speech production Yonezaki and Ito (2012) examined the effect of this type of oral reading for high school learners of English in Japan. The learners were divided into two groups: a control group of 22 was given regular oral reading training, while an experimental group of 17 were given personalized oral reading combined with read-​and-​look-​up tasks. The experimental group was also given questions to deepen their understanding, such as: Q: Why did you decide to go to your mother’s old school? Q: How did you feel when you and your mother lived separately? The training sessions for both groups lasted 15 minutes each, four times a week over a period of two months. Before and after the training period, both groups were given spoken English tests in CALL (computer assisted language learning) classrooms; these consisted of answering questions (a) and (b) below and describing three pictures in English. Their speech was recorded. ( a) After you graduate from school, what do you want to study and do in the future? (b) Suppose you win the lottery and get 100  million yen, what would you like to do? The two teachers evaluated each participant’s speech on a scale from 1 to 5 in four different categories: sound intensity (loudness), articulation speed, content transmission, and grammatical accuracy. Figure 4.17 summarizes the results for the control and experimental groups.

43 42 41 40 39 38 37 36 35 Pre Control

Post Experimental

Figure 4.17 Overall results of pre-​and posttraining spoken English tests

 127

Shadowing for L2 speech production 127 12

11

10

9

8 Pre

Post

Control

Experimental

Figure 4.18 Articulation speed data 12

11

10

9

8 Pre Control

Post Experimental

Figure 4.19 Content data

Clearly, the experimental group that was given personalized oral reading alongside read-​and-​look-​up training improved its test scores far more than the control, given standard oral reading training. There was no significant difference between the control and experimental groups in the speech intensity and grammatical accuracy tests. However, clear differences were found in the evaluation of articulation speed and content data, as shown in Figures 4.18 and 4.19. To summarize, when L2 learners are given a course of personalized oral reading combined with read-​and-​look-​up training, their speaking skills improve greatly, especially in terms of speed of articulation and transmission of content.

128

128  Shadowing for L2 speech production

Figure 4.20 Selective shadowing

By using our ingenuity to refine and develop shadowing training, it should be possible to amplify its benefits in a similar way to that described above for personalized oral reading. For example, Furuta (2012) advocates a selective shadowing technique that improves dialog capability in L2 speech: one of a pair of students shadows a native English voice while the other identifies and repeats keywords (only) from the shadowing speech. For example: STUDENT A:  It was very difficult for him to get along with other kids at school. STUDENT B:  difficult, get along, kids, school

Furuta also proposed an interactive shadowing technique in which one partner (Student B) repeats the keywords in the shadowing speech and then adds brief comments, for example: STUDENT A:  Schulz’s humor has touched millions of people. STUDENT B:  Humor, touched, millions, people –​I know. That’s great.

 129

Shadowing for L2 speech production 129

Figure 4.21 Interactive shadowing

4.7  Effectiveness of shadowing by L1 speakers in assessing L2 speech In this chapter so far, we have examined the “output effect” of shadowing training on L2 speech. That is, our focus has been to assess whether shadowing training in the L2 classroom improves learners’ speaking skills, particularly focusing on the phonetic encoding and articulation and the lexico-​grammatical encoding stages of L2 speech production (see also Hamada, 2017, pp. 43–​47). However, recently there has also been a move toward using native speakers’ shadowing of L2 learners’ speech to provide an objective assessment of the latter. A  major problem in learning L2 speech production is acquiring sufficiently accurate pronunciation to be understood in the target second language; a “foreign accent” based on the first language is difficult to lose. There is a need to assess objectively the degree of intelligibility of the learners’ pronunciation. It is extremely difficult to predict how native listeners will perceive speech in foreign language accents. To date, learners’ intelligibility has been largely assessed through native teachers’ experience and intuition. Inoue et al. (2018) introduced the technique of shadowing not by L2 learners but by native teachers. They asked the teachers to shadow L2 learners’ speech in an attempt to determine whether the learners’ shadowability is a reliable measurement of the intelligibility of their speech.

130

130  Shadowing for L2 speech production In this study, the target second language was Japanese, and the learners were Vietnamese. Six Vietnamese learners’ L2 Japanese speech was recorded through oral reading of an intermediate-​level L2 Japanese textbook. Then three groups of 27 L1 Japanese listeners were required to shadow the recorded speech without practicing or imitating accented pronunciation. The three groups were:  NS-​1, who had never talked with Vietnamese; NS-​2, students with a Vietnamese classmate; and NS-​3, teachers of Japanese with expert knowledge of Vietnamese. After shadowing, they were given two subjective questions: Q-​1 (comprehensibility): How easily did you understand the speech? Q-​2 (smoothness): How easily did you shadow the speech? The learners’ original speech and L1 shadowing thereof were assessed. DNN-​ based GOP scores were calculated for each original reading and shadowing. GOP refers to an automatic index of “goodness of pronunciation,” calculated by classical and generative Hidden Markov Models (Witt and Young, 2000; Dean, Minematsu, Yamauchi, and Hirose, 2009), introducing recent discriminative DNN (deep neural network) speech models that have been shown to improve error detection and proficiency prediction. The shadowing delay was also assessed by comparing the learner’s speech and its corresponding L1 shadowing and measuring the time gaps between individual phonemes. These objective scores were compared to the subjective scores awarded by the L1 shadowers. Table  4.8 below shows the average scores of the two objective measures  –​ DNN-​based GOP for comprehensibility and shadowing delay for smoothness –​ for the three groups of shadowers. First, it was discovered that there is a significant difference in smoothness scores only between NS-​1 and NS-​3 and between NS-​2 and NS-​3 (p < 0.05). Teachers of Japanese with considerable knowledge of Vietnamese tended to be smoother in shadowing than the other L1 Japanese speakers. The correlation between the two kinds of scores –​comprehensibility (SC) and smoothness (SS) –​was calculated for each participant, and the average correlation was found to be r = .68, considered moderate rather than high. Six out of the 27 shadowers showed very low correlations (r = .36), possibly because they used aberrant shadowing strategies. The correlations between the average GOP scores and the two subjective comprehensibility and smoothness scores were analyzed. The results are shown in Figure 4.22 for the L1 shadowers and in Figure 4.23 for the L2 learners. Table 4.8 Average scores of objective measures for comprehensibility and smoothness by NS-​1,  -​2,  -​3

comprehensibility (SC) smoothness (SS)

NS-​1

NS-​2

NS-​3

4.13 4.58

4.20 4.45

4.24 4.78

 131

Shadowing for L2 speech production 131

GOP scores

0.9 0.8 0.7 0.6 0.5 0.4

R = 0.73

0.3 1

2

3 4 5 6 Comprehensibility scores

R = 0.73 7 1

2

3 4 5 6 Smoothness scores

7

Figure 4.22 Correlation between shadowers’ GOP and the two subjective scores

GOP scores

0.9 0.8 0.7 0.6 0.5 0.4

R = 0.63

0.3 1

2

3 4 5 6 Comprehensibility scores

R = 0.50 7 1

2

3 4 5 Smoothness scores

6

7

Figure 4.23 Correlation between learners’ speech GOP and the two subjective scores

It is interesting that, even when the subjective scores were awarded not for shadowing but for learners’ original speech, the GOP scores correlate rather well with subjective scores of learners’ speech. To sum up, the experimental data represent a remarkably promising result: the study by Inoue et  al. (2018) tends to suggest that native speakers’ shadowing can be an effective, automatized tool to evaluate L2 Japanese learners’ speech comprehensibility. Starting with some of the basic features of everyday speech, this chapter provided an outline of L2 speech production through the analysis of errors (slips of the tongue) and of an L2 production model. We then discussed in detail the effects of shadowing training on the phonetic encoding/​articulation and lexico-​ grammatical encoding stages of L2 speech production. We surveyed some practical studies of the effectiveness of shadowing in L2 English and finally looked into shadowing by L1 speakers as a way of assessing L2 speech production. The chapter focused on the output effects of shadowing, that is, how effective shadowing training can be in L2 speech production.

132

5  Metacognitive monitoring and control

5.1  Metacognition in L2 Acquisition 5.1.1  What is metacognition? If you have never heard the words “metacognition” or “metacognitive monitoring,” you might discover from a dictionary that metacognition is “cognition about cognition” or becoming “aware of one’s awareness” through higher-​ order supercognitive processes. In the preface of her book on the psychology of reading and writing, Akita (2003, p. 7–​8) addresses the importance not only of repeated reading and writing exercises but of recognizing what is going on while we are reading and writing. She states that baseball players often evaluate their batting swings by watching themselves in the mirror. They not only do a lot of practice but repeatedly analyze their progress. Practice alone is not enough to be a good hitter; you need expert knowledge of how to hit a ball. The ability to monitor our own performance is important not only to athletes such as baseball players but also in many aspects of our daily life. Figure 5.1 shows a typical example of metacognitive monitoring. This capacity for introspection is very important in learning L2; we need to be able to evaluate our performance and judge how well we are doing in L2 comprehension and production. This “monitoring” or “metacognitive monitoring” is a crucial determining factor of success or failure in L2 acquisition. 5.1.2  Types of metacognitive knowledge Metacognitive ability is usually considered to fall into one of three categories: metacognitive knowledge, metacognitive monitoring, and metacognitive control. Metacognitive knowledge –​in this case an accurate knowledge of L2 acquisition –​is essential to the use of metacognition during L2 processing. There are three basic types of metacognitive knowledge (Flavell, 1979): ) Knowledge of person(s). 1 2) Knowledge of task. 3) Knowledge of strategy.

 133

Metacognitive monitoring and control 133

Figure 5.1 Example of metacognitive monitoring

The following Figure 5.2 shows the types of metacognitive knowledge in more detail. Pintrich (2002) summarizes Flavell’s (1979) three types as follows. 1)  Knowledge about persons Knowledge about persons is a very important part of metacognition. It includes knowledge about learners themselves, such as their own strengths and weaknesses. One example is that students who know that they generally do better on multiple-​ choice tests than on essay tests have some metacognitive self-​knowledge about their test-​taking ability, which seems useful to them when they give answers to the two different types of tests. Another type of knowledge about persons concerns the differences between individuals. An example is learners’ beliefs on differences between individuals: “She

134

134  Metacognitive monitoring and control

Knowledge about selves

Metacognitive knowledge

Knowledge about persons

Knowledge about differences between individuals

Knowledge about cognitive tasks

Knowledge about generalized human cognition

Knowledge about strategies

Figure 5.2 Types of metacognitive knowledge (Sannomiya, 2008, p. 9, 2017, p. 143)

is much more motivated than me in mathematics,” “I am more skillful in communication with people from abroad than my brother,” etc. Knowledge about human cognition in general is also essential in that, if we confront a novel problem that is ill defined, then general problem-​solving heuristics, such as syllogisms, may be very useful. Pintrich (2002) points out that, although self-​knowledge in itself can be an important aspect of metacognitive knowledge, the accuracy of the knowledge seems to be most important for learning; inaccurate understanding often leads to misleading results. 2)  Knowledge about cognitive tasks Knowledge of tasks includes knowledge that different tasks can be more or less difficult and may require different cognitive strategies. For example, a recall task is more difficult than a recognition task, because the former requires a deeper search of memory to retrieve the relevant information than merely selecting the correct answer from a number of alternatives. 3)  Knowledge about strategies Strategic knowledge concerns general strategies for learning, thinking, and problem solving. It includes strategies learners use, for example, to memorize text and extract meaning from it so that they can understand what they hear in the classroom. Pintrich (2002) suggests that, alongside general learning strategies,

 135

Metacognitive monitoring and control 135 students have various metacognitive strategies that seem useful to them in planning and regulating their learning and thinking: setting subgoals or asking themselves questions as they read, for example. Let us think about the use of these three kinds of metacognitive knowledge in L2 reading comprehension. a) Self-​knowledge: e.g., “I am rather good at reading English, but not good at listening, so I read a lot more than I listen.” b) Knowledge about differences between individuals: e.g., “My friend Ryo has a larger vocabulary than I do, so I have to consult a dictionary more often than he does.” c) Knowledge about the human cognition in general: e.g., “It is important to read English with a purpose.” d) Knowledge about the task: e.g., “English text with many unknown words is harder to read, so I take longer than normal to read it.” e) Knowledge about strategy: e.g., “It is important to predict the content of a passage from its title before we start reading.” 5.1.3  Metacognitive activity: Monitoring and control Metacognitive knowledge, however accurate and useful, is not helpful unless it is actually used during L2 learning and processing. Practical use of metacognitive knowledge, known as “metacognitive activity,” consists of metacognitive monitoring and metacognitive control (see Figure 5.3 below). Metacognitive monitoring might include, for example, self-​ evaluation of whether our comprehension of L2 reading is going well or not. Metacognitive control would then involve modifying the comprehension process according to

Metacognitive monitoring: Awareness, feeling, prediction, inspection, evaluation, etc. on cognition Metacognitive activity Metacognitive control: Goal setting, planning, modification, etc. on cognition

Figure 5.3 Metacognitive activity:  Metacognitive monitoring and control (Sannomiya, 2008, p. 9)

136

136  Metacognitive monitoring and control the result of metacognitive monitoring. So, when reading text, we might sometimes step outside our task to look at it from the outside, as it were, and try to be aware of what we are doing. If monitoring leads us to judge that it is taking much longer to finish reading a text than we anticipated, we might increase our reading speed or decide not to process all the words but just skim the important cues while skipping the detail. Metacognitive monitoring and control may be carried out not only online during reading but also offline immediately after finishing. Alternatively, or additionally, it may be performed just before the next time you start reading. Figure 5.4 shows a variety of pre-​, mid-​and posttask metacognitive monitoring and control activities. Metacognitive monitoring and control should occur before, mid, and after a given task. However, Sannomiya (2008) points out that metacognitive activity is more difficult in the midst of the task, since we are deeply involved in processing the meaning. While it is possible to pause a reading task to carry out some monitoring and control, it is much more difficult to conduct metacognitive activity while trying to listen to and understand somebody else’s speech. When listening to speech, we usually have to conduct metacognitive activity pre-​and posttask. During a shadowing task, we are forced to hear our own voice throughout, whether we like it or not. Metacognitive monitoring and control in the midst of shadowing, as opposed to other tasks such as listening, have to be done

Figure 5.4 Major pre-​, mid-​and posttask metacognitive activities (Sannomiya, 2008, p. 10)

 137

Metacognitive monitoring and control 137 automatically and subconsciously. I will return to this in detail at the end of this chapter. 5.1.4  Metacognition and the frontal association cortex We will now look at the neural correlate underlying metacognitive activities. The frontal association cortex has been confirmed as the region of the brain responsible for higher-​order cognition, including thinking, learning, and emotions. Figure 5.5 compares the proportion of the brain occupied by the frontal association cortex in different animals. In the human brain, the frontal association cortex occupies approximately 30% of the brain: much more than in animals such as cats (4%), dogs (7%), and monkeys (12%). The human frontal association cortex is known to continue to develop until we are 20-​year-​old adults and then to decline with age. By the age of 80, it will be less than half its size at 20 years old. It is known that human intelligence, memory, perception, and language are hardly affected even if the frontal association cortex is damaged by a brain hemorrhage, infarction, tumor, or other cause. Instead, it is likely that we suffer from a higher-​order dysfunction related to our brain’s “executive functions.” Russian psychologist Luria (1973) showed patients with damage to the frontal association cortex a rather complex picture (Figure  5.6) and asked them to describe it in words (see also Sannomiya, 2008, p. 208). These patients typically responded as follows. Seeing the sign “Danger,” they immediately said something like “high voltage” or “infected area.” Observing the man running to help the drowning boy, they said: “It’s a war.” Another response, upon seeing the church tower, was “It’s the Kremlin.” Understanding a complex picture like Figure  5.6 requires careful examination to first construct a framework for what the picture conveys and then to validate the framework against the actual details shown in the picture. It is necessary to suppress early responses based on fragments of the image or individual cues and then to interpret the message of the picture as a whole. Luria (1973)

Figure 5.5 Percentage of the brain occupied by the frontal association cortex in humans and other animals

138

138  Metacognitive monitoring and control

Figure 5.6 Picture shown to patients with damaged frontal association cortex (Kashima, 1999, p. 217)

concluded that processes like this are quite difficult for patients with frontal association cortex impairment. This suggests that inhibitory control is one of the basic functions of the frontal association cortex. 5.1.5  Monitoring and executive functions of working memory In section 3.1 of Chapter 3, we discussed a recent model of working memory by Baddeley (2012). We also evaluated the role of the phonological loop as a language acquisition center. However, further recent research on working memory has focused more on its monitoring and executive functions. These are mechanisms that are constantly working during day-​to-​day life. For example, conscious memory stores all the actions we need to perform before leaving home in the morning, such as closing the windows, unplugging electrical devices, picking up our wallet and watch, and closing the front door.

 139

Metacognitive monitoring and control 139

Simple memory span tasks (e.g., D-span, W-span, NWR-span)

Phonological short-term store

Articulatory rehearsal

Phonological working memory (PWM)

Complex memory span tasks (e.g., R-span, O-span, N-back-span)

Updating

Executive working memory (EWM)

Switching

Inhibition

Visuospatial working memory (VWM)

Long-term/store (declarative/procedural memory) L1 competence; mental lexicon/word-specific information, formulaic sequences/chunks, morphosyntactic constructions, rule-governed knowledge L2 proficiency/knowledge; L2 lexis, formulaic sequences/chunks, morphosyantactic constructions, rule-based/metalinguistic knowledge

Figure 5.7 An integrated conceptual framework for working memory in L2 acquisition (Wen, 2016, p. 83)

Monitoring and executive functions are considered to play a crucial role in L2 processing and acquisition. Thus Wen (2016) and Wen et al. (2015) propose an integrated model of working memory designed especially for L2 acquisition, as shown in Figure 5.7. This model is essentially an expanded version of two components of Baddeley’s (2002, 2012) models: the phonological loop and the central executive, referred to in Wen’s model as “phonological working memory” (PWM) and “executive working memory” (EWM), respectively. These are currently recognized as the two key components most relevant to L2 acquisition. PWM, which consists of a phonological short-​term store and a subvocal articulatory rehearsal system, functions as a key “language acquisition center” (see section 3.1 of Chapter 3) and is believed to play a decisive role in acquiring newly encountered L2 phonological patterns. The size of PWM is often determined by simple memory span tasks, such as number or word span tasks or nonword repetition (NWR). EWM, however, has attention regulating and executive control functions consisting principally of memory updating, task shifting, and inhibition. It has long been suggested that the frontal cortex governs the central executive functions of working memory (see section 3.1 of Chapter 3). Executive functions roughly correspond to the following.

140

140  Metacognitive monitoring and control 1) Inhibitory control:  Inhibiting one of two competing tasks, e.g., when bilinguals inhibit processing of one of their two languages. 2) Task shifting:  Shifting between two tasks, e.g., from adding numbers to subtracting them. 3) Memory updating:  Updating working memory according to new infor mation/​ instructions, e.g., during the n-​ back task in working memory experiments. Inhibitory control is usually measured using the “Simon task” depicted in Figure 5.8. In this task, participants are shown a fixation point and then presented with either a red or a green circle to the left or right of that point. They are required to judge whether the circle is red or green by pressing one of two keys as quickly as possible (x or /​in Figure 5.8). Participants are generally found to be faster and more accurate when the stimuli (red or green circles) occur in a congruent rather than incongruent location to the response keys (x or /​). Memory updating is typically measured using an n-​back task as shown in Figure 5.9. In this task, participants are presented with a sequence of letters on a computer display and are required to indicate when the current letter matches the letter n (typically 2 or 3 steps earlier in the sequence). For example, for a 3-​back task using the letter sequence “C, X, Q, q, N, a, n, C, A” in Figure 5.9, when the letter “C” is presented the participants must remember that it is not the same as the letter “N” from three letters earlier, whereas when the letter is “A” is presented, they must remember that it is the same as the “a” three letters earlier. The n-​back

Figure 5.8 Example of the Simon task (adapted from Kadota, 2018, p. 144)

 141

Metacognitive monitoring and control 141 Time going by

A C n a N

q X

3-back condition 2 3-back condition 1

Q

C

Figure 5.9 Example of an n-​back  task

task measures participants’ central executive capacity for continuously refreshing the working memory buffer. EWM has been shown to be involved in complex, higher-​order interpretation processes in L2 comprehension and production. Examples of higher-​order processing include detecting the antecedents of pronouns, resolving anaphoras, and deciding how to attach a relative clause. EWM capacity is therefore mainly measured by complex span tests like reading span, operation span, and n-​back  span. We will not focus here on the visuospatial sketchpad or visual working memory (VWM) or the recently added episodic buffer in Baddeley’s L2 working memory model described in section 3.1 of Chapter 3. It seems likely that they play a minor role in L2 processing and acquisition. Wen (2016) argued that the PWM–​EWM distinction hypothesis adopts a developmental perspective of L2 proficiency:  PWM plays a greater role at an elementary level, while EWM becomes more prominent as learners become more proficient (see Figure 5.10). PWM is involved in retaining and acquiring new words (vocabulary), word sequences (formulae), and rules (grammar). In L2 learning, EWM is more involved in attentional and executive functions –​mainly in metacognitive monitoring and control  –​including perception and self-​correction. It is likely, as Wen (2016) suggests, that EWM affects processing and performance related areas of attentional resource allocation in L2 comprehension, production, and interaction. The developmental model by Wen (2016) is very promising and may provide us with an important perspective on the interactive relationship between domain specific PWM and domain general EWM. The author expects it will be a good jumping-​off point for many researchers and educators working in L2 acquisition.

5.2  Research into L2 metacognitive monitoring and control Metacognitive activity supporting our learning mechanisms in various ways has been researched from the perspective of cognitive psychology. One technique often employed is to capture learners’ think-​aloud protocols; learners are required to describe their mental processes out loud as they are performing learning tasks.

142

142  Metacognitive monitoring and control

Phonological working memory (PWM) Phonological short-term store Articulatory rehearsal mechanism (Simple memory span task, e.g., NWR-span)

Executing working memorty (EWM) Attention regulating/allocation (Updating/switching/inhibition) (Complex memory span task, e.g., R-span, O-span)

L2 beginners

Efficency of acquiring novel word forms Retention of sequences of forms/chunks Lexis (vocabulary) Formulaic sequences/chunks Morphosyntax constructions (grammar) Acquisition/developmental aspects

Intermediate L2 learners

Noticing/monitoring/self-repair (efficiency of encoding and retrieval) Comprehension (listening and reading) Production (speaking and writing) Interpreting (comprehension and production) Offline processing and online performance

Figure 5.10 PWM and EWM hypothesis for L2 acquisition (Wen, 2016, p. 110)

Next, we will review an important study capturing think-​aloud protocol data from L2 processing and acquisition. 5.2.1  Think-​aloud protocols in L2 acquisition Yoshida et al. (1997) analyzed L2 learners’ think-​aloud protocols while filling in the blank spaces in an English language cloze test. Think-​aloud protocol analysis is a process-​oriented research technique that allows researchers to gain insights into learners’ strategies. Participants are asked to monitor what they are doing or thinking and to verbalize their actions and thoughts aloud under controlled conditions. This elaborate introspective psychological method was devised and developed by Ericsson and Simon (1984, 1993). A cloze test is a written test devised by Taylor (1953) to measure the readability of a printed text. Words are removed from a passage of text according to a fixed principle: every nth (e.g., fifth, sixth, or seventh) word is removed regardless of its function or meaning. Application of cloze tests has been extended to measurement of reading comprehension and overall L2 proficiency (Kadota, 1985). The basic assumption behind cloze testing is the established gestalt concept of human perception. People tend to perceive things not as a sum of separate parts, but as an organized, meaningful whole. Thus an incomplete image may be perceived as a complete one, a phenomenon referred to by gestaltists as the “principle of closure.” Guessing the missing words in a cloze test is claimed to be similar to a viewer’s completion of an imperfect visual image (Taylor, 1953). The two main scoring procedures for each blank in a cloze test are the exact word method and the acceptable word method. If the reader can fill in the identical word to that removed from the text, it indicates a degree of correlation between the language used by the writer and the reader. The acceptable

 143

Metacognitive monitoring and control 143 word method is more message oriented; it counts any word that fully fits the surrounding context as a correct answer. Participants in Yoshida et al.’s research were 65 first-​year university students (29 males and 36 females) learning English as a foreign language in Japan. Figure 5.11 depicts a typical face-​to-​face think-​aloud session; the participant (on the left) is verbally reporting what he is thinking while completing a cloze test, and the researcher (on the right) is listening to and recording the report. Two English passages were used in the cloze tests. One of them, entitled “Crime in New York” (419 words), was as follows.

144

144  Metacognitive monitoring and control

Figure 5.11 Think-​aloud protocol research

From the second paragraph onwards in each passage, a total of 10 words (9 content and 1 function word) were deleted. Participants’ strategies in filling in the blanks were identified and classified as follows: CW:  Thinking about the passage’s main themes. CP:   Thinking about contextual constraints preceding the blanks. CN:  Thinking about contextual constraints following the blanks. M:   Translating sentences with blanks into Japanese. I:   Using idiomatic or formulaic knowledge. B:   Using background information or schemata. G:   Using grammatical knowledge. The main results are summarized in Table  5.1. These include a measure of the order in which participants filled in the blanks, the percentage of correct responses using both the exact word and acceptable word scoring methods, the number of different strategies identified through the verbal protocols (CW, CP, CN, etc.), and the mean number of strategies used for each blank. Some major conclusions were as follows: 1) Participants attempted to fill in the blanks as they went through the passage, so the blanks were largely filled in the order that they appeared. 2) Contextual constraints following the blanks (CN), translating sentences into Japanese (M), using grammatical knowledge (G), and contextual constraints preceding the blanks (CP) were strategies preferred by many participants. Researchers usually refer to these clues as local processing strategies.

 145

Metacognitive monitoring and control 145 Table 5.1 Main results of Yoshida et al. (1997) Blank Cloze no. filling

Score (%) by order exact (mean) word (mean)

Score No. of strategies used (%) by acceptable word (mean) CW CP CN

1 2 3 4 5 6 7 8 9 10

1.5 3.3 5.5 4.5 6.5 7.1 8.9 11 11 12

90.9 72.7 72.7 100.0 90.9 9.1 36.6 36.6 0.0 9.1

90.9 72.7 72.7 100.0 90.9 9.1 45.5 45.5 72.7 9.1

2 0 0 0 0 2 4 2 5 5

3 1 7 2 3 4 3 4 2 5

3 9 5 11 8 7 2 4 1 4

No. of strategies used per participant (mean)

M

I

B G total

8 4 6 1 6 6 6 6 4 6

0 0 1 0 0 2 0 0 1 0

0 0 0 0 2 2 3 2 3 0

8 4 5 3 2 3 3 4 3 1

24 18 24 17 21 26 21 22 19 21

2.2 1.6 2.3 1.5 1.9 2.4 2.0 2.1 1.7 1.9

3) The correct response rate is significantly higher for the blanks 1–​5 than for 6–​10 by both scoring methods. It was also noted that contextual strategies were used slightly more for blanks 6–​10 than for 1–​5. This suggests that participants mostly used local processing strategies (contextual constraints) for the easier blanks but also needed more general contextual clues for the more difficult blanks. Although all the students, including those observed to be slow learners, turned to both local and general contextual clues when confronted with the more difficult cloze blanks, this did not mean they always gave correct answers. In summary, collecting think-​aloud protocol data is an effective way to investigate L2 learners’ metacognitive monitoring, one of the functions of executive working memory discussed in 5.1.5. 5.2.2  Developing metacognitive monitoring by shadowing training In section 2.4 of Chapter 2, we discussed one of the notable findings of Kadota et al.’s (2015a) NIRS experiment, that shadowing resulted in greater activity than did listening in the channels attached to the left frontal association cortex (e.g., BA9, 10, 46, and 47). The corresponding regions in the right frontal association cortex were also found to be activated during shadowing (Kadota et al., 2016). What do these activations of the frontal association areas signify? The executive functions of inhibitory control, task shifting, and updating working memory are governed by the frontal association cortex (see 5.1.5). According to current research that takes an information processing approach

146

146  Metacognitive monitoring and control to cognition, executive functions are closely linked to metacognitive activity in language processing (Osaka, 2002; Sannomiya, 2008). Many researchers have pointed out that people find it difficult to engage in metacognitive control or other cognitive tasks while simultaneously engaged in language comprehension. This is because language processing uses up considerable cognitive resources, leaving no extra resources available for metacognitive activity. However, in shadowing, which involves two concurrent tasks –​listening and speaking  –​learners find it relatively easy to maintain awareness of their own shadowing performance. We can hypothesize that learners’ conscious repetition during shadowing enhances metacognitive awareness of performance. The fact that there is greater activity in the frontal association cortex during shadowing than during listening (as shown in Kadota et  al., 2015a) gives neuroscientific support to this hypothesis, suggesting that it would be promising to investigate this topic further. Considerably more empirical data is necessary to establish the effects of shadowing training on the development of metacognitive monitoring and control and the benefit to L2 processing and learning. 5.2.3  Effect of metacognitive control in L2 oral reading Reading aloud is an exercise favored by many L2 English teachers and one that has been shown empirically to be effective in developing L2 fluency (Suzuki and Kadota, 2012, etc.). Mori (2013) described unique empirical research into the relationship between oral reading performance, text comprehension, and metacognitive control by L2 learners in Japan. The metacognition in question is how learners allocate their attentional resources between pronunciation and speed of oral reading and processing the meaning of the passage. There were 32 participants in Mori’s research, mainly undergraduate and graduate students. They were given the following three tasks. 1) English placement test:  Oxford Quick Placement Test Version 1 (2001) [marked from 0 to 60]. 2) Oral reading of four different passages, recorded by the researcher to determine a) Speed (syllables per minute). b) Pitch range (semitones). c) Holistic evaluation of oral reading [marked from 1 to  5] followed by written recall of each passage [marked from 0 to 10]. 3) Attentional resource control to clarify participants’ awareness of their focus on pronunciation, speed, and comprehension of each passage [marked from 1 to 5]. After completing the Oxford Quick Placement Test, participants were asked to read each passage aloud while making sure they understood the meaning. They were then immediately given a written recall test in Japanese to test their comprehension and recall of the content. Finally, they were questioned about their

 147

Metacognitive monitoring and control 147 Table 5.2 Oral reading speed (syllables per minute), pitch range (semitones), holistic evaluation scores, and written recall scores

(1) Oral reading speed (syll/​min) (2) Pitch change (semitone) (3) Holistic evaluation (1–​5) (4) Written recall (1–​10)

N

Mean

SD

(RS)

32

162.25

26.137

(PC)

32

19.07

4.690

(HE)

32

2.83

.411

(WR)

32

5.51

1.656

Table 5.3 Attentional resource allocation results

Pronunciation Speed Meaning

Abbreviation

Mark range

N

Mean

SD

(AP) (AS) (AM)

(1–​5) (1–​5) (1–​5)

32 32 32

3.37 3.08 3.98

.792 .849 .581

allocation of attentional resources to pronunciation, speed, and comprehension at the time of oral reading. Table 5.2 and 5.3 show the results for oral reading performance and recall and attentional resource allocation, respectively. The main findings were as follows. 1) The average oral reading speed was 162.25 syllables per minute or 103.2 words per minute. 2) Holistic evaluation indicates little comprehension during oral reading (mean HE score: 2.83). 3) The participants paid slightly more attention to the meaning of the passage than to pronunciation or reading speed. The next Table 5.4 shows the correlation coefficients among all the measurements. There was a strong correlation between RS (speed) and the HE (holistic evaluation) score [.667], but no significant correlation between RS (speed) and WR (written recall) [.315]. With regard to attentional control, AS (focus on reading speed) was found to correlate significantly with HE [–.351], but not with WR [–.182]. Likewise, although there was a significant correlation between AM (focus on meaning) and WR (written recall) [.387], no correlation was found between AM and HE [.000].

−​.029 −​.092 .172

(AP) (AS) (AM)

**p < .01, *p < .05 (two-​tailed test)

1.00 .499** .376* .388* .248

(PT) (RS) (PC) (HE) (WR)

Placement test Oral reading speed Pitch change Holistic evaluation Written recall Attentional control   Pronunciation   Speed   Meaning

(PT)

Abbreviation

N=32

−​.051 −​.261 .025

1.00 .385* .667** .315

(RS)

Table 5.4 Pearson’s correlation coefficients for all measurements

.001 .080 .291

1.00 .271 .326

(PC)

−​.048 .-​351* .000

1.00 .386*

(HE)

.091 −​.182 .387*

1.00

(WR)

1.00 .107 .088

(AP)

1.00 −​.025

(AS)

1.00

(AM)

newgenrtpdf

148

 149

Metacognitive monitoring and control 149 The three attentional resource allocations showed very little correlation with each other, possibly because there is a trade-​off between them; this needs to be investigated further. Mori’s (2013) findings are important in that they show how what learners attempt to focus on during oral reading may determine how effective this form of training is in L2 acquisition. If learners pay more attention to reading speed, the overall quality of oral reading as assessed by native speakers tends to be much improved. However, paying more attention to comprehension may enhance recall. Suzuki (2005) summarizes the major effects of oral reading in L2 learning as follows (Kadota, 2015, p. 231; Suzuki and Kadota, 2012, etc.). Effect 1: Helping learners to convert written words into pronunciation (i.e., phonological coding). Effect 2: Helping learners to comprehend textual content. Effect 3:  Helping learners to internalize words, formulae, structures, etc. into long-​term memory. Effect 4: Helping learners to acquire the ability to make themselves understood by others through “oral interpretation.” It seems reasonable to assume that articulation-​focused oral reading will contribute to Effect 1, and, to some extent, Effect 4, whereas the comprehension focused oral reading will contribute more to Effect 2. To summarize, since oral reading involves two processes being performing simultaneously –​phonological encoding of text into speech and comprehension of meaning –​it can be argued that learners’ metacognitive monitoring and control may play an important role in developing L2 proficiency and fluency.

5.3  Dementia and bilingualism 5.3.1  What is dementia? Dementia is defined as A syndrome  –​usually of a chronic or progressive nature  –​in which there is deterioration in cognitive function (i.e. the ability to process thought) beyond what might be expected from normal aging. It affects memory, thinking, orientation, comprehension, calculation, learning capacity, language, and judgement. Consciousness is not affected. The impairment in cognitive function is commonly accompanied, and occasionally preceded, by deterioration in emotional control, social behavior, or motivation. (www.who.int/news-room/fact-sheets/detail/dementia) Table 5.5 shows the major types of dementia.

150

150  Metacognitive monitoring and control Table 5.5 Major types of dementia (summarized by the author from www.healthline. com/​health/​types-​dementia) Alzheimer’s disease

Frontotemporal dementia

Vascular dementia

Dementia with Lewy bodies (DLB)

Mixed dementia

The most common type of dementia, accounting for between 60% and 80% percent of cases. Symptoms: Early signs include depression, forgetting names and recent events. People suffering from the disease also have trouble speaking and walking. A term used to describe several types of dementia. The front and side parts of the brain, which control language and behavior, are affected. Also known as Pick’s disease. Symptoms: These typically include loss of inhibition, loss of motivation, and compulsive behavior. People also have problems with language (speech), including forgetting the meaning of common words. The second most common type of dementia after Alzheimer’s disease is caused by a lack of blood flow to the brain resulting from stroke, etc. Symptoms: Inability to plan the steps needed to complete a task is a likely early symptom, along with confusion and disorientation. Later on people have trouble completing tasks or concentrating for long periods of time. Protein deposits in nerve cells cause memory loss and disorientation. Symptoms: People suffering from DLB also experience visual hallucinations and have trouble falling asleep at night or during the day. They might also faint, get lost, or become disoriented. The disease shares many symptoms with Alzheimer’s (e.g., difficulty walking, feeling weak). Some people have more than one type of dementia. This is common, the most frequent combination being vascular dementia and Alzheimer’s. It’s reported that up to 45% of people with dementia have mixed dementia.

5.3.2  Association between bilingualism and age at onset of dementia It is much more difficult to accomplish automatic comprehension and speech production processes using an L2 than an L1. It is important to allocate our attentional resources optimally to each component of L2 communication that needs to be processed in parallel (see Figure  1.3 in Chapter  1). This is usually done in executive working memory (EWM). In 5.1.5, we discussed the roles of the monitoring and executive functions in L2 processing and acquisition, suggesting that using an L2 in daily conversation is good training for our EWM. Alladi et al. (2013) researched the apparent effect of bilingualism on the age at onset of different types of dementia. They analyzed the case records of 648 dementia patients, of whom 391 were bilingual speakers. The aims of the research were as follows:

 151

Metacognitive monitoring and control 151 Table 5.6 Age at onset of different types of dementia for monolinguals and bilinguals (Alladi et al., 2013) Dementia subtype

AD dementia FTD VaD DLB Mixed dementia

No.

240 116 189 55 48

Age at onset of dementia

p value

Monolingual

Bilingual

65.4 (10.0); 39.5–​92.0 55.6 (10.5); 31.0–​78.0 57.0 (10.7); 37.0–​84.0 66.7 (11.0); 57.0–​84.0 70.1 (10.0); 49.0–​83.0

68.6 (9.6); 40.0–​89.0 61.6 (9.0); 39.0–​83.0 69.0 (8.2); 51.0–​80.0 69.0 (8.2); 51.0–​80.0 71.5 (7.7); 57.0–​85.0

0.013 0.001 0.506 0.506 0.608

Abbreviations:  AD = Alzheimer’s disease; DLB=dementia with Lewy bodies; FTD=fronto-​ temporal dementia; VaD=vascular dementia. Data are presented as mean (SD); range, unless otherwise stated.

1) To compare the age at onset of first symptoms between monolingual and bilingual patients. 2) To examine the influence of number of languages spoken, education, occupation, and other potentially interacting variables. Table 5.6 represents the ages at onset of different types of dementia for monolinguals and bilinguals. The major observations were as follows: 1) Bilingual patients developed dementia 4.5 years later than monolingual ones. Significant differences of age at onset were found across Alzheirmer disease (AD), fronto-​temporal dementia (FTD), and vascular dementia (VaD); the differences were also observed in illiterate patients. 2) There was no additional benefit to speaking more than two languages. 3) The effect of bilingualism on age at dementia onset was independent of other factors such as education, sex, occupation, and urban vs rural dwelling. This delay in onset of dementia was attributed to bilingualism enhancing executive functions such as inhibitory control and task shifting in EWM (see 5.1.5 of this chapter). Kroll (2015) summarized recent important findings on the association between bilingualism and executive functions as follows: 1) For bilingual speakers, the two languages are not separate but active and competing with each other in their minds. She stated that bilinguals are “mental jugglers.” 2) L1 use is affected significantly by L2 use. 3) Using two languages actively has a major impact on the executive functions and EWM in the frontal association cortex of the brain.

152

152  Metacognitive monitoring and control Bialystok (2015) uses previous empirical comparisons between bilingual and monolingual to conclude: 1) Bilinguals generally have a smaller vocabulary than monolinguals. Their lexical–​phonological–​conceptual connections are also found to be weaker and their reaction times when reading aloud a word on a computer screen are slower. 2) However, bilinguals’ executive capabilities using EWM tend to be superior to those of monolinguals. For example, Bialystok et  al. (2005) reported research on the Simon task, comparing (a)  children, (b)  young adults, (c) middle-​aged adults, and (d) older adults. They concluded that bilinguals in age groups (a), (c), and (d) had faster response times than monolinguals, though no significant difference was observed for the young adult group (b). Bak et al. (2014) at the University of Edinburgh investigated the relationship between IQ test scores of children aged 11 and their own Moray House Tests on inference ability, conducted 62 years later at the age of 73. They found that (1) to enhance the executive functions in the frontal cortex it is necessary to use the two languages on a daily basis, and (2) being bilingual from an early age is not a deciding factor in EWM development. We can conclude that bilinguals using two languages on a daily basis can enhance their executive functions, delaying the age at onset of any dementia in later life (see Figure 5.12).

5.4  Metacognitive monitoring, executive function, and L2 acquisition Bilingual people have been shown to have enhanced executive functions and monitoring, but it can be argued that similar executive enhancement can also be seen in L2 learners and users who use two languages (L1 and L2) in on a daily basis. This is particularly true for cognitively challenging L2 acquisition, where the target L2 is linguistically distant from the learner’s L1 in grammar, pronunciation, and so forth (see Table 6.1 of Chapter 6 for language difficulty rankings). Bialystok et  al. (2005) investigated the degree of difference between the two languages of bilinguals as a factor in enhancing one type of executive function (i.e., inhibitory control). The participants were ten English monolinguals (L1 English), ten French–​ English bilinguals (L1 French), and nine Cantonese–​ English bilinguals (L1 Cantonese). They were given the Simon task to examine their inhibitory control, an important indicator of executive control and EWM (see Figure 5.8). In the congruent condition, the correct response key was on the same side of the screen as the stimulus, and in the incongruent condition, the reverse was true; in control tests the same stimuli were presented in the center of the screen. Reaction time (RT) data are shown in Table 5.7. The main findings were as follows:

 153

Figure 5.12 The images of monolinguals and bilinguals

Table 5.7 Mean reaction times and accuracy under different conditions (Bialystok et al., 2005, p. 44) Group

Monolingual (N=10) French (N=10) Cantonese (N=9)

Condition Control

Congruent

Incongruent

RT (SD) %Accuracy

RT (SD) %Accuracy

RT (SD) %Accuracy

425 (76) 97

479 (99) 96

499 (94) 93

415 (64) 98

457 (77) 97

475 (80) 96

348 (46) 95

378 (65) 93

397 (92) 91

154

154  Metacognitive monitoring and control ) In general, RTs tended to be faster for congruent than for incongruent tests. 1 2) The Cantonese–​English bilingual group were faster than either monolinguals or French–​English bilinguals, whose results were similar. Although Bialystok et al. said that they had no explanation for the faster RTs of the Cantonese group, it seems safe to suggest that the daily need to manage two languages that are linguistically dissimilar may lead to significant systematic changes in frontal executive functions. It seems plausible to claim that people who use a cognitively challenging L2 on a daily basis benefit from excellent training of their executive functions and EWM than those using a less challenging L2. This chapter has dealt with metacognition in L2 acquisition, introducing and defining the different types of metacognitive knowledge and activity, and then discussing the relationship between the frontal association cortex of the brain and EWM. We have reviewed important studies on metacognitive monitoring and control in L2 acquisition, focusing particularly on shadowing and oral reading tasks. Finally, from the standpoint of enhancing executive functions and EWM, we briefly reviewed the apparent relationship between bilingualism and the age of onset of dementia. This may provide an impetus for researchers to look more deeply into particularly cognitively challenging L2 acquisition, where the linguistic distance between L1 and L2 is great, and where it is inevitable to introduce the tasks designed to enhance L2 executive functions, such as shadowing.

 155

6  Establishing a new concept of practice in L2 acquisition

6.1  Outlining a likely innate learning system There is a vast amount of published material on L2 acquisition, as much of it approaching the subject from an L1 as from an L2 perspective. There is a recent consensus that it is important to study both L1 and L2 acquisition –​including aspects of bilingualism, trilingualism, etc. –​to best exploit the true relationship between language and the human mind in learning and teaching. 6.1.1  Our learning system The L2 acquisition process has been hotly debated from many angles. The author believes that we can best proceed by attempting to answer two fundamental research questions (RQs) (Kadota, 2012, 2014, 2015), as follows (see also Figure 6.1). RQ 1: What inputs are indispensable for L2 acquisition to take place? RQ 2: What activities best help learners apply the innate learning system of the human mind to L2 acquisition? RQ 1 is a very basic question about the types, quantities, and levels of language inputs. People who have mastered a foreign language have often received a large amount of spoken and written L2 input. Here we are concerned with how we can increase the amount of input that learners hear or see, bearing in mind that we must be able to understand all the input for it to be useful. In other words, we need both a sufficient amount of target language input and for it to be at an appropriate level to the learner. RQ 2 concerns how we humans acquire language. What educational methods are effective to assist L2 acquisition? For example, simply memorizing vocabulary and grammar and translating written sentences into English will not on their own enable us to speak English fluently. Figure 6.1 shows a hypothetical innate learning system; it essentially consists of the following components (Kadota, 2014). 1) The auditory and/​or visual preceptors, in which inputs such as language are received.

156

156  New concept of practice in L2 acquisition Language inputs

Research question (1)

Learning (perception · comprehension · memory · internalization) device

Research question (2)

Language competence

Figure 6.1 Two fundamental research questions for L2 learning (Kadota, 2015, p. 26)

Figure 6.2 A L2 teacher contemplates the two research questions

2) The comprehension component, which analyzes the inputs and understands their meaning. 3) The short-​term storage or working memory system, which holds inputs that have been understood temporarily in the brain. 4) The input intake system, which internalizes these inputs by transferring the information into long-​term memory (LTM). Figure  6.2 is an image of an L2 teacher contemplating these research questions. This chapter will first discuss what L2 acquisition research has revealed so far regarding RQ 1. 6.1.2  Input theory A famous pioneer of second language theory, Stephen Krashen, proposed the well-​known “monitor model” of L2 acquisition, in which the following five hypotheses were proposed (Krashen, 1982).

 157

New concept of practice in L2 acquisition 157 1) The input hypothesis suggests that language acquisition occurs when learners are exposed to L2 inputs that can be understood but that contain i + 1 level content, where i represents the level of language already acquired and the “+ 1” refers to language (words, grammatical forms, pronunciation, etc.) that is one step beyond that level. It states that comprehensible input is the “necessary but also sufficient condition” for language acquisition. 2) The acquisition-​learning hypothesis. Acquisition is the subconscious process of learning an L1, which is in principle an implicit learning based on a large amount of spoken input. L2 learning, however, is usually a process of explicit knowledge formation. The linguistic knowledge obtained in explicit learning in the classroom is not necessarily useful in actual communication; rather, it serves as an aid to monitoring actual L2 use. Only implicit knowledge acquired through actual use of the target language is really practical for real-​world communication. Thus acquisition and learning are not linked so much as separate from each other, a hypothesis known as a “non-​interface” position (see 3.3.2 of Chapter  3 for a discussion of the interface between explicit and implicit learning in psychological terms). DeKeyser (2007), who supports the importance of practice to input processing and output production in L2 acquisition, argues that there is a level of automatized explicit knowledge in between explicit and implicit knowledge: Explicit Knowledge > Automatized Explicit Knowledge > Implicit Knowledge

It seems that a realistic or rather an ultimate goal for most L2 learners should be to achieve a high degree of automaticity, however difficult this may be (DeKeyser, 2007, p. 289). 3) The monitor hypothesis is similar to the acquisition learning hypothesis in that the L2 knowledge memorized by explicit learning plays a very limited role (monitoring, etc.) in actual communication (see Chapter 5). 4) The natural order hypothesis claims that grammatical knowledge and rules are acquired in a certain natural order that is common to all L2 learners regardless of their age and native language. For instance, grammatical knowledge such as the rules for constructing question sentences, adding -​s to third-​person singular verbs in the present tense, or the use of auxiliary verbs is acquired following a fixed, predictable sequence. It is well known that the grammatical features that are easiest to learn explicitly are not always the first to be acquired. For instance, the –​s added to a third-​ person singular verb is often dropped in spontaneous speech even by advanced learners, possibly because it is not generally crucial in conveying a message. 5) The affective filter hypothesis refers to emotional, attitudinal factors that prevent learners from acquiring language even when plenty of appropriate

158

158  New concept of practice in L2 acquisition input is available. For example, a learner might feel demotivated because of a dislike for the teacher. Learners with a high affective filter tend not to assimilate any language input into their learning systems; a low affective filter is a prerequisite for successful learning. It can be argued that these five hypotheses define the major requirements for successful second language acquisition. 6.1.3  Output theory However, on the basis of immersion programs used for English–​French bilingual education in Canada, Swain (1995, 2005) notes that a large amount of input processing does not guarantee good L2 acquisition, particularly not an accurate understanding of grammatical and morphological language forms. She emphasizes the importance of output production on the grounds that, while input processing promotes comprehension of a language, it does not necessarily develop its accurate use. Output production can be seen as drills to practice accurate manipulation of L2. Swain proposes an output hypothesis to complement the above input hypothesis, as shown schematically in Figure 6.3. In Figure 6.3, the learning system is depicted as a mechanism to perceive linguistic information, store it, and internalize it as knowledge in long-​term memory. While input theory assumes that language capacity can be acquired simply by processing a large amount of i + 1 comprehensible inputs, output theory asserts that it is important to add output tasks to follow on from comprehension of the inputs. According to output theory, “forced output” tasks should yield the following benefits. 1) Learners should notice a gap between what they can understand and what they can produce with their current L2 knowledge. 2) Learners should get corrective feedback of their output from listeners.

Language inputs

Learning system

Language competence

Language inputs

Learning system

Language competence

Output activities Figure 6.3 Input and output theories of L2 acquisition

 159

New concept of practice in L2 acquisition 159 3) Learners should become more aware of formal language properties such as grammar. 4) Learners should come to pay more attention to L2 grammatical forms and, as a result, improve their input comprehension in terms of both listening and reading. 6.1.4  Interactional theory In addition to the two theories discussed above there is also an interactional hypothesis, which argues that interactions between peers deepen the level of comprehension of inputs and so enhance L2 acquisition. LEARNER A:  You should’n’ve eaten so much ice cream. LEARNER B: Sorry? LEARNER A: You should not have eaten so much ice cream. (Muranoi,

2006, p. 46) In the above interaction, Learner B felt Learner A’s utterance was a bit strange, so B asked A to repeat it, whereupon A produced an accurate grammatical form. A’s incorrect utterance was corrected. Though it might seem trivial, a brief interaction like this forces learners to reconfirm their knowledge of L2 grammatical rules and to reconstruct their own “interlanguage.”

6.2  Psycholinguistic competence: An index for proceduralization of language use What component skills are needed to attain interactional skills or “communicative competence”? Canal and Swain (1980) suggested that communicative competence consists of the following components: 1) Grammatical competence:  To be able to understand and produce previously unheard and unspoken sentences correctly based on L2 grammatical knowledge. 2) Sociolinguistic competence: To be able to use words that are well suited to context and surroundings. 3) Discourse competence: To be able to make use of discourse markers such as pronouns, paraphrases, and ellipsis to construct a coherent spoken or written discourse. 4) Strategic competence: To be able to continue conversation even if you cannot recall the most appropriate words and expressions quickly enough. Strategies to do this include paraphrasing and repetition, which help the speaker to get through immediate difficulties. 5) Psycholinguistic competence: The cognitive fluency to carry on conversation smoothly without any communication breakdown. The author advocated the inclusion of psycholinguistic competence into the communicative

160

160  New concept of practice in L2 acquisition competences in 2009 (Kadota, 2009) and has been emphasizing its importance ever since (Kadota, 2012, 2014). The author defines psycholinguistic competence as the capacity to process input speech and generate a response quickly, consistently, fluently, and automatically within a response time of at most one second and preferably within 400–​500 ms. Segalowitz (2010) proposed the concepts of “cognitive fluency,” “utterance fluency,” and “perceived fluency,” all of which are shown in Figure 6.4. Cognitive fluency is the basic infrastructure required to generate speech. It includes utterance planning and utterance assembly, which are cognitive indicators of how fluently we can produce sentences. Utterance fluency is the ability to keep speech fluent by, for example, changing articulation speed, self-​correction (repair), or paraphrasing using different expressions. Perceived fluency refers to the degree of cognitive fluency that listeners anticipate on the basis of utterance fluency. The concept of psycholinguistic competence is based on Segalowitz’s (2010) notion of cognitive fluency. It represents the degree of proceduralization of L2 knowledge, namely to what extent L2 vocabulary, grammar, and so on are Domain of cognitive fluency (cognitive functions underlying production)

Domain of utterance fluency (features of the utterance)

Measurable features of oral production Utterance planning

• Temporal • Repair • Other

Integration and execution

Communicatively acceptable utterance

Utterance assembling (Inference)

Domain of perceived fluency (inferences about cognitive fluency based on perception of utterance fluency)

Figure 6.4 Cognitive, utterance, and perceived fluencies (Segalowitz, 2010, p. 50)

 161

New concept of practice in L2 acquisition 161 retrieved from long-​ term memory without the user being conscious of the retrieval process. Neuroscientific studies using electroencephalographs to measure event-​related potentials (ERPs) during language processing tend to support the existence of psycholinguistic competence. Suppose a sentence containing a deviation from semantic prediction is presented to participants word by word, auditorily or visually. It is known that a specific negative potential N400 appears about 400 ms after a critical word that causes a meaning deviation is encountered. Figure  6.5 shows a typical N400 potential after a semantically deviant word presented at 0 μV (zero microvolts on the vertical axis) is processed. Kohno et al. (2007, p. 107) used the following example sentences to illustrate ERP N400: Jenny put the sweet in her pocket after the lesson. Jenny put the sweet in her mouth after the lesson. The unexpected use of the word “pocket” instead of “mouth” in the first sentence caused an ERP N400 response to the critical word due to semantic deviation. ERP data suggests that psycholinguistic competence in processing input and reacting within a few hundred milliseconds is a prerequisite for effective real-​time communication. Of the five components of communicative competence described above, the greatest efforts to date have been put into achieving the first (i.e., grammatical) competence. This is because grammatical competence enables L2 learners to comprehend and produce previously unheard and unspoken sentences by retrieving the relevant grammatical rules, particularly for L2 learners of English. Grammatical competence has usually been achieved through traditional, explicit

Cz

–6 N400

Amplitude (µV)

–4 –2 0 2 4 6 0

200 400 Time (ms)

600

Figure 6.5 Typical example of event related potential (ERP) N400 appearing 400 ms after processing a critical word (Ward, 2010, p.  238). CZ (the vertical axis) is the potential of an electrode placed at the center of the parietal lobe of the brain in the internationally recognized position for brain wave measurement.

162

162  New concept of practice in L2 acquisition learning of declarative L2 knowledge. However, it is much more difficult to implicitly acquire automatic L2 manipulation ability, which has been shown to be essential in everyday communication.

6.3  Establishing a new concept of practice in L2 acquisition 6.3.1  Practice connecting input processing and output production Earlier in this chapter I explained briefly the interactional hypothesis of L2 acquisition, which is based on both output and input theory. Prerequisites for the kind of interaction exemplified in subsection 6.1.4 are: (1) that L2 inputs are comprehensible, and (2) that learners are motivated to produce outputs to a certain level of proficiency. This book, as its title suggests, emphasizes the importance of practice in connecting L2 input processing and output production and particularly in interaction. These activities are especially important for learners whose target language is not used around them on a daily basis and whose L1 is linguistically distant from the target L2. Such learners need a large amount of L2 practice because of its crucial role of “rehearsing” aspects of future output speech. Explicit L2 knowledge must be converted into a level of automatic, implicit L2 knowledge that can be used in actual communication –​by practicing! I believe strongly that diligent practice is the intervention technique that can best bridge the gap between L2 input comprehension and output production, as is necessary to acquire psycholinguistic competence. Practice enhances the basic cognitive fluency underlying these two skills (Figure 6.6). 6.3.2  Old concept of practice Practice was a traditional concept in the behavioral-​psychology-​based audiolingual approach to L2 learning of the 1950s and 1960s. Behavioral psychology was also a well-​known theoretical framework to explain L1 acquisition. Next, we will briefly examine the behavioral approach to L1 acquisition, before moving on to the concept of practice in L2 acquisition. (1) The stimulus-​response-​reinforcement model of L1 acquisition consists of constructing two different kinds of mental association:  one between

PRACTICE

Input theory

Output theory

Figure 6.6 Intervention to connect L2 input and output (Kadota, 2018)

 163

New concept of practice in L2 acquisition 163 stimulus and response and one between response and reinforcement. The former association  –​based on classical conditioning  –​will gradually disappear without regular reinforcement. A  new association is known to be encouraged or discouraged by “reward” or “punishment,” a phenomenon known as “operant conditioning” (see 3.3 in Chapter 3). This stimulus-​response-​reinforcement model of learning can be applied to L1 acquisition so that if a mother points to a cat and says the word “cat,” her infant child makes an association between the word and the animal through classical conditioning. In operant conditioning, if a mother says the word “cat” clearly, while looking at a cat, the infant might receive a “reward” such as praise or a smile. However, if the child produces an incorrect response, such as “car,” the parent’s response might be less positive, perhaps involving correction. (2) The stimulus-​response-​reinforcement model of L2 acquisition became infamous as the “pattern drill” of the audiolingual approach of the 1950s and 1960s. The following is a typical example, with cues given to learners shown in italics. I play tennis every day. I play tennis every day. Do you play tennis every day? Do you play tennis every day? Did you play tennis yesterday? Did you play tennis yesterday? Learners were first required to repeat a sentence aloud and then to amend it slightly according to a given set of cues, for example, changing to question form or using “yesterday” instead of “every day” in the example above. Pattern drill was designed as an exercise to habitualize learners’ capacity to use different sentence structures. Pattern drills were much criticized on the grounds that they were very mechanical and noncontextual (context independent). Therefore, when a generative linguistics approach emerged, it quickly became mainstream in L2 teaching in place of structural linguistics. Pattern drills became obsolete, and more communication oriented techniques were adopted by many L2 teachers. 6.3.3  A new concept of practice: Practice as a priming process Instead of making new associations between stimulus and response as in the old concept of practice outlined above, the author has proposed a new type of practice using “priming” as its underlying theoretical framework (Kadota, 2012, 2014). In Chapter 3, we discussed a type of human memory known as a perceptual representation system; evidence for perceptual learning actually comes from priming studies.

164

164  New concept of practice in L2 acquisition

Figure 6.7 Structurally feasible objects illustrated by Schacter, Cooper, and Delaney (1990, p. 368)

Priming refers to information becoming easier to access if it has recently been encountered and processed. For example, in 3.3 of Chapter 3 we saw that fragments of already encountered words were easier to fill in than full-​word blanks (e.g., “OR_​_​GE” as “ORANGE”). Ward (2010, pp. 185–​ 186) reviewed some empirical neural studies and found that neural correlate activity seems to be less apparent during secondary, primed processing than during primary, unprimed processing. He confirmed that priming involves cerebral regions related to perception. Schacter, Cooper, and Delaney (1990) suggested that people asked to make a judgment as to the structural plausibility of various three-​dimensional objects could only be primed for the feasible objects, not for the unfeasible ones (see Figures 6.7 and 6.8). This suggests that our perceptual system works solely for plausible objects, since it is known that they only can enhance priming for implicit memory formation. Priming is usually considered to be a framework for implicit, procedural memory formation (McDonough and Trofimovich, 2009). There are three main types of priming, as follows (see also 4.5 in Chapter 4 for syntactic priming). 1) Semantic priming: The processing of a target word or phrase is facilitated due to a person’s prior experience with a semantically related prime word (see Figure 6.9). 2) Syntactic priming: Speakers tend to produce a particular syntactic structure after recent exposure to that structure. 3) Repetition priming: Processing of a spoken or written word is enhanced due to a person’s prior experience processing the same word. This book emphasizes practice redefined by priming, as described above, and particularly by repetition priming. This priming-​ based concept of practice is

 165

New concept of practice in L2 acquisition 165

Figure 6.8 Structurally impossible objects illustrated by Schacter, Cooper, and Delaney (1990, p. 368)

Prime word (e.g. nurse or teacher)

Nurse

Teacher

Blank screen

Target word (e.g. hospital)

Hospital

Figure 6.9 Semantic priming (Kadota, 2015, p. 129)

presumed to play a significant role in connecting L2 input processing and output production. An important characteristic of this new concept of practice is that it is context dependent: it provides learners with ample opportunity to extract meaning contextually, in sharp contrast with the old audiolingual approach to practice. The significance of connecting input processing and output production through priming-​based practice cannot be stressed too much, particularly for L2 learners whose L1 is linguistically distant from the L2 target. L2 English learners whose L1 are very different include Japanese, Koreans, and Chinese. Table 6.1, drawn up by the US Foreign Service Institute (FSI), lists languages in order of difficulty in terms of the typical time a native English speaker would need to learn them at L2. Learners might be expected to reach “general professional proficiency in speaking” and “general professional proficiency in reading” after the listed study times. Naturally we should keep in mind that “this ranking only shows the view of the Foreign Service Institute (FSI)” (see www. effectivelanguagelearning.com/​language-​guide/​language-​difficulty). According to Table 6.1, it takes approximately 2,200 hours for native English speakers to learn one of the most distantly related languages (Category V, i.e.,

166

Table 6.1 Languages listed in order of increasing difficulty by the US Foreign Service Institute (FSI) Category I : 23–​24 weeks (575–​600 hours) Languages closely related to English Afrikaans Norwegian Danish Portuguese Dutch Romanian French Spanish Italian Swedish Category II: 30 weeks (750 hours) Languages similar to English German Category III: 36 weeks (900 hours) Languages with linguistic and/​or cultural differences from English Indonesian Swahili Malaysian Category IV: 44 weeks (1,100 hours) Languages with significant linguistic and/​or cultural differences from English Albanian Lithuanian Amharic Macedonian Armenian *Mongolian Azerbaijani Nepali Bengali Pashto Bosnian Persian (Dari, Farsi, Tajik) Bulgarian Polish Burmese Russian Croatian Serbian Czech Sinhala *Estonian Slovak *Finnish Slovenian *Georgian Tagalog Greek *Thai Hebrew Turkish Hindi Ukrainian *Hungarian Urdu Icelandic Uzbek Khmer *Vietnamese Lao Xhosa Latvian Zulu Category V: 88 weeks (2,200 hours) Languages which are exceptionally difficult for native English speakers Arabic *Japanese Cantonese (Chinese) Korean Mandarin (Chinese) * Languages preceded by asterisks are usually more difficult for native English speakers to learn than other languages in the same category.

 167

New concept of practice in L2 acquisition 167 Japanese, Korean, Chinese, or Arabic), whereas they need just 575–​600 hours to learn Danish, Dutch, French, or Italian. Conversely, one would expect native Japanese, Korean, Chinese, and Arabic speakers to have the greatest difficulty in learning English. Therefore it seems particularly important that these groups have sufficient practice in connecting input processing and output production. 6.3.4  Practice for automatized explicit knowledge: DeKeyser Figure 3.38 in 3.3 of Chapter  3 suggests that there is a route from episodic memory via explicit, semantic memory to a subconscious implicit memory and that repeated exposure to L2 makes it possible for learners to proceduralize their explicit L2 knowledge so that they can use it in a subconscious, implicit manner. However, DeKeyser (2007) notes that the concept of L2 practice remains largely unexamined from a theoretical point of view and points out that automatized language knowledge is not exactly the same as implicit language knowledge. While implicit memory or knowledge is always defined with reference to lack of awareness, lack of awareness is not a requirement for automaticity (DeKeyser, 2007, p. 4). Therefore, in terms of L2 practice goals, it is theoretically more useful to understand the fine distinctions between declarative, proceduralized, and automatized rather than just between declarative and procedural knowledge. We can interpret this as suggesting that a high degree of automatic L2 knowledge, rather than totally proceduralized implicit knowledge, is a realistic ultimate goal for most L2 learners (see also DeKeyser, 2007, p. 289). Whatever the ultimate goal, it seems clear that introducing extensive reading and listening to the L2 toolkit would help learners to attain a more automatized or implicit method of L2 processing, as discussed in the following section.

6.4  Extensive reading/​listening and shadowing as practice connecting inputs and outputs In Chapter 2, we discussed Krashen’s (1985) L2 input hypothesis and his view that providing learners with a large amount of comprehensible input is the “necessary but also sufficient condition” for successful L2 acquisition. However, it seems important to clarify what exactly “a large amount of input” means in regard to L2 acquisition; the author believes this needs to be investigated seriously. The author posits here that extensive reading and extensive listening can constitute “sufficient input-​driven practice” according to input theory. By contrast, in Chapter 4 we discussed the effects of shadowing training on the phonetic encoding/​articulation and lexico-​grammatical encoding stages of L2 speech production. The effect of shadowing was described as an output effect and can be posited as a potential source of “sufficient output-​driven

168

168  New concept of practice in L2 acquisition

Extensive reading & listening

1) Input theory

Shadowing & oral reading

2) Practice

3) Output theory

Figure 6.10 Input-​and output-​driven practice (Kadota, 2015, p. 36)

practice” according to output theory. Figure  6.10 shows these two types of practice. We will now discuss these two types of practice: input-​driven and output-​driven. 6.4.1  Extensive reading as input-​driven practice L2 learning requires repeated exposure to the target language so that learners can make the best use of their linguistic knowledge in a subconscious manner. We would expect this to be brought about by repetition priming, as discussed above (see also Figure 3.38 in Chapter 3 for the links between episodic, semantic, and implicit procedural memory). There have been many studies exploring the effects of extensive reading (ER) on L2 acquisition. In this section, I will consider the importance of input-​driven practice by focusing on ER rather than extensive listening. Two mainstream views of ER’s effectiveness have emerged and can be summarized as follows (Day and Bamford, 1998; Krashen, 1982; etc.): ) ER is a good way to improve reading proficiency 1 2) ER is an effective method of building linguistic competence, including reading ability, vocabulary, writing, and spelling skills. Most ER researchers take the second of these views. Although there have been relatively few carefully controlled experimental studies, Grabe (2009, p. 313) reports that the major findings so far regarding the effectiveness of ER in L2 English learning are that: ) Reading ability is strongly related to the amount of reading undertaken. 1 2) Long-​term ER training, specifically, leads to vocabulary growth. 3) ER is more motivating for L2 learners than traditional textbook-​centered reading. Let us examine an important study into the effectiveness of ER in L2 English learning (Furukawa, 2007). The participants were:  (a) L2 English learners who started ER at junior high school, (b) those who started ER at elementary

 169

New concept of practice in L2 acquisition 169 Table 6.2 BASE scores of two ER groups (junior high school seventh and eighth graders) and one non-​ER group (senior high school tenth graders) Group (a): 7th grade Listening Grammar & vocabulary Reading Total

Group (b): 7th grade

Group (a): 8th grade

Group (b): 8th grade

Group (c): 10th grade

61 42

69 51

74 59

94 75

52 51

51 154

63 183

68 201

88 257

45 148

school, and (c)  a control group of senior high school learners (tenth graders) who did not undertake ER. All three groups were given the Basic Assessment of Communicative English (BASE) test, designed for tenth graders. The results of the test are shown in Table 6.2. The results are as follows: 1) The two ER groups (a) and (b) achieved significantly higher BASE scores during their eighth grade (the second year of junior high school) than the non-​ER group (c). 2) Even during the seventh grade (first year of junior high school), ER groups (a) and (b) tended to be better than the tenth-​grade control group (c) at listening and reading. While at seventh grade the ER groups were only better than tenth graders at listening and reading, after another year of ER, they were better at grammar and vocabulary as well. The order in which ER improves language performance –​first listening and reading comprehension, and subsequently knowledge like grammar and vocabulary –​is similar to that found in L1 acquisition. Grabe (2017) also points out that ER promotes implicit L2 learning by improving word recognition of known vocabulary, syntactic processing, and automatic inference. Learners are exposed to discourse structures from many sources, which help to develop comprehension strategies in reading, such as inference. To add to the discussion above, the author would like to draw attention to an effect dubbed “quasi-​priming” (Kadota, 2014, pp. 58–​59). During ER, learners will often encounter unfamiliar words and phrases and will usually either look them up in a dictionary or guess their meaning from context. As ER continues, the probability of encountering those same words and phrases again is quite high. This means that you have to repeatedly retrieve the relevant lexical information from long-​term memory. However, very often these words and phrases will be used in a different context to when you first processed them. While you are still retrieving the same lexical information, the different context means the effect is slightly different to that of repetition priming and can be better defined as something like “quasi-​repetition priming.”

170

170  New concept of practice in L2 acquisition These quasi-​repetitions are important for L2 learners, since they should occur in proportion to the frequency of use of the words and phrases concerned. Forming a robust and refined repertoire of words and phrases through quasi-​ repetitive processing is an important role of ER as input-​driven practice  –​and perhaps also of extensive listening. In summary, benefits of ER and extensive listening are probably derived from subconscious, implicit acquisition of L2 words and phrases rather than explicit learning and are likely to be due to input-​driven practice using quasi-​repetition priming. 6.4.2  Usage-​based theory as applied to extensive reading So far in this chapter, we have discussed the effects of ER from two key perspectives:  input theory and quasi-​ repetition priming. However, they may also be closely related with recently defined usage-​based theories of L1 and L2 acquisition. 6.4.2.1  Usage-​based theory in L1 acquisition How do people store language in their minds? This question has been addressed by many linguisticists and psychologists all over the world. In the 20th-​century generative linguistics of Noam Chomsky and others, it was assumed that stored rules of syntax, morphology, phonology, and so forth were used to interpret and produce sentences correctly. Human language was said to be ‘a rules-​based system’. However, Tomasello’s usage-​based model posits that what people store in their minds are not rules of language but ‘a structured inventory of frequently and less frequently used linguistic constructions’ that are retrieved to comprehend and produce sentences (Tomasello, 2003, p. 7). Examples of such constructions range from simple morphemes like –​ing and -​ly to syntactic frameworks like subject-​ verb-​object-​object (e.g., Nick made Steffi a sandwich) (Ellis and Wulff, 2015). This knowledge is essentially implicit in nature and is retrieved automatically rather than consciously from long-​term memory (see Figure 3.35 in Chapter 3). It is claimed that attaining an inventory of constructions is the final goal for infants in acquiring their first language. We might also assume that Tomasello’s inventory of constructions includes fixed expressions and formulae that have been noted by L2 researchers in recent years. I will discuss this later in this chapter. Discussing the usage-​based model in L1 acquisition, Tomasello proposed a “social-​cognitive system”: a powerful general learning system based on human interactive instinct. This was made up of two mechanisms: (1) intention reading, and (2) pattern finding. (1) Intention reading by a child is possible if an adult first establishes a joint attentional frame. A 9-​to 12-​month-​old child will follow the gaze of, that is, look in the same direction as, an adult caregiver. Figure  6.11 depicts a

 171

New concept of practice in L2 acquisition 171

(a) Perceptual situaon (b) Joint aenonal frame

(c) Ref. event

Figure 6.11 A joint attentional frame for adult–​ child communication (Adapted from Tomasello, 2003, p. 26)

joint attentional frame constructed in this way that can function as common ground for basic adult–​child communication. Once a joint attentional frame is established, a child should come to understand the intention of words spoken to it by an adult. The child knows that adults are separate from itself and that their speech is intended to convey a message (see Figure 6.12 below). The child’s next step may be “culture learning”, which happens when the child imitates the actions of the adult. These steps are believed to be a fundamental mechanism preprogrammed into the human brain. It is believed that children use a similar social-​cognitive mechanism, including imitation, to acquire their native language. (2) Pattern finding, another important prerequisite for L1 acquisition, is a form of statistical learning. Children are known to use pattern recognition in the acquisition of both phonology and grammar. For example, Marcus et  al. (1999) made seven-​ month-​ old infants listen repeatedly to three-​ syllable nonsense words with an ABB pattern (e.g., wi-​di-​di or de-​li-​li) for three minutes. The infants were then given a different ABB nonsense word like ba-​ po-​po and also corresponding AAB-​type (ba-​ba-​po) and ABA-​type (ba-​po-​ba) “words.” The children preferred “words” with the same structure (ABB) as they had initially been listening to. Tomasello (2003) argued that people are equipped with an innate statistical learning ability that they apply to speech they hear, thus acquiring L1 constructions such as specific pivot schemas. Figure 6.13 depicts a child’s ability to create usage patterns. At about one year old, a child might use a single word such as “milk” to express a number of different meanings. It then progresses to the pivot schema stage using combinations of two-​or three-​word patterns that are used repeatedly, for example, all …: all broken, all clean, all done (Tomasello, 2003, p. 116).

172

172  New concept of practice in L2 acquisition

Figure 6.12 A child can work out the intention of an adult saying “dog”

Figure 6.14 and Table 6.3 represent typical pivot schemas of children learning L1-​English: After the pivot schema stage, children progress to “item-​ based construction” and will speak using a verb-​dependent syntax, often called a “verb island hypothesis.” This stage is followed by “abstract syntactic construction,” where the verbs are no longer isolated and can be used in a variety of constructions. Eventually, children become capable of expressing a message freely using as many constructions as are necessary. 6.4.2.2  Usage-​based theory in L2 acquisition Usage-​based theory is embodied in L2 acquisition as the “associative–​cognitive model,” which recent research sees as a promising framework in L2 acquisition (Suzuki and Kadota, 2012, pp. 388–​391). Ellis and Wulff (2015) list the major prerequisites of this approach as follows: 1) Both L2 learning and L1 acquisition are mainly based on lots of exposure to language inputs.

 173

Figure 6.13 Children’s ability to create usage patterns (Ibbotson and Tomasello, 2016, pp. 71–​75)

174

174  New concept of practice in L2 acquisition

Figure 6.14 Typical early pivot schemas of children learning English (Tomasello, 2003, p. 116)

Table 6.3 Typical early pivot schemas of children learning English (Tomasello, 2003, p. 116) more car more cereal more cookie more fish no bed no down no fix other bib other bread other milk

all broke all buttoned all clean all done clock on there up on there hot in there milk in there all done milk all done now all gone juice

2) Learners induce rules from language inputs via cognitive mechanisms used for a variety of learning tasks, not just language learning. 3) In any language learning, acquiring different constructions is crucial to pairing forms with meanings/​functions. Constructions correspond to: a) Words, like a squirrel. b) Morphological affixes, such as -​ing attached to a verb.

 175

New concept of practice in L2 acquisition 175 c) Idioms such as I cannot get my head around this (= I  do not fully understand this). d) Abstract syntactic frames, such as subject-​verb-​object-​object (e.g., Nick gave his mother a letter) or passive constructions (e.g., A cake was baked for Jessica). 4) Associative learning (i.e. mapping forms and meanings of constructions), is a central component of L2 learning. Mapping can range from easy (e.g., sandwich = slice of meat and/​or cheese between two slices of bread) to complex: -​ing forms, articles, etc. are difficult to learn because they have many semantic functions depending on context. ) Exemplar-​based learning, which depends on subconscious implicit learning 5 based on statistical analyses, such as how often particular examples occur together with which phrases. The usage-​ based approach to language acquisition provides a model of implicit L2 learning, which ER familiarizes us with through “quasi-​priming.” The above five prerequisites specified by the usage-​based L2 theorists are claimed to have much in common with basic assumptions employed by ER educators. The underlying essence of both acquisition models seems to be implicit learning enhanced by input-​driven practice. 6.4.3  Shadowing as output-​driven practice We have already discussed shadowing training at L2 in some detail, identifying four main effects:  (1) input effect, (2)  practice effect, (3)  output effect, and (4) metacognitive monitoring effect. In contrast to input-​driven practice such as ER and extensive listening, I would argue that the practice aspect of shadowing is an output-​driven practice effect; its benefit is in enhancing internalization of lexical chunks and constructions through an accelerated vocal and subvocal rehearsal rate. Nakanishi and Ueda (2011) investigated the combined practice of ER and shadowing using the following research questions. RQ 1: Can ER practice improve students’ reading comprehension? RQ 2: Can shadowing practice accelerate the effects of ER practice? The participants were 87 college students enrolled in CALL-​based EFL classes in Japan. Group C1 (21 students) and Group C2 (24 students) practiced traditional translation-​based reading of English texts, while Group ER (22 students) practiced ER and Group ER&S (20 students) practiced both ER and shadowing in every class. All students were given three English reading comprehension tests: the first immediately before the study, the second four months in, and the final test after the full one-​year course. Figure 6.15 shows the results for the four groups:

176

176  New concept of practice in L2 acquisition

42 C1 40

C2 ER

Score means

38 ER & S 36

34

32

30

28 1

2 Test

3

Figure 6.15 Results of four groups in Tests 1–​3 (Nakanishi and Ueda, 2011)

The results lead to the following conclusions: 1) Regarding RQ 1, no significant differences in progress were detected between traditional and ER groups in any of the three tests, suggesting ER has no more effect on comprehension than traditional translation-​based reading. 2) The most noticeable progress was made by the ER plus shadowing group, though the difference was not significant. The average improvement in test scores for each group from Test 1 to Test 3 was 7.76 for C1, 6.83 for C2, 7.35 for ER, and 8.04 for ER&S. A more compelling study of combined ER and shadowing practice was conducted by Onimaru (2014). She compared the pass grades in the Eiken test of Practical English Proficiency of two groups of Japanese junior high school students –​an ER-​only group and an ER-​plus-​shadowing group –​over a period of three years. In the following Table 6.4 is shown the number of students who passed each grade of the Eiken in two groups.

 177

New concept of practice in L2 acquisition 177 Table 6.4 Number of junior high school students who passed each grade of the Eiken test of Practical English Proficiency (Onimaru, 2014, p. 41) Eiken grade

1st year 2nd year 3rd year Total

Extensive reading (ER) only

Extensive reading + shadowing (ER+SH)

2

Pre-​2

3

4

2

Pre-​2

3

4

1 0 0 1

2 1 9 12

9 26 61 96

0 9 61 70

0 2 4 6

3 10 25 38

10 37 57 104

30 97 1 128

It is revealed that at all Eiken grades, the ER-​ plus-​ shadowing group outperformed the ER-​only group. Okayama (2015, pp. 12–​14, 25–​27) introduced her own concept of ER-​like shadowing training, as follows. ) Students shadow a very large amount of input speech. 1 2) Students skip words they cannot easily shadow. 3) Students change the input speech if they find it too difficult. Some teachers use a variant of shadowing combined with ER known as shadoku (Suzuki, 2016). Shadoku provides learners with images as well as audio so their mental processing becomes “three-​dimensional” (audio, visual, and voice). The main benefit of shadoku is considered to be that it motivates students by immersing them more deeply in the learning experience. All in all, research so far suggests that the output-​driven practice of English language shadowing combined with the input-​driven practice of ER is likely to be effective in enhancing students’ L2 reading ability. However, more further research would be inevitable to prove this. 6.4.4  Implicit acquisition of formulae We discussed the effect of shadowing on the acquisition of formulaic sequences to enhance L2 lexico-​grammatical encoding in section 4.5 of Chapter 4. A similar effect seems likely by combining ER and extensive listening as an input-​driven practice. Imamura (2011) furnishes convincing evidence of the effect of ER on the acquisition of formulaic sequences by Japanese learners of English. Study participants were 47 high school students divided into two groups: 23 students in Group A and 24 in Group B. Group A students chose their favorite English graded readers and read them as extensively as they wished outside class for approximately four months. Group B was the control group, where students answered questions given in English textbooks outside class over the same four-​month period. Both groups were tested before and after the study period as follows.

178

178  New concept of practice in L2 acquisition

1600 1400 1200 1000 800 600 Pretest

Posttest Test (a)

Test (b)

Test (c)

Test (d)

Figure 6.16 Average RTs (ms) of ER group A in the four tests

Test (a): A lexical decision test, in which participants had to decide whether or not a word on a PC screen was real (e.g., power, enter (real words); voon (not a real word)). Test (b): A collocation test, in which participants had to decide whether or not a verb-​plus-​noun word sequence was correct (e.g., cook dinner; do trees). Test (c): A word order test, in which participants had to decide whether or not a sequence of three or four words was grammatically acceptable (e.g., as far as, as a result; day the at, bus on go). Test (d): An antonym test, in which participants had to decide whether or not two given words were antonyms (e.g., black-​white; carry-​sleep). The response times (RTs in milliseconds) in the four tests are shown in Figure 6.16 for Group A and in Figure 6.17 for Group B, based on the descriptive data given in Imamura (2011). The major findings were as follows: 1) The control Group B showed no significant difference between any of the four pre-​and poststudy tests. 2) The ER Group A showed significantly faster response times to tests (a) and (b) after the study period than before. 3) The ER Group A responded faster in tests (c) and (d) after the study period, but the difference was not statistically significant.

 179

New concept of practice in L2 acquisition 179 1800 1600 1400 1200 1000 800 600 Pretest

Posttest Test (a)

Test (b)

Test (c)

Test (d)

Figure 6.17 Average RTs (ms) of control group B in the four tests

Table 6.5 Collocational continuum in English (Wray, 2002, p. 63) Free combinations Lexical composites: Verb + noun Grammatical composites: Preposition + noun

Restricted collocations

blow a trumpet blow a fuse under the table under attack

Figurative idioms

Pure idioms

blow your own trumpet under the microscope

blow the gaff under the weather

We have been looking at ER as input-​driven practice to develop learners’ ability to judge lexical collocations. Table 6.5 below represents Wray’s (2002) “collocational continuum,” which describes several types of collocation in English, ranging from free combinations to pure idioms. The acquisition of this kind of knowledge seems crucial for the attainment of psycholinguistic competence. The importance of practice, as we discussed in 6.3., is demonstrated not just in acquisition of L2 automaticity or psycholinguistic competence, but of course in many games and activities such as golf, and baseball. (see Figure 6.18) The effects of both input-​and output–​driven practice on L2 acquisition cannot be overemphasized. In this chapter so far, we have discussed its importance in

180

180  New concept of practice in L2 acquisition

Figure 6.18 Sportsmen practicing

detail with reference to behavioral learning paradigms such as stimulus-​response-​ reinforcement, the priming process, automatization of explicit knowledge, and forms of practice, including ER, extensive listening, and shadowing.

6.5  Summary and perspectives 6.5.1  Locating shadowing in a simplified model of bilingual processing It is well known that the two languages of bilingual speakers are not separate but active simultaneously in competition with each other (see Chapter 5). So too are the languages of L2 learners and users. The author has produced (Kadota, 2010a, p. 20) a simplified representation of the mechanisms that might be involved in L2 and L1 processing (Figure 6.19). This model gives a rough idea of what is likely to be going on during processes such as shadowing, oral reading, and picture naming. L2 processing is shown on the left and L1 processing on the right, since learners are assumed always to process L2 by reference to their L1 processing. The model attempts to show roughly how speech and visual inputs and outputs are connected for both languages.

 181

New concept of practice in L2 acquisition 181

L2 Oral input (a)

L2 oral output (b)

L2 phonological representation (c)

L1 oral output (j)

Inter-language mapping

L1 oral input (i)

L1 phonological representation (k)

Conceptual representation (g) L2 orthographic representation (d)

L2 visual input (e)

L2 visual output (f)

Secondary route

Image input (picture, etc.) (h)

Secondary route

L1 orthographic representation (l)

L1 visual output (n)

L1 visual input (m)

Figure 6.19 Simplified model of bilingual processing (Kadota, 2010a, p. 20)

This model is based on the following two speculative hypotheses: 1) Although the degree of automaticity differs greatly between L2 and L1 processing, the mental representations are the same for both. 2) A shared conceptual (semantic) representation exists independently from L2 and L1 specific representations. The model proposes a number of intralanguage processing routes specific to either L2 or L1 as well as interlanguage processing routes for both languages. So in shadowing at L2, a phonological representation of input speech enables the learner to repeat speech by the shortest route simply by copying the sounds “parrot fashion” with no reference to a conceptual representation (route a-​>c-​>b, see also Figure 4.15 in Chapter 4). However, as we get used to shadowing, the cognitive resources required gradually decrease, and it becomes more automatized. It becomes possible to access a conceptual representation (route a-​ >c-​ >g-​ >c-​ >b) during shadowing. Once semantic shadowing is possible, it is no longer mindless repetition but involves simultaneously analyzing linguistic information including grammatical, textual, semantic, and pragmatic cues. Shadowing practice at this level not only improves speaking accuracy and fluency (see 4.4 of Chapter  4) but helps the learner to acquire new L2 linguistic forms: words, formulae, and more complex linguistic structures. In L2 oral reading, we construct an orthographic representation of visual input and then perform phonological encoding using either the grapheme–​ phoneme correspondence route or the whole word route (see Chapter  3 for

182

182  New concept of practice in L2 acquisition more detail on phonological coding). The result of this coding is a phonological representation that can be used to construct a conceptual representation. As with parrot-​fashion shadowing, oral reading is possible simply by pronouncing a phonological representation without reference to meaning (route e-​>d-​ >c-​>b, see also Figure  4.15 in Chapter  4); this is often called “eye-​to-​mouth reading.” However, as we become accustomed to L2 phonological coding, we become able to construct meaning (a conceptual representation) while we are reading aloud, and so, as in shadowing, we acquire new linguistic information. Figure 4.15 in Chapter 4 and Figure 6.19 above also allow us to envisage the following processing routes:   3)   4)   5)   6)   7)   8)   9) 10)

Dictation: a-​>c-​>d-​>f or a-​>c-​>g-​>c-​>d-​>f Transcription: e-​>d-​>f or e-​>d-​>c-​>g-​>c-​>d-​>f Spoken picture naming: h-​>g-​>c-​>b Written picture naming: h-​>g-​>c-​>d-​>f L2-​to-​L1 interpretation (spoken translation): a-​>c-​>g-​>k-​>j L2-​to-​L1 written translation: e-​>d-​>c-​>g-​>k-​>l-​>n L1-​to-​L2 interpretation: i-​>k-​>g-​>c-​>b L1-​to-​L2 written translation: m-​>l-​>k-​>g-​>c-​>d-​>f

6.5.2  IPOM: Input, practice, output, monitoring As mentioned above, L2 shadowing is considered to help learners construct a phonological representation of speech input and to imitate or semantically reproduce that representation. In this chapter we have seen that a large amount of comprehensible input and attaining a certain level of speech production skill are prerequisites for successful L2 acquisition, as suggested by input and output theory. We have discussed how learners also need ‘practice’ connecting input processing and output production, especially when the target L2 is linguistically quite distant from the L1. Chapter 5 showed the importance of metacognitive monitoring and control in L2 acquisition. Therefore, we can characterize the key components of successful L2 acquisition identified in mainstream L2 research as IPOM (input processing, practice, output production, and monitoring) –​see Figure 6.20. This book has consistently supported the view that shadowing practice is exactly the sort of training needed to enhance IPOM. 6.5.3  Summary of the four effects of shadowing This book proposes four main effects of L2 shadowing. 1) An input effect (Chapter  2), which automatizes perception of L2 speech sounds by imitation or semantic reproduction of input speech, improving L2 listening comprehension. This effect appears to tie in with such theories as the McGurk effect, the motor theory of speech perception, and the human mirror system.

 183

New concept of practice in L2 acquisition 183

Figure 6.20 The four pillars of L2 acquisition: Input processing, practice, output production, and monitoring (Kadota, 2018)

2) A practice effect (Chapter 3), which enhances the internalization of new L2 words, formulae, and constructions by training the learner to rehearse speech in phonological working memory (PWM). 3) An output effect (Chapter 4), which improves both the phonetic articulation and the lexico-​grammatical encoding stages of L2 speech production. 4) A monitoring effect (Chapter  5), which develops the L2 metacognitive executive functions of executive working memory (EWM) by monitoring the learner’s own shadowing speech. Figure 6.21 presents the four main effects of shadowing upon L2 acquisition (Kadota, 2018, p. 186). The practice effect can be divided into three subeffects (Chapter 3) as follows. a) Shadowing is not intended to promote conscious learning of linguistic items but is simply an imitation or semantic reproduction of L2 input speech. However, new linguistic items are acquired subconsciously during shadowing, particularly at the higher, semantic level; this is an implicit practice effect. b) Although learners try to consciously memorize new linguistic items, they often use phonological working memory (PWM) with no vocally expressed rehearsal. Shadowing enhances internal, subvocal rehearsal in PWM and helps to transfer learned items subconsciously into long-​term memory (LTM). c) The practice effect of shadowing is not limited to memorizing new items. When shadowing a variety of texts, the learner encounters words, phrases, and structures in different contexts. Repeated retrieval of relevant information

184

184  New concept of practice in L2 acquisition

SHADOWING

Automatic speech perception

Improvement of listening skill

Acceleration of vocal and subvocal speech rate

Internalization of Lexical chunks, constructions etc.

Simulation of sentence production

Improvement of speaking skill

Enhancement of meta-cognitive monitoring and control

Promotion of executive control

Figure 6.21 Four proposed effects of shadowing upon L2 acquisition (Kadota, 2018, p. 186)

from the mental lexicon causes a repetition priming or “quasi-​repetition priming” effect as in extensive reading and listening (see subsection 6.4.1). Of the four effects discussed, most research on L2 shadowing training to date has focused on the input effect. This book has reviewed not only this but the latest empirical evidence supporting shadowing as an important technique in L2 acquisition. 6.5.4  Order of emergence of input, practice, and output effects It is not possible to assume that the first three of the four effects of shadowing emerge at approximately equal time intervals as the learner continues to practice. However, there may be a fixed emergence order as shown in Figure 6.22. The fourth (monitoring) effect may be considered to support all the other effects throughout L2 acquisition. Figure 6.23 (recalling Figure 2.24 of Chapter 2) shows four steps in developing listening skills. It can be argued that shadowing training promotes, first and foremost, accurate repetition of input speech, and that the resultant increase in articulation speed eventually leads to skill in L2 listening comprehension. 6.5.5  Hypothetical route map to L2 acquisition through shadowing Looking at the emergence order of the input, practice, and output effects of shadowing and considering that the input effect has stages of development, we can plot a tentative “route map” to L2 acquisition via continuous shadowing training (Figure 6.24), using Nishizawa’s (2006, pp. 14–​15) “route map of one-​ million-​word extensive reading” as a reference. The following rough guide to the stages along the way shows the number of approximate number of words shadowed in brackets.

 185

Figure 6.22 Emergence order of three effects of shadowing (Kadota and Tamai, 2017, p. 51)

1) Shadowing training 2) Promoting repetition accuracy 3) Accelerating articulation speed 4) Developing listening comprehension Figure 6.23 Steps from shadowing training to development of listening comprehension

1 Starting shadowing training

2 Promoting repetition accuracy

3 Accelerating articulation speed 4 Developing listening comprehension: An input effect 5 Internalizing new words, formulas, constructions: A pratice effect 6 Enhancing speech production: An output effect Figure 6.24 Hypothetical route map to L2 acquisition through shadowing

186

186  New concept of practice in L2 acquisition Stage ❶: Start of shadowing training, at which point it is most important to enjoy shadowing. Shadowing seems to work well when combined with working out, walking, jogging, etc. [0 words] Stage ❷: Achieving full repetition accuracy as you become more comfortable with repeating input speech. [10,000 words] Stage ❸:  Increasing articulation rate (words per minute); at this stage shadowing has become almost automatic. [30,000 words] Stage ❹:  The input effect of shadowing starts to improve listening comprehension. By now there is virtually no discrepancy between reading and listening comprehension. If you understand a written passage by reading silently, you will now be able to understand the same passage by listening to it. [100,000 words] Stage ❺:  Internalizing new linguistic items. You can now memorize and acquire unfamiliar words, formulae, and structures almost subconsciously through the practice effect. [300,000 words] Stage ❻:  The ultimate goal of L2 shadowing training, by which point shadowing is simulating grammatical encoding and articulation in speech production: an output effect. [1,000,000 words]

Conclusions This book has reviewed the theoretical and empirical evidence available to date that shadowing can enhance L2 acquisition. However, our understanding of the role of shadowing and related tasks is still far from complete. I believe that learning more about shadowing or even simple repetition –​a crude method of teaching words to children –​should help us improve what we do in the L2 classroom in the future. I do hope that this book has succeeded in providing a fresh insight into the mysteries of L2 acquisition and new methods of L2 learning and teaching. There are many L2 researchers and teachers who do not appreciate the need for practice to fill the gap between input processing and output production. I suspect one reason for this might be that the major L2 acquisition theories –​input, output, interaction, etc.  –​and approaches such as task-​based learning (TBL), content-​ and-​ language integrated learning (CLIL), and focus on form have mostly been advanced by researchers whose L1 is English or a closely related language. All the above approaches are very well grounded theoretically and empirically and their proponents include many of the internationally best known L2 researchers and educators. However, the author hopes the insights in this book will help bring a new recognition of how important the seemingly prosaic approach of practice is, not just to teachers and learners in Asia whose L1 are the furthest removed from English and related European languages, but to other nationalities too. “Practice makes perfect!”

 187

References

A, R., and Hayashi, R. (2012). Accuracy of Japanese pitch accent rises during and after shadowing training. In Q. Ma, H. Ding, and D. Hirst (Eds.), Proceedings of the 6th International Conference on Speech Prosody (pp. 214–​217). Shanghai: Tongji University. A, R., Hayashi, R., and Kitamura, T. (2015). Crucial prosodic features in Japanese learners’ pronunciation:  Naturalness judgments of synthetic speech. Journal of the Phonetic Society of Japan, 19,  37–​42. A, R., Sakai, N., and Mori, K. (2014). Frequency of prevention (blocking) of stuttering: A comparison between shadowing and repetition. Paper Presented at the 28th General Meeting of the Phonetic Society of Japan. Tokyo: Tokyo University of Agriculture and Technology. Akita, K. (2003). Psychology of reading and writing. Kyoto: KitaojiShobo. Alladi, S., Bak, T. H., Duggirala, V., Surampudi, B., Shailaja, M., Shukla, A. K., Chaudhuri, J. R., and Kaul, S. (2013). Bilingualism delays age at onset of dementia, independent of education and immigration status. Neurology, 81, 1938–​1944. Arbib, M. A. (2012). How the brain got language:  The mirror system hypothesis. New York: Oxford University Press. Atkinson, R. C., and Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K.W. Spence and J. T. Spence (Eds.), The psychology of learning and motivation: Advances in research theory (Vol. 2, pp. 89–​195). New York: Academic Press. Atkinson, R. C. and Shiffrin, R. M. (1971). The control of short-​term memory. Scientific Americana, 225,  82–​90. Baddeley, A. D. (1986). Working memory. New York: Oxford University Press. Baddeley, A. D. (2000). The episodic buffer:  A new component of working memory? Trends in Cognitive Sciences, 4, 417–​423. Baddeley, A. D. (2002). Is working memory still working? European Psychologist, 7,  85–​97. Baddeley, A. D. (2012). Working memory: Theories, models, and controversies. Annual Review of Psychology, 63,  1–​29. Baddeley, A. D. (2015). Working memory in second language learning. In Z. Wen, M. B. Mota, and A. McNeill (Eds.), Working memory in second language acquisition and ­processing (pp. 17–​28). Bristol: Multilingual Matters. Baddeley, A. D., and Hitch, G. (1974). Working memory. In G. H. Brower (Ed.), The psychology of learning and motivation (Vol. 8, pp. 47–​90). New York: Academic Press. Baddeley, A. D., Thomson, N., and Buchanan, M. (1975). Word length and the structure of short-​term memory. Journal of Verbal Learning and Verbal Behavior, 14, 375–​589. Bak, T. H., Nissan, J. J., Allerhand, M. M., and Deary, I. J. (2014). Does bilingualism influence cognitive aging? Annals of Neurology, 75, 959–​969.

188

188 References Bear, M., Connors, B. W., and Paradiso, M. A. (2007). Neuroscience: Exploring the brain (3rd ed.). Philadelphia, PA: Lippincott Williams and Wilkins. Bialystok, E. Craik, F. I.  M, Grady, C., Chau, W., Ishii, R., Gunji, A., and Pantev, C. (2005). Effect of bilingualism on cognitive control in the Simon task: Evidence from MEG. NeuroImage, 24,  40–​49. Bialystok, E. (2015). Bilingualism: Consequences for mind and brain. A Keynote Lecture Delivered at the 60th Annual Conference of the International Linguistic Association, New York, Columbia University Teacherʼs College. Breen, M. (2014). Empirical investigations of the role of implicit prosody in sentence processing. Language and Linguistics Compass, 8,  37–​50. Breen, M., and Clifton, C. Jr. (2011). Stress matters effects of anticipated lexical stress on silent reading. Journal of Memory and Language, 64, 153–​170. Brown, G. D.  A., and Hulme, C. (1995). Modeling item length effects in memory span: No rehearsal needed? Journal of Memory and Language, 34, 594–​621. Canale, M., and Swain, M. (1980). Theoretical basis of communicative approaches to second language teaching and testing. Applied Linguistics, 1,  1–​47. Carroll, J. B., and Sapon, S. (1959). Modern language aptitude test:  Form A. New York: Psychological Cooperation. Cherry, C. (1966). On human communication. Cambridge, MA: MIT Press. Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky, N. (1988). Language and problems of knowledge:  The Managua lectures. Cambridge, MA: MIT Press. Craik, F., and Lockhart, R. (1972). Levels of processing:  A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671–​684. Day, R. R., and Bamford, J. (1998). Extensive reading in the second language classroom. Cambridge: Cambridge University Press. Dean, L., Minematsu, N., Yamauchi, Y., and Hirose, K. (2009). Analysis and comparison of automatic language proficiency assessment between shadowed sentences and read sentences. Proceedings of ISCA International Workshop on Speech and Language Technology in Education (pp. 37–​40). Wroxall Abbey Estate: Warwickshire. DeKeyser, R. M. (2007). Practice in a second language: Perspectives from applied linguistics and cognitive psychology. Cambridge: Cambridge University Press. Ellis, N. C., and Wulff, S. (2015). Usage-​based approaches to SLA. In B. VanPatten and J. Williams (Eds.), Theories in second language acquisition:  An introduction. New York: Routledge. Ellis, R. (1994). The study of second language acquisition. Oxford: Oxford University Press. Ellis, R., Loewea, S., Elder, C., Erlarn, R., Philp, J., and Reinders, H. (2009). Implicit and explicit knowledge in second language learning, testing and teaching. Bristol: Multilingual Matters. Ericsson, K. A., and Simon, H. A. (1984). Protocol analysis:  Verbal reports as data. Cambridge, MA: MIT Press. Ericsson, K. A., and Simon, H. A. (1993). Protocol analysis: Verbal reports as data (2nd ed.). Cambridge, MA: MIT Press. Ewoldt, C. (1981). A psycholinguistic description of selected deaf children reading in sign language. Reading Research Quarterly, 17,  58–​89. Flavell, J. H. (1979). Metacognition and cognitive monitoring: A new area of cognitive developmental inquiry. American Psychologist, 34, 906–​911. Fodor, J. D. (2002). Psycholinguistics cannot escape prosody. Proceedings of Speech Prosody 2002 Conference. Aix-​en-​Provence, France. Fromkin, V. A. (1971). The non-​anomalous nature of anomalous utterances. Language, 47,  27–​52.

 189

References 189 Funahashi, S. (2000). Brain mechanism of working memory. In N. Osaka (Ed.), Brain and working memory (pp. 21–​49). Kyoto: Kyoto University Press. Furth, H. G. (1966). Thinking without language:  Psychological implications of deafness. New York: Free Press. Furukawa, A. (2007, November 29). Extensive reading special: Win 3 English books by reading 100 books. Daily Yomiuri, 20. Furuta, A. (2012). Development of remedial teaching material on shadowing for high school ESL students. Paper Presented at the JACET Reading Research Sig. December 2012 Meeting. Osaka: Kwansei Gakuin University. Futatsuya, K. (1999). Changing and understanding how to teach. Tokyo: Gakugei Tosho. Gathercole, S. E., and Baddeley, A. D. (1993). Working memory and language. Hove, UK: Lawrence Erlbaum. Grabe, W. (2009). Reading in a second language:  Moving from theory to practice. Cambridge: Cambridge University Press. Grabe, W. (2017). Where’s real extensive reading in the adult ESL/​EFL curriculum? Paper Presented at the 2017 TESOL Annual Convention. Seattle: Sheraton Seattle and Washington State Convention Center. Greene, J. (1986). Language understanding: A cognitive approach. Milton Keynes: Open University Press. Griffiths, T. D., and Warren, J. D. (2002). The planum temporal as a computational hub. Trends in Cognitive Sciences, 25, 348–​353. Gvion, A., and Friedmann, N. (2012). Phonological short term memory conduction aphasia. Aphasiology, 26, 579–​614. Hamada, Y. (2011). Improvement of listening comprehension skills through shadowing with difficult materials. The Journal of Asia TEFL, 8, 139–​162. Hamada, Y. (2014). The effectiveness of pre-​and post-​shadowing in improving listening comprehension skills. The Language Teacher, 38,  3–​10. Hamada, Y. (2017). Teaching EFL learners shadowing for listening. London: Routledge. Hawkins, R. (2005). Synaptic plasticity and learning. 28th annual postgraduate review course, basic and clinical neurosciences held from December 17, 2005 to March 11, 2006, New York, Columbia University. Hayashi, R. (2014). How does shadowing training change pronunciation by JSL learners? In H. Yokokawa, N. Sadato, and H. Yoshida (Eds.), How does proficiency in foreign language develop?: Investigating automatization in the processing of linguistic information (pp. 157–​179). Tokyo: Shohakusha. Hill, L. A. (1965). Elementary Stories for Reproduction 1. London: Oxford University Press. Hori, T. (2008). Exploring shadowing as a method of English pronunciation training. A doctoral dissertation submitted to The Graduate School of Language, Communication and Culture, Kwansei Ggakuin University. Hulstijn, J. H. (2001). Intentional and incidental second language vocabulary learning: A reappraisal of elaboration, rehearsal and automaticity. In P. Robinson (Ed.), Cognition and second language instruction (pp. 258–​286). Cambridge:  Cambridge University Press. Hulstijn, J. H. (2006). Psycholinguistic perspectives on second language acquisition. In J. Cummins and C. Davision (Eds.), The international handbook on English language teaching (pp. 701–​713). Norwell, MA: Springer. Ibbotson, P., and Tomasello, M. (2016). Language in a new key. Scientific American, 315,  71–​75. Imamura, K. (2011). Does extensive reading increase the recognition speed of words, collocations, formulaic sequences, and antonyms? Language Education and Technology, 48, 185–​214.

190

190 References Inoue, Y., Kabashima, S., Saito, D., Minematsu, N., Kanamura, K., and Yamauchi, Y. (2018). A  Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances. Proc. Interspeech 2018: 1651–​1655. Iwata, S. (1987). Brain and communication. Tokyo: Asakura Shoten. Jasper, H. H. (1958). The ten twenty electrode system of the international federation. Electroencephalography and Clinical Neurophysiology, 10, 371–​375. Kadota, S. (1982). Some psycholinguistic experiments on the process of reading comprehension. Journal of Assumption Junior College, 9,  49–​70. Kadota, S. (1984). Subvocalization and processing units in silent reading. Journal of Assumption Junior College, 11,  29–​58. Kadota, S. (1985). Cloze procedure and processing of discourse. Journal of Assumption Junior College, 12,  13–​39. Kadota, S. (1986). The process of speech production: An analysis of pauses. New Currents in English and English Literature (pp. 381–​398.) Tokyo: New Current International. Kadota, S. (1987). The role of prosody in silent reading. Language Sciences, 9, 185–​206. Kadota, S. (1997). An assessment of concurrent articulation in processing visually and aurally presented sentences. Journal of Language Processing, 1,  32–​44. Kadota, S. (2002). How phonology works in L2 reading comprehension. Tokyo:  Kurosio Shuppan. Kadota, S. (2006). Reading and phonological processes in English as a second language. Tokyo: Kurosio Shuppan. Kadota, S. (2007). The science of shadowing and oral reading. Tokyo: CosmoPier. Kadota, S. (2009). Shadowing and oral reading as an interface between input and output tasks. Proceedings of the 35th Tottori Conference of the Japan Society of English Language Education (pp. 69–​71). The Japan Society of English Language Education. Kadota, S. (2010a). Lexical processing and its modularity in L2. In H. Kimura, T. Kimura, and O. Shiki (Eds.), Theory and practice in reading and writing: Nurturing independent learning (pp. 74–​89). Tokyo: Taishukan-​Shoten. Kadota, S. (2010b). An introduction to SLA research: How to conduct research in L2 processing and acquisition. Tokyo: Kurosio Shuppan. Kadota, S. (2012). The science of shadowing, oral reading, and English language acquisition. Tokyo: CosmoPier. Kadota, S. (2014). Twelve tips for successful English-​as-​second-​language learners. Tokyo: CosmoPier. Kadota, S. (2015). The science of shadowing, oral reading, and communication in English as L2. Tokyo: CosmoPier. Kadota, S. (2018). Mechanism of becoming a fluent L2 speaker: How shadowing enhances language acquisition. Tokyo: SB Creative. Kadota, S., Hase, N., Nakanishi, H., Shiki, O., Nakano, Y., Noro, T., and Kawasaki, M., (2016). Shadowing as a practice in second language acquisition: Psycholinguistic and neurolinguistic viewpoints. Colloquia held at PacSLRF 2016, Tokyo: Chuo University. Kadota, S., and Ishikawa, K. (2005). Do Japanese EFL learners activate phonology in reading English words and Japanese kanji? JACET Bulletin, 40,  55–​75. Kadota, S., Kawasaki, M., Nakanishi, H., Shiki, O., Hase, N., Nakano, Y., Noro, T., and Kazai, K. (2015a). The effect of shadowing on the subvocal rehearsal in L2 reading: An experiment using NIRS for Japanese EFL learners. Paper Presented at the FLEAT 6 Conference. Cambridge, MA: Harvard University. Kadota, S., Kawasaki, M., Shiki, O., Hase, N., Nakano, Y., Noro, T., Nakanishi, H., and Kazai, K. (2015b). The effect of shadowing on the subvocal rehearsal in L2 reading: A behavioral experiment for Japanese EFL leaners. Language Education and Technology, 52, 163–​177.

 191

References 191 Kadota, S., Shibahara, T., Takase, A., and Yoneyama, A. (2012). Shadowing for speaking English: Connecting listening to speaking. Tokyo: CosmoPier. Kadota, S., and Tamai, K. (2004). The shadowing. Tokyo: CosmoPier. Kadota, S., and Tamai, K. (2017). The shadowing (2nd ed.). Tokyo: CosmoPier. Kawashima, R. (2003). Brain imaging of higher order functions. Tokyo: Igaku-​Shoin. Kohno, M. (1992). Reconstruction of English class (2nd ed.) Tokyo: Tokyo-​Shoseki. Kohno, M., Ikari, Y., Ishikawa, K., Kadota, S., Murata, J., and Yamane. S. (2007). Explorations into the mechanism of language and cognition. Tokyo: Sanseido. Koike, I., Kohno, M., Tanaka, H., Mizutani, O., Ide, S., Suzuki, H., and Tanabe, Y. (2003). Kenkyusha dictionary of applied linguistics. Tokyo: Kenkyusha. Kojima, T. (2006). A physio-​psychological examination of speech repetition. Higher Brain Function Research, 26, 156–​168. Kormos, J (2006). Speech production and second language acquisition. Mahwah, NJ: Lawrence Erlbaum Associates. Krashen, S. (1982). Principles and practice in second language acquisition. Oxford, UK:  Pergamon. Krashen, S. (1985). The input hypothesis: Issues and implications. New York: Longman. Kroll. J. F. (2015) Bilingualism transforms language, cognition, and the brain. A Lecture Delivered at the Two-​ Day Workshop on Bilingualism and Executive Function, New York, City University of New York Graduate Center. Lee, N., Mikesell, L., Joaquin, A. D.  L., Mates, A. W. and Schumann, J. H. (2009). The interactional instinct: The evolution and acquisition of language. Oxford: Oxford University Press. Lesch, M. F., and Pollatsek, A. (1993). Automatic access of semantic information by phonological codes in visual word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 285–​294. Lesch, M. F., and Pollatsek, A. (1998). Evidence for the use of assembled phonology in accessing the meaning of printed words. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 573–​592. Levelt, W. J.  M. (1989). Speaking:  From intention to articulation. Cambridge, MA: MIT Press. Levelt, W. J. M. (1993). The architecture of normal spoken language use. In G. Blanken, E. Dittman, H. Grimm, J. Marshall, and C. Wallesch (Eds.), Linguistic disorders and pathologies: An international handbook (pp. 1–​15). Berlin: de Gruyter. Levy, B. A. (1977). Reading: Speech and meaning processes. Journal of Verbal Learning and Verbal Behavior, 16, 623–​638. Liberman, A. M., Cooper, F. S., Harris, K. S., and MacNeilage, P. F. (1963). A motor theory of speech perception. Proceedings of the Symposium on Speech Communication Seminar (Vol. 2, Paper D3). Stockholm: Royal Institute of Technology. Liberman, A. M., Cooper, F. S., Shankweiler, D. P. and Studdert-​Kennedy, M. (1967). Reception of the speech code. Psychological Review, 74, 431–​461. Linton, M. (1990). Transformation of memory life in everyday life. In U. Neisser (Ed.), Memory observed:  Remembering in natural contexts (2nd ed., pp. 107–​ 118). San Francisco: Freeman and Company. Locke, J. (1978). Phonemic effects in the silent reading of hearing and deaf children. Cognition, 6, 175–​187. Logie, R. H. (1995). Visuo-​spatial working memory. Hillsdale, NJ:  Lawrence Erlbaum Associates. Long, M. (1996). The role of the linguistic environment in second language acquisition. In W. C. Ritchie and T. K. Bhatica (Eds.), Handbook of research on language acquisition (Vol. 2, pp. 413–​468). New York: Academic Press.

192

192 References Luria, A. R. (1973). Essence of neuropsychology: Brain functions. (Translated into Japanese by H. Kashima, 1999.) Tokyo: Sozo Publishing. MacIntyre, T. (2009). Reading explorer 2. Boston: Heinle Cengage Learning. Marcus, G., Vijayan, S., Bandi Rao, S., and Vishton, P. M. (1999). Rule learning by seven-​ month-​old-​infants. Science, 283,  77–​80. Marslen-​Wilson, W. D. (1985). Speech shadowing and speech comprehension. Speech Communication, 4,  55–​73. Matthei, E., and Roeper, T. (1983). Understanding and producing speech. London: Fontana. Matsui, T. (2011). Bottom-​up shadowing practice at Junior High School. Paper Presented at the Spring Research Meeting of LET Kansai Chapter. Kyoto:  Doshisha Women’s College. Mcdonough, K., and Trofimovich, P. (2009). Using priming methods in second language research. London: Routledge. McGuigan, F. J. (1970). Covert oral behavior during the silent performance of language tasks. Psychological Bulletin, 74, 309–​326. Meltzoff, A. N., and Decety, J. (2003). What imitation tells us about social cognition: A rapprochement between developmental psychology and cognitive neuroscience. Philosophical Transactions, 358, 491–​500. Miyake, S. (2009a). A study on the change of reproduction rate and of utterance speed after the repetition training by Japanese learners of English. Journal of Japan Society of Speech Sciences, 10,  51–​69. Miyake, S. (2009b). Cognitive processes in phrase shadowing:  Focusing on articulation rate and shadowing latency. JACET Journal, 48,  15–​28. Miyake, S. (2009c). A study on the speech repetition by Japanese learners of English. Paper Presented at the Joint Meeting of JACET Reading Research Sig. and Graduate School of Language Communication Culture. Osaka: Kwansei Gakuin University. Mochizuki, H. (2006). Application of shadowing to TEFL in Japan: The case of junior high school students. Studies in English Language Teaching, 29,  29–​44. Mochizuki, H. (2011). Effective shadowing training in English Classes in Japan and resolution of intracerebral language processing mechanism while shadowing: With simple LL system and NIRS. A  2011 Symposium on Biological and Advanced Information Processing, Nagaoka, Nagaoka University of Technology. Mochizuki, H. (2014a). Mechanism elucidation for cerebral English language processing while shadowing, repeating and listening: An experimental brain research using NIRS. Paper Presented at the 14th Annual Conference of Japan Second Language Association (J-​SLA). Nishinomiya: Kwansei Gakuin University. Mochizuki, H. (2014b). Resolution of intracerebral language processing mechanism during shadowing training with NIRS. Paper Presented at the AILA World Congress 2014. Brisbane: Brisbane Convention Center. Moon, R. (1997). Vocabulary connections: Multi-​word items in English. In N. Schmitt and M. McCarthy (Eds.), Vocabulary: Description, acquisition and pedagogy (pp. 40–​ 63). Cambridge: Cambridge University Press. Mori, Y. (2011). Shadowing with oral reading:  Effects of combined training on the improvement of Japanese EFL learners’ prosody. Language Education and Technology, 48,  1–​22. Mori, Y., Kadota, S., Shiki, O. and Yoshida, S. (2010). Various factors concerning reproduction rate in shadowing and repeating:  Focusing phonological loop system. Paper Presented at The 36th Osaka Conference of the Japan Society of English Language Education. Suita: Kansai University.

 193

References 193 Mori, M. (2013). Predictors of comprehension in oral reading and its attentional control in L2: An empirical study of Japanese EFL learners. A Master of Arts Thesis Presented to The Graduate School of Language, Communication, and Culture, Kwansei Gakuin University. Morishita, M., Satoi, H., and Yokokawa, H. (2010). Verbal lexical representation of Japanese EFL learners: Syntactic priming during language production. Journal of the Japan Society for Speech Sciences, 11,  29–​43. Mukoyama, Y. (2010). Direction of language aptitude research: The recent trends of SLA research in connection with teaching practice. Paper Reported at the JASLA symposium. Chiba: Reitaku University. Mukoyama, Y. (2013). The role of language aptitude in second language acquisition. Tokyo: CoCo Shuppan. Muranoi, H. (2006). SLA research and second language learning and teaching. Tokyo: Taishukan-​Shoten. Nakanishi, T., and Ueda, A. (2011). Extensive reading and the effect of shadowing. Reading in a Foreign Language, 23,  1–​16. Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press. Neath, I., and Surprenant, A. M. (2003). Human memory (2nd ed.). Belmont: Wadsworth /​Thomson Learning. Nishizawa, H. (2006, April). Basics of extensive reading:  0 to a million words. The Magazine for Extensive Listening (EL) and Extensive Reading (ER). Tokyo: Cosmopier. Obler, L. K., and Gjerlow, K. (1999). Language and the brain. Cambridge: Cambridge University Press. Ohta, N. (2016). Cognitive psychology and education. In N. Ohta and Y. Sakuma (Eds.), The crosspoint between English education and cognitive psychology. Kyoto: Kitaoji Shobo. Okamoto, M., Dan, H., Sakamoto, K., Takeo, K., Shimizu, K., Kohno, S., Oda, I., Isobe, S., Suzuki, T., Kohyama, K., and Dana, I. (2004). Three-​dimensional probabilistic anatomical cranio-​cerebral correlation via the international 10–​20 system oriented for transcranial functional brain mapping. NeuroImage, 21, 99–​111. Okayama, Y. (2015). Introduction to ER-​like shadowing. Tokyo: Cosmopier. Okura, S. (2012). The effects of repetition on the performance of shadowing and repeating tasks for Japanese EFL learners. A  Master of Arts thesis presented to The Graduate School of Language, Communication, and Culture, Kwansei Gakuin University. Onimaru, H. (2014). The effect of combined practice of extensive reading and shadowing. The Magazine for Extensive Listening (EL) and Extensive Reading (ER): Special Issue. Tokyo: Cosmopier. Osaka, M. (2002). Working memory as a memo pad in the brain. Tokyo: Shinyosha. Otsuki, M. (2012). Diagnosis of conduction aphasia. Education and Training committee of Japan Society of Higher Brain Dysfunction (ed.), Conduction Aphasia: Repetition, STM Disorder, Phonological Paraphasia (pp. 3–​24). Tokyo: Shinko Igaku Shuppan. Perfetti, C. A. (1999). The cognitive science of word reading:  What has been learned from comparisons across writing systems? A Lecture Delivered at the 2nd International Conference on Cognitive Science and the 16th Annual Meeting of the Japanese Cognitive Science Society Joint Conference, Tokyo, Waseda University. Pimsleur, P. (1971). Some aspects of listening comprehension. In G. Taggart and L. J. Chatagnier (Eds.), Laboratories de langues:  Orientations nouvelles (pp. 106–​ 114). Montreal: Bordas-​Aquila.

194

194 References Pinker, S. (1994). The language instinct:  How the mind creates language. New  York: HarperCollins. Pintrich, P. R. (2002). The role of metacognitive knowledge in learning, teaching, and assessing. Theory into Practice, 41, 219–​225. Rayner, K, Pollatsek, A., Ashby, J., and Chlifton, C. Jr. (2011). Psychology of reading (2nd ed.). New York: Psychology Press. Rizzolatti, G., and Sinigaglia, C. (2006). Mirrors in the brain: How our minds share actions and emotions. Oxford: Oxford University Press. Sakoda, K. (2006). A study of shadowing to transform “knowing” into “using”: A comparison with oral reading and transcribing. Proceedings of the 2006 International Conference on Japanese-​ Language Teaching (pp. 99–​ 100). New  York:  Columbia University. Sakoda, K. (2010). A practical study on shadowing for Japanese Learners:  Toward developing JSL performance based on L2 acquisition research. Acquisition of Japanese as a Second Language, 13,  5–​21. Sakoda, K., and Matsumi, N. (2004). A basic research on shadowing in Japanese-​as-​L2 teaching:  Classroom activities transforming “knowing” into “using.” Proceedings of the 2004 National Conference on Japanese-​ Language Teaching (pp. 223–​ 224). Niigata: Niigata University. Sakoda, K., and Matsumi, N. (2005). A basic research on shadowing in Japanese-​as-​L2 teaching 2:  What we can learn from the comparison with oral reading. Proceedings of the 2005 National Conference on Japanese-​ Language Teaching (pp. 241–​ 242). Kanazawa: Kanazawa University. Sakoda, K., and Furumoto, H. (2008). Toward enhancing output in L2 acquisition research:  Is an appropriate teaching material for shadowing i+1 or i-​1? Proceedings of the 2008 International Conference on Japanese-​Language Teaching (pp. 393–​396). Busan: Busan University of Foreign Studies. Samuels, S. J. (2006). Toward a model of reading fluency. In S. J. Samuels and A. E. Farstrup (Eds.), What research has to say about fluency instruction. Newark, NJ:  International Reading Association Sannomiya, M. (2008). Metacognition:  Higher-​ order cognitive function supporting the learning capacity. Kyoto: Kitaoji Shobo. Sannomiya, M. (2017). Psychology of misunderstanding: Metacognition of human communication. Kyoto: Nakanishiya Shuppan. Schacter, D. L., Cooper, L. A., and Delaney S. M. (1990). Implicit memory for visual objects and the structural description system. Bulletin of the Psychonomic Society, 28, 367–​372. Segalowitz, N. (2010). Cognitive bases of second language fluency. New York: Routledge. Shiki, O., Mori, Y., Kadota, S., and Yoshida, S. (2010). Exploring differences between shadowing and repeating practices: An analysis of reproduction rate and types of reproduced words. Annual Review of English Language Education in Japan (ARELE), 21,  81–​90. Schumann, J. H., Crowell, S. E., Jones, N. E., Lee, N., Schuchert, S.  A., and Wood, L. A. (2004). The neurobiology of learning: Perspectives from second language acquisition. Mahwah, NJ: Lawrence Erlbaum Associates. Schumann, J. H. (2010). Applied linguistics and the neurobiology of language. In R. B. Kaplan (Ed.), The Oxford handbook of applied linguistics (2nd ed., pp. 244–​259). Oxford: Oxford University Press. Skehan, P. (1998). A cognitive approach to language learning. Oxford:  Oxford University Press. Slowiaczek, M. L., and Clifton, C. Jr., (1980). Subvocalization and reading for meaning. Journal of Verbal Learning and Verbal Behavior, 19, 573–​582.

 195

References 195 Soma, Y. (1997). Cerebral foundation of phonological (articulatory) loop. Higher Brain Function Research, 17, 149–​154. Sugito, M. (1985). Osaka-​Tokyo accent dictionary (CD version). Tokyo: Maruzen. Suzuki, J. (2005). Integrating theory and practice in English-​as-​FL learning: To develop the students’ proficiency with insufficient basic capacity. Lecture Delivered at The 50th Kansai English and English Literature Society, Nishinomiya, Kwansei Gakuin University. Suzuki, J. and Kadota, S. (2012). A handbook of English oral reading instruction:  From phonics to shadowing. Tokyo: Taishukan Shoten. Suzuki, Y. (2016). The practice of Shadoku (shadowing + extensive reading) and its effectiveness. Paper Presented at The 16th National Meeting of Japan Extensive Reading Association (JERA). Tokuyama: National Institute of Technology, Tokuyama College. Swain, M. (1995). Three functions of output in second language learning. In G. Cook and B. Seidlhofer (Eds.), Principle and practice in applied linguistics (pp. 125–​144). Cambridge: Cambridge University Press. Swain, M. (2005). The output hypothesis:  Theory and research. In E. Hinkel (Ed.) Handbook of research in second language teaching and learning (pp. 471–​ 484). London: Routledge. Taft, M. (1991). Reading and the mental lexicon. Hillsdale, NJ:  Laurence Erlbaum Associates. Takeuchi, T. (2008). EFL listeners’ perceived use of overt and covert rehearsal:  Event-​ related fMRI. Paper Presented at the AILA World Congress 2008. Essen: Messe Essen. Takeuchi, T. (2010). On the effects of proficiency in overt and covert rehearsal. Paper Presented at The 36th Osaka Conference of the Japan Society of English Language Education. Suita: Kwansai University. Tamai, K. (2005). A study on the effectiveness of shadowing as a method of teaching listening. Tokyo: Kazama Shobo. Tamai, K. (2008). The shadowing: Super-​introduction. Tokyo: CosmoPier. Taylor, W. L. (1953). “Cloze procedure”: A new tool for measuring readability. Journalism Quarterly, 30, 415–​433. Tomasello, M. (2003). Constructing a language: A usage-​based theory of language acquisition. Cambridge, MA: Harvard University Press. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving and W. Donaldson (Eds.), Organization of memory (pp. 381–​403). New York: Academic Press. van Orden, G. C. (1987). A rows is a rose:  Spelling, sound, and reading. Memory and Cognition, 15, 181–​198. Ward, J. (2010). A student’s guide to cognitive neuroscience (2nd ed.). New  York: Psychology Press. Watanabe, H., and Yokokawa H. (2015). Use of lexical stress information in silent reading and speech production by Japanese learners of English: Evidence from eye movement measurements and naming tasks. JACET Journal, 59, 111–​129. Wen, Z. (2016). Working memory and second language learning. Bristol:  Multilingual Matters. Wen, Z., Mota, M. B., and McNeill, A. (Eds.) (2015). Working memory in second language acquisition and processing. Bristol: Multilingual Matters. Wilson, S. M., Saygın, A. P., Sereno, M. I., and Iacoboni, M. (2004). Listening to speech activates motor areas involved in speech production. Nature Neuroscience, 7, 701–​702. Witt, S. M., and Young, S. J. (2000). Phone-​level pronunciation scoring and assessment for interactive language learning. Speech Communications, 30, 95–​108. Wray, A. (2002). Formulaic language and the lexicon. Cambridge, UK:  Cambridge University Press.

196

196 References Yonezaki, M., and Ito, H. (2012). A study on the effectiveness of oral reading activities to improve speaking ability. JACET Journal, 55, 93–​110. Yoshida, H. (2009). Initial stage of novice word learning by vocal imitation and repetition: fMRI study. A doctoral dissertation submitted to Kobe University. Yoshida, S. (1997). Strategies in answering cloze items: An analysis of learnersʼ think-​aloud protocols. JACET Bulletin, 28, 207–​222.

 197

Index

Note: Page numbers in italics indicate figures and bold indicate tables. aphasia 26, 26; Broca’s aphasia 63–64, 64; conduction aphasia 63–65, 64, 64, 119–21, 121; Wernicke’s aphasia 13, 26, 63–64, 64 Baddeley, A.D.: working memory model of 52–53, 53, 138 bilingualism: dementia onset age and 150–53, 151, 153; executive functioning, metacognitive monitoring and L2 acquisition and 152–54, 153; shadowing in simplified model of bilingual processing 180–81, 181 brain imaging techniques 31–32, 32, 32. See also NIRS (near-infrared spectroscopy) Broca’s aphasia 63–64, 64 Broca’s area 35–38, 36 classical conditioning 88–89, 88–89 communication: multi-processing of tasks in daily 1, 3, 3–4, 99–100, 100 communicative competence 159–62, 160–62 conduction aphasia 63–65, 64, 64, 119–21, 121 decontextualization and 90; experiment on homophone word pairs in Japanese EFL learners and 69, 71; on oral reading 8–9; phonological loop model and 55; study of brain activity of Japanese EFL learners using shadowing and 31, 34–37, 145 DeKeyser, R.M. 157, 167

dementia: bilingualism and age of onset of 150–53, 151, 153; defined 149; types of 150 echoic memory 14, 14 English as a Second Language (ESL) 4; empirical studies on shadowing effectiveness for listening comprehension and 38–50, 39–43, 44, 45, 46, 48, 49; usage-based theories in acquisition of English 170–72, 171–74, 174. See also L2 acquisition executive functioning 138–41, 139–42; bilingualism and 152–54, 153 explicit learning 90–94, 91–92, 94 explicit memory formation 90–94, 91–92, 94 extensive reading (ER): acquisition of formulaic sequences by Japanese learners of English and 177–80, 178–79, 179; L2 acquisition and 168–70, 169; shadowing as output driven practice and 175–77, 176, 177. See also practice, L2 acquisition and Hayashi, R. 106–10, 113, 119 human information processing system 51, 52; human memory system 51–61, 52–60 Imamura, K. 177–78 implicit learning 90–94, 91–92, 94; L2 shadowing and Japanese EFL learners and 94–98, 95–97, 98

198

198 Index implicit memory formation 90–94, 91–92, 94 input theory 21; input-driven practice for L2 acquisition and 167–68, 168; L2 acquisition and 155–58, 158; of L2 shadowing 9, 10 interactional theory 159 Kormos, J. 104, 106 Krashen, S. 21, 62, 156, 167 L1 acquisition: major stages of production of 104–106, 105; usage-based theories in 170–72, 171–74, 174 L2 acquisition: bilingualism and 152–54, 153; executive functioning and working memory in 138–41, 139–42; explicit and implicit learning and 90–94, 91–92, 94; extensive reading (ER) and 168–70, 169; hypothetical route map to through shadowing 184, 185, 186; input theory and 155–58, 158; interactional theory and 159; metacognitive monitoring and 132, 138–41, 138–42; output theory and 158, 158–59; research questions for 155–56, 156; think-aloud protocols and 142–45, 144, 145; usage-based theories in 172, 174–75. See also English as a Second Language (ESL); practice, L2 acquisition and L2 listening comprehension 13, 14; automaticity of speech perception in 21; empirical studies on shadowing effectiveness for listening comprehension and 38–50, 39–43, 44, 45, 46, 48, 49; filter device in 13–14, 14; perception and comprehension in 15–20, 16, 17, 19; perception and comprehension switching in listening 20–21, 21; shadowing and 9; short-term memory stage (analysis by synthesis) 14, 14; short-term memory stage (prediction-testing) 14, 14–15 L2 phonetic encoding and articulation: shadowing’s effect on 106–19, 108, 108, 110–111, 112, 113–118, 117 L2 reading comprehension, metacognitive knowledge and 135 L2 sentence production through shadowing practice 119–22, 121, 121

L2 shadowing 1, 182; automatic responses in L2 and 4; bilingualism and 180–81, 181; brain activation in Broca’s area and 35–38, 36; effect of on lexico-grammatical encoding 122–24, 123; effect on L2 phonetic encoding and articulation 106–19, 108, 108, 110–111, 112, 113–118, 117; effectiveness of in promoting vocal and subvocal rehearsal in 67–85, 69–71, 73–75, 77–78, 80–84; empirical studies on shadowing effectiveness for listening comprehension and 38–50, 39–43, 44, 45, 46, 48, 49; example of 4; implicit learning in Japanese EFL learners and 94–98, 95–97, 98; input effect of 9, 10, 182; L2 sentence production through shadowing practice 119–22, 121, 121; practice effect of shadowing 9, 10; sound waves and results of a loudness and pitch analysis 4, 5; study of brain activation during 31–37, 32, 32, 33, 34, 35, 36, 36; theoretical background of effect of on speech automaticity 21–22. See also shadowing L2 speech production 99; lexicogrammatical encoding and 122–24, 123; oral reading and empirical studies in ESL and JSL 124–29, 125–29; outline of 101; speech encoding errors and 101–104, 103 language acquisition center, phonological loop as 61, 61–63, 63 language learning aptitude: phonological working memory capacity as 65–67 Levelt, W.J.M. 104; model of speech comprehension and production of 3, 3 lexico-grammatical encoding 122–24, 123 listening comprehension disorders 13 long-term memory 85–88, 86; classical and operant conditioning and 88–89, 88–89; two-way channel model from sensory register to 89–90, 90. See also memory McGurk effect 22, 23 memory: executive functioning and working memory in L2 acquisition 138–41, 139–42; explicit memory formation 90–94, 91–92, 94; human

 199

Index 199 memory system 51–61, 52–60; implicit memory formation 90–94, 91–92, 94; phonological working memory capacity as language learning aptitude 65–67; working memory model of Baddeley 52–53, 53, 138. See also long-term memory; phonological loop (phonological working memory) metacognition: bilingualism and 152–54, 153; frontal association cortex and 137–38, 137–38; metacognitive knowledge 132–35, 134; metacognitive monitoring and L2 acquisition and 132, 138–41, 138–42; metacognitive monitoring and metacognitive control 135–36, 135–37; metacognitive monitoring by shadow training 145–46; oral reading and 146–49, 147, 148; think-aloud protocols in L2 acquisition 142–45, 144, 145 mirroring system of speech perception 27–30, 28, 30 Mori, M. 146 Mori, Y. 85, 115, 118–19 motor theory of speech perception 22–25, 24, 25, 25 NIRS (near-infrared spectroscopy) 31–37, 32, 32–36, 36, 145. See also brain imaging techniques Ohta, N. 90 operant conditioning 88–89, 88–89 oral reading 7, 7–8, 124–29, 125–29, 181–82; effect of metacognitive control and 146–49, 147, 148 output theory 158, 158–59; output driven practice for L2 acquisition and 167–68, 168; shadowing as output driven practice 175–77, 176, 177 parallel reading 8, 8 phonological loop (phonological working memory) 54–60, 54–61, 57; as language acquisition center 61, 61–63, 63; as language learning aptitude 65–67; long-term memory and 59–61, 60. See also memory

Praat (free software for acoustic phoneticians) 4 practice, L2 acquisition and 162, 162–67, 164–65, 166, 179, 180, 182–83, 186; for automatized explicit knowledge (DeKeyser) 167; inputand output-driven practice 167–68, 168; practice effect of shadowing 9, 10. See also extensive reading (ER); L2 acquisition pronunciation assessment 129–31, 130, 131 psycholinguistic competence 159–62, 160–62 quasi-priming and repetition 169–70 repeating 6, 6–7 shadoku, 177 shadowing: automatic responses in L2 and 4; defined 1, 4; image of technique of 2; L2 listening comprehension and 9; metacognitive monitoring by shadow training 145–46; typical shadowing experiment in auditory phonetics 2. See also L2 shadowing Simon task 140, 140 sound waves, results of a loudness and pitch analysis and 4, 5 speech automaticity: theoretical background of effect of shadowing on 21–22 speech encoding errors 101–104, 103 speech perception: dorsal and ventral streams involved in 26, 26–27, 27; McGurk effect and 22, 23; mirroring system of 27–30, 28, 30; motor theory of 22–25, 24, 25, 25 think-aloud protocols in L2 acquisition 142–45, 144, 145 usage-based theories: in L1 acquisition 170–72, 171–74, 174; in L2 acquisition 172, 174–75 Wen, Z. 141 Wernicke’s aphasia 13, 26, 26, 63–64, 64